Migratory patterns

Article 20 of 26
M-iD, July 2005
View a PDF of the original article ~ 1.4MB

Shifting content from one system to another is never easy. But following a few key steps can ease the process.

Tim Frazer, an account manager at Macro 4 who has helped with several mainframe migration projects, says this may not be a cause for too much concern. “A mainframe application may be 30 years old, the guy who wrote it is no longer at the company and it was never properly documented anyway. But quite often there is someone in the organisation who knows or an organisation such as Capgemini or Accenture can do a detailed analysis.”

Nevertheless, the organisation should not overstate its knowledge of how its systems work, particularly if it expects an outside contractor to do the work.

Problems, problems
Shane Williams of Mobius MPS Services says that typically, he will rely on the resources of customers to get good information. “We rely on a certain level of trust in the customer about how their system is set up. But quite often we find out they don't really know how it is being used. In the end, we find that there are more exceptions flagged up [during the migration] than those that follow the rules.”

After analysing the data and the format in which it is held, the next task is to extract it. Most modern content management systems can export data in a variety of useful formats and databases are readily accessible using standard administration tools. There are usually translators or wholesale importers for flat files, while many programs also have functions for exporting data in standard formats such as CSV, PDF, JPG or TIFF.

However, while HTML is also a standard format, it may present many migration challenges since it will often reflect how the old system operated. Formatting information may need to be removed, tags stripped, links re-written and poorly written code cleaned up, usually through a vendor-provided tool or custom script.

There are also extraction and conversion tools that claim to be able move data from virtually any source to another. For example, Datawatch claims its Monarch software can take data from almost any source that can print, and then re-purpose it for a new content repository.

As well as content, the organisation will usually also need to migrate metadata - usually an easy task of field mapping - and, perhaps, workflow data. The latter is far harder since there are almost no generic tools for this purpose. As a result, custom tools and scripts will need to be written or the idea of workflow migration abandoned.

After export, the organisation needs to consider the best way to import the data into the destination system. Structured data will often need some kind of transformation to match the destination system's structures, although in bespoke systems, this task will usually be smaller, since the organisation can specify what those structures will be.

Data cleansing
The organisation should also consider whether to clean the data at this point or whether to load it into the destination system before cleaning. This is an important process. “During the discovery phase, one of the things we often show is data in the system they didn't know was there. Lots of content management systems may have multiple copies of certain pages over and over again. They need to be cleaned up before going into the end system,” says Williams.

IBM's Cadogan points out that early content management systems often stored longer pieces of content as separate pages, while newer systems store the whole document as an object, splitting it up at the display stage.

Page 1 | Page 2 | Page 3 | All 3 Pages

Rob Buckley – Freelance Journalist and Editor

Hard driving

Where in the world?

Migratory patterns