Migratory patterns

Article 20 of 26
M-iD, July 2005
View a PDF of the original article ~ 1.4MB

Shifting content from one system to another is never easy. But following a few key steps can ease the process.

Very few IT systems last forever. Vendors go bust, get taken over or just stop developing particular products. More frequently, organisations change direction or simply want to offer new products and services and find that their existing systems will not be able to support the required changes.

So navigating the migration of content from one system to another is often an important and necessary task for many IT departments. Yet it is also fraught with difficulty.

Vendors often do not make it easy for users to abandon their products - it is not in their interests. But equally, the way one system works and stores its data is rarely similar to the way that any other system does, so a straight import and export is almost never possible.

Fortunately, there are steps that can be taken to minimise the pain and expense of such a move.

Step-by-step
The first step is to identify the business case for the move and to see which migration aspects can be justified - and what data can safely be abandoned or archived. This needs to be decided by both business and IT staff.

Bernard Cadogan, a content management technical specialist at computer giant IBM, says that often, “business people start off with a proposition that everything needs to be migrated, but without necessarily having an appreciation of what this involves in cost, timescale or disruption to existing processes.”

The content of the data needs to be closely examined first to determine what, exactly, needs to be migrated - and what can or should be dumped.

The organisation should therefore make an assessment of the volume of data there is, whether it can be left on the existing system, which might be kept on as a read-only data source, and how much data is disposable or superfluous.

Once that has been accomplished and signed off by appropriate sponsors within the business, an assessment is required of what systems are in use and how much knowledge of their workings is available.

If it is a relatively recent system, there may well be a high degree of understanding about how it works, either internally or among consultants and systems integrators. But often it is an old or bespoke system that the organisation hopes to retire, about which little information might be available.

Tim Frazer, an account manager at Macro 4 who has helped with several mainframe migration projects, says this may not be a cause for too much concern. “A mainframe application may be 30 years old, the guy who wrote it is no longer at the company and it was never properly documented anyway. But quite often there is someone in the organisation who knows or an organisation such as Capgemini or Accenture can do a detailed analysis.”

Nevertheless, the organisation should not overstate its knowledge of how its systems work, particularly if it expects an outside contractor to do the work.

Problems, problems
Shane Williams of Mobius MPS Services says that typically, he will rely on the resources of customers to get good information. “We rely on a certain level of trust in the customer about how their system is set up. But quite often we find out they don't really know how it is being used. In the end, we find that there are more exceptions flagged up [during the migration] than those that follow the rules.”

After analysing the data and the format in which it is held, the next task is to extract it. Most modern content management systems can export data in a variety of useful formats and databases are readily accessible using standard administration tools. There are usually translators or wholesale importers for flat files, while many programs also have functions for exporting data in standard formats such as CSV, PDF, JPG or TIFF.

However, while HTML is also a standard format, it may present many migration challenges since it will often reflect how the old system operated. Formatting information may need to be removed, tags stripped, links re-written and poorly written code cleaned up, usually through a vendor-provided tool or custom script.

There are also extraction and conversion tools that claim to be able move data from virtually any source to another. For example, Datawatch claims its Monarch software can take data from almost any source that can print, and then re-purpose it for a new content repository.

As well as content, the organisation will usually also need to migrate metadata - usually an easy task of field mapping - and, perhaps, workflow data. The latter is far harder since there are almost no generic tools for this purpose. As a result, custom tools and scripts will need to be written or the idea of workflow migration abandoned.

After export, the organisation needs to consider the best way to import the data into the destination system. Structured data will often need some kind of transformation to match the destination system's structures, although in bespoke systems, this task will usually be smaller, since the organisation can specify what those structures will be.

Data cleansing
The organisation should also consider whether to clean the data at this point or whether to load it into the destination system before cleaning. This is an important process. “During the discovery phase, one of the things we often show is data in the system they didn't know was there. Lots of content management systems may have multiple copies of certain pages over and over again. They need to be cleaned up before going into the end system,” says Williams.

IBM's Cadogan points out that early content management systems often stored longer pieces of content as separate pages, while newer systems store the whole document as an object, splitting it up at the display stage.

Any migration project needs to combine these separate pages into a single object. However, he adds that it may be more appropriate to have the transformation rules built into the final system, depending on the complexity of the transformation.

Quality assurance needs to be continuous during the migration, rather than at the end. Not only will that make it easier to identify problems, it will also make it easier to provide an audit trail. Unfortunately, there is no one set way of providing this kind of assurance and organisations will need to devise procedures appropriate to the different systems and data.

Macro 4's Frazer recalls one project that came close to migrating only 149 tapes of data out of the 150 tapes made because there were no records kept of the data produced by the customers during their export.

More worryingly, says IBM's Cadogan, simple “count them out, count them in” principles of assurance are unlikely to work since almost invariably, there will be some discrepancies between the indexing in a system and the number of objects, potentially because of corruption and other factors.

The final issue is the switchover. The organisation needs to decide how much downtime it is prepared to put up with. Planning a switchover to coincide with a weekend - or better still, a bank holiday weekend - usually provides enough time for the switch.

But a phased implementation of different aspects of the end system may be more helpful in certain situations. Trying to switch over quickly will often also require an abrupt change to different methods and processes, some of which cause the situation to become far worse if they go wrong.

Although migrating content from one system to another is by no means an easy task, if the proper procedures and processes are put in place and followed, it becomes a far more achievable project. While it may never be entirely trouble-free, it will at least be relatively painless.

Page 1 | Page 2 | Page 3 | All 3 Pages

Rob Buckley – Freelance Journalist and Editor

Hard driving

Where in the world?

Migratory patterns