Logo Rob Buckley – Freelance Journalist and Editor

Moving home

Moving home

Shifting content from one web site to another is not an easy task. But following a number of basic rules can make it easier.

Page 1 | Page 2 | Page 3 | Page 4 | All 4 Pages

When telecoms equipment maker Ericsson needed to cut costs, one of the most obvious areas of waste was its web site. Or more specifically, its web sites: it had more than 2,000 content management systems and document management systems, both internal and external.

Not all of these, of course, were feeding Ericsson's various web sites, but the organisation nevertheless realised that these needed to be consolidated in order to cut costs, save staff time and streamline the company's branding.

It was the task of Johan Wetterhorn, the global product manager responsible for content management at Ericsson, to consolidate this mish-mash into something more coherent. After a project lasting some three years, he managed to reduce that number from 2,000 to just one.

“Just in the enterprise document management area alone, we had more than 100 local systems out there,” he says.

Even though the excitement of the dot-com era is now more than five years gone, there are still many organisations with multiple web sites running on multiple different systems. Other organisations, meanwhile, need to migrate from older systems to something newer in order to take advantage of the latest features on offer - or for any number of other reasons.

For example, the web site that worked for an organisation when it was small or had a relatively modest amount of content is unlikely to work as effectively as the organisation - or its content - grows. As a result, many organisations consider migrating to another platform, which is where they discover that there are considerable problems moving from one web content management system to another.

Problems and solutions
The simplest and most common migration is simply to move the system from one server to a more powerful server running exactly the same software - a task that sounds straightforward enough in principle.

If the server is to remain with the same host, the service provider should be able to manage most of the technical aspects of the relocation; service level agreements will ensure that if there are any glitches, the provider will pick up the tab. But if the organisation plans a change of service provider, some additional groundwork is necessary. Before picking a new host, says Ané-Mari Peter, co-founder and managing director of web consultancy on-Idle, organisations should first set themselves a budget and compare costs between Internet service providers (ISPs) and their various packages.

“Requirements such as technology, costs, processing speed, support services, the web space, bandwidth and so on are factors that will determine pricing, and therefore the vendor. But ensure that the service that you are moving to has solid support and preferably a human to speak to as well as email support. If appropriate, inform your customer base if the service or product that you provide may be reduced, affected or withdrawn for a period of time,” says Peter.

The problem of scripting languages
The simplest of web sites can survive using simple 'static' HTML pages. That is to say, pages scripted in HTML and not constructed dynamically with the aid of a content management system.

But any site that needs to personalise pages for users, add information from databases or simply include changing data from another site will need a method for creating these 'dynamic' pages.

'Scripting' languages enable web developers to add to pages programs that can interact with servers and insert HTML according to the developers' rules.

The most common scripting languages are the open source PHP (PHP: Hypertext Processor), Microsoft's Active Server Pages (ASP), the Java J2EE (Java 2 Enterprise Edition) standard, Macromedia's ColdFusion and Java Server Pages (JSP), although there are other languages.

For any migration project, particular attention needs to be paid to both the current system's scripting language and that of the destination system. No two scripting languages have the same syntax or commands, although most work on similar principles, so if the dynamic pages are to be preserved, some kind of translation project needs to be undertaken.

According to David Macken, the managing director of public sector services supplier System Associates, only about one-tenth of web developers have enough experience in two or more scripting languages to perform such a translation. He recommends getting two developers, each with experience of one of the languages, to work together to perform the migration. “If you're going from ASP to JSP or whatever, use a different person to do it: web developers need to know enough languages as it is.”

Usually, because there is almost always insufficient or no documentation about how the original system works, getting one of the original developers to act as half of this team can help. But Macken cautions that although the company may charge for this service, the original developers may be uncooperative.

“You often get politics. Company X, which has been looking after the site, has been sacked, so it is usually not very helpful. You might end up having to reverse engineer the site. Often it will be a lot simpler than that, but I have to say that's quite rare,” he says. A further complication can occur if there is also a migration in the underlying database technology that will support the new web site.

Even the same scripting language is used in the final system, any application that uses the database back end will have to be re-written since scripting languages use different commands for different databases. For example, the PHP command to connect to a MySQL database is mysql_connect while the command to connect to a SQL Server database is mssql_connect.

A switch in content management system will furthermore mean that the database structure itself will likely change, requiring even more rewrites. The more changes in technology that are made during the migration, the more complex the re-write becomes.

However, even a simple migration can have its share of technical challenges. Kevin Hutchinson, chief technical officer of web traffic analysis company WebtraffIQ, provides some basic advice to help avoid the most common problems.

“Organisations should initiate the change at the 'slowest' time, make back-ups first and ensure that file permissions on the new system are identical to the old one. They should check that the new vendor provides all the services they need. They should allocate at least a week of testing, then point the domain name from the old IP address to the new one. Remember it can take a day or two for the Internet to register this changeover, so allocate sufficient time,” he says. Above all, he adds, it is imperative that businesses make sure that servers, back-up servers and subsequent bandwidth are ready to take on a large amount of traffic with a successful move.

Switching platforms
More often that not, an organisation needs to migrate to a completely different platform, rather than move their existing system onto a more powerful server.

This is a bigger challenge, which is why any organisation contemplating such a move needs first to perform an audit to see exactly how much work will be involved: many people contemplating migrations often think that HTML content can simply be moved over 'as is', not realising that such a transfer is anything but trivial.

Any switch of technology will invariably require a rewriting of the majority of, if not all, existing web pages - particularly if they are 'dynamic' rather than 'static'. Even migrating relatively uncomplicated HTML pages to a new content management system will often require some cleaning up:

  • Links will need to be re-written so that they correctly refer to new pages, with most modern content management systems having a link management system, rather than direct coding, to ensure consistency.
  • To preserve valuable deep links from other sites, organisations should get the destination system to redirect surfers to the new URLs by maintaining a database of old links and their replacements.
  • Image references will also need to be changed once they get added to the content management system's image library and extra data might be needed, such as image descriptions, dimensions, size, licensing rights and so on. This will always involve some degree of human interaction, something that should be considered when allocating resources and time.
  • Adding metadata at this stage, particularly to meet e-government guidelines, might well be necessary - or might even be one of the main reasons for the migration. A fully automated approach may be possible with certain systems, although some degree of human input will probably also be necessary.

Web monkeys
David Macken, the managing director of public sector services supplier System Associates, says that the main challenge facing anyone attempting to cleanse old HTML pages is normally in how the original authors wrote the HTML in the first place - interpreting their code is never straightforward. “You're supposed to separate formatting from content [using stylesheets], but you'll always end up with a certain amount of embedded styles: people will want bullet points and so forth.”

Users will add HTML tags, often incorrectly, to format the content the way they want. When that content is added to the new system, however, this formatting is likely to override the new site's formatting, so needs to be purged or rewritten accordingly.

In the 1990s, many web sites were developed without the use of a content management system to streamline the storage of content and its presentation. Instead, they were almost entirely hand-coded by a handful of web developers. That can make the cleansing stage particularly problematic as the migration team struggles to understand the practices of the site's original developers - and all those that came after them. “The worst sites are ones that have been around for more than two years, especially those run without content management systems,” says Macken.

“The web team has had maybe 15 different members of staff over that time, all using different tools. Most of it is impossible to translate, particularly the stuff created by the Macromedia Fireworks application, which takes an image and chops it up - the last 5% of the HTML might be the paragraph that appears at the top of the page. It's a nightmare,” he says.

As a result, analyst group Bloor Research estimates that it costs as much as £10 to clean up just one existing web page for re-use in another system. Any kind of automated tool for cleaning up pages will therefore save a considerable amount of time and money.

Many content management systems have import tools, but frequently these require consultants to manage the migration process successfully. Vamosa's Content Migrator and Nahava's Content Re-engineering Solution are among the most powerful tools for automating data migration.

Vamosa's software, for example, was used by the Department of Health to migrate a 70,000 page site to the government's DotP content management system, while Nahava's has been employed successfully by FedEx among others. Any organisation that is looking to migrate its own content should evaluate such tools to see if they would work with their own sites: Vamosa claims it can save four-fifths of the time it would take to conduct a migration manually.

Consultancy help might still be necessary, however, and System Associates uses its own scripts as well as a simpler tool, HTML tidy, for its migration projects. Other tools, such as those for cleaning the badly written HTML that Microsoft Word typically generates, can also be of use.

However, warns Macken, typically only half of the cleansing required in a content migration exercise can be automated with such tools; a further one-third can be 'semi-automated', with some human intervention required; while the remaining 20% will require a human being to supervise the whole translation and cleaning.

Subtleties
A move from one heavyweight content management system to another will need even more careful planning. Many of the details were covered in Best Practice last month, but there are some subtleties peculiar to some web content management systems that bear mentioning.

Not only will the content need migrating, but the templates used by the existing system will need rewriting for the new system and workflows might need migrating and converting as well. These will often be incompatible, however, and may even operate using different approaches. For example, some systems let users 'check in' content, locking it while they work on it, before releasing it by checking it out. Others, however, will allow users to pass content to other users and allocate tasks.

Converting from one system to another simply may not be worth the effort. Frequently, starting afresh with the same users, but with content either published or ready to be workflowed anew, will provide the simplest and most cost effective approach.

Online questionnaires, especially those set up by third-parties, will need rewriting from scratch or to be brought over as static content, unless they were developed on a platform independent medium, such as Macromedia Flash.

With any large site, migrating all the content at once is unfeasible, so the migration should proceed section by section. 'Spiders' - programs that can read web pages and check them for specific content - can be set-up to crawl the finished pages to ensure link consistency, and both sites should be running concurrently for some time to ensure a smooth transition while bugs are ironed out.

Migrating a web site is by no means as easy as it sounds, with obstacles both technological and financial that need to be overcome. However, by using the appropriate tools, judiciously choosing what can be discarded and what should be kept, and employing the right skills, it is possible to keep the pain - and cost - to a minimum.

Page 1 | Page 2 | Page 3 | Page 4 | All 4 Pages

Interested in commissioning a similar article? Please contact me to discuss details. Alternatively, return to the main gallery or search for another article: