Logo Rob Buckley – Freelance Journalist and Editor

Double trouble

Double trouble

  • Article 23 of 26
  • M-iD, September 2005
Duplicate records undermine the credibility of any records management system, but preventing slip-ups is a complex challenge.

Page 1 | Page 2 | Page 3 | All 3 Pages

Says Computacenter's Gay, "It's easy to criticise, but if people can implement it and make it work for a small team of people, very, very quickly, actually there's a lot to be said for it. Scale will determine the outcome there."

The main requirement for these capture-related processes is that all staff involved know of the processes, and they are well documented and easy to follow.

Content management control

When dealing with electronic documents, a content management system will be able to restrict users' abilities to duplicate documents. It can also impose versioning controls and a taxonomy system. This will enable users to classify documents according to their themes so that duplicates, if they do arise, will be easier to spot.

Many content management systems, such as Open Text's LiveLink and IBM DB2 Common Store, will also use linking to prevent document duplication. Tracy Caughell, product manager at Open Text, explains: "If a user is unaware that a document with the exact same data is already in the system but under a different classification, LiveLink will only store one version and then use a pointer to the other document instead. That all happens without the user even knowing."

Content management systems will often allow users to check out documents so they can use them on their laptops. Without appropriate controls, this can often lead to duplication as different people work on copies of the same document before checking them back in. Tight policies that only allow a single instance of the document (or none at all) to be checked out can prevent this.

Imam Hoque, head of the technology innovation group at Detica, advises organisations to use read-only document formats such as PDF to prevent alterations being made to documents if they're checked out. "You should try where possible to change business processes so that people find it easier not to send original versions of documents."

Andy Maurice, head of consulting at Iron Mountain, recommends creating a list of 'registered' and 'unregistered' documents. The registered documents are the master documents that need to be kept whereas unregistered documents are working versions that can be deleted. The important thing, he says, is that people are aware that the unregistered versions carry no weight and that any copy they make of a registered document is unregistered and should be destroyed when finished with -certainly, the copies should almost never be declared as records.

Similar versus same

Deciding how similar documents need to be before they're deemed duplicates is something every organisation needs to consider, sometimes case-by-case in highly regulated environments. Documents identical in every way are clearly duplicates, but documents that may be identical in content but differ in metadata could be duplicates for organisations working in one market, but not for organisations in another. "Theoretically, if the metadata associated with a document is used as part of a business process, that could change the context and meaning in which the document is being used," says oque. "Something like the date of approval might be an important event for auditors."

Certain systems will throw up these fuzzy duplicates as a matter of course. Emails sent to multiple recipients will create different copies of the same mail for each recipient. These might differ in metadata, such as the exact path taken to reach the recipient, the order of recipients in the to: field and so on. Yet many organisations will regard them as duplicates that take up vital space on a mail server.

Some systems can take care of this duplication automatically. IBM's Common Store system provides additional folders in Outlook and Lotus Notes, into which users can drag any mail messages they want to declare as records. At the point of declaration, the system will check for an existing copy of the message and warn the user if there is one. LiveLink, by contrast, won't try to prevent duplication of mail messages, but will use hashing algorithms to work out if email attachments in the system are identical and then consolidate them.

Deletion dangers

Page 1 | Page 2 | Page 3 | All 3 Pages

Interested in commissioning a similar article? Please contact me to discuss details. Alternatively, return to the main gallery or search for another article: