Long Term Archiving

Recently, my old friends at UHY Hacker Young made the news as they celebrated the closure of a 34 year old Corporate Recovery case – there was only one person still at the firm who had been on the case when it started, and he is now the Managing Partner.

My immediate thought (which I admit says more about me than it does about anything else) was “I wonder how many different WP systems have been used on that file in it’s life?”. You can envisage how the file starts with yellowing typewritten appointment documents, and moves through early impact printers with Wordstar, WordPerfect, the first versions of Microsoft Word, and on to every iteration of Microsoft Office.

If that client file had been held in electronic files from the earliest days (just possible in 1975, I suppose) would they still be accessible?

One of the issues occupying the minds of Document Management professionals is the long term viability of electronic documents. How can you be sure that the electronic file you store today will still be readable in ten or twenty years? Will your Office 2003 document be readable by a copy of Office 2023? I just checked – Word 2007 can still open WordPerfect 5 documents, but WordStar is not on the list, let alone things like IBM DisplayWrite, Samna Word, BusiPost, MultiMate, Symphony, or the other long-extinct systems that we all used in the 1980’s.

There is a classic case study that is often referenced – The BBC Domesday Project.

In 1984, the BBC initiated an educational project to create an ‘Electronic Domesday Book’. Schools all over the country were invited to contribute to a collection of photos, articles and drawings from schoolchilden. This information was digitised and written to a 12″ laser disk that could be accessed via appropriate software on a BBC Microcomputer with a custom-built Philips laser-disk player – one of the first generation of consumer optical players. Nowadays we all just use Google Earth, but then it demonstrated a real breakthrough in what IT could do to collate data from multiple sources and make it navigable by anybody.

The project ran its course – everyone was very happy and life went on. The laser disk player, as is the way of things, was superseded by CD and DVD, and went out of production.

Some years later, it became clear that there was a real risk that the project would be lost to future generations because the analogue laser-disk format had fallen out of use, and it transpired that nobody (including Philips themselves) could actually locate a working example of the player. Every school had gradually replaced their BBC micros, chucked the players out, and moved on. Noboody had thought to preserve a player, because everyone assumed that there were lots of them floating about. There weren’t.

It took a real effort by enthusiasts to re-unite a working Domesday system, Further work was then needed to extract the information and transcode it into digital format so that it could be stored onto modern digtal media at the National Archives.

For more details on the restoration project for the BBC Domesday system, read this…
http://www.ariadne.ac.uk/issue36/tna/

My point is – today’s ubiquitous storage format can vanish suprisingly rapidly – one day the shops are full of cassette tapes, and suddenly, they aren’t. Minidisk? DAT tapes? VHS tapes? They’re all gone or going, and they’ve all been used at some time as computer archive media.

Where your business documents are concerned, the problem is the same – I know firms that have collections of WordPerfect documents on their network still – but no working copy of WordPerfect. Luckily Microsoft Office can still read those files, but that cannot be assured for the future.

A while back, Adobe engaged in a project with the ISO to develop a file format (PDF/A) that would remain readable over very long periods. The file format is based on good old PDF, but it includes extra information to ensure that future computer systems don’t need access to ANY external resources to make the files accessible. An increasing number of PDF products now support the PDF/A format, and many Governments now mandate use of this format for their own filing. It’s still not very well-known, but support is growing, and I have started to gently encourage the use of PDF/A over ‘normal’ PDF wherever possible.

Accountants only have to worry about keeping stuff for a few decades (unless you’re Hacker Young!) but records in the Nuclear industry have a statutory 150-year life, and things like military records are retained indefinitely for historical reasons.

Just as a side point – keeping stuff on paper is NOT a panacea – ask anybody hoping to access the UK 1931 census records, or patrons of Iron Mountain’s Bromley document store in 2006. Paper cannot be easily backed-up – even an outdated backup beats a smoking ruin.

What to take from this?

There is a truism that a backup is not a backup until its been sucessfully restored. This is true not just of the physical media, but also of the format in which the data is stored. Guarding your data is fine, but equal attention must be given to preserving the means to read that data.

Make Best Use of What You Have

Times are tough – and IT budgets are under pressure. Before spending money on new kit – it’s worth making sure you are getting the best out of your current systems, and there are lots of (free) ways that your current systems can be optimised.

Here are a few…

**
If you have the time (2-3 hours) and the confidence – one of the best ways to revive a struggling PC is to wipe it and re-install Windows. An old machine will have had endless installs and re-installs over its life – all of which leaves all sorts of old files, registry entries, etc on the hard disk. No amount of manual ‘spring cleaning’ ever gets rid of it all. A clean rebuild can do wonders for performance and reliability.
BUT
– Make sure you’ve backed up any and all data files, shortcuts, favourites, etc (in an office environment, of course, all important data should be on the servers NOT on desktop PCs – so there shouldn’t be much data to back-up, right?) 😉
– Make sure you have all the right software CD’s and licence keys for the re-install.

**
Try to standardise the PC ‘builds’ across the office – lots of different PCs with different software versions makes support more difficult (and therefore more expensive). If you can’t justify Office 2007 for everyone – then stick to Office 2003 for everyone – not a mixture. But, make sure you have rolled out the latest Office service pack (Service Pack 3 in the case of Office 2003) and installed the 2007 Compatibility module that lets you read Office 2007 files.

Office 2003 SP3
http://www.microsoft.com/Downloads/details.aspx?familyid=E25B7049-3E13-4…

Office 2003 compatibility pack for 2007 files
http://www.microsoft.com/downloads/details.aspx?FamilyId=941b3470-3ae9-4…

**
Uninstall Browser add-ons like Yahoo toolbar – you don’t need them and they can slow up your PC.

**
If you have Windows XP – Upgrade to Service Pack 3. It installs EVERY security and reliability patch Microsoft have released to date – in one handy package. It has also delivered performance benefits in some cases.

XP Service Pack 3
http://www.microsoft.com/downloads/details.aspx?FamilyId=5B33B5A8-5E76-4…

**
Upgrade to Internet Explorer 7 (at least) – IE7 is significantly faster and more secure than IE6, and it’s been around long enough for any major issues to be addressed. Microsoft are about to terminate support for IE6 as well – another reason to move on. If you’re up for it – go to IE8, which is Microsoft’s best browser yet – no question.

http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=…

**
Upgrade to Acrobat Reader 9 – It’s the best version of reader so far – after many years where successive versions of Reader got ever slower and more bloated – Adobe have raised their game. Reader 9 loads faster, is more secure, and is easier to use.

http://www.adobe.com/products/reader/

**
Rationalise your security approach – Each PC should have..
1. Anti-Virus
2. Anti Spyware
3. Firewall
But not more than one of each! Some security products do all three, but if you have one of these, then clear out any other ones that may have accumulated over the years (Microsoft Defender, AdAware, etc.). Note that updating from IE6 will also help protect your PCs from the dodgier corners of the Internet.

**
Update your hardware drivers – Poorly written drivers are the single biggest reason for PC crashes (which leads to loss of work and time). Manufacturers DO review and fix problems in their drivers, and it’s worth checking the ‘support’ or ‘downloads’ sections of their web-sites for these. Also, Use the ‘Windows Update’ system to check for driver updates from Microsoft.

**
Once you’ve got past this – you do have to think about spending a bit of money….

The single best hardware upgrade you can do to improve system performance is more memory (RAM) – 1Gb should be a minumum with XP (2Gb with Vista and Win7) – and that shouldn’t be expensive. BUT if you intend to replace PC’s within the next 6 months – don’t bother, you won’t get the payback. All servers should have as much as possible (4Gb on 32-bit Windows). For new PC’s I’d want 4Gb RAM straight away.