Disaster Recovery

I was asked by a friend the other day if I had ever had a disaster recovery situation in my career.
Well the answer is not really but I have been close on a couple of occasions.

On the 12th April 2002 (remember clearly it was a Friday afternoon) and the Distillex factory in North Shields, about 500 metres from where I worked and about the same distance from the server room, caught fire.

The site had been on fire previously so we really did not think it was that bad. Well it was that bad and when the police declared a major incident, within minutes we had to leave quick, epecially when the gas tanks started flying.


It was only when I was driving home and the specialist chemical fire fighters from Middlesbrough were on their way up the A19, I thought, "oh no, the ... server room".  With a slightly different wind direction and different gas tanks, I might have been making a lot of phone calls.

A few years later and the Buncefield oil terminal went up near the M1 (right next to a massive business park, great idea) and there must been a few disaster recovery plans put into place that day. Sure it still holds the record for Britain's costliest industrial accident at a billion pound. So if you think it won't happen to you think again.

Fast forward, ten years and I get a phone call on a Sunday morning, nothing was working, So I get logged in and sure enough nothing except the intranet and the internet site, all a bit strange. At this point my spider sense started tingling and I asked the person who made the call if he could get hold of the lad on security to walk down and check the server room. Ten minutes later, "can you get yourself in work".

Managed to get into work thirty minutes later (fortunately it was a Sunday) and can only describe that the server room was like walking into a greenhouse and every server was beeping away. Despite having the latest server room technology and ultra sensitive smoke detectors, massive halon tanks for fire suppression, nobody had thought about heat monitoring. The air conditioning units had failed (long story) and the only server that had stayed up was the one serving the internet / intranet sites. All I can say is we got lucky that day.

So if you have fire suppression in your server room check you have heat monitoring too or you might need to smile politely when months later your baked disks all start to fail and the engineers are scratching their heads, wondering why they are seeing loads of failures.



No comments:

Post a Comment