Wednesday, June 15, 2005

Playing with Fire

What would you do if your house burned down? Build another one, but that takes a while, so next time it catches fire you call the fire brigade. They get there in time to save some stuff, but you still have to rebuild it.

Since fires seem to happen rather often (bear with me on this, I'll get to my point eventually) you hire your own fireman. Thats better, you are putting out fires early enough now that the house remains habitable more often.

But the fireman is only there during working hours, so you hire a team of firemen to give you 24x7 coverage, and tidy up the burned patches that appear every few days.

However your spouse is not happy with the mess, the disruption and the cost of a team of firemen, and a nice salesman comes by and sells you a set of smoke detectors, alarm bells and water sprinklers, and you hire a cleanup crew with mops.

Its still messy and inconvenient, and everything is slightly soggy so you ask around. You find that some people are having fires much less often than you do, and they think its because they had their house checked over by a building standards inspector.

The inspector looked around your house and predicted the next few things that would start a fire - so you could fix them in advance. The inspector also advised you on how to build a new house that wouldn't catch fire so easily in the first place.

So my point is that if you are running a datacenter things will go wrong, and from a performance and capacity perspective rapid growth rates, fast changing applications and sudden changes in user activity levels can all put you into a fire fighting situation.
You can develop a fire fighting mentality, or a fire prevention mentality. One problem is that fire fighters are heroes, but when did you last hear the story of a heroic building inspector saving thousands of lives? Few people are fans of building inspectors, but they are mandated by local government to keep people safe.

The recent upsurge of interest in the Information Technology Infrastructure Library (ITIL) seems to be a response to the auditing requirements of Sarbanes-Oxley (SOX). ITIL specifies lots of best practices for capacity planning amongst other things, as my co-author Bill Walker described in our Sun Blueprint on Capacity Planning Book. So it may be SOX that gets you to look at ITIL, and helps you justify your building inspector status as the hero that helped pass a SOX audit perhaps? OK, its a stretch, but perhaps it helps get beyond the fire fighting mentality towards predictive capacity modelling.

So how well are you doing? One way to find out is to rate yourself on a maturity model, and there is a good paper on this in the current issue (3.06) of the CMG newsletter Measure-IT.