Don’t Assume It’s the Database

What I’m going to write today is pretty obvious. However, I can’t believe how often I get it wrong. When applications begin to have performance issues…what is the first thing you suspect? In many cases the immediate suspect is the database. What’s interesting to me is that the last few times I have been working with systems supported by SQL Server, the problem has not been the database engine.
There used to be a bit of a battle in companies where they have a dedicated DBA, SAN managers, and application managers. When things are not running smoothly, or customers are wanting to get more performance, each tends to look to the performance of another layer other than the one they are responsible for.
It’s an easy out to blame another subsystem. What’s harder is to track the real bottleneck(s). If you application is not well instrumented, tracking down where things are slow can be difficult. You can look for problems at a high level. I once experienced an application that had many features working quite nicely. However, on feature was pretty much not working at all. The immediate thought was that there were blocking or deadlock issues in the database.
With a little sleuthing, our network administrator discovered that the web server CPU utilization was near 100 percent. It turned out that a security application had gone haywire and was consuming most of the server capacity. For some reason, this directly impacted one feature of our application. Restarting the security daemon fixed the whole problem. There were database problems, but they were downstream of the root cause.
In short, when things go wrong…don’t just blame the database.