Editorials

Database Disk Usage

I appreciate the comments for addressing performance issues that may be based on systems out of your control as a DBA. Some suggestions rely on tools to identify bottlenecks. Others take advantage of metrics you are already able to gather from SQL Server and performance monitors built into Windows. Sometimes the issues are not with SQL Server, or Windows is not the host operating system…so other techniques become important.

Today I’d like to talk a little bit about how SQL Server specifically uses disk. Many other engines work in a similar fashion, and the concepts may be transferable. The big thing it is important to understand with a database is the way it utilizes disk resources.

The simplest to consider is the transaction log file(s). In general terms the transaction logs are written sequentially. The primary work is added to the tail of the file as data is modified in tables. Little work happens to the head of the file unless a transaction log backup is being executed. This works much more like a typical file system, and disk storage used for basic file access can cooperate nicely with transaction log files.

In contrast, data files are accessed in a random method. If you are simply adding records, yes data may be written to the tail of the table, but a single file supports multiple tables. In fact, a single extent or page may support multiple different tables. This means that when your data is written to or read from the disk, to get to all of the data from the different tables, there is a lot of head movement to get to the correct location. This random access, resulting a LOTS of head movement on the disk, does not cooperate well with other disk consumers that want to read documents or files.

When it comes to a SAN, the desire is to share resources as much as possible to get the best value out of the SAN possible. So, unless you have a massive SAN with lots of cache, or a SAN where your random access load is on SSD, then there can be conflict for the different kinds of utilization. If your database table files are stored on a LUN that is carved out of a set of disk that also have LUNS supporting file access, there can be degradation. If your SAN administrator is not aware of these conflicts, that may be something you will need to pass on in a professional manner.

Before you get to concerned be aware that most SANS have methods for performance monitoring as well. In fact, they can probably provide some very detailed statistics on the throughput specific to your system. If your IOPs are up to par, you need to look elsewhere. I once supported a database hosted on a massive SAN consisting of a RAID configuration of 500 individual disks and a huge amount of cache. Even though my system was only a small consumer of the entire SAN, the performance was stellar, primarily due to the cache on the SAN. SAN metrics determined that the majority of my database access was handled primarily through the cache. At the end of the day, the IOPs were higher than those experienced on a smaller dedicated SAN.

Cheers,

Ben