SAN Solutions and SQL Server – Real-World Feedback

Featured Article(s)
Exporting your SQL Server 2005 Backup to an FTP location
Now that we have access to tools like SQL Server 2005/8, you might ask "Why would I need such rudimentary form of backup as FTP?". However, in many scenarios, mostly small scale environments where a company would like to take advantage of their FTP server and backup their site, intranet, or blog – this still makes sense.

Grid-Like Abilities for SQL Server?
Many of our readers are familiar with Oracle’s Real Application Clusters – "Oracle RAC." RAC uses grid-like capability to share application load across multiple database servers tied to a shared storage system. You may wonder if there’s a similar technology for SQL Server. Now there is. GRIDSCALE from xkoto lets you scale SQL Server horizontally across multiple, independent copies of SQL Server. It load-balances reads for scalability and propagates updates for consistency. If you’re looking for scale-out and continuous availability for SQL Server, GRIDSCALE from xkoto is worth a look.

SQL Server Show
SelectViews: Today on the show we have three interviews – Kevin Kline, Scott Golightly and Anthony D’Angelo. We talked about all sorts of things, from Business Intelligence tricks to upcoming tools.
[Watch the Show]

SAN-Oriented Experiences
wrote in with an extensive bit of food for thought – he gets our exclusive look today at what he’s learned, what it means for their installations and more.

"In 2006/7, we performed a huge migration project from a Windows NT4 Domain, to a Windows 2003 AD and migrated our core SQL platforms from SQL 7 to SQL 2000. Yes, I know, not leading edge in MS technology but we are a small organisation without the capacity to perform large scale R&D projects, so we tend to wait until a product is at least 1 year old with service packs (stable as it is likely to be) before we consider a jump.

As part of this work, most physical servers were replaced. On our site which performs the most data processing, we decided to implement a SAN for centralised storage. The decision-making process was fairly simple and sound, or so I thought. I guess like most organisations, capacity management for the storage of data has become very difficult as data growth, with email systems being used as file transfer and storage, and file sizes, multimedia and data in general just growing exponentially. I wanted a solution that did not mean replacing disks in servers, and all of the work associated with it, one or two years down the line. I was looking for a 5 year plan.

The SAN is not a cheap option to say the least, especially where disk costs keep coming down. My ROI is measured over a longer term. I wanted a solution whereby we could keep adding to the storage capacity and growing it, without un-necessary infrastructure, especially with hosting, power and cooling costs going up.

We generally are a Microsoft house with HP Proliant based server technology. In my view, reasonable quality for a reasonable price. We chose to implement an HP EVA6000 Storage Solution (Storage Works Device). This is attached to all core servers in the sites data centre via fibre through fibre switches, then fibre channel adapters. We have about 7 servers provided by our SAN, with SAN attached tape library.

Our servers have always been built with the maximum redundancies built in. In other words, multiple power supplies, dual network cards, RAID with hot spares, and hot swap for most components. On top of this, they are backed up with tight SLA support contracts with comprehensive DR contracts in place. These are great, but things break. Yes, you can shout at a supplier to get things fixed but you will suffer downtime. The redesign of our data centre looked at these problems and we tried to build extra resilience into the design. This was also a factor in the choice of our SAN.

We ended up with a core production database server which was a 4 dual processor box. In this, there are 5 local disks arranged as 2 x RAID 1 mirrors with a hot spare covering both mirror sets. These two mirrors provide for a) Operating system and Application Software (SQL) and b) System Swap Space). All data volumes are provided for by the SAN. To back this up, we have a second identical server in spec and configuration that acts as our test, development and reporting database server. With this configuration, we can be self sufficient on dealing with a critical server failure without the reliance on our support partners (although the same provision is available). So, in the event of a critical single point failure with our production server, we can shut down both servers, move the production local disks to the test / development server, reassign the SAN volumes, and bring up the production services on the backup server within minutes, leaving the production server to be repaired without further impact on production services.

The EVA6000 itself, boasts about no single points of failure (allegedly). All disks are dual interface, all controllers are duplicated, all fibre switches are dual pathed, as are the fibre channel adapters in the servers. To date, we have only suffered single disk failures. (Hope I have not spoke too soon).

The performance provided to SQL far out performs local disk in my view. Even with a RAID 0+1 (10) configuration in a server, IO performance is all about the number of spindles. I am not going to go into the theory of SAN storage as this is still rocket science to me, and really needs a storage expert to give you all the answers. But, volume configuration, RAID section etc on a SAN is all virtual. The storage server sorts out how the data is organised across your disk real-estate to gain the best resilience and performance. It’s a bit like a fish tank. If your fish tank is full of water, you buy a bigger fish tank. When you transfer the contents from one to another, the level goes down as the waters is spread over a larger area. This is no different in a SAN, the more disks that are put in, the data is spread across all disks. As this is the case, the performance actually increases as you add further disks. A disk head can only read one sector on one platter at any one time. As your data is spread out across many disks and platters, the chances of two data requests needing the same head reduce and therefore data is retrieved / stored more quickly.

The second performance advantage is in the gained trust. With on-board server disk, I have never trusted RAID controllers even with battery backed cache. I have always disabled write back cache on RAID controllers servicing SQL databases. As SQL uses a two phase commit, I was never convinced that a RAID controller would correctly complete its write cache to disk on power restoration. With it turned off, if the two phases are not completed, the transaction is rolled back and therefore, no data corruption but maybe a transaction loss. With the SAN, and its design ruling out all single point failure (alleged), the cache does not need to be disabled but in reality, is far better and more efficient that server on-board disk caching.

So, two and a half years in, and all is well. No disasters or issues at all, from an uptime perspective, and a SAN that still doesn’t even break a sweat yet. To be fair, we do not have the need for clustering or larger scale processing however, the active data in some databases exceed 30 million records and performance is still instant.

There is a downside though. Microsoft updates! A SAN, including all of the attached servers are put through a configuration tool to ensure SAN compatibility. Every time a new service pack comes along for Windows or an MS Server Application, there is a huge overhead in updating all firmware and driver aspects of the SAN, from the physical disks, controllers, SAN switches and Fibre Channel Adapters etc. This is a huge piece of work that is very complicated and if not performed in a calculated and informed way, can cause major problems, if not data loss. This needs storage experts and nerves of steel, especially from the IT manager."

Featured White Paper(s)
Best Practices in Index Maintenance – Fighting the Silent Performance Killers
Index fragmentation significantly degrades database performance. Unfortunately, neither the SQL Server native tools or in-hou… (read more)