Editorials

Unicode Compression in SQL Server 2008R2

Virtual Training – SSIS, and Replication
February 1 – 28, 2011

Here is an opportunity for you to add or extend your SQL Server skills in the areas of SQL Server Integration Services (SSIS) and/or SQL Replication. As Stephen wrote a while back, there are so many skills a SQL Server professional needs to have today, its difficult to keep up.

Eric Johnson, SQL Server MVP, has put together 12 virtual classes on SSIS and Replication. For those who register, the classes will be available for the whole month of February. An interactive DVD is also available. This allows you to study at your own pace, and fill those gaps in your tool set.

Your going to have to register.
Register today for SSIS
Register today for Replication

SharePoint Administration Expo
Friday, February 11, 2011
9 a.m. – 1 p.m. PST

This expo is all about providing the foundation for providing outstanding service with SharePoint. We’ll be going into information on access control, uses of different capabilities of SharePoint and working with SharePoint projects.

Our expert panel of presenters will be investigating access controls, focusing on points you’ll want to know about as you set up your systems and the security surrounding them. We’ll also have information about assuring a successful project to help you determine which aspects are important to consider.

There will even be information about implementation strategies and tips you can put to use immediately.

With registration, all attendees will also receive a complimentary month of full membership to SSWUG.org, where they can learn even more about SharePoint and other databases and database technologies through in-depth articles, podcasts, how-to videos and more. All the content will also be able for seven calendar days after Feb. 11, allowing attendees to revisit key portions of information at a convenience.

Register today to save your place for these great sessions!

Featured Article(s)
Using SQL Server Profiler (Part 1 of 3)
This will be a demo filled session with just a few slides to explain the basics. We will go over all facets of SQL Profiler used in daily production support and development. The first section with cover many of the common tasks. The second section with cover the different event classes to help you understand when to use them and how to better filter. The final section will cover more advanced techniques and external tools like starting a trace from TSQL, SQLDiag, and SQL Nexus among others.

Featured White Paper(s)
Essential Performance Tools for SQL Server DBAS
Optimizing SQL Server performance can be a daunting task. Especially so for an increasing number of reluctant DBAs faced with… (read more)

Unicode Compression in SQL Server 2008R2
One of the features released with SQL Server 2008R2 is the ability to compress UNICODE data. This is huge. UNICODE is implemented in your database by using NCHAR, NVARCHAR, or NTEXT datatypes.

Note: NCHAR(MAX), NVARCHAR(MAX) and NTEXT are not compressed data types. The data must be able to fit into a single page (8k) in your table in order to be compressed natively through SQL Server. See this Microsoft SQL Server Advisory Team Blog for more details on how compression is done, and what data types natively participate in compression.

I designed a database schema for presenting telephone bills a few years ago. When predicting disk usage the original estimate was based on UNICODE characters in order to allow the database to be used internationally. We did not want to back ourselves into a corder, or have to radically change the application as it was sold to countries requiring bigger character sets.

After putting together the estimates, management quickly asked if there was any way we could reduce the amount of disk space required for our customer base. We tested using a reduced character set implementation of Latin based languages only (CHAR, VARCHAR, TEXT) and found that we reduced the estimated disk space by 40%. Clearly we had a log of character data in this schema.

With SQL Server 2008 R2, we could utilize the native compression built into the storage engine and had a single schema without restrictions. At that point the question would be how things would perform. Would the additional overhead of compressing/decompressing be worth the reduced disk space? Also, what is the impact when using string manipulation functions such as CHARINDEX, SUBSTRING, LEFT, RIGHT, UPPER, LOWER?

I have been told (take this simply as rumor and not authoritative) that on SQL Server 2000 they had string manipulation routines written only for UNICODE character types and that performing those functions on non-UNICODE data required casting of the data to it’s UNICODE counterpart, implementation of the string function, and then recasting back to the original data type. I have no idea if this is the case in current versions (or, frankly, if it was truly the case in SQL Server 2000).

If so, we clearly have a big win here with compression of UNICODE datatypes. It would be a lot less overhead to compress and decompress UNICODE data than to constantly be casting data types.

Another question would be how compressed data participates in an Index? If compressed in the table, would the data also be compressed in an index? I would think not. The index would not work correctly if it the data was compressed. There’s a little something to look into.

So, there are some thoughts for today. Have you used this compression feature? Send us your comments about your experience to btaylor@sswug.org.

Cheers,

Ben