BLOB Data in NoSQL

SSWUGtv – Windows Server AppFabric
with Stephen Wynkoop

In today’s edition of SSWUGtv Steve interviews Rick Garibay on his new book: Microsoft Windows Server AppFabric Cookbook. I’ve been using Application Fabrics (Not Windows) for a few years now…can’t wait to see how Windows compares.

Watch the Show

BLOB Data in NoSQL
Today I wanted to explore a hybrid solution of relational database and NoSQL. In this brainstorm, the data of an application is stored in the relational database. The BLOB data is stored in an NoSQL data storage technology. This is similar to that of using the file system or FileStream from SQL Server but has the following advantages.

  1. Many NoSQL solutions have redundancy built in without overhead of management and sometimes not even configuration
  2. Many NoSQL solutions allow you to enable redundancy in co-located facilities making Disaster Recover less problematic
  3. NoSQL data is generally sharded data and distributed across multiple hosts enabling
  • Scalability (just add servers)
  • Reliability (self healing, redundant storage)
  • Performance (just add servers)
  • Maintainability (adding servers is simple or automated)
  • Reduced cost (can run on open source OS using commodity hardware or in Cloud)

Applications are written using a two phase commit technique. First save the BLOB using a key. Then save the meta-data in your relational database, including the utilized key.

Using a strategy pattern you could implement such a design, and later on swap it out for a file system, or other form of storage with little work and impact on your application. Simply swap out a new implementation of your BLOB persistence interface.

Let me close out this topic with some responses from readers.

I have a mapping application that uses open street map. I have been experimenting , successfully ,with using open ssl and encoding/decoding images stored in a simple csv file.

From some hard earned experience, we had to move all our blobs to the filesystem and store our docs on a networked storage server. We support both MySQL and MSSQL ( 2000 /2005 ) and notice severe performance degradation after the db grew to about 12GB in size.

For us, storing images in the db also made backups more challenging because of the additional space used.

I’d like to mention that in most RDBMs, blobs are stored onto the file system internally with an pointer in the db itself. I’m not sure how MSSQL 2012 handles this as we haven’t moved or tested any new systems with this as yet.

Removing those image blobs and saving them in the FS with just a link in the db reduced the Db sizes by about 80%, increased performance and reduced customer complaints. It does make our back up processes a little more complex since the network share also needs back up, but the trade off is well worth it.

Fred Shares In his Blog some efficient uses of separate SQL File Objects and the impact of restoring databases by FileGroup. While the blog is directed to disaster recovery, that is really the main crux of putting BLOB data in a database in the first place, performance and recovery. So, I have included that link here as well, as a tangent to the topic at hand.

Tomorrow we move on to some new material. Thank you all who have participated with great comments and resources by writing to btaylor@sswug.org.




Featured Article(s)
Troubleshooting problems with SQL Server 2008 R2 Management Studio
In this article, Alexander Chigrik explains some problems that you can have with SQL Server 2008 R2 Management Studio.

Featured White Paper(s)
Go Beyond SCOM Monitoring with Foglight for SQL Server
read more)