Performance Comparison

One of the things I like to do when a new version of a database engine is released is to compare the performance of different query techniques. Doing this comparison results in information that helps in understanding the optimizations that have been introduced with each release. I find it helpful to dispel assumptions of how certain syntax typically performs under certain circumstances. It turns out that the typical performance is not the same from one version to the next.

For example, in older versions of SQL Server I would prefer using the EXISTS operator over IN when testing for a record in TABLEA that has one or more matching records in TABLEB. If you run a query using these two different syntaxes in SQL Server 2008 and SQL Server 2014 the results are different. In SQL Server 2014, the query plan using either syntax is identical. This is not the case in 2008.

Again, using SQL Server as an example, I would compare the use of Memory Variables, Hash Temp Tables, and CTEs for building intermediate sets in a stored procedure. All of these three temporary memory objects have been available since SQL Server 2005. However, the performance characteristics have not been the same. In earlier versions the Hash Temp table method often out performed CTE or Table Variables for performance. In fact, it was true even for memory variables with unique constraints meeting the requirements of join criteria of filter criteria. Today, this is no longer the case.

Hierarchies are another kind of query that have shown optimization over time. Not all SQL engines support hierarchies using native SQL Code. Still, this is something that has demonstrated improved performance from one release over the next for those engines that do support hierarchical queries.

Queries using data structures with other than standard date and time values or spatial data types have continued to improve over time.

I’m thinking about creating a suite of standard tests one could use to measure performance of an engine against itself from one release to the next. The scenarios I have listed could be a starting point for a series of tests. What would you add? Why not leave your suggestions in our comments and see what comes up.



  • Raul Baron

    The idea is excellent

  • Eilenblogger

    Since query plans can change based on stastics as a result of more data (or less data), among other things, the suite of tests would need to be unique to a static set of data (yes?).

    I guess this could give you a baseline that might show improvements in a specific technique but I’m not sure that improvement would necessarily be realized in a real-world environment.

    And one persons real-world is vastly different from another persons real-world.

    So I guess what I’m saying is that the value of a standard suite of tests could trick you into thinking that you can change some queries using the newly discovered observation and realize some benefit.

    This may or may not be true because as you and others in database land have repeatedly said “it depends”.

    The information gained from such a suite of tests might be useful in the sense that you would have another piece of information to consider when writing, rewriting or tuning queries.

    Maybe I’m making an obvious point.

    • Ben

      You make really good points. I have been pondering that same thing as well…how to do a comparison that is as close to the same circumstances as possible.

      What would you think about having a baseline database in the oldest version of the engine that you want to include in your comparisons? Then you would upgrade the same database to incremental versions, and run the same queries to see how things perform.

      In fact, you wouldn’t necessarily have to run the suite of tests on all versions every time, unless you added new tests to the suite. If you are running the current set, you would have the historical results from previous engine versions. You would only need to upgrade the database to the new engine version under test, and add the results to the statistics.

      • Eilenblogger

        It seems salient.

        I would be interested in knowing what you learn by going through the process.

        Maybe you can write a future post of lessons learned.

  • Greg Hamilton

    Comparisons of join techniques, e.g., INNER/OUTER JOIN vs OUTER APPLY correlated sub-queries. Type coercion effects on query plans, eg, WHERE 1234 = ‘1234’. Bad query plan creation, e.g. use of explicit join hints vs letting the query optimizer try to figure it out (you’d need to mock bad query plan creation).