Use SQL 2K Functions To Optimize Your Queries - Part 2
by Ben Taylor
Introduction
What do you do when you need to run a query that returns multiple rows and for each row you need to run an additional query returning a single value? For example, you may need to compute the Median cost per item for each order in a purchase order system. SQL Server does not provide a Median aggregation function. It has SUM, COUNT, Min, MAX, STDEV and a few others. But you require the Median value for a series of rows represented by an SQL Query. Here come SQL 2K User Defined Functions to the rescue. This is the second article in a three part series demonstrating the use of User Defined Functions to perform a correlated query returning a single value that can be encapsulated (correlcated) into another query.
The first article demonstrated a method of returning multiple values for each row in a query as a single comma separated list value. This article demonstrates an User Defined Function that extends SQL Servers aggregate functions providing a median aggregation.The third article (still to come) demonstrates a method for computing percentage of investment return over a period of time specified at execution time.
Use UDF Median Function
SQL Server does not provide a Median Aggregate Function. If you don’t do a lot of statistical work you may not remember that the Median is the middle record in a set of records. The value 6 is the median of the following list:
1
4
6
9
23
The median of a list with an even number records is the record just before the middle. The value 6 is the median of the following list:
1
6
9
23
Finding a specific row from a set of records is one of the most challenging kinds of queries that can be performed in SQL Server. This process becomes even more complicated when you need to obtain the Median for multiple records. What if you wanted to return the Median Item_Amt of the Order_Items table for each record in the Orders table (See figure 1)? This can be performed with some sophisticated queries. However, an UDF makes this process much more painless, the queries are easier to understand and the query still performs quite well.

Figure 1
Create the Function
Here is the syntax for an UDF that computes the Median.
CREATE FUNCTION fnMedianOrderDesc (
@Order_ID INT)
RETURNS MONEY
AS
BEGIN
DECLARE @MEDIAN MONEY
DECLARE @RECCOUNT SMALLINT
DECLARE @ITEM_NO SMALLINT
SELECT @RECCOUNT = COUNT(*)
FROM Order_Items
WHERE Order_ID = @Order_ID
IF @RECCOUNT = 0
BEGIN
SET @MEDIAN = NULL
END
ELSE
BEGIN
IF @RECCOUNT ^ 2 = 0
BEGIN
SET @ITEM_NO = @RECCOUNT / 2
END
ELSE
BEGIN
SET @ITEM_NO = (@RECCOUNT / 2) + 1
END
SELECT @MEDIAN = Item_Amt
FROM Order_Items
WHERE Order_ID = @Order_ID
AND ITEM_NO = @Item_NO
END
RETURN @MEDIAN
END
How It Works
First the UDF is defined. A parameter is passed to the UDF to identify the Order_ID that will be used to identify the Order_Items records to be included in the determination of the Median.
CREATE FUNCTION fnMedianOrderDesc (
@Order_ID INT)
The function is further defined to return a Money data type.
RETURNS MONEY
AS
BEGIN
Next the UDF declares working variables needed to find the median within the records meeting the criteria. We are going to have to return the actual median value so that is tracked in the variable @Median. We need to find the number of records that match our criteria so we can determine what is the middle (MEDIAN) record. @RECCOUNT stores this value. We need to know what record to actually retrieve. This value is stored in @ITEM_NO.
DECLARE @MEDIAN MONEY
DECLARE @RECCOUNT SMALLINT
DECLARE @ITEM_NO SMALLINT
At this point the UDF retrieves the number of records from Order_Items that match the Order_ID sent to the function in the variable @Order_ID
SELECT
@RECCOUNT = COUNT(*)
FROM Order_Items
WHERE Order_ID = @Order_ID
If no records are in the database for @Order_ID then we want to return a MEDIAN value of NULL. So that we have only one point of return we always store the value to return in the variable @MEDIAN. In this case we store NULL.
IF @RECCOUNT = 0
BEGIN
SET @MEDIAN = NULL
END
When there are records in the database for @Order_ID we process the result differently. There are two different formulas to determine the MEDIAN record. One formula is used if the number of records is Even.
ELSE
BEGIN
A different formula is used if the number of records is Odd. A test of the MODULO of the record count with 2 determines if the number of records is even or odd. If the remainder of the variable @RECCOUNT divided by 2 is zero then there is an even number of records. Otherwise there is an odd number of records.
ELSE
BEGIN
IF @RECCOUNT ^ 2 = 0
BEGIN …
ELSE …
If there is an even number of records then we want to get the value from the Order_Item record that is the same as @RECCOUNT / 2.
BEGIN
SET @ITEM_NO = @RECCOUNT / 2
END
Otherwise we want to get the value from the Order_Item record that is the same as (@RECCOUNT / 2) + 1. This works because the datatype for @RECCOUNT is an int. Therefore the result of @RECCOUNT/2 will be rounded down for the fraction. We can then add 1 to point to the exact middle record of an ODD numbered recordset.
ELSE
BEGIN
SET @ITEM_NO = (@RECCOUNT / 2) + 1
END
Now that we know what record contains the median value we can retrieve the Item_Amt value for that record and store it in @MEDIAN as the value to return for the function.
SELECT @MEDIAN = Item_Amt
FROM Order_Items
WHERE Order_ID = @Order_ID
AND Item_NO = @Item_NO
END
The function has computed @MEDIAN based on the criteria of No Records, Even Records or Odd Records. It now completes passing the resulting formula as a Money value through the Variable @MEDIAN.
RETURN @MEDIAN
END
Use the Function
Now that we have the function written we can include it in a query. Consider the following query:
SELECT
Order_ID,
Order_Date,
Median_Amount = dbo.fnMedianOrderDesc (Order_ID)
FROM Orders
WHERE Orders.Order_Date > ’4/2/2004’
This query returns a record for every Order in the Orders Table that has a date > ‘4/2/2004 00:00:00’. It has three columns for each row showing the Order_ID, Order_Date and the MEDIAN Item_Amount for each Order_ID.
About The Author