Oracle's Enhanced Group By Clause Part 2
By John Adolph Palinski
Recent database versions enhanced Oracle's Group By clause. It allows the developer to perform a large variety of aggregations with a single pass over the target data. These enhancements increase the performance of deriving data warehouse facts. This is the second of two articles that explore these enhancements. In this article you will see how to add Grouping Set lists, Concatenated Groupings, and Composite Columns to the Group By clause. The first series article explored the Rollup and Cube list options along with the Grouping() function.
Historically, Oracle has had a Group By clause available in the Select statement. The Group By clause is used with Oracle Grouping functions (i.e. sum, avg, variance, min, max, count) and determines the granularity of an aggregation. Select statements that use the Group By clause create rows that have the same granularity. For example, a Select statement using the below Group By clause creates rows aggregated to the Sales_area, Gender, and State grain.
Group by sales_area, gender, state
With the advent of data warehouses and dimensional models, it has become necessary to create databases of aggregated or derivative facts. The derivative facts are contained in fact tables linked to dimension tables. Fact tables contain numeric values such as commissions or costs and foreign keys to the linked dimension table(s). The fact records are joined to dimension table records to produce business information for analysis. Examples of derivative facts are:
- Commissions by Sales_area
- Commissions by Sales_area, Gender
- Commissions by Sales_area, State
- Commissions by Sales_area, Gender, State
- Commissions by Gender
- Commissions by Gender, State
- Commissions by State
Using the historical Group By clause to create the various aggregations requires seven Select statements and seven separate passes through the source data. Reducing the number of source data passes and receiving the complimentary performance enhancement is the reason for the Group By clause enhancements.
Grouping sets
Grouping sets are a further enhancement of the Group By clause. This tool allows the developer to identify precise levels of aggregation. Like the Rollup and Cube lists, Grouping sets enable a single query to list multiple sets of columns which will be grouped for additional aggregate calculations. Unlike the Rollup list, which can be somewhat difficult to use in documenting specific aggregations and the Cube list which often creates more aggregations than needed, Grouping sets are an easier way to specify the precise aggregations.
Specify a Grouping set in the Group By clause using the keywords "Grouping Sets" followed by the column specification list. The following is a template:
Group by Grouping sets ((list1), (list2), (list3))
Listing 1 is an example of a Grouping set that performs four aggregations. These are:
- Sum of commissions by Sales_area, Gender, State
- Sum of commissions by Sales_area, Gender
- Sum of commissions by Gender
- Sum of commissions by State
Notice that each aggregation is documented. Compare this to Rollup and Cube list aggregations discussed in the first article and you will see that the Grouping Set enables the developer to easily define the precise number of aggregations and more importantly view the actual aggregations. The Rollup list contains a single listing that can produce a number of variations and the Cube list produces all possible combinations. You cannot identify the aggregation unless you have some knowledge of how to configure the Rollup list. A Grouping set allows you to view exactly what you will receive without the added worry of getting unintentional aggregations.
Listing 1 - Using the Grouping sets option
SQL> Select sales_area, gender, state, sum(commission)
2 From salesmen
3 Natural join salesman_commissions
4 Group by grouping sets
5 ((sales_area, gender, state),
6 (sales_area, gender),
7 (gender),
8* (state))
S G ST SUM(COMMISSION)
- - -- ---------------
S M AL 4000
W M AR 3200
...
W M WA 250
N F WI 4300
AL 4000
AR 3200
...
WA 250
WI 4300
N F 5050
W F 250
F 5300
E M 24000
N M 15450
S M 10750
W M 4800
M 55000
38 rows selected.
SQL>
Concatenated Grouping Sets
In Listing 1, a single Grouping set was used in the Group By clause. It is possible to have several Grouping sets in the Group By clause. Multiple Grouping sets are called Concatenated Group sets and yield the cross-product of grouping based upon the grouping sets. Concatenated Grouping sets are a concise way to generate large combinations of groupings.
Listing 2 is an example of Concatenated Grouping sets. This is the same basic Select statement as Listing 1, except that a new dimension, Year Hired, was added to the example Select statement. The Group By clause now has two Grouping sets, each with a list of two variables. The combination of Grouping sets produce the following aggregations:
- Sum of commissions by Sales_area, State
- Sum of commissions by Gender, State
- Sum of commissions by Sales_area, Year Hired
- Sum of commissions by Gender, Year Hired
Notice that a comma separates each Grouping set clause.
Listing 2 - Using Concatenated Grouping sets
SQL> Select sales_area, gender, state,
2 substr(hire_date, 8, 2) Yr, sum(commission)
3 From salesmen
4 Natural join salesman_commissions
5 Group by Grouping sets (sales_area, gender),
6* Grouping sets (state, substr(hire_date, 8, 2))
S G ST YR SUM(COMMISSION)
- - -- -- ---------------
E MA 18800
E NJ 1450
...
W OR 1100
W WA 250
F CA 250
...
M TX 6500
M WA 250
E 02 3250
E 90 19300
...
W 99 250
F 00 4300
F 89 250
...
M 96 1450
M 98 250
M 99 250
61 rows selected.
SQL>
Concatenation can also be performed on groupings specified with Rollup and Cube. The cross-product operation enables small numbers of concatenated groupings to generate a large number of aggregations.
Composite columns
The Group by clause also allows the use of composite columns to generate aggregations. A composite column is a list of columns treated as a single unit during the aggregation. Enclosing subsets of columns in a Rollup, Cube, or Grouping set causes levels to be omitted. The following is a typical Rollup list:
Rollup(sales_area, gender, state, substr(hire_date, 8,2))
Reviewing the list, you expect the following aggregations:
- Sales_area, Gender, State, Year_hired
- Sales_area, Gender, State
- Sales_area, Gender
- Sales_area
- Total result set
The below is a modification of the previous Rollup list. Gender and State are enclosed by parentheses. This creates a composite column of Gender, State.
Rollup(sales_area, (gender, state), substr(hire_date, 8, 2))
The aggregations are changed by the Rollup composite column. The change eliminates some aggregations. Compare the following list to the previous bulleted list and review Listing 3 to confirm. Notice that the Sales_area, Gender aggregation is omitted.
- Sales_area, Gender, State, Year_hired
- Sales_area, Gender, State
- Sales_area
- Total result set
Listing 3 - Using Composite columns in a Rollup list to reduce aggregations
SQL> Select sales_area, gender, state, substr(hire_date, 8, 2) yr, sum(commission)
2 From salesmen
3 Natural join salesman_commissions
4* Group by Rollup(sales_area, (gender, state), substr(hire_date, 8,2))
S G ST YR SUM(COMMISSION)
- - -- -- ---------------
E M MA 90 18800
E M MA 18800
E M NJ 96 1450
E M NJ 1450
E M NY 02 3250
E M NY 90 500
E M NY 3750
E 24000
...
W F CA 89 250
W F CA 250
W M AR 93 3200
W M AR 3200
W M CA 98 250
W M CA 250
W M OR 93 1100
W M OR 1100
W M WA 99 250
W M WA 250
W 5050
60300
40 rows selected.
SQL>
Between the Rollup and Cube lists and the Grouping function discussed in the first article and the Grouping Sets, Concatenated Grouping Sets, and Composite columns discussed in this article, you should have very effective tools for deriving business facts.
John Adolph Palinski is a former adjunct faculty member at the University of Nebraska at Omaha and Iowa Western Community College. He is the author of the "Oracle Database Construction Kit", Que, 1997, the "Oracle SQL and PL/SQL Handbook: A Guide for Data Administrators, Developers, and Business Analysts", Addison Wesley Longman, 2002, and "Oracle 9i Developer: Developing Web Applications With Forms Builder", Course Technology, 2003. He has also written numerous articles for periodicals such as Oracle Internals and Oracle Professional. He provides Oracle training and consulting services. One of his passions is to help customers set up logical business models to improve information efficiency and effectiveness. He may be contacted at www.oracle-trainer.com.