Improving performance using the With Clause
By John Adolph Palinski
The Oracle9i With clause is used to develop a result set before the main Select statement is executed. In-line views and subqueries in the main Select statement use this result set as a data source. The With clause may increase performance of Select statements by eliminating additional database calls.
The With clause became available with Oracle9i. The purpose of the clause is to improve the performance of complex queries. More formally, it is called the subquery factoring clause. It allows the developer to create a result set, cache the result seat, and then perform additional Select statements(s) against the cached result set. This technique allows the developer to perform secondary queries against a small result set stored in memory rather than larger original target tables.
Its possible to use the With clause result set as the data source for the Exists operator or correlated subqueries. Correlated subqueries are a Where clause condition that executes a Select statement for each row that is fetched by the master Select statement. If the subquery successfully locates a record, the Exists operator returns a Boolean True value to the master Select statement. This constant executing of subqueries can really hurt the performance of the Select statement, especially if the subquery is complex. Each time the correlated subquery is executed the Oracle database must be accessed. Repeated access slows performance. Oracle9i's With clause increases performance by caching the result set and allowing Oracle to perform the test against the cached result set rather than the Oracle database. Performance gain potential is especially likely if the With clause result set is small.
Listing 1 demonstrates how a small gain in performance is achieved using the With clause result set in a correlated subquery. The first Select statement counts the number of records from a Work_order table that does not have a related record for Planner X3221. A correlated subquery uses the Not Exists operator to test whether a related record exists in the Work_order_tasks table. The table is large, but the columns are indexed, so performance is not bad. This Select statement returned its result set in 5.04 seconds.
Listing 1 - Using a correlated subquery to count the number of work orders with tasks not created by planner X3221
SQL> select count(*)
2 from work_orders
3 where not exists (select *
4 from work_order_tasks
5 where work_order_tasks.reference_nbr
6 = work_orders.work_order_nbr
7 and work_orders.planner = 'X3221');
COUNT(*)
----------
176434
Elapsed: 00:00:05.04
Listing 2 produces the same result set. However, a With clause was added that creates a result set of Planner X3221's Work_order_task records. This result set is named "Tasks". The correlated subquery uses the Tasks result set as a data source. The change resulted in the result set returning in 3.05 seconds, a small gain in performance.
Listing 2 - Using a correlated subquery and a With clause result set to count the number of work orders with tasks not created by planner X3221
SQL> with tasks as
2 (select reference_nbr
3 from work_order_tasks
4 where wo_tsk_planner = 'X3221')
5 select count(*)
6 from work_orders
7 where not exists (select *
8 from tasks
9 where tasks.reference_nbr
10 = work_orders.work_order_nbr);
COUNT(*)
----------
176434
Elapsed: 00:00:03.05
SQL>
With clause result sets can also be used as a data source in the From clause reducing Select statement complexity. To illustrate, assume that a company has a table that contains the detail charges or costs for all of the company's departments. The president of the company has charged you to write a report that displays the three departments with the highest 2004 costs. In order to produce this result set, the developer must do the following:
- Compute a result set of departments and their 2004 costs
- Determine the top three departments
Listing 3 illustrates a Select statement that accomplishes the needed tasks. A subquery using an in-line view identifies the top three departments. The outer Select statement uses the returned department values as arguments to identify the top three departments. The outer or main Select statement re-computes the department costs for the three departments identified in the subquery. The overall Select statement essentially performs two similar queries against the same table (Detail_costs).
Listing 3 - Classic Select statement using a subquery to identify the top three departments
select deptid, sum(costs) costs
from detail_costs
where transaction_date >= '01-jan-2004'
and deptid in (select deptid
from (select deptid,
rank() over (order by sum(costs) desc) ranking
from detail_costs
where transaction_date >= '01-jan-2004'
group by deptid)
where ranking <= 3)
group by deptid;
The Listing 4 Select statement produces the same result set as Listing 3. However, it takes approximately 50% less time to produce the results (see benchmarks in Table 1). The performance increase was caused by the reduction in the data that is evaluated. The With clause computed a result set of departments and their 2004 costs. This is in reality a small subset of Detail_costs data comprising no more that 20 rows. The main Select statement, subquery, and the subquery in-line view used this small result set to determine the top three departments. Using the precompiled result set dramatically reduced the work Oracle had to do.
Listing 4 - Using the With clause to precompute a department costs result set
with dept_costs as
(select deptid, sum(costs) costs
from detail_costs
where transaction_date >= '01-jan-2004'
group by deptid)
select deptid, costs
from dept_costs
where deptid in (select deptid
from (select deptid,
rank() over (order by costs desc) ranking
from dept_costs)
where ranking <= 3);
Table 1 contains some benchmark tests. The database has approximately 5,000,000 records. Consecutive queries were run during a period when there was high usage on the server. The last benchmark was executed off hours. As you can see, the Select statement containing the With clause dramatically increased performance. It may be worth your time to investigate its use when writing statements that contain sub-queries.
Table 1 - Benchmark query times
Time Listing 1 Listing 2
12:45 PM 6 minutes 28 seconds 3 minutes 13 seconds
2:10 PM 5 minutes 24 seconds 1 minute 37 seconds
3:10 PM 5 minutes 33 seconds 3 minutes 14 seconds
4:15 PM 3 minutes 56 seconds 1 minute 15 seconds
The With clause has a simple syntax as seen below. It is the first clause and begins with the keyword With. The next word is the name of the result set. The Select statement used this name as data source name in the following select statement. The With clause Select statement is wrapped by parentheses.
with result_set_name as
(select statement)
John Adolph Palinski is a former adjunct faculty member at the University of Nebraska at Omaha and Iowa Western Community College. He is the author of the "Oracle Database Construction Kit", Que, 1997, the "Oracle SQL and PL/SQL Handbook: A Guide for Data Administrators, Developers, and Business Analysts", Addison Wesley Longman, 2002, and "Oracle 9i Developer: Developing Web Applications With Forms Builder", Course Technology, 2003. He has also written numerous articles for periodicals such as Oracle Internals and Oracle Professional. He provides Oracle training. He may be contacted at www.oracle-trainer.com.