ARTICLES

Home  > Articles  >  Oracles Merge Statement as an ETL Tool

Oracle's Merge Statement as an ETL Tool

By John Adolph Palinski

Oracle added an extremely useful tool to its 9i+ database versions.  This tool is the Merge statement.  The statement allows the developer to Insert and Update a target table using the same result set without having to use a Where clause statement to locate the Updated record.  This tool greatly simplifies the refresh of data warehouse tables.  The statement is also much simpler and enhances performance.

Prior to this statement, refreshes that required Inserts and Updates require a PL/SQL code block with various cursors, loops, and If-Then structures.  Listing 1 depicts a typical PL/SQL code block that refreshes a table.  The script has the following components:

-  A For Loop that retrieves the fresh data into the PL/SQL code block
-  A Cursor opened for each row fectched by the For Loop.  This cursor tests the target data warehouse table for record existence
-  Cursor Open, Fetch, and Close statements.
-  An If-Then-Else construct that tests the success of the cursor fetch
-  An Update statement
-  An Insert statement

The result of the script is that records that exist in the data warehouse table are updated and the records that do not exist are inserted.


Listing 1 - PL/SQL code block that refreshes a data warehouse table with Inserts and updates

begin
  for new_data in (select * from emp) -- Brings in the current records
  loop
    declare
       cursor test is select * 
                      from emp_data_warehouse
                      where emp_data_warehouse.empno = new_data_empno;
       test_var    test@rowtype;
    begin
       open test;           -- Executes a Select statement against the target data warehouse
       fetch test into test_var;          
       if test%found then    - Determines whether the new record exists in the data warehouse
           update emp_data_warehouse     -- Update the record if it exists
                 set emp_data_warehouse = new_data
           where emp_data_warehouse.empno = new_data.empno;
       else
           insert into emp_data_warehouse -- Insert the record it it doesn't exist
               (emp_data_warehouse)
           values
               (new_data);
       end if;
       close test;
     end;
     end loop;
end;
/
         


The Merge statement simplifies the code needed to perform the insert/update.  The statement increases performance by:

-  Eliminating the need to open a cursor for each row checking for existence
-  Having to locate and lock each updated record

The Merge statement eliminates the tasks since its first step is to join the new data set records with the target data warehouse data set records.  Oracle uses the power of a join to identify whether a record exists or not.  The Merge statement allows the developer to perform different actions when a match or non-match occurs.  Joining and matching result sets eliminates much of the incremental database hits caused by opening the cursor and issuing the update statement.  

Listing 2 illustrates a merge statement that performs the same insert/update as the Listing 1 PL/SQL code block.

Listing 2 - Using the Merge statement to refresh a data warehouse table


merge into emp_data_warehouse           -- Identifies the target table
using emp                               -- Identifies the source table, view, or in-line view
on (emp_data_warehouse.empno = emp.empno) -- Join condition
when matched then                      -- Reviews the joined result set and tests for a match
  update                       -- Notice the Update statement doesn't have a data source name
    set emp_data_warehouse.ename = emp.ename  -- No Where clause is needed
when not matched then  -- Review the joined result set and tests for a match
  insert
    (emp_data_warehouse.empno, emp_data_warehouse.ename)
  values
    (emp.empno, emp.ename)

Notice that there are far fewer lines in the Merge statement.  It is also much simpler.  The following is a template of the statement:

merge into refreshed_data_source
using (refreshed_from_data_source)
on (matching conditions)
when matched then
 
statement
when not matched then
  statement


Some points to consider are:

 - The Merge statement is not a PL/SQL code block.  Do not end the Insert or Update statement with a semi-colon
 - The matching conditions clause can only contain join conditions.  Filtering clauses will terminate with internal error codes.  If the refreshed_from_data_source is to be filtered, create and in-line view as the data source and add the filtering condition to the view
 - The statement can have a When Matched clause without having a When Not Matched clause.  The statement can also be valid with a When Not Matched clause without a When Matched clause
 - The Insert statement canly only placed in the When Not Matched clause
 - The Update statement can only be placed in the When Matched clause
 - Do not update any join column arguments in the matching conditions clause

The Merge statement is a very effective tool.  However, the 9i version issues incorrect error messages.  For example, if you should try to update one of the matching clause join columns, Oracle issues an "Invalid Identifier" error.  The error does not correspond to the actual error.  Oracle corrected the problems in the 10g version.  Remember the above points and being aware that the issued error message may not match the actual syntax problem will help you be successful with the Merge statement.

John Adolph Palinski is a former adjunct faculty member at the University of Nebraska at Omaha and Iowa Western Community College. He is the author of the "Oracle Database Construction Kit", Que, 1997, the "Oracle SQL and PL/SQL Handbook: A Guide for Data Administrators, Developers, and Business Analysts", Addison Wesley Longman, 2002, and "Oracle 9i Developer: Developing Web Applications With Forms Builder", Course Technology, 2003. He has also written numerous articles for periodicals such as Oracle Internals and Oracle Professional. He provides Oracle training and consulting services. One of his passions is to help customers set up logical business models to improve information efficiency and effectiveness. He may be contacted at www.oracle-trainer.com.