ARTICLES

Home  > Articles  >  Oracles hierarchical query and new 10g enhancements

Oracle's hierarchical query and new 10g enhancements

By John Adolph Palinski

The Select statement has some additional clauses that are used to place a result set into a hierarchy.  A good example of a hierarchy is the Windows My Computer dialog window.  When the window is opened, the various desktop directories and drives are displayed.  Under each drive is an offset of directories that reside on the drive.  Further, expanding the directory causes the display of child files and other directories.  This is a hierarchy, each object node has its children, grandchildren, and great-grandchildren displayed underneath.  Oracle’s Select statement has a clause that allows you to identify the level of a row (parent, child, and grandchild) and display the related row under the proper parent.  This clause is the Start With … Connect By clause.  The Start With section identifies the first parent in the hierarchy.  The Connect By section tells Oracle how to link each parent to the child or next level record (if it exists).

Listing 1 illustrates a hierarchical query.  The listing uses a default Oracle table called EMP that contains a unary relationship.  The EMP table contains employee records.  Each EMP record has an MGR field that contains another employee ID as a value.  The MGR value is a foreign key to the EMPNO column on the same table and identifies the current employee’s boss. 

Listing 1 is depiction found in many reference manuals and was not developed by the author.  The listing displays employees and the employees that report to the respective employee.  The employee records are displayed in a hierarchy.  At the top of the hierarchy is employee KING.  KING’s record is the first level under the hierarchy.  At the second level are employees JONES, BLAKE, and CLARK.  Under each of the second level employees are employees that report to these individuals.  This is a hierarchy in that the proper employee is placed under the proper boss. 

The start of the hierarchy is identified in the Start With clause section.  Records with a null MGR value are placed in the first level.  Oracle then evaluates other records looking for MGR values that match the prior or parent rows EMP.  In the case of KING, the MGR value is null.  KING’s  employee number value is the PRIOR EMPNO value.  Employees JONES, BLAKE, and CLARK are the employees that had KING’s EMPNO value in their MGR column.  Oracle then moves through the data looking for employees that have MGR equal to JONE, BLAKE, or CLARK’s EMPNO value.

Listing 1 – Listing employees in a hierarchy using the Start With … Connect By clause

SQL> column employee format a25
SQL> Select level, lpad(' ',2*(level - 1)) || ename "EMPLOYEE",
  2         empno, mgr
  3  from emp
  4  start with mgr is null
  5  connect by prior empno = mgr
  6*

     LEVEL EMPLOYEE                       EMPNO        MGR                     
---------- ------------------------- ---------- ----------                     
         1 KING                            7839                                
         2   JONES                         7566       7839                     
         3     SCOTT                       7788       7566                     
         4       ADAMS                     7876       7788                     
         3     FORD                        7902       7566                     
         4       SMITH                     7369       7902                     
         2   BLAKE                         7698       7839                     
         3     ALLEN                       7499       7698                     
         3     WARD                        7521       7698                     
         3     MARTIN                      7654       7698                     
         3     TURNER                      7844       7698                     
         3     JAMES                       7900       7698                     
         2   CLARK                         7782       7839                     
         3     MILLER                      7934       7782                     

14 rows selected.

SQL>

Notice the Listing 1 Level expression.  It is a pseudo-column used to identify the particular hierarchy level.  The Prior keyword indicates how the current row is related to the parent row.  The previous query is representive of the pre-Oracle10g hierarchical query functionality.  The functionality of the Hierarchical query is enhanced in Oracle10g.  Oracle has added some additional operators, pseudo-columns, functions, and keywords.  These are:

• Connect_by_root - Operator that returns a value from the current column’s root row.

• Connect_by_iscycle - Pseudo-column that identifies cycles.  When used in conjunction to the Nocycle keyword, the column returns a value of 1 if the current row has a child that is also a parent.  Connect_by_iscycle returns a value of 0 if the current row does not have a child that is a parent.

• Connect_by_isleaf - Pseudo-column that identifies a leaf node.  A leaf node is the lowest record in the hierarchy.  It is a record that does not have a child. 

• Sys_connect_by_path - Oracle9i function that returns a concatenated value that has each of the related row values.  The values can be separated by a character string.

• Nocycle - Keyword placed in the Connect By section that disables the error generated when the target data has a cycle.  A cycle occurs when a parent record is also a child record.

• Siblings - Oracle9i keyword used in the Order By clause to properly order the rows.  The keyword causes rows to be sorted within their particular hierarchy.  This is something that cannot be done using the normal Order by clause.

Listing 2 illustrates the Sys_connect_by_path function.  The listing is a modification of Listing 1.  The Listing 1 expressions are replaced by the Sys_connect_by_path() function.  The function has two parameters: the employee names (ENAME) and the character (-) separating the employee names.  Notice that each row now displays the full ancestral hierarchy rather than an individual employee name.  Each of the employee’s names is separated by the function character.

Listing 2 – Using the Sys_connect_by_path() function to display the full ancestral hierarchy on each row

SQL> Select sys_connect_by_path(ename, '-') hierarchy
  2  from emp
  3  start with mgr is null
  4* connect by prior empno = mgr

HIERARCHY
--------------------------------
-KING                                                                          
-KING-JONES                                                                    
-KING-JONES-SCOTT                                                              
-KING-JONES-SCOTT-ADAMS                                                        
-KING-JONES-FORD                                                               
-KING-JONES-FORD-SMITH                                                         
-KING-BLAKE                                                                    
-KING-BLAKE-ALLEN                                                              
-KING-BLAKE-WARD                                                               
-KING-BLAKE-MARTIN                                                             
-KING-BLAKE-TURNER                                                             
-KING-BLAKE-JAMES                                                              
-KING-CLARK                                                                    
-KING-CLARK-MILLER                                                             

14 rows selected.

SQL>

Listing 3 is also a revision of Listing 1.  This listing illustrates the Sibling Order By clause option.  Return to Listing 1 and notice that the child records are listed under the proper parent, but not order in any particular order.  Adding the Order By clause to the Select statement and sorting the result set by one of the expressions causes the result set to lose its hierarchy.  Ordering by Level causes all records to be sorted by Level value.  You will no longer know which hierarchy the particular row belongs to.  This same effect occurs if Employee, Empno, or Mgr is used.  As Listing 3 illustrates, using the Siblings keyword in the Order By clause causes each hierarchy to be properly sorted within the hierarchy.  The Siblings keyword replacing the By keyword.  Notice, the clause must still identify the result set value that determines the sorting within the hierarchy levels Compare the result set to Listing 1.

Listing 3 – Using the Siblings keyword to sort rows within each hierarchy

SQL> Select level, lpad(' ',2*(level - 1)) || ename "EMPLOYEE",
  2         empno, mgr
  3  from emp
  4  start with mgr is null
  5  connect by prior empno = mgr
  6* order siblings by ename

     LEVEL EMPLOYEE                       EMPNO        MGR                     
---------- ------------------------- ---------- ----------                     
         1 KING                            7839                                
         2   BLAKE                         7698       7839                     
         3     ALLEN                       7499       7698                     
         3     JAMES                       7900       7698                     
         3     MARTIN                      7654       7698                     
         3     TURNER                      7844       7698                     
         3     WARD                        7521       7698                     
         2   CLARK                         7782       7839                     
         3     MILLER                      7934       7782                     
         2   JONES                         7566       7839                     
         3     FORD                        7902       7566                     
         4       SMITH                     7369       7902                     
         3     SCOTT                       7788       7566                     
         4       ADAMS                     7876       7788                     

14 rows selected.

SQL>

Listing 4 uses the Connect_by_root operator and the Connect_by_isleaf pseudo-column.  Connect_by_root identifies a value from the root row.  In the case of Listing 4, the row is KING’s and the retained expression is Ename with a value of KING.  Notice that the Connect_by_root operator displays KING on each row since it is the root value for all hierarchies.  The Connect_by_isleaf pseudo-column is used in the fourth expression.  It displays a value of 0 if the current row has children and a value of 1 if the row does not have children.

Listing 4 – Using the Connect_by_root operator and the Connect_by_isleaf pseudo-column

SQL> Select level, lpad(' ',2*(level - 1)) || ename "EMPLOYEE",
  2         Connect_by_root ename "Head Person",
  3         Connect_by_isleaf "End of Hier"
  4  from emp
  5  start with mgr is null
  6  connect by prior empno = mgr
  7* order siblings by ename

     LEVEL EMPLOYEE                  Head Perso End of Hier                    
---------- ------------------------- ---------- -----------                    
         1 KING                      KING                 0                    
         2   BLAKE                   KING                 0                    
         3     ALLEN                 KING                 1                    
         3     JAMES                 KING                 1                    
         3     MARTIN                KING                 1                    
         3     TURNER                KING                 1                    
         3     WARD                  KING                 1                    
         2   CLARK                   KING                 0                    
         3     MILLER                KING                 1                    
         2   JONES                   KING                 0                    
         3     FORD                  KING                 0                    
         4       SMITH               KING                 1                    
         3     SCOTT                 KING                 0                    
         4       ADAMS               KING                 1                    

14 rows selected.

SQL>

Hierarchical data sets generally do not have cycles.  A cycle is a record that has an ancestor record that is also a child.  However, this condition may occur as the result of an error or for some other unknown reason.  If Oracle encounters a cycle, it issues an error message.  In order to display the error message Oracle has created the Nocycle keyword.  Oracle has also created the Connect_by_iscycle pseudo-column.  This column displays a value of 0 if the current record has children records that are not parents.  It displays a value of 1 if the record has a child record that is also a parent record.  In essence, the returning a value of 1 indicates a cycle existing. 

The default Oracle EMP table does not have a cycle, so it is necessary to modify the table creating a cycle.  Employee KING is the root record.  KING has a null MGR value since KING does not report to anyone else.  If KING’s MGR value is modified to 7900, a cycle is.  7900 is JAMES’ Empno value and KING will now report to JAMES.  Listing 5 contains a Select statement that is executed against the EMP table that contains a cycle.  Notice the error message.

Listing 5 – Hierarchical query against a table containing a cycle

SQL> Select level, lpad(' ',2*(level - 1)) || ename "EMPLOYEE",
  2         empno, mgr
  3  from emp
  4  start with empno = 7839
  5  connect by prior empno = mgr
  6  order siblings by ename
  7  ;
ERROR:
ORA-01436: CONNECT BY loop in user data 

no rows selected

SQL>

Listing 6 is the same Select statement except the Nocycle keyword is added to the Connect By section.  In addition, the Select clause now contains the Connect_by_iscycle pseudo-column.  The Select statement now executes successfully.  JAMES’ record is identified by the Connect_by_iscycle column at the beginning of a cycle.  This is determined by the value of 1 for JAMES.  The Nocycle option terminates the cycle at that point and continues the query.

Listing 6 – Using the Connect_by_iscycle pseudo-column and Nocycle keyword to identify a cycle and eliminate the error message

SQL> Select level, lpad(' ',2*(level - 1)) || ename "EMPLOYEE",
  2         empno, mgr, connect_by_iscycle
  3  from emp
  4  start with empno = 7839
  5  connect by nocycle prior empno = mgr
  6* order siblings by ename

     LEVEL EMPLOYEE            EMPNO        MGR CONNECT_BY_ISCYCLE  
---------- ------------------- ---------- ---------- ------------------  
         1 KING                      7839       7900                  0  
         2   BLAKE                   7698       7839                  0  
         3     ALLEN                 7499       7698                  0  
         3     JAMES                 7900       7698                  1  
         3     MARTIN                7654       7698                  0  
         3     TURNER                7844       7698                  0  
         3     WARD                  7521       7698                  0  
         2   CLARK                   7782       7839                  0  
         3     MILLER                7934       7782                  0  
         2   JONES                   7566       7839                  0  
         3     FORD                  7902       7566                  0  
         4       SMITH               7369       7902                  0  
         3     SCOTT                 7788       7566                  0  
         4       ADAMS               7876       7788                  0  


I believe that the new Oracle10g enhancements will make the hierarchical a much more powerful tool for you.


John Adolph Palinski is a former adjunct faculty member at the University of Nebraska at Omaha and Iowa Western Community College. He is the author of the "Oracle Database Construction Kit", Que, 1997, the "Oracle SQL and PL/SQL Handbook: A Guide for Data Administrators, Developers, and Business Analysts", Addison Wesley Longman, 2002, and "Oracle 9i Developer: Developing Web Applications With Forms Builder", Course Technology, 2003. He has also written numerous articles for periodicals such as Oracle Internals and Oracle Professional. He provides Oracle training and consulting services. One of his passions is to help customers set up logical business models to improve information efficiency and effectiveness. He may be contacted at www.oracle-trainer.com.