ARTICLES

Home  > Articles  >  Using Oracle10gs new Regexp_instr() function in place of the traditional Instr() function

Using Oracle10g's new Regexp_instr() function in place of the traditional Instr() function

By John Adolph Palinski

Oracle's new 10g database includes an enhanced version of the Instr() function. The Instr() function is used to identify the sequential position number of a target character value. This new function allows you to use POSIX and Unicode notation in a regular expression. A regular expression is a pattern used to match characters in a string. This article offers several examples of this function.

Let's begin with a review of the Instr() function. Instr() is designed to return the sequential position number of a target character or set of characters. The sequential position number of the first character in the target character set "FI" within the expression "MAFIA" is 3. This is shown in Listing 1.

Listing 1 - Using the Instr() function to determine the sequential position number of a set of characters

SQL> select instr('MAFIA', 'FI', 1, 1)
2 from dual;

INSTR('MAFIA','FI',1,1)
-----------------------
3

SQL>

Reviewing the function you can see that it has four parameters. These are:

'MAFIA'

-

The evaluated character string. In Listing 1 the evaluated expression is "MAFIA"

'FI'

-

The search character(s). The function searches the expression for these characters. The Listing 1 Instr() function searches for the character set "FI"

1

-

Search starting position. The third parameter identifies the position in the evaluated expression in which to start the search. In Listing 1, the search begins in the first position.

1

-

Occurrence. The fourth parameter identifies the target occurrence. In Listing 1, the first occurrence of the character string "FI" is the target. If the value is set to 2, the function will return a 0 since there isn't a second occurrence of the character set "FI".


The Instr() function remains in Oracle10g but is complimented with the new Regexp_instr() function. The Regexp_instr() function allows you to use POSIX and Unicode notation as a regular expression in the second or search character parameter. You are no longer limited to searches on 1 or more sequential characters. The POSIX/Unicode notation opens an infinite variety of searches. You can search for patterns such as:

- The letter "A" followed 3 positions later by another letter "A"
- A digit character (i.e. 3, 4) followed by an alpha character (i.e. A, B, C)
- The letter "P" followed by the letter "A" or the letter "E".


Listing 2 illustratrates the Regexp_instr() function in use. The function is used to identify the sequential starting position of the letter "A" followed 3 positions later by another letter "A". The notation for this pattern is "A..A". In POSIX notation periods are a wild-card symbol.

Listing 2 - Using the Regexp_instr() function to identify the sequential position of the character A followed three positions later by another letter A

SQL> Select regexp_instr('MAFIA', 'A..A', 1, 1)
2 from dual;

REGEXP_INSTR('MAFIA','A..A',1,1)
--------------------------------
2

SQL>

Listing 3 is another example of Regexp_instr(). This example uses the function to identify the sequential position number of a digit character followed by an alpha character. The POSIX/Unicode notation allows you to use character classes to check for digit, alpha, alphanumeric, lower, upper, and other type characters. The character class notation is enclosed by brackets and colons. The notation [[:digit:]][[:alpha:]] describes a character set consisting of a digit followed by an alpha character. In the following example, the Regexp_instr function is used to evaluate different expressions.

Listing 3 - Using the Regexp_instr() function to identify the sequential position of a digit character followed by an alpha character

SQL> Select regexp_instr('12-59 9Ao @3', '[[:digit:]][[:alpha:]]') exp1,
2 regexp_instr('AAAA$333333B 3', '[[:digit:]][[:alpha:]]') exp2,
3 regexp_instr(' a3 b4 c6 7v', '[[:digit:]][[:alpha:]]') exp3
4 from dual;

EXP1 EXP2 EXP3
---------- ---------- ----------
7 11 11

SQL>

In this last example the Regexp_instr() function is used to identify the sequential position number of the letter "P" followed by the letter "A" or the letter "E". In essence, it is searching for character sets: "PA" or "PE". In this example the notation uses the Alternate operator (|). This operator tells the function to evalutate for either the "A" or "E" characters.

Listing 4 - Using the Regexp_instr() function along with the Alternate operator to identify the sequential position of charater sets "PA" and "PE"

SQL> select regexp_instr('PsPaPTPRPEPPAPA', 'P(A|E)', 1, 1) exp1,
2 regexp_instr('aPaPePAPAEPPAPE', 'P(A|E)', 1, 1) exp2,
3 regexp_instr('MAFIA', 'FI', 1, 1) exp3
4 from dual;

EXP1 EXP2 EXP3
---------- ---------- ----------
9 6 3

SQL>

You may have noticed the third expression in Listing 4. The search character notation searches for the characters "FI". This is exactly the same search illustrated in Listing 1 using the Instr() function. In fact, the parameter list looks exactly the same as the one used in Listing 1. Since we can use the Regexp_instr exactly the same way we used Instr, why use the Instr() function?

You can read more about the POSIX/Unicode notation by reviewing a previous SSWUG.org article entitled "Using Oracle10g's Regexp_like evaluation operator in place of the traditional Like operator" (http://www.sswug.org/see/18302).


John Adolph Palinski is a former adjunct faculty member at the University of Nebraska at Omaha and Iowa Western Community College. He is the author of the "Oracle Database Construction Kit", Que, 1997, the "Oracle SQL and PL/SQL Handbook: A Guide for Data Administrators, Developers, and Business Analysts", Addison Wesley Longman, 2002, and "Oracle 9i Developer: Developing Web Applications With Forms Builder", Course Technology, 2003. He has also written numerous articles for periodicals such as Oracle Internals and Oracle Professional. He provides Oracle training and consulting services. One of his passions is to help customers set up logical business models to improve information efficiency and effectiveness. He may be contacted at www.oracle-trainer.com.