Chapter 58. UTL_MATCH

Table of Contents

58.1. Overview
58.2. Functions
58.2.1. EDIT_DISTANCE
58.2.2. EDIT_DISTANCE_SIMILARITY
58.2.3. JARO_WINKLER
58.2.4. JARO_WINKLER_SIMILARITY

This chapter briefly introduces the UTL_MATCH package, and describes how to use the functions of the package.

58.1. Overview

UTL_MATCH is used to calculate the degree of similarity between two strings.

It provides functions that use the Levenshtein Algorithm to calculate the Edit Distance and the Jaro-Winkler Algorithm to calculate the Match Score.

58.2. Functions

This section describes the functions provided by the UTL_MATCH package, in alphabetical order.

58.2.1. EDIT_DISTANCE

Calculates the Edit Distance between two string using the Levenshtein Algorithm.

Details about the EDIT_DISTANCE function are as follows:

  • Prototype

    UTL_MATCH.EDIT_DISTANCE 
    (
            str1       IN          VARCHAR2,
            str2       IN          VARCHAR2
    )
    RETURN PLS_INTEGER;
  • Parameters

    ParameterDescription
    str1Operand 1.
    str2Operand 2.
  • Return Value

    Return ValueDescription
    PLS_INTEGEREdit Distance between str1and str2. Returns -1 if either or both of str1or str2 is NULL.
  • Example

    BEGIN
        DBMS_OUTPUT.PUT_LINE(UTL_MATCH.EDIT_DISTANCE('apple','aaple'));
    END;
    /
    1
    

58.2.2. EDIT_DISTANCE_SIMILARITY

Calculates the Edit Distance between two strings using the Levenshtein Algorithm and normalizes the result (percent).

Details about the EDIT_DISTANCE_SIMILARITY are as follows:

  • Prototype

    UTL_MATCH.EDIT_DISTANCE_SIMILARITY 
    (
            str1       IN          VARCHAR2,
            str2       IN          VARCHAR2
    )
    RETURN PLS_INTEGER;
  • Parameters

    ParameterDescription
    str1Operand 1.
    str2Operand 2.
  • Return Value

    Return ValueDescription
    PLS_INTEGERNormalized value (to the string with the longer length) of the Edit Distance between str1and str2 in percent. Returns 100 if both strings are NULL. Returns 0 if either one of str1or str2 is NULL.
  • Example

    BEGIN
        DBMS_OUTPUT.PUT_LINE(UTL_MATCH.EDIT_DISTANCE_SIMILARITY('apple', 'aaple'));
    END;
    /
    81
    

58.2.3. JARO_WINKLER

Calculates the Match Score between two string using the Jaro-Winkler Algorithm.

Details about the JARO_WINKLER function are as follows:

  • Prototype

    UTL_MATCH.JARO_WINKLER 
    (
            str1       IN          VARCHAR2,
            str2       IN          VARCHAR2
    )
    RETURN BINARY_DOUBLE;
  • Parameters

    ParameterDescription
    str1Operand 1.
    str2Operand 2.
  • Return Value

    Return ValueDescription
    BINARY_DOUBLEMatch Score between str1and str2. Returns 1 if either or both of str1or str2 is NULL.
  • Example

    BEGIN
        DBMS_OUTPUT.PUT_LINE(UTL_MATCH.JARO_WINKLER('apple','aaple'));
    END;
    /
    .88
    

58.2.4. JARO_WINKLER_SIMILARITY

Calculates the Match Score (in percent) between two string using the Jaro-Winkler Algorithm.

Details about the JARO_WINKLER_SIMILARITY function are as follows:

  • Prototype

    UTL_MATCH.JARO_WINKLER_SIMILARITY 
    (
            str1       IN          VARCHAR2,
            str2       IN          VARCHAR2
    )
    RETURN PLS_INTEGER;
  • Parameters

    ParameterDescription
    str1Operand 1.
    str2Operand 2.
  • Return Value

    Return ValueDescription
    PLS_INTEGERMatch Score (in percent) between str1and str2. Returns 1 if either or both of str1or str2 is NULL.
  • Example

    BEGIN
        DBMS_OUTPUT.PUT_LINE(UTL_MATCH.JARO_DISTANCE_SIMILARITY('apple','aaple'));
    END;
    /
    88