Back to articles list Articles Cookbook
3 minutes read

Oracle Collations: Binary and Linguistic Collations

Oracle bases its language support on the values of parameters that begin with NLS. These parameters specify, for example, how to display currency or how the name of a day is spelled.

The table below presents some of the NLS parameters. By using one of them, NLS_SORT, we can specify the sort method (binary or linguistic) for both SQL WHERE clause operations and NLSSORT function operations.

Option Name Description
NLS_LANG The current language, territory, and database character set, which are determined by session-wide globalization parameters.
NLS_LANGUAGE The current language for the session.
NLS_SORT The sequence of character values used when sorting or comparing text.

To check the current NLS settings, type:

SELECT *
FROM v$NLS_PARAMETERS;


NLS parameters


Oracle provides two main types of collation: binary and linguistic.

Binary Collation

With a binary sort characters are sorted in the order of their binary representation. In case of different languages, a binary collation doesn't produce reasonable results. It’s a default collation of the database session for ORDER BY and BETWEEN (and other operations that support linguistic sort) and is set as ‘binary’ in NLS_COMP variable.

Linguistic Collation

A linguistic sort operates by replacing characters with other binary values that reflect the character’s proper linguistic order so that a sort returns the desired result. You can specify the collation behavior by setting NLS_COMP = linguistic.

There are three types of linguistic collation:

  1. Monolingual Collation

    A collation where two values are used to determine the relative position of a character.

    Characters are compared in two steps:

    • The major values of the characters from a table of major values (letters with the same appearance have the same major value)
    • The minor values of the characters from a table of minor values
    Glyph Major Value Minor Value
    a 15 5
    A 15 10
    ä 15 15
    Ä 15 20
    b 20 5

    Sample major, minor values for GERMAN monolingual sort

    For example: to set monolingual German sort, set NLS_SORT value to GERMAN.

  2. Multilingual Collation

    It enables you to sort data at three levels of precision:

    • Primary Level Collation – distinguishes between base letters
    • Secondary Level Collation – distinguishes between diacritics if base letters are the same
    • Tertiary Level Collation – distinguishes between base letters, diacritics and case (upper case and lower case) including special characters such as +, -, and *.

    The _M appended to the end of a sort name denotes a multilingual sort. Its absence denotes a monolingual sort. Case-sensitive and accent-insensitive sorts have _CI or _AI appended to the name. Its absence denotes case- and accent-sensitivity.

    For example: to set multilingual French for an accent-insensitive and case-insensitive, sort set NLS_SORT value to FRENCH_M_AI.

  3. Unicode Collation Algorithm (UCA)

    It defines a Default Unicode Collation Element Table (DUCET) that provides a default ordering for all languages.

List Collations

To get a list with the valid NLS_SORT values, type:

SELECT * 
FROM V$NLS_VALID_VALUES 
WHERE parameter = 'SORT';


NLS_SORT values


You Can Set NLS Parameters:

  • as initialization parameters on the instance/server:

    ALTER SYSTEM SET V$NLS_PARAMETER = 'XXX' scope = both;
    

    Example:

    ALTER SYSTEM SET NLS_SORT='RUSSIAN' SCOPE=SPFILE;
    ALTER SYSTEM SET NLS_COMP='LINUGUISTIC' SCOPE=SPFILE;
    

  • as environment variables on the client:

    % setenv NLS_SORT FRENCH
    

  • for a session:

    ALTER SESSION SET V$NLS_PARAMETER =  = 'XXX'
    

If you don’t want to globally affect the instance, you can use the NLSSORT() function to set the NLS_SORT for the scope of a specific query.

For example:

SELECT *
FROM test
ORDER BY NLSSORT(name, 'NLS_SORT=german');