Using multibyte collations

This section describes how multibyte character sets are handled. The description applies to the supported collations and to any multibyte custom collations you may create.

Sybase IQ provides collations using several multibyte character sets.

For a complete listing, see “Understanding collations”.

Sybase IQ supports variable-width character sets. In these sets, some characters are represented by one byte, and some by more than one, to a maximum of four bytes. The value of the first byte in any character indicates the number of bytes used for that character, and also indicates whether the character is a space character, a digit, or an alphabetic (alpha) character.

For the UTF8 collation, UTF-8 characters are represented by one to four bytes. For other multibyte collations, one or two bytes are used. For all provided multibyte collations, characters comprising two or more bytes are considered to be “alphabetic”, such that they can be used in identifiers without requiring double quotes.

Sybase IQ does not support 16-bit or 32-bit character sets such as UTF-16 or UTF-32.

All client libraries other than embedded SQL are Unicode-enabled, using the UTF-16 encoding. Translation occurs between the client and the server.

Japanese language support

Sybase recommends using collation 932JPN for Japanese Windows applications. Collation 932JPN supports loading 32-bit multibyte characters that cannot be loaded into SJIS or SJIS2. SJIS and SJIS2 are older collations. SJIS is available as an alternate collation. SJIS2 is no longer supported. For Unix applications, use EUC_JAPAN.

Thai language support

Sybase IQ provides a utility to convert data files in CP874 format into UTF8, the only Thai language collation supported. For syntax, see the Sybase IQ Utility Guide. Before you can load data in the CP874 character set, you must convert it to UTF8 using this utility.

The SORTKEY() function returns values in the sort order thaidict (Thai dictionary), the Thai character set in UTF8 form. The following statements generate the same result:

SELECT c1, SORTKEY(c1) from T1 where rid=3
SELECT c1, SORTKEY(c1, ‘thaidict’) from T1 where rid=3)
SELECT ‘\340\270\201\340\271\207’,SORTKEY(‘\340\279\201\340\271\207’) from T1 where rid=3

For more details, see Chapter 5, “SQL Functions” in Sybase IQ Reference Manual.