Both application and server locale definitions have a character set. The application uses its character set when requesting character strings from the server. If character set translation is enabled (the default), the database server compares its character set with that of the application to determine whether character set translation is needed.
For a list of available character set labels, see “Character set labels”.
For more information about how to find locale settings, see “Determining locale information”.
The client library determines the character set as follows:
If the connection string specifies a character set, it is used.
For more information, see “CharSet connection parameter [CS]”.
Open Client applications check the locales.dat file in the Sybase locales directory is used.
Character set information from the operating system is used to determine the locale:
On Windows operating systems, use the GetACP system call. This returns the ANSI character set, not the OEM character set.
On UNIX, default to ISO8859-1.
The database server determines the character set for a connection as follows:
The character set specified by the client is used if it is supported.
For more information, see “CharSet connection parameter [CS]”.
The database's character set is used if the client specifies a character set that is not supported.
When a new database is created, the database server determines the character set for the new database as follows.
A collation is specified in the CREATE DATABASE statement.
The ASCHARSET environment variable is used if it exists.
Character set information from the operating system is used to determine the locale.
On Windows operating systems, use the GetACP system call. This returns the ANSI character set, not the OEM character set.
On UNIX, default to ISO8859-1.
On other platforms, use code page 1252.
When creating an IQ database, the default collation of ISO_BINENG is used if none is explicitly specified.
The following table shows the valid character set label values, together with the equivalent IANA labels and a description:
Character set label |
IANA label |
Description |
---|---|---|
big5 |
<N/A> |
Traditional Chinese (cf. CP950) |
cp437 |
<N/A> |
IBM CP437 - U.S. code set |
cp850 |
<N/A> |
IBM CP850 - European code set |
cp852 |
<N/A> |
PC Eastern Europe |
cp855 |
<N/A> |
IBM PC Cyrillic |
cp856 |
<N/A> |
Alternate Hebrew |
cp857 |
<N/A> |
IBM PC Turkish |
cp860 |
<N/A> |
PC Portuguese |
cp861 |
<N/A> |
PC Icelandic |
cp862 |
<N/A> |
PC Hebrew |
cp863 |
<N/A> |
IBM PC Canadian French code page |
cp864 |
<N/A> |
PC Arabic |
cp865 |
<N/A> |
PC Nordic |
cp866 |
<N/A> |
PC Russian |
cp869 |
<N/A> |
IBM PC Greek |
cp874 |
<N/A> |
Microsoft Thai SB code page |
cp932 |
windows-31j |
Microsoft CP932 = Win31J-DBCS |
cp936 |
</N/A> |
Simplified Chinese |
cp949 |
<N/A> |
Korean |
cp950 |
<N/A> |
PC (MS) Traditional Chinese |
cp1250 |
<N/A> |
MS Windows Eastern European |
cp1251 |
<N/A> |
MS Windows Cyrillic |
cp1252 |
<N/A> |
MS Windows US (ANSI) |
cp1253 |
<N/A> |
MS Windows Greek |
cp1254 |
<N/A> |
MS Windows Turkish |
cp1255 |
<N/A> |
MS Windows Hebrew |
cp1256 |
<N/A> |
MS Windows Arabic |
cp1257 |
<N/A> |
MS Windows Baltic |
cp1258 |
<N/A> |
MS Windows Vietnamese |
deckanji |
<N/A> |
DEC UNIX JIS encoding |
euccns |
<N/A> |
EUC CNS encoding: Traditional Chinese with extensions |
eucgb |
<N/A> |
EUC GB encoding = Simplified Chinese |
eucjis |
euc-jp |
Sun EUC JIS encoding |
eucksc |
<N/A> |
EUC KSC Korean encoding (cf. CP949) |
greek8 |
<N/A> |
HP Greek-8 |
iso_1 |
iso_8859-1:1987 |
ISO 8859-1 Latin-1 |
iso15 |
<N/A> |
ISO 8859-15 Latin-1 with Euro, etc. |
iso88592 |
iso_8859-2:1987 |
ISO 8859-2 Latin-2 Eastern Europe |
iso88595 |
iso_8859-5:1988 |
ISO 8859-5 Latin/Cyrillic |
iso88596 |
iso_8859-6:1987 |
ISO 8859-6 Latin/Arabic |
iso88597 |
iso_8859-7:1987 |
ISO 8859-7 Latin/Greek |
iso88598 |
iso_8859-8:1988 |
ISO 8859-8 Latin/Hebrew |
iso88599 |
iso_8859-9:1989 |
ISO 8859-9 Latin-5 Turkish |
koi8 |
<N/A> |
KOI-8 Cyrillic |
mac |
macintosh |
Standard Mac coding |
mac_cyr |
<N/A> |
Macintosh Cyrillic |
mac_ee |
<N/A> |
Macintosh Eastern European |
macgrk2 |
<N/A> |
Macintosh Greek |
macturk |
<N/A> |
Macintosh Turkish |
roman8 |
hp-rpman8 |
HP Roman-8 |
sjis |
shift_jis |
Shift JIS (no extensions) |
tis620 |
<N/A> |
TIS-620 Thai standard |
turkish8 |
<N/A> |
HP Turkish-8 |
utf8 |
utf-8 |
UTF-8 treated as a character set |