Most characters are represented by code points in the range [0x20, 0xFFFF], and can be represented with a single 16-bit value. A surrogate pair is a pair of 16 bit values that represent a character in the range [0x010000..0x10FFFF]. The first half of the pair is in the range [0xD800..0xDBFF], and the second half of the pair is in the range [0xDC00..0xDFFF]. Such a pair (H, L) represents the character computed as follows (hex arithmetic):
(H - 0xD800) * 400 + (L – 0xDC00)
For example, the character “𝛑” is a lower-case bold mathematical symbol, represented by the surrogate pair D835, DED1:
select convert(unitext, u&'\+1d6d1') --------------------- 0xd835ded1
When you specify ncr=non_ascii or ncr=non_server to generate a SQLX XML document containing non-ASCII data with surrogate pair characters, the surrogate pairs appear as single NCR characters, not as pairs:
select convert(unitext, u&'\+1d6d1') for xml option 'ncr=non_ascii" ------------------------------- <resultset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <row> <C1>𝛑</C1> </row> </resultset>
Copyright © 2005. Sybase Inc. All rights reserved. |
![]() |