On Thu, Nov 04, 2004 at 08:05:03PM +0100, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Saying that a "string is UTF-16 encoded" implies that characters outside the Basic Multilingual Plane are encoded using surrogates (pairs of two 16-bit values).
"In computing, UTF-16 is a 16-bit Unicode Transformation Format, a character encoding form that provides a way to represent a series of abstract characters from Unicode and ISO/IEC 10646 as a series of 16-bit words suitable for storage or transmission via data networks."
And from the RFC: "In the UTF-16 encoding, characters are represented using either one or two unsigned 16-bit integers, depending on the character value." (http://www.ietf.org/rfc/rfc2781.txt)
As I said, UTF-16 implies 16-bit wide characters, hence, 16-bit wide strings in Pike, which clearly explains why I use this term.
Regards, /Al