Unicode support

This section describes the Unicode support to the STDLIB.

The Unicode changes for STDLIB are categorised into groups:

  • Impact on the existing "char* " interfaces.

  • Addition of new "wchar_t* " functions which use Unicode directly.

Impact on existing interfaces

The Symbian platform is a Unicode-based operating system. Hence, all operating system services which use text require that text to be presented in the 16-bit Unicode character encoding known as UCS-2. For STDLIB on the Symbian platform, this is most significant in dealing with the names of files and directories, all of which are now Unicode sequences.

To minimise the impact of this change on existing narrow C code, the STDLIB has adopted the policy that all such names in char* interfaces will be interpreted using the UTF-8 standard for encoding Unicode strings as 8-byte sequences. UTF-8 is a no surprises encoding and matches the 7-bit ASCII encoding for character codes 0 to 127. Hence, the existing string handling code will work without modification.

New interfaces

The wchar_t type is defined and ISO-C standard wide character constants are supported. The wchar_t definition chosen is unsigned short to match the use of UCS-2, and a range of relevant functions now have wide character analogues which use wchar_t* in place of char*, for example:

FILE *    fopen( const char *_name, const char *_type );
FILE *    wfopen( const wchar_t *_name, const wchar_t *_type );

and

DIR * opendir( const char * );
WDIR *    wopendir( const wchar_t * );

When such a pair of functions exist, the char* interface is implemented by converting the UTF-8 parameters to Unicode and calling the matching wchar_t* interface.

The mbtowc family of conversion functions is provided to convert between UTF8 and Unicode, but there is no additional support for locales or other forms of multibyte encoding; to convert from encodings such as Shift-JIS, programmers are recommended to use the CHARCONV conversion routines via C++ wrapper functions callable from C.

There are no implementations of wchar_t* versions of STDIO functions such as fputc.