Cnvtool Source File

The source file is a text file which contains a table of mappings between Unicode and a foreign character set. The table consists of tab-separated columns which the cnvtool uses to create a Charconv converter.

The file is case-insensitive. Comments begin with a # and extend to the end of the line. Blank lines and leading and trailing whitespace are ignored.



The first column lists the foreign character code and the second lists the corresponding Unicode character code. Both codes are in hexadecimal. The third column is optional and contains comments prefixed with a comment sign # to make the file more readable.

0x3E    0x003E    #GREATER-THAN SIGN
0x3F    0x003F    #QUESTION MARK
0x40    0x0040    #COMMERCIAL AT
0x41    0x0041    #LATIN CAPITAL LETTER A
0x42    0x0042    #LATIN CAPITAL LETTER B
0x43    0x0043    #LATIN CAPITAL LETTER C

Note: The table can contain other hex columns. In such cases the columns for the foreign character code and corresponding Unicode must be specified using the -columns option of cnvtool.


In some cases, the foreign character codes that appear in the source file need to be processed in some way before being used in the binary output file. You can specify how they must be processedby including a SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE command line in the source file as follows:


All of the characters following the line will be processed using the perl code. You can stop processing using the SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE command with no parameter.

SET_FOREIGN_CHARACTER_CODE_PROCESSING_CODE return $foreignCharacterCode|0x00008080; 

0x2121    0x3000    # IDEOGRAPHIC SPACE
0x2122    0x3001    # IDEOGRAPHIC COMMA
0x2123    0x3002    # IDEOGRAPHIC FULL STOP

$foreignCharacterCode variable

The $foreignCharacterCode variable stands for the foreign encoding (the first column). For example, if the high bit of each foreign character is off in the source file but is required to be on in the output file, the Perl code (assuming the foreign character set uses only one byte for each character) is:

return $foreignCharacterCode|0x80;