mariadb/sql/share/charsets/README
unknown db3b3c1799 Associate a charset directly with its number in the Index file, and
propogate those changes through the code.  This is so that there can
be holes in the list of charsets without breaking old tables.


configure.in:
  - changed pattern for getting number from charsets Index file
mysys/charset.c:
  - changed from using a TYPELIB to a CS_ID struct, so both the name
    and the number of a charset is stored in available_charsets
sql/share/charsets/Index:
  - made the number a real part of the Index file, not just a comment
sql/share/charsets/README:
  - order is no longer significant, but each charset must be paired
    with its number
2000-08-22 16:08:34 -04:00

40 lines
1.7 KiB
Text

This directory holds configuration files which allow MySQL to work with
different character sets. It contains:
*.conf
Each conf file contains four tables which describe character types,
lower- and upper-case equivalencies and sorting orders for the
character values in the set.
Index
The Index file lists all of the available charset configurations.
Each charset is paired with a number. The number is stored
IN THE DATABASE TABLE FILES and must not be changed. Always
add new character sets to the end of the list, so that the
numbers of the other character sets will not be changed.
Compiled in or configuration file?
When should a character set be compiled in to MySQL's string library
(libmystrings), and when should it be placed in a configuration
file?
If the character set requires the strcoll functions or is a
multi-byte character set, it MUST be compiled in to the string
library. If it does not require these functions, it should be
placed in a configuration file.
If the character set uses any one of the strcoll functions, it
must define all of them. Likewise, if the set uses one of the
multi-byte functions, it must define them all. See the manual for
more information on how to add a complex character set to MySQL.
Syntax of configuration files
The syntax is very simple. Comments start with a '#' character and
proceed to the end of the line. Words are separated by arbitrary
amounts of whitespace.
For the character set configuration files, every word must be a
number in hexadecimal format. The ctype array takes up the first
257 words; the to_lower, to_upper and sort_order arrays take up 256
words each after that.