Character Sets and Collations That MySQL Supports
803
• The
sjis
character set does not support the conversion of these extension characters.
• There are several conversion rules from so-called “SHIFT JIS” to Unicode, and some characters
are converted to Unicode differently depending on the conversion rule. MySQL supports only one of
these rules (described later).
The MySQL
cp932
character set is designed to solve these problems. It is available as of MySQL
5.0.3.
Because MySQL supports character set conversion, it is important to separate IANA
Shift_JIS
and
cp932
into two different character sets because they provide different conversion rules.
How does
cp932
differ from
sjis
?
The
cp932
character set differs from
sjis
in the following ways:
•
cp932
supports NEC special characters, NEC selected—IBM extended characters, and IBM
selected characters.
• Some
cp932
characters have two different code points, both of which convert to the same Unicode
code point. When converting from Unicode back to
cp932
, one of the code points must be
selected. For this “round trip conversion,” the rule recommended by Microsoft is used. (See
http://
support.microsoft.com/kb/170559/EN-US/
.)
The conversion rule works like this:
• If the character is in both JIS X 0208 and NEC special characters, use the code point of JIS X
0208.
• If the character is in both NEC special characters and IBM selected characters, use the code point
of NEC special characters.
• If the character is in both IBM selected characters and NEC selected—IBM extended characters,
use the code point of IBM extended characters.
The table shown at
http://www.microsoft.com/globaldev/reference/dbcs/932.htm
provides information
about the Unicode values of
cp932
characters. For
cp932
table entries with characters under which
a four-digit number appears, the number represents the corresponding Unicode (
ucs2
) encoding.
For table entries with an underlined two-digit value appears, there is a range of
cp932
character
values that begin with those two digits. Clicking such a table entry takes you to a page that displays
the Unicode value for each of the
cp932
characters that begin with those digits.
The following links are of special interest. They correspond to the encodings for the following sets of
characters:
• NEC special characters:
http://www.microsoft.com/globaldev/reference/dbcs/932/932_87.htm
• NEC selected—IBM extended characters:
http://www.microsoft.com/globaldev/reference/dbcs/932/932_ED.htm
http://www.microsoft.com/globaldev/reference/dbcs/932/932_EE.htm
• IBM selected characters:
http://www.microsoft.com/globaldev/reference/dbcs/932/932_FA.htm
http://www.microsoft.com/globaldev/reference/dbcs/932/932_FB.htm
http://www.microsoft.com/globaldev/reference/dbcs/932/932_FC.htm
• Starting from version 5.0.3,
cp932
supports conversion of user-defined characters in combination
with
eucjpms
, and solves the problems with
sjis
/
ujis
conversion. For details, please refer to
http://www.opengroup.or.jp/jvc/cde/sjis-euc-e.html
.
Summary of Contents for 5.0
Page 1: ...MySQL 5 0 Reference Manual ...
Page 18: ...xviii ...
Page 60: ...40 ...
Page 396: ...376 ...
Page 578: ...558 ...
Page 636: ...616 ...
Page 844: ...824 ...
Page 1234: ...1214 ...
Page 1427: ...MySQL Proxy Scripting 1407 ...
Page 1734: ...1714 ...
Page 1752: ...1732 ...
Page 1783: ...Configuring Connector ODBC 1763 ...
Page 1793: ...Connector ODBC Examples 1773 ...
Page 1839: ...Connector Net Installation 1819 2 You must choose the type of installation to perform ...
Page 2850: ...2830 ...
Page 2854: ...2834 ...
Page 2928: ...2908 ...
Page 3000: ...2980 ...
Page 3122: ...3102 ...
Page 3126: ...3106 ...
Page 3174: ...3154 ...
Page 3232: ...3212 ...