Unicode Support
791
+----------+-----------------------------------------------------------------+
| test | CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET latin1 */ |
+----------+-----------------------------------------------------------------+
If no
COLLATE
clause is shown, the default collation for the character set applies.
SHOW CREATE TABLE
is similar, but displays the
CREATE TABLE
statement to create a given table.
The column definitions indicate any character set specifications, and the table options include character
set information.
The
SHOW COLUMNS
statement displays the collations of a table's columns when invoked as
SHOW
FULL COLUMNS
. Columns with
CHAR
,
VARCHAR
, or
TEXT
data types have collations. Numeric and
other noncharacter types have no collation (indicated by
NULL
as the
Collation
value). For example:
mysql>
SHOW FULL COLUMNS FROM person\G
*************************** 1. row ***************************
Field: id
Type: smallint(5) unsigned
Collation: NULL
Null: NO
Key: PRI
Default: NULL
Extra: auto_increment
Privileges: select,insert,update,references
Comment:
*************************** 2. row ***************************
Field: name
Type: char(60)
Collation: latin1_swedish_ci
Null: NO
Key:
Default:
Extra:
Privileges: select,insert,update,references
Comment:
The character set is not part of the display but is implied by the collation name.
10.1.10. Unicode Support
MySQL 5.0 supports two character sets for storing Unicode data:
•
ucs2
, the UCS-2 encoding of the Unicode character set using 16 bits per character
•
utf8
, a UTF-8 encoding of the Unicode character set using one to three bytes per character
These two character sets support the characters from the Basic Multilingual Plane (BMP) of Unicode
Version 3.0. BMP characters have these characteristics:
• Their code values are between 0 and 65535 (or
U+0000
..
U+FFFF
)
• They can be encoded with a fixed 16-bit word, as in
ucs2
• They can be encoded with 8, 16, or 24 bits, as in
utf8
• They are sufficient for almost all characters in major languages
The
ucs2
and
utf8
character sets do not support supplementary characters that lie outside the BMP.
Characters outside the BMP compare as REPLACEMENT CHARACTER and convert to
'?'
when
converted to a Unicode character set.
A similar set of collations is available for each Unicode character set. For example, each has a Danish
collation, the names of which are
ucs2_danish_ci
and
utf8_danish_ci
. All Unicode collations are
listed at
Section 10.1.13.1, “Unicode Character Sets”
.
Summary of Contents for 5.0
Page 1: ...MySQL 5 0 Reference Manual ...
Page 18: ...xviii ...
Page 60: ...40 ...
Page 396: ...376 ...
Page 578: ...558 ...
Page 636: ...616 ...
Page 844: ...824 ...
Page 1234: ...1214 ...
Page 1427: ...MySQL Proxy Scripting 1407 ...
Page 1734: ...1714 ...
Page 1752: ...1732 ...
Page 1783: ...Configuring Connector ODBC 1763 ...
Page 1793: ...Connector ODBC Examples 1773 ...
Page 1839: ...Connector Net Installation 1819 2 You must choose the type of installation to perform ...
Page 2850: ...2830 ...
Page 2854: ...2834 ...
Page 2928: ...2908 ...
Page 3000: ...2980 ...
Page 3122: ...3102 ...
Page 3126: ...3106 ...
Page 3174: ...3154 ...
Page 3232: ...3212 ...