Character Sets and Collations in General
768
• The multiple-level default system for character set assignment
• Syntax for specifying character sets and collations
• Affected functions and operations
• Unicode support
• The character sets and collations that are available, with notes
Character set issues affect not only data storage, but also communication between client programs and
the MySQL server. If you want the client program to communicate with the server using a character
set different from the default, you'll need to indicate which one. For example, to use the
utf8
Unicode
character set, issue this statement after connecting to the server:
SET NAMES 'utf8';
For more information about configuring character sets for application use and character set-related
issues in client/server communication, see
Section 10.1.5, “Configuring the Character Set and Collation
for Applications”
, and
Section 10.1.4, “Connection Character Sets and Collations”
.
10.1.1. Character Sets and Collations in General
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters
in a character set. Let's make the distinction clear with an example of an imaginary character set.
Suppose that we have an alphabet with four letters: “
A
”, “
B
”, “
a
”, “
b
”. We give each letter a number: “
A
”
= 0, “
B
” = 1, “
a
” = 2, “
b
” = 3. The letter “
A
” is a symbol, the number 0 is the encoding for “
A
”, and the
combination of all four letters and their encodings is a character set.
Suppose that we want to compare two string values, “
A
” and “
B
”. The simplest way to do this is to look
at the encodings: 0 for “
A
” and 1 for “
B
”. Because 0 is less than 1, we say “
A
” is less than “
B
”. What
we've just done is apply a collation to our character set. The collation is a set of rules (only one rule in
this case): “compare the encodings.” We call this simplest of all possible collations a binary collation.
But what if we want to say that the lowercase and uppercase letters are equivalent? Then we would
have at least two rules: (1) treat the lowercase letters “
a
” and “
b
” as equivalent to “
A
” and “
B
”; (2) then
compare the encodings. We call this a case-insensitive collation. It is a little more complex than a
binary collation.
In real life, most character sets have many characters: not just “
A
” and “
B
” but whole alphabets,
sometimes multiple alphabets or eastern writing systems with thousands of characters, along with
many special symbols and punctuation marks. Also in real life, most collations have many rules, not
just for whether to distinguish lettercase, but also for whether to distinguish accents (an “accent” is a
mark attached to a character as in German “
Ö
”), and for multiple-character mappings (such as the rule
that “
Ö
” = “
OE
” in one of the two German collations).
MySQL can do these things for you:
• Store strings using a variety of character sets
• Compare strings using a variety of collations
• Mix strings with different character sets or collations in the same server, the same database, or even
the same table
• Enable specification of character set and collation at any level
In these respects, MySQL is far ahead of most other database management systems. However, to use
these features effectively, you need to know what character sets and collations are available, how to
change the defaults, and how they affect the behavior of string operators and functions.
Summary of Contents for 5.0
Page 1: ...MySQL 5 0 Reference Manual ...
Page 18: ...xviii ...
Page 60: ...40 ...
Page 396: ...376 ...
Page 578: ...558 ...
Page 636: ...616 ...
Page 844: ...824 ...
Page 1234: ...1214 ...
Page 1427: ...MySQL Proxy Scripting 1407 ...
Page 1734: ...1714 ...
Page 1752: ...1732 ...
Page 1783: ...Configuring Connector ODBC 1763 ...
Page 1793: ...Connector ODBC Examples 1773 ...
Page 1839: ...Connector Net Installation 1819 2 You must choose the type of installation to perform ...
Page 2850: ...2830 ...
Page 2854: ...2834 ...
Page 2928: ...2908 ...
Page 3000: ...2980 ...
Page 3122: ...3102 ...
Page 3126: ...3106 ...
Page 3174: ...3154 ...
Page 3232: ...3212 ...