MySQL 5.0 FAQ: MySQL Chinese, Japanese, and Korean Character Sets
2899
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql/share/mysql/charsets/ |
+--------------------------+----------------------------------------+
8 rows in set (0.01 sec)
Now stop the client, and then stop the server using
mysqladmin
. Then start the server again, but this
time tell it to skip the handshake like so:
mysqld --character-set-server=utf8 --skip-character-set-client-handshake
Start the client with
utf8
once again as the default character set, then display the current settings:
mysql>
SHOW VARIABLES LIKE 'char%';
+--------------------------+----------------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql/share/mysql/charsets/ |
+--------------------------+----------------------------------------+
8 rows in set (0.01 sec)
As you can see by comparing the differing results from
SHOW VARIABLES
, the server ignores the
client's initial settings if the
--skip-character-set-client-handshake
[402]
is used.
B.11.12: Why do some
LIKE
[896]
and
FULLTEXT
searches with CJK characters fail?
There is a very simple problem with
LIKE
[896]
searches on
BINARY
and
BLOB
columns: we need to
know the end of a character. With multi-byte character sets, different characters might have different
octet lengths. For example, in
utf8
,
A
requires one byte but
ペ
requires three bytes, as shown here:
+-------------------------+---------------------------+
| OCTET_LENGTH(_utf8 'A') | OCTET_LENGTH(_utf8 '
ペ
') |
+-------------------------+---------------------------+
| 1 | 3 |
+-------------------------+---------------------------+
1 row in set (0.00 sec)
If we don't know where the first character ends, then we don't know where the second character
begins, in which case even very simple searches such as
LIKE '_A%'
[896]
fail. The solution is to use
a regular CJK character set in the first place, or to convert to a CJK character set before comparing.
This is one reason why MySQL cannot allow encodings of nonexistent characters. If it is not strict about
rejecting bad input, then it has no way of knowing where characters end.
For
FULLTEXT
searches, we need to know where words begin and end. With Western languages,
this is rarely a problem because most (if not all) of these use an easy-to-identify word boundary—
the space character. However, this is not usually the case with Asian writing. We could use arbitrary
halfway measures, like assuming that all Han characters represent words, or (for Japanese) depending
on changes from Katakana to Hiragana due to grammatical endings. However, the only sure solution
requires a comprehensive word list, which means that we would have to include a dictionary in the
server for each Asian language supported. This is simply not feasible.
B.11.13: How do I know whether character
X
is available in all character sets?
The majority of simplified Chinese and basic nonhalfwidth Japanese Kana characters appear in all
CJK character sets. This stored procedure accepts a
UCS-2
Unicode character, converts it to all other
character sets, and displays the results in hexadecimal.
DELIMITER //
Содержание 5.0
Страница 1: ...MySQL 5 0 Reference Manual ...
Страница 18: ...xviii ...
Страница 60: ...40 ...
Страница 396: ...376 ...
Страница 578: ...558 ...
Страница 636: ...616 ...
Страница 844: ...824 ...
Страница 1234: ...1214 ...
Страница 1426: ...MySQL Proxy Scripting 1406 The following diagram shows an overview of the classes exposed by MySQL Proxy ...
Страница 1427: ...MySQL Proxy Scripting 1407 ...
Страница 1734: ...1714 ...
Страница 1752: ...1732 ...
Страница 1783: ...Configuring Connector ODBC 1763 ...
Страница 1793: ...Connector ODBC Examples 1773 ...
Страница 1839: ...Connector Net Installation 1819 2 You must choose the type of installation to perform ...
Страница 1842: ...Connector Net Installation 1822 5 Once the installation has been completed click Finish to exit the installer ...
Страница 1864: ...Connector Net Visual Studio Integration 1844 Figure 20 24 Debug Stepping Figure 20 25 Function Stepping 1 of 2 ...
Страница 2850: ...2830 ...
Страница 2854: ...2834 ...
Страница 2928: ...2908 ...
Страница 3000: ...2980 ...
Страница 3122: ...3102 ...
Страница 3126: ...3106 ...
Страница 3174: ...3154 ...
Страница 3232: ...3212 ...