MySQL字符集学习

MySQL
304
0
0
2023-05-02
  1. 将字符映射成二进制数据的过程叫编码,将二进制数据映射到字符的过程叫做解码
  2. ASCII字符集: 有128个字符。包括空格/标点符号/数字/大小写字母和不可见字符。

img

  1. ISO 8859-1 字符集合:有256个字符,在ASCII字符集基础上扩展了128个西欧常用字符(包括德法字符)。它可以使用一个字节来进行编码(它的别名称叫Latin1)
  2. GB2312字符集:包括汉子和拉丁字母/希腊字母/日文/俄文等。如果字符集包含在ASCII字符集中,则采用一个字节编码,否则采用两个字没编码。
  3. GBK字符集:对GB2312字符集进行了扩充。编码方式兼容GB2312.
  4. UTF-8字符集:收录了当今世界各个国家地区使用的字符,并且还在扩充。它兼容ASCII字符集。采用变长编码方式,编码一个字符时需要使用1到4字节。
  5. mysql 不区分字符集和编码方案的概念。
  6. mysql utf8mb3: "阉割"过的utf-8字符集,只使用1-3个字节表示字符。
  7. mysql utf8mb4: 正宗的utf-8字符集,使用1-4个字节表示字符。
  8. mysql 中utf8是 utf8mb3的别名。
  9. mysql 中如果要存放表情,则使用utf8mb4.
mysql> show charset
    -> ;
+----------+---------------------------------+---------------------+--------+
| Charset  | Description                     | Default collation   | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian              | armscii8_general_ci |      1 |
| ascii    | US ASCII                        | ascii_general_ci    |      1 |
| big5     | Big5 Traditional Chinese        | big5_chinese_ci     |      2 |
| binary   | Binary pseudo charset           | binary              |      1 |
| cp1250   | Windows Central European        | cp1250_general_ci   |      1 |
| cp1251   | Windows Cyrillic                | cp1251_general_ci   |      1 |
| cp1256   | Windows Arabic                  | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic                  | cp1257_general_ci   |      1 |
| cp850    | DOS West European               | cp850_general_ci    |      1 |
| cp852    | DOS Central European            | cp852_general_ci    |      1 |
| cp866    | DOS Russian                     | cp866_general_ci    |      1 |
| cp932    | SJIS for Windows Japanese       | cp932_japanese_ci   |      2 |
| dec8     | DEC West European               | dec8_swedish_ci     |      1 |
| eucjpms  | UJIS for Windows Japanese       | eucjpms_japanese_ci |      3 |
| euckr    | EUC-KR Korean                   | euckr_korean_ci     |      2 |
| gb18030  | China National Standard GB18030 | gb18030_chinese_ci  |      4 |
| gb2312   | GB2312 Simplified Chinese       | gb2312_chinese_ci   |      2 |
| gbk      | GBK Simplified Chinese          | gbk_chinese_ci      |      2 |
| geostd8  | GEOSTD8 Georgian                | geostd8_general_ci  |      1 |
| greek    | ISO 8859-7 Greek                | greek_general_ci    |      1 |
| hebrew   | ISO 8859-8 Hebrew               | hebrew_general_ci   |      1 |
| hp8      | HP West European                | hp8_english_ci      |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak      | keybcs2_general_ci  |      1 |
| koi8r    | KOI8-R Relcom Russian           | koi8r_general_ci    |      1 |
| koi8u    | KOI8-U Ukrainian                | koi8u_general_ci    |      1 |
| latin1   | cp1252 West European            | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European     | latin2_general_ci   |      1 |
| latin5   | ISO 8859-9 Turkish              | latin5_turkish_ci   |      1 |
| latin7   | ISO 8859-13 Baltic              | latin7_general_ci   |      1 |
| macce    | Mac Central European            | macce_general_ci    |      1 |
| macroman | Mac West European               | macroman_general_ci |      1 |
| sjis     | Shift-JIS Japanese              | sjis_japanese_ci    |      2 |
| swe7     | 7bit Swedish                    | swe7_swedish_ci     |      1 |
| tis620   | TIS620 Thai                     | tis620_thai_ci      |      1 |
| ucs2     | UCS-2 Unicode                   | ucs2_general_ci     |      2 |
| ujis     | EUC-JP Japanese                 | ujis_japanese_ci    |      3 |
| utf16    | UTF-16 Unicode                  | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode                | utf16le_general_ci  |      4 |
| utf32    | UTF-32 Unicode                  | utf32_general_ci    |      4 |
| utf8     | UTF-8 Unicode                   | utf8_general_ci     |      3 |
| utf8mb4  | UTF-8 Unicode                   | utf8mb4_0900_ai_ci  |      4 |
+----------+---------------------------------+---------------------+--------+

字符集比价规则

mysql> SHOW COLLATION;
+----------------------------+----------+-----+---------+----------+---------+---------------+
| Collation                  | Charset  | Id  | Default | Compiled | Sortlen | Pad_attribute |
+----------------------------+----------+-----+---------+----------+---------+---------------+
| armscii8_bin               | armscii8 |  64 |         | Yes      |       1 | PAD SPACE     |
| armscii8_general_ci        | armscii8 |  32 | Yes     | Yes      |       1 | PAD SPACE     |
| ascii_bin                  | ascii    |  65 |         | Yes      |       1 | PAD SPACE     |
| ascii_general_ci           | ascii    |  11 | Yes     | Yes      |       1 | PAD SPACE     |
| big5_bin                   | big5     |  84 |         | Yes      |       1 | PAD SPACE     |
| big5_chinese_ci            | big5     |   1 | Yes     | Yes      |       1 | PAD SPACE     |
| binary                     | binary   |  63 | Yes     | Yes      |       1 | NO PAD        |
| cp1250_bin                 | cp1250   |  66 |         | Yes      |       1 | PAD SPACE     |
| cp1250_croatian_ci         | cp1250   |  44 |         | Yes      |       1 | PAD SPACE     |
| cp1250_czech_cs            | cp1250   |  34 |         | Yes      |       2 | PAD SPACE     |
| cp1250_general_ci          | cp1250   |  26 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1250_polish_ci           | cp1250   |  99 |         | Yes      |       1 | PAD SPACE     |
| cp1251_bin                 | cp1251   |  50 |         | Yes      |       1 | PAD SPACE     |
| cp1251_bulgarian_ci        | cp1251   |  14 |         | Yes      |       1 | PAD SPACE     |
| cp1251_general_ci          | cp1251   |  51 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1251_general_cs          | cp1251   |  52 |         | Yes      |       1 | PAD SPACE     |
| cp1251_ukrainian_ci        | cp1251   |  23 |         | Yes      |       1 | PAD SPACE     |
| cp1256_bin                 | cp1256   |  67 |         | Yes      |       1 | PAD SPACE     |
| cp1256_general_ci          | cp1256   |  57 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1257_bin                 | cp1257   |  58 |         | Yes      |       1 | PAD SPACE     |
| cp1257_general_ci          | cp1257   |  59 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1257_lithuanian_ci       | cp1257   |  29 |         | Yes      |       1 | PAD SPACE     |
| cp850_bin                  | cp850    |  80 |         | Yes      |       1 | PAD SPACE     |
| cp850_general_ci           | cp850    |   4 | Yes     | Yes      |       1 | PAD SPACE     |
| cp852_bin                  | cp852    |  81 |         | Yes      |       1 | PAD SPACE     |
| cp852_general_ci           | cp852    |  40 | Yes     | Yes      |       1 | PAD SPACE     |
| cp866_bin                  | cp866    |  68 |         | Yes      |       1 | PAD SPACE     |
| cp866_general_ci           | cp866    |  36 | Yes     | Yes      |       1 | PAD SPACE     |
| cp932_bin                  | cp932    |  96 |         | Yes      |       1 | PAD SPACE     |
| cp932_japanese_ci          | cp932    |  95 | Yes     | Yes      |       1 | PAD SPACE     |
| dec8_bin                   | dec8     |  69 |         | Yes      |       1 | PAD SPACE     |
| dec8_swedish_ci            | dec8     |   3 | Yes     | Yes      |       1 | PAD SPACE     |
| eucjpms_bin                | eucjpms  |  98 |         | Yes      |       1 | PAD SPACE     |
| eucjpms_japanese_ci        | eucjpms  |  97 | Yes     | Yes      |       1 | PAD SPACE     |
| euckr_bin                  | euckr    |  85 |         | Yes      |       1 | PAD SPACE     |
| euckr_korean_ci            | euckr    |  19 | Yes     | Yes      |       1 | PAD SPACE     |
| gb18030_bin                | gb18030  | 249 |         | Yes      |       1 | PAD SPACE     |
| gb18030_chinese_ci         | gb18030  | 248 | Yes     | Yes      |       2 | PAD SPACE     |
| gb18030_unicode_520_ci     | gb18030  | 250 |         | Yes      |       8 | PAD SPACE     |
| gb2312_bin                 | gb2312   |  86 |         | Yes      |       1 | PAD SPACE     |
| gb2312_chinese_ci          | gb2312   |  24 | Yes     | Yes      |       1 | PAD SPACE     |
| gbk_bin                    | gbk      |  87 |         | Yes      |       1 | PAD SPACE     |
| gbk_chinese_ci             | gbk      |  28 | Yes     | Yes      |       1 | PAD SPACE     |
| geostd8_bin                | geostd8  |  93 |         | Yes      |       1 | PAD SPACE     |
| geostd8_general_ci         | geostd8  |  92 | Yes     | Yes      |       1 | PAD SPACE     |
| greek_bin                  | greek    |  70 |         | Yes      |       1 | PAD SPACE     |
| greek_general_ci           | greek    |  25 | Yes     | Yes      |       1 | PAD SPACE     |
| hebrew_bin                 | hebrew   |  71 |         | Yes      |       1 | PAD SPACE     |
| hebrew_general_ci          | hebrew   |  16 | Yes     | Yes      |       1 | PAD SPACE     |
| hp8_bin                    | hp8      |  72 |         | Yes      |       1 | PAD SPACE     |
| hp8_english_ci             | hp8      |   6 | Yes     | Yes      |       1 | PAD SPACE     |
| keybcs2_bin                | keybcs2  |  73 |         | Yes      |       1 | PAD SPACE     |
| keybcs2_general_ci         | keybcs2  |  37 | Yes     | Yes      |       1 | PAD SPACE     |
| koi8r_bin                  | koi8r    |  74 |         | Yes      |       1 | PAD SPACE     |
| koi8r_general_ci           | koi8r    |   7 | Yes     | Yes      |       1 | PAD SPACE     |
| koi8u_bin                  | koi8u    |  75 |         | Yes      |       1 | PAD SPACE     |
| koi8u_general_ci           | koi8u    |  22 | Yes     | Yes      |       1 | PAD SPACE     |
| latin1_bin                 | latin1   |  47 |         | Yes      |       1 | PAD SPACE     |
| latin1_danish_ci           | latin1   |  15 |         | Yes      |       1 | PAD SPACE     |
| latin1_general_ci          | latin1   |  48 |         | Yes      |       1 | PAD SPACE     |
| latin1_general_cs          | latin1   |  49 |         | Yes      |       1 | PAD SPACE     |
| latin1_german1_ci          | latin1   |   5 |         | Yes      |       1 | PAD SPACE     |
| latin1_german2_ci          | latin1   |  31 |         | Yes      |       2 | PAD SPACE     |
| latin1_spanish_ci          | latin1   |  94 |         | Yes      |       1 | PAD SPACE     |
| latin1_swedish_ci          | latin1   |   8 | Yes     | Yes      |       1 | PAD SPACE     |
| latin2_bin                 | latin2   |  77 |         | Yes      |       1 | PAD SPACE     |
| latin2_croatian_ci         | latin2   |  27 |         | Yes      |       1 | PAD SPACE     |
| latin2_czech_cs            | latin2   |   2 |         | Yes      |       4 | PAD SPACE     |
| latin2_general_ci          | latin2   |   9 | Yes     | Yes      |       1 | PAD SPACE     |
| latin2_hungarian_ci        | latin2   |  21 |         | Yes      |       1 | PAD SPACE     |
| latin5_bin                 | latin5   |  78 |         | Yes      |       1 | PAD SPACE     |
| latin5_turkish_ci          | latin5   |  30 | Yes     | Yes      |       1 | PAD SPACE     |
| latin7_bin                 | latin7   |  79 |         | Yes      |       1 | PAD SPACE     |
| latin7_estonian_cs         | latin7   |  20 |         | Yes      |       1 | PAD SPACE     |
| latin7_general_ci          | latin7   |  41 | Yes     | Yes      |       1 | PAD SPACE     |
| latin7_general_cs          | latin7   |  42 |         | Yes      |       1 | PAD SPACE     |
| macce_bin                  | macce    |  43 |         | Yes      |       1 | PAD SPACE     |
| macce_general_ci           | macce    |  38 | Yes     | Yes      |       1 | PAD SPACE     |
| macroman_bin               | macroman |  53 |         | Yes      |       1 | PAD SPACE     |
| macroman_general_ci        | macroman |  39 | Yes     | Yes      |       1 | PAD SPACE     |
| sjis_bin                   | sjis     |  88 |         | Yes      |       1 | PAD SPACE     |
| sjis_japanese_ci           | sjis     |  13 | Yes     | Yes      |       1 | PAD SPACE     |
| swe7_bin                   | swe7     |  82 |         | Yes      |       1 | PAD SPACE     |
| swe7_swedish_ci            | swe7     |  10 | Yes     | Yes      |       1 | PAD SPACE     |
| tis620_bin                 | tis620   |  89 |         | Yes      |       1 | PAD SPACE     |
| tis620_thai_ci             | tis620   |  18 | Yes     | Yes      |       4 | PAD SPACE     |
| ucs2_bin                   | ucs2     |  90 |         | Yes      |       1 | PAD SPACE     |
| ucs2_croatian_ci           | ucs2     | 149 |         | Yes      |       8 | PAD SPACE     |
| ucs2_czech_ci              | ucs2     | 138 |         | Yes      |       8 | PAD SPACE     |
| ucs2_danish_ci             | ucs2     | 139 |         | Yes      |       8 | PAD SPACE     |
| ucs2_esperanto_ci          | ucs2     | 145 |         | Yes      |       8 | PAD SPACE     |
| ucs2_estonian_ci           | ucs2     | 134 |         | Yes      |       8 | PAD SPACE     |
| ucs2_general_ci            | ucs2     |  35 | Yes     | Yes      |       1 | PAD SPACE     |
| ucs2_general_mysql500_ci   | ucs2     | 159 |         | Yes      |       1 | PAD SPACE     |
| ucs2_german2_ci            | ucs2     | 148 |         | Yes      |       8 | PAD SPACE     |
| ucs2_hungarian_ci          | ucs2     | 146 |         | Yes      |       8 | PAD SPACE     |
| ucs2_icelandic_ci          | ucs2     | 129 |         | Yes      |       8 | PAD SPACE     |
| ucs2_latvian_ci            | ucs2     | 130 |         | Yes      |       8 | PAD SPACE     |
| ucs2_lithuanian_ci         | ucs2     | 140 |         | Yes      |       8 | PAD SPACE     |
| ucs2_persian_ci            | ucs2     | 144 |         | Yes      |       8 | PAD SPACE     |
| ucs2_polish_ci             | ucs2     | 133 |         | Yes      |       8 | PAD SPACE     |
| ucs2_romanian_ci           | ucs2     | 131 |         | Yes      |       8 | PAD SPACE     |
| ucs2_roman_ci              | ucs2     | 143 |         | Yes      |       8 | PAD SPACE     |
| ucs2_sinhala_ci            | ucs2     | 147 |         | Yes      |       8 | PAD SPACE     |
| ucs2_slovak_ci             | ucs2     | 141 |         | Yes      |       8 | PAD SPACE     |
| ucs2_slovenian_ci          | ucs2     | 132 |         | Yes      |       8 | PAD SPACE     |
| ucs2_spanish2_ci           | ucs2     | 142 |         | Yes      |       8 | PAD SPACE     |
| ucs2_spanish_ci            | ucs2     | 135 |         | Yes      |       8 | PAD SPACE     |
| ucs2_swedish_ci            | ucs2     | 136 |         | Yes      |       8 | PAD SPACE     |
| ucs2_turkish_ci            | ucs2     | 137 |         | Yes      |       8 | PAD SPACE     |
| ucs2_unicode_520_ci        | ucs2     | 150 |         | Yes      |       8 | PAD SPACE     |
| ucs2_unicode_ci            | ucs2     | 128 |         | Yes      |       8 | PAD SPACE     |
| ucs2_vietnamese_ci         | ucs2     | 151 |         | Yes      |       8 | PAD SPACE     |
| ujis_bin                   | ujis     |  91 |         | Yes      |       1 | PAD SPACE     |
| ujis_japanese_ci           | ujis     |  12 | Yes     | Yes      |       1 | PAD SPACE     |
| utf16le_bin                | utf16le  |  62 |         | Yes      |       1 | PAD SPACE     |
| utf16le_general_ci         | utf16le  |  56 | Yes     | Yes      |       1 | PAD SPACE     |
| utf16_bin                  | utf16    |  55 |         | Yes      |       1 | PAD SPACE     |
| utf16_croatian_ci          | utf16    | 122 |         | Yes      |       8 | PAD SPACE     |
| utf16_czech_ci             | utf16    | 111 |         | Yes      |       8 | PAD SPACE     |
| utf16_danish_ci            | utf16    | 112 |         | Yes      |       8 | PAD SPACE     |
| utf16_esperanto_ci         | utf16    | 118 |         | Yes      |       8 | PAD SPACE     |
| utf16_estonian_ci          | utf16    | 107 |         | Yes      |       8 | PAD SPACE     |
| utf16_general_ci           | utf16    |  54 | Yes     | Yes      |       1 | PAD SPACE     |
| utf16_german2_ci           | utf16    | 121 |         | Yes      |       8 | PAD SPACE     |
| utf16_hungarian_ci         | utf16    | 119 |         | Yes      |       8 | PAD SPACE     |
| utf16_icelandic_ci         | utf16    | 102 |         | Yes      |       8 | PAD SPACE     |
| utf16_latvian_ci           | utf16    | 103 |         | Yes      |       8 | PAD SPACE     |
| utf16_lithuanian_ci        | utf16    | 113 |         | Yes      |       8 | PAD SPACE     |
| utf16_persian_ci           | utf16    | 117 |         | Yes      |       8 | PAD SPACE     |
| utf16_polish_ci            | utf16    | 106 |         | Yes      |       8 | PAD SPACE     |
| utf16_romanian_ci          | utf16    | 104 |         | Yes      |       8 | PAD SPACE     |
| utf16_roman_ci             | utf16    | 116 |         | Yes      |       8 | PAD SPACE     |
| utf16_sinhala_ci           | utf16    | 120 |         | Yes      |       8 | PAD SPACE     |
| utf16_slovak_ci            | utf16    | 114 |         | Yes      |       8 | PAD SPACE     |
| utf16_slovenian_ci         | utf16    | 105 |         | Yes      |       8 | PAD SPACE     |
| utf16_spanish2_ci          | utf16    | 115 |         | Yes      |       8 | PAD SPACE     |
| utf16_spanish_ci           | utf16    | 108 |         | Yes      |       8 | PAD SPACE     |
| utf16_swedish_ci           | utf16    | 109 |         | Yes      |       8 | PAD SPACE     |
| utf16_turkish_ci           | utf16    | 110 |         | Yes      |       8 | PAD SPACE     |
| utf16_unicode_520_ci       | utf16    | 123 |         | Yes      |       8 | PAD SPACE     |
| utf16_unicode_ci           | utf16    | 101 |         | Yes      |       8 | PAD SPACE     |
| utf16_vietnamese_ci        | utf16    | 124 |         | Yes      |       8 | PAD SPACE     |
| utf32_bin                  | utf32    |  61 |         | Yes      |       1 | PAD SPACE     |
| utf32_croatian_ci          | utf32    | 181 |         | Yes      |       8 | PAD SPACE     |
| utf32_czech_ci             | utf32    | 170 |         | Yes      |       8 | PAD SPACE     |
| utf32_danish_ci            | utf32    | 171 |         | Yes      |       8 | PAD SPACE     |
| utf32_esperanto_ci         | utf32    | 177 |         | Yes      |       8 | PAD SPACE     |
| utf32_estonian_ci          | utf32    | 166 |         | Yes      |       8 | PAD SPACE     |
| utf32_general_ci           | utf32    |  60 | Yes     | Yes      |       1 | PAD SPACE     |
| utf32_german2_ci           | utf32    | 180 |         | Yes      |       8 | PAD SPACE     |
| utf32_hungarian_ci         | utf32    | 178 |         | Yes      |       8 | PAD SPACE     |
| utf32_icelandic_ci         | utf32    | 161 |         | Yes      |       8 | PAD SPACE     |
| utf32_latvian_ci           | utf32    | 162 |         | Yes      |       8 | PAD SPACE     |
| utf32_lithuanian_ci        | utf32    | 172 |         | Yes      |       8 | PAD SPACE     |
| utf32_persian_ci           | utf32    | 176 |         | Yes      |       8 | PAD SPACE     |
| utf32_polish_ci            | utf32    | 165 |         | Yes      |       8 | PAD SPACE     |
| utf32_romanian_ci          | utf32    | 163 |         | Yes      |       8 | PAD SPACE     |
| utf32_roman_ci             | utf32    | 175 |         | Yes      |       8 | PAD SPACE     |
| utf32_sinhala_ci           | utf32    | 179 |         | Yes      |       8 | PAD SPACE     |
| utf32_slovak_ci            | utf32    | 173 |         | Yes      |       8 | PAD SPACE     |
| utf32_slovenian_ci         | utf32    | 164 |         | Yes      |       8 | PAD SPACE     |
| utf32_spanish2_ci          | utf32    | 174 |         | Yes      |       8 | PAD SPACE     |
| utf32_spanish_ci           | utf32    | 167 |         | Yes      |       8 | PAD SPACE     |
| utf32_swedish_ci           | utf32    | 168 |         | Yes      |       8 | PAD SPACE     |
| utf32_turkish_ci           | utf32    | 169 |         | Yes      |       8 | PAD SPACE     |
| utf32_unicode_520_ci       | utf32    | 182 |         | Yes      |       8 | PAD SPACE     |
| utf32_unicode_ci           | utf32    | 160 |         | Yes      |       8 | PAD SPACE     |
| utf32_vietnamese_ci        | utf32    | 183 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_0900_ai_ci         | utf8mb4  | 255 | Yes     | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_ci         | utf8mb4  | 305 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_cs         | utf8mb4  | 278 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_bin           | utf8mb4  | 309 |         | Yes      |       1 | NO PAD        |
| utf8mb4_bin                | utf8mb4  |  46 |         | Yes      |       1 | PAD SPACE     |
| utf8mb4_croatian_ci        | utf8mb4  | 245 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_cs_0900_ai_ci      | utf8mb4  | 266 |         | Yes      |       0 | NO PAD        |
| utf8mb4_cs_0900_as_cs      | utf8mb4  | 289 |         | Yes      |       0 | NO PAD        |
| utf8mb4_czech_ci           | utf8mb4  | 234 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_danish_ci          | utf8mb4  | 235 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_da_0900_ai_ci      | utf8mb4  | 267 |         | Yes      |       0 | NO PAD        |
| utf8mb4_da_0900_as_cs      | utf8mb4  | 290 |         | Yes      |       0 | NO PAD        |
| utf8mb4_de_pb_0900_ai_ci   | utf8mb4  | 256 |         | Yes      |       0 | NO PAD        |
| utf8mb4_de_pb_0900_as_cs   | utf8mb4  | 279 |         | Yes      |       0 | NO PAD        |
| utf8mb4_eo_0900_ai_ci      | utf8mb4  | 273 |         | Yes      |       0 | NO PAD        |
| utf8mb4_eo_0900_as_cs      | utf8mb4  | 296 |         | Yes      |       0 | NO PAD        |
| utf8mb4_esperanto_ci       | utf8mb4  | 241 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_estonian_ci        | utf8mb4  | 230 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_es_0900_ai_ci      | utf8mb4  | 263 |         | Yes      |       0 | NO PAD        |
| utf8mb4_es_0900_as_cs      | utf8mb4  | 286 |         | Yes      |       0 | NO PAD        |
| utf8mb4_es_trad_0900_ai_ci | utf8mb4  | 270 |         | Yes      |       0 | NO PAD        |
| utf8mb4_es_trad_0900_as_cs | utf8mb4  | 293 |         | Yes      |       0 | NO PAD        |
| utf8mb4_et_0900_ai_ci      | utf8mb4  | 262 |         | Yes      |       0 | NO PAD        |
| utf8mb4_et_0900_as_cs      | utf8mb4  | 285 |         | Yes      |       0 | NO PAD        |
| utf8mb4_general_ci         | utf8mb4  |  45 |         | Yes      |       1 | PAD SPACE     |
| utf8mb4_german2_ci         | utf8mb4  | 244 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_hr_0900_ai_ci      | utf8mb4  | 275 |         | Yes      |       0 | NO PAD        |
| utf8mb4_hr_0900_as_cs      | utf8mb4  | 298 |         | Yes      |       0 | NO PAD        |
| utf8mb4_hungarian_ci       | utf8mb4  | 242 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_hu_0900_ai_ci      | utf8mb4  | 274 |         | Yes      |       0 | NO PAD        |
| utf8mb4_hu_0900_as_cs      | utf8mb4  | 297 |         | Yes      |       0 | NO PAD        |
| utf8mb4_icelandic_ci       | utf8mb4  | 225 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_is_0900_ai_ci      | utf8mb4  | 257 |         | Yes      |       0 | NO PAD        |
| utf8mb4_is_0900_as_cs      | utf8mb4  | 280 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ja_0900_as_cs      | utf8mb4  | 303 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ja_0900_as_cs_ks   | utf8mb4  | 304 |         | Yes      |      24 | NO PAD        |
| utf8mb4_latvian_ci         | utf8mb4  | 226 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_la_0900_ai_ci      | utf8mb4  | 271 |         | Yes      |       0 | NO PAD        |
| utf8mb4_la_0900_as_cs      | utf8mb4  | 294 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lithuanian_ci      | utf8mb4  | 236 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_lt_0900_ai_ci      | utf8mb4  | 268 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lt_0900_as_cs      | utf8mb4  | 291 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lv_0900_ai_ci      | utf8mb4  | 258 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lv_0900_as_cs      | utf8mb4  | 281 |         | Yes      |       0 | NO PAD        |
| utf8mb4_persian_ci         | utf8mb4  | 240 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_pl_0900_ai_ci      | utf8mb4  | 261 |         | Yes      |       0 | NO PAD        |
| utf8mb4_pl_0900_as_cs      | utf8mb4  | 284 |         | Yes      |       0 | NO PAD        |
| utf8mb4_polish_ci          | utf8mb4  | 229 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_romanian_ci        | utf8mb4  | 227 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_roman_ci           | utf8mb4  | 239 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_ro_0900_ai_ci      | utf8mb4  | 259 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ro_0900_as_cs      | utf8mb4  | 282 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ru_0900_ai_ci      | utf8mb4  | 306 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ru_0900_as_cs      | utf8mb4  | 307 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sinhala_ci         | utf8mb4  | 243 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_sk_0900_ai_ci      | utf8mb4  | 269 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sk_0900_as_cs      | utf8mb4  | 292 |         | Yes      |       0 | NO PAD        |
| utf8mb4_slovak_ci          | utf8mb4  | 237 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_slovenian_ci       | utf8mb4  | 228 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_sl_0900_ai_ci      | utf8mb4  | 260 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sl_0900_as_cs      | utf8mb4  | 283 |         | Yes      |       0 | NO PAD        |
| utf8mb4_spanish2_ci        | utf8mb4  | 238 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_spanish_ci         | utf8mb4  | 231 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_sv_0900_ai_ci      | utf8mb4  | 264 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sv_0900_as_cs      | utf8mb4  | 287 |         | Yes      |       0 | NO PAD        |
| utf8mb4_swedish_ci         | utf8mb4  | 232 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_tr_0900_ai_ci      | utf8mb4  | 265 |         | Yes      |       0 | NO PAD        |
| utf8mb4_tr_0900_as_cs      | utf8mb4  | 288 |         | Yes      |       0 | NO PAD        |
| utf8mb4_turkish_ci         | utf8mb4  | 233 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_unicode_520_ci     | utf8mb4  | 246 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_unicode_ci         | utf8mb4  | 224 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_vietnamese_ci      | utf8mb4  | 247 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_vi_0900_ai_ci      | utf8mb4  | 277 |         | Yes      |       0 | NO PAD        |
| utf8mb4_vi_0900_as_cs      | utf8mb4  | 300 |         | Yes      |       0 | NO PAD        |
| utf8mb4_zh_0900_as_cs      | utf8mb4  | 308 |         | Yes      |       0 | NO PAD        |
| utf8_bin                   | utf8     |  83 |         | Yes      |       1 | PAD SPACE     |
| utf8_croatian_ci           | utf8     | 213 |         | Yes      |       8 | PAD SPACE     |
| utf8_czech_ci              | utf8     | 202 |         | Yes      |       8 | PAD SPACE     |
| utf8_danish_ci             | utf8     | 203 |         | Yes      |       8 | PAD SPACE     |
| utf8_esperanto_ci          | utf8     | 209 |         | Yes      |       8 | PAD SPACE     |
| utf8_estonian_ci           | utf8     | 198 |         | Yes      |       8 | PAD SPACE     |
| utf8_general_ci            | utf8     |  33 | Yes     | Yes      |       1 | PAD SPACE     |
| utf8_general_mysql500_ci   | utf8     | 223 |         | Yes      |       1 | PAD SPACE     |
| utf8_german2_ci            | utf8     | 212 |         | Yes      |       8 | PAD SPACE     |
| utf8_hungarian_ci          | utf8     | 210 |         | Yes      |       8 | PAD SPACE     |
| utf8_icelandic_ci          | utf8     | 193 |         | Yes      |       8 | PAD SPACE     |
| utf8_latvian_ci            | utf8     | 194 |         | Yes      |       8 | PAD SPACE     |
| utf8_lithuanian_ci         | utf8     | 204 |         | Yes      |       8 | PAD SPACE     |
| utf8_persian_ci            | utf8     | 208 |         | Yes      |       8 | PAD SPACE     |
| utf8_polish_ci             | utf8     | 197 |         | Yes      |       8 | PAD SPACE     |
| utf8_romanian_ci           | utf8     | 195 |         | Yes      |       8 | PAD SPACE     |
| utf8_roman_ci              | utf8     | 207 |         | Yes      |       8 | PAD SPACE     |
| utf8_sinhala_ci            | utf8     | 211 |         | Yes      |       8 | PAD SPACE     |
| utf8_slovak_ci             | utf8     | 205 |         | Yes      |       8 | PAD SPACE     |
| utf8_slovenian_ci          | utf8     | 196 |         | Yes      |       8 | PAD SPACE     |
| utf8_spanish2_ci           | utf8     | 206 |         | Yes      |       8 | PAD SPACE     |
| utf8_spanish_ci            | utf8     | 199 |         | Yes      |       8 | PAD SPACE     |
| utf8_swedish_ci            | utf8     | 200 |         | Yes      |       8 | PAD SPACE     |
| utf8_tolower_ci            | utf8     |  76 |         | Yes      |       1 | PAD SPACE     |
| utf8_turkish_ci            | utf8     | 201 |         | Yes      |       8 | PAD SPACE     |
| utf8_unicode_520_ci        | utf8     | 214 |         | Yes      |       8 | PAD SPACE     |
| utf8_unicode_ci            | utf8     | 192 |         | Yes      |       8 | PAD SPACE     |
| utf8_vietnamese_ci         | utf8     | 215 |         | Yes      |       8 | PAD SPACE     |
+----------------------------+----------+-----+---------+----------+---------+---------------+
  1. 比较规则名称以其关联的字符集的名称开头。
  2. 后面紧跟着该比较规则所对应的语言。

如:utf8_polish_ci

3. 后缀ci表示该比较规则是否区分中间语言中的重音,大小写等。

  • _ci: case insensitive 不区分大小写
  • _cs:case sensitive 区分大小写
  • _ai: accent insensitive 不区分重音
  • _as: accent sensitive 区分重音
  • _bin:binary 以二进制方式

4. 字符集与比较规则有四个级别:服务器级别/数据库级别/表级别/列级别

mysql> SHOW variables like "%character_set_server%";
+----------------------+---------+
| Variable_name        | Value   |
+----------------------+---------+
| character_set_server | utf8mb4 |
+----------------------+---------+

mysql> SHOW variables like "%character_set_database%";
+------------------------+---------+
| Variable_name          | Value   |
+------------------------+---------+
| character_set_database | utf8mb4 |
+------------------------+---------+

mysql> SHOW variables like "%collation_server%";
+------------------+--------------------+
| Variable_name    | Value              |
+------------------+--------------------+
| collation_server | utf8mb4_0900_ai_ci |
+------------------+--------------------+

mysql> SHOW variables like "%collation_database%";
+--------------------+--------------------+
| Variable_name      | Value              |
+--------------------+--------------------+
| collation_database | utf8mb4_0900_ai_ci |
+--------------------+--------------------+