Thus, xxx_unicode_520_ci collations are based on UCA 5.2.0 weight keys: https://

Collations without the "520", are based on the older UCA 4.0.0.

That is, there is no utf8 collation for case folding, but keeping accents distinct. The document is inconsistent as to which it specifies.

If you are running My SQL before 5.5.3, you have only 'utf8'.

Let's walk an acute-e (é) through the INSERT and SELECT.("UTF" = "Unicode Transformation Format") Meanwhile, My SQL is born, but has enough problems without worrying about character sets. You can put any kind of bytes, representing anything, into a VARCHAR. My SQL 4.1 introduced the concept of "character set" and "collation".If you had legacy data or legacy code, you probably did not notice that you were messing things up when you upgraded.The problems being addressed Basic Concepts History Best Practice Conversions and Common Errors When do bytes get translated?Life Cycle of a Character How Mangling Happens on INSERT Disaster before INSERT Diagnosing CHARSET issues The manual Viewing text Fixes for various Cases Entering accents in CMD Example of Double-encoding 2-step ALTER Fixing utf8 bytes in latin1 column Fixing mismatch between CHARSET and data encoding Fixing mix of charsets in a column Fixing Micro Soft thingies Fixing "double encoding" Fix definition and data, IF correctly encoded Testing an in-place fix Fixing while moving data with mysqldump Fixing while moving data with LOAD DATA Conversion Gotchas 4-byte utf8 Functions to know about BOM * Byte-Order-Mark German "sharp-s" Circumflex Quotes When is ö not equal to ö? A "byte" is an a-bit thing; it is the unit of space in computers (today). 'Charset' ('character set'; 'encoding') refers to the bits used to represent 'characters'.

