Skip to content

[5.x]: Multibyte characters do not survive db/backup unless db charset is explicitly set to utf8mb4 in config #16753

@mmikkel

Description

@mmikkel

What happened?

Description

When backing up the database using the db/backup CLI command or Craft's "Database Backup" utility, multibyte characters (such as emojis) used in native fields (title fields, alternative text) or PlainText custom fields, will be exported as a '?' (question mark) unless CRAFT_DB_CHARSET is explicitly set to utf8mb4.

Note: I have not set/configured the backupCommand config setting.

Steps to reproduce

  1. Make sure the database is using utf8mb4 charset and utf8mb4_0900_ai_ci collation (i.e. Craft 5's defaults).
  2. Also make sure that the charset isn't explicitly set via the CRAFT_DB_CHARSET env var or in a config/db.php file
  3. Create an entry, and add a fire emoji (🔥) in its title.
  4. Backup the database using the db/backup CLI command, or the "Database Backup" utility
  5. Re-import the database dump from the db/backup command
  6. Confirm that the 🔥 in the entry's title has been replaced by a question mark ?

Repeat the process, but this time make sure there is a CRAFT_DB_CHARSET=utf8mb4 environment variable, before backing up the database. This time, the fire emoji will be intact.

Expected behavior

Multibyte characters in native and plaintext fields should be preserved when Craft dumps a database using the utf8mb4 charset and utf8mb4_0900_ai_ci collation.

Actual behavior

Multibyte characters in native and plaintext fields are exported as ? unless CRAFT_DB_CHARSET is set to utf8mb4.

Craft CMS version

5.6.9.1

PHP version

8.2.27

Operating system and version

DDEV v1.24.2

Database type and version

MySQL 8.0.40

Image driver and version

No response

Installed plugins and versions

None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions