Zanata should check database for UTF-8 support at startup

Description

If the Zanata database is not created as UTF-8 for whatever reason (eg a new installation, restoring from backup), MySQL/MariaDB will corrupt Unicode characters by silently converting them to question mark characters. This could lead to a lot of lost work if the problem isn't noticed immediately.

To reduce the danger, we should check at startup that Zanata can store and retrieve Unicode characters successfully. If a Unicode problem is detected, startup will abort. A system property can be used to continue anyway, or to skip the checks entirely.

We should interrogate the INFORMATION_SCHEMA metadata (eg SCHEMATA, TABLES, COLUMNS, GLOBAL_VARIABLES, SESSION_VARIABLES) to check database/table/column/connection support for utf8 (or utf8mb4) support, and finally save and retrieve (then delete) some Unicode characters, perhaps in the HTextFlowTarget table.

Status

Assignee

Unassigned

Reporter

Sean Flanigan

Tested Version/s

None

Components

Priority

unspecified
Configure