mariadb/sql/share
Monty c0bd9cdf13 MDEV-36290: Improved support of replication between tables of different structure
One can have data loss in multi-master setups when 1) both masters
update the same table, 2) ALTER TABLE is run on one master which
re-arranges the column ordering, and 3) transactions are binlogged
in ROW binlog_format.

This is because the slave assumes that all columns are in the same
order on the master and slave and all columns on the master also
exists on the slave. This happens even if binlog_row_metadata=FULL is
used.  If this is not the case, this will lead to silent data loss.

A new option for slave_type_conversions bit field,
ERROR_IF_MISSING_FIELD, has been added, along with a new error,
ER_SLAVE_INCOMPATIBLE_TABLE_DEF. This allows the user to define if
the slave should abort replication if it is missing some field that
existed on the master. The option is off by default to keep things
compatible with earlier versions.
If a field is missing on the slave and log_warnings >= 1, a warning
will be logged to the error log.

This patch fixes this, when binlog_row_metadata=FULL is used on the
master, by mapping fields with identical names on the master and slave.
If slave has fields that does not exist in the row event, these will
be set to their default value.

The main idea is that we added two conversion tables:
m_tabledef.master_to_slave_map[master_column_index] -> slave_column_index
and m_tabledef.master_to_slave_error[master_column_index] which contains
an error number if the master_column does not exist on the slave or
it is not possible to convert the master data to the slave column.
master_to_slave_error[#] contains 0 if the column exists and is compatible.

General code changes:
- Instead of looping over row fields in the order of slave table
  we are instead looping over fields in the order of the binary log.
- We are using table->write_set to know which fields should be updated
  on the slave. This is reflected in unpack_row
- We are calling TABLE::mark_columns_per_binlog_row_image() to ensure
  that rpl_write_set is properly set. This is needed if the slave also
  is doing binary logging.
- Before replication aborted if the master and slave tables were too
  different.  Now replication is only aborted if the row actually uses
  columns that does not exists on the slave (and ERROR_IF_MISSING_FIELD
  is used) or uses columns that cannot be converted.
  - Instead of giving errors in compatible_with(), used when table is
    accessed by first the row event, we are instead giving errors
    when we examine a row event and notice that it is accessing
    a not existing or not compatible field.

Other code changes:
- Removed conv_table argument from compatible_with() and store it
  directly in RPL_TABLE_LIST->m_conv_table
- table_def::compatible_with() returns now 1 on error (not 0).
- Remove m_width and skip arguments from prepare_record() as we are
  now using table->write_set() to check which elements need a default
  value.
- Moved DBUG_ENTER() to it's proper place (after variable
  declarations) in a few functions.
- Some changes in unpack_row():
  - Replaced null_mask and null_ptr with an indexed bit check for
    simplicity.
  - Removed check of rgi == null and table_found which never worked.
  - Updated comments to reflect current code.
  - Indentation changes as the code now uses 'continue' instead of
    'if-else' in the main loop.
  - The code to throw away 'extra master fields' is not needed as we
    are now looping over fields in binary log, not over fields in
    slave table.
- Simplified get_table_data(TABLE *table_arg) by returning found
  table_list.
- Errors for row events are now initialized in compatible_with(),
  checked in check_wrong_column_usage() and reported in
  give_compatibility_error().

Note for Review:
 - MDEV-36892 is not addressed, so the clause and associated code from
   the 10.6 patch is removed:

   """
  - Store a table's original write_set in cond_set, so we can later
    cross-reference it when automatically populating fields (i.e. so we
    know not to override a replicated value).
   """

Co-authored-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
2025-08-11 16:12:07 -06:00
..
charsets Fix remaining typos 2025-04-29 11:18:00 +10:00
CMakeLists.txt MDEV-32923: drop errmsg-utf8.txt from packaging 2023-12-18 14:15:15 +11:00
errmsg-utf8.txt MDEV-36290: Improved support of replication between tables of different structure 2025-08-11 16:12:07 -06:00
insert_translations_into_errmsg.py A procedure and script to speed up translation of MariaDB error messages to a new language 2023-07-20 11:16:21 +01:00
README.md Fix remaining typos 2025-04-29 11:18:00 +10:00

A quicker way for adding new language translations to the errmsg-utf8.txt file

Summary

To generate a new language translation of MariaDB use the following pull request (PR) as a template for your work:

You will notice as part of your translation work, you will have to add your language translations to the file sql/share/errmsg-utf8.txt which is found in the current directory. This file is long with many sections which can make the translation work tedious. In this README, we explain a procedure and provide a script insert_translations_into_errmsg.py that cuts down the amount of tedium in accomplishing the task.

Procedure

  1. Start by grepping out all the english translations from errmsg-utf8.txt using the following grep command, and redirecting the output to a file:

    grep -P "^\s*eng\s" errmsg-utf8.txt > all_english_text_in_errmsg-utf8.txt

  2. Next use Google translate to obtain a translation of this file. Google translate provides the ability to upload whole files for translation. For example, this technique was used to obtain Swahili translations which yielded a file with output similar to the below (output is truncated for clarity):

    sw "hashchk" sw "isamchk" sw "LA" sw "NDIYO" sw "Haiwezi kuunda faili '% -.200s' (kosa: %iE)" sw "Haiwezi kuunda jedwali %s.%s (kosa: %iE)" sw "Haiwezi kuunda hifadhidata '% -.192s' (kosa: %iE)" sw "Haiwezi kuunda hifadhidata '% -.192s'; hifadhidata ipo"

Note that Google translate removes the leading whitespace in the translation file it generates. DO NOT add that leading whitespace back!

  1. Give the translated file an appropriate name (e.g. all_swahili_text_in_errmsg-utf8.txt) and store it in the same directory with errmsg-utf8.txt and all_english_text_in_errmsg-utf8.txt. These 3 files will be used by the script insert_translations_into_errmsg.py.

  2. Proof check the auto-translations in the file you downloaded from Google translate. Note that Google might omit formatting information that will cause the compilation of MariaDB to fail, so pay attention to these.

  3. Reintegrate these translations into the errmsg-utf8.txt by running the insert_translations_into_errmsg.py script as follows:

    chmod ugo+x insert_translations_into_errmsg.py # Make the script executable if it is not.

    ./insert_translations_into_errmsg.py <errmsg-utf8.txt file>

    For example, for the swahili translation, we ran the following:

    ./insert_translations_into_errmsg.py errmsg-utf8.txt all_english_text_in_errmsg-utf8.txt all_swahili_text_in_errmsg-utf8.txt

    The script uses the errmsg-utf8.txt file and the grepped english file to keep track of each new translation. It then creates a file in the same directory as errmsg-utf8.txt with the name errmsg-utf8-with-new-language.txt.

  4. Check that the reintegration of the new translations into errmsg-utf8-with-new-language.txt went OK, and if it did, rename errmsg-utf8-with-new-language.txt to errmsg-utf8.txt:

    mv errmsg-utf8-with-new-language.txt errmsg-utf8.txt

  5. In the header of errmsg-utf8.txt make sure to add your language long form to short form mapping. E.g. for Swahili, add:

    swahili=sw