mariadb/sql/rpl_utility.h
Monty c0bd9cdf13 MDEV-36290: Improved support of replication between tables of different structure
One can have data loss in multi-master setups when 1) both masters
update the same table, 2) ALTER TABLE is run on one master which
re-arranges the column ordering, and 3) transactions are binlogged
in ROW binlog_format.

This is because the slave assumes that all columns are in the same
order on the master and slave and all columns on the master also
exists on the slave. This happens even if binlog_row_metadata=FULL is
used.  If this is not the case, this will lead to silent data loss.

A new option for slave_type_conversions bit field,
ERROR_IF_MISSING_FIELD, has been added, along with a new error,
ER_SLAVE_INCOMPATIBLE_TABLE_DEF. This allows the user to define if
the slave should abort replication if it is missing some field that
existed on the master. The option is off by default to keep things
compatible with earlier versions.
If a field is missing on the slave and log_warnings >= 1, a warning
will be logged to the error log.

This patch fixes this, when binlog_row_metadata=FULL is used on the
master, by mapping fields with identical names on the master and slave.
If slave has fields that does not exist in the row event, these will
be set to their default value.

The main idea is that we added two conversion tables:
m_tabledef.master_to_slave_map[master_column_index] -> slave_column_index
and m_tabledef.master_to_slave_error[master_column_index] which contains
an error number if the master_column does not exist on the slave or
it is not possible to convert the master data to the slave column.
master_to_slave_error[#] contains 0 if the column exists and is compatible.

General code changes:
- Instead of looping over row fields in the order of slave table
  we are instead looping over fields in the order of the binary log.
- We are using table->write_set to know which fields should be updated
  on the slave. This is reflected in unpack_row
- We are calling TABLE::mark_columns_per_binlog_row_image() to ensure
  that rpl_write_set is properly set. This is needed if the slave also
  is doing binary logging.
- Before replication aborted if the master and slave tables were too
  different.  Now replication is only aborted if the row actually uses
  columns that does not exists on the slave (and ERROR_IF_MISSING_FIELD
  is used) or uses columns that cannot be converted.
  - Instead of giving errors in compatible_with(), used when table is
    accessed by first the row event, we are instead giving errors
    when we examine a row event and notice that it is accessing
    a not existing or not compatible field.

Other code changes:
- Removed conv_table argument from compatible_with() and store it
  directly in RPL_TABLE_LIST->m_conv_table
- table_def::compatible_with() returns now 1 on error (not 0).
- Remove m_width and skip arguments from prepare_record() as we are
  now using table->write_set() to check which elements need a default
  value.
- Moved DBUG_ENTER() to it's proper place (after variable
  declarations) in a few functions.
- Some changes in unpack_row():
  - Replaced null_mask and null_ptr with an indexed bit check for
    simplicity.
  - Removed check of rgi == null and table_found which never worked.
  - Updated comments to reflect current code.
  - Indentation changes as the code now uses 'continue' instead of
    'if-else' in the main loop.
  - The code to throw away 'extra master fields' is not needed as we
    are now looping over fields in binary log, not over fields in
    slave table.
- Simplified get_table_data(TABLE *table_arg) by returning found
  table_list.
- Errors for row events are now initialized in compatible_with(),
  checked in check_wrong_column_usage() and reported in
  give_compatibility_error().

Note for Review:
 - MDEV-36892 is not addressed, so the clause and associated code from
   the 10.6 patch is removed:

   """
  - Store a table's original write_set in cond_set, so we can later
    cross-reference it when automatically populating fields (i.e. so we
    know not to override a replicated value).
   """

Co-authored-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
2025-08-11 16:12:07 -06:00

371 lines
12 KiB
C++

/*
Copyright (c) 2006, 2010, Oracle and/or its affiliates.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA */
#ifndef RPL_UTILITY_H
#define RPL_UTILITY_H
#ifndef __cplusplus
#error "Don't include this C++ header file from a non-C++ file!"
#endif
#include "sql_priv.h"
#include "m_string.h" /* bzero, memcpy */
#ifdef MYSQL_SERVER
#include "table.h" /* TABLE_LIST */
#endif
#include "mysql_com.h"
class Relay_log_info;
class Log_event;
class Rows_log_event;
struct rpl_group_info;
struct RPL_TABLE_LIST;
/**
A table definition from the master.
The responsibilities of this class is:
- Extract and decode table definition data from the table map event
- Check if table definition in table map is compatible with table
definition on slave
*/
class table_def
{
table_def(const table_def&) = default;
public:
/**
Constructor.
@param types Array of types, each stored as a byte
@param size Number of elements in array 'types'
@param field_metadata Array of extra information about fields
@param metadata_size Size of the field_metadata array
@param null_bitmap The bitmap of fields that can be null
@param optional_metadata Optional metadata logged into Table Map Event
when binlog_row_metadata=FULL on master
@param optional_metadata_len Length of optional_metadata
*/
table_def(unsigned char *types, ulong size, uchar *field_metadata,
int metadata_size, uchar *null_bitmap, uint16 flags,
const uchar *optional_metadata, uint optional_metadata_len);
/**
Move constructor
Since it deallocates a memory during destruction, we can't safely copy it.
We should instead move it to zero m_memory in an old object
*/
table_def(table_def &&tabledef)
: table_def(tabledef)
{
tabledef.m_memory= NULL;
}
~table_def();
/**
Return the number of fields there is type data for.
@return The number of fields that there is type data for.
*/
inline ulong size() const { return m_size; }
/**
Returns internal binlog type code for one field,
without translation to real types.
*/
enum_field_types binlog_type(ulong index) const
{
return static_cast<enum_field_types>(m_type[index]);
}
/*
Return a representation of the type data for one field.
@param index Field index to return data for
@return Will return a representation of the type data for field
<code>index</code>. Currently, only the type identifier is
returned.
*/
enum_field_types type(ulong index) const
{
DBUG_ASSERT(index < m_size);
/*
If the source type is MYSQL_TYPE_STRING, it can in reality be
either MYSQL_TYPE_STRING, MYSQL_TYPE_ENUM, or MYSQL_TYPE_SET, so
we might need to modify the type to get the real type.
*/
enum_field_types source_type= binlog_type(index);
uint16 source_metadata= m_field_metadata[index];
switch (source_type)
{
case MYSQL_TYPE_STRING:
{
int real_type= source_metadata >> 8;
if (real_type == MYSQL_TYPE_ENUM || real_type == MYSQL_TYPE_SET)
source_type= static_cast<enum_field_types>(real_type);
break;
}
/*
This type has not been used since before row-based replication,
so we can safely assume that it really is MYSQL_TYPE_NEWDATE.
*/
case MYSQL_TYPE_DATE:
source_type= MYSQL_TYPE_NEWDATE;
break;
default:
/* Do nothing */
break;
}
return source_type;
}
#ifdef MYSQL_SERVER
const Type_handler *field_type_handler(uint index) const;
#endif
/*
This function allows callers to get the extra field data from the
table map for a given field. If there is no metadata for that field
or there is no extra metadata at all, the function returns 0.
The function returns the value for the field metadata for column at
position indicated by index. As mentioned, if the field was a type
that stores field metadata, that value is returned else zero (0) is
returned. This method is used in the unpack() methods of the
corresponding fields to properly extract the data from the binary log
in the event that the master's field is smaller than the slave.
*/
uint16 field_metadata(uint index) const
{
DBUG_ASSERT(index < m_size);
if (m_field_metadata_size)
return m_field_metadata[index];
else
return 0;
}
/*
This function returns whether the field on the master can be null.
This value is derived from field->maybe_null().
*/
inline my_bool maybe_null(uint index) const
{
DBUG_ASSERT(index < m_size);
return (m_null_bits[(index / 8)] & (1 << (index % 8))) != 0;
}
/*
This function returns the field size in raw bytes based on the type
and the encoded field data from the master's raw data. This method can
be used for situations where the slave needs to skip a column (e.g.,
WL#3915) or needs to advance the pointer for the fields in the raw
data from the master to a specific column.
*/
uint32 calc_field_size(uint col, uchar *master_data) const;
/**
Compare the master table with an existing table on the slave and
create a conversion map for fields that needs to be converted and
update master_to_slave_error[] map with fields that does not
exist on the slave or are not compatible with the field with the
same name on the slave.
If any fields need to be converted, a temporary conversion table
is created with the fields that needs conversions
@param thd
@param rgi Pointer to relay log info
@param table Pointer to table to compare with.
@return 0 ok
@return 1 Something went wrong (OOM?)
*/
#ifndef MYSQL_CLIENT
bool compatible_with(THD *thd, rpl_group_info *rgi,
RPL_TABLE_LIST *table) const;
/**
Create a virtual in-memory temporary table structure.
@param thd Thread to allocate memory from.
@param rli Relay log info structure, for error reporting.
@param target_table Target table for fields.
@return A pointer to a temporary table with memory allocated in the
thread's memroot, NULL if the table could not be created
The table structure has records and field array so that a row can
be unpacked into the record for further processing.
In the virtual table, each field that requires conversion will
have a non-NULL value, while fields that do not require
conversion will have a NULL value.
Some information that is missing in the events, such as the
character set for string types, are taken from the table that the
field is going to be pushed into, so the target table that the data
eventually need to be pushed into need to be supplied.
Note that the fields generated in the conversion table are not guaranteed to
align with the fields from this table_def. If the slave doesn't have the
target field, we don't generate a field in the conversion_table, as it would
serve no purpose. If the conversion table is referenced while iterating
through this table_def, one needs a separate index to keep track of the
conv_table fields, which are only incremented when the slave has that
column. This can be checked using member variable master_to_slave_error:
when an element in that array is 0, it means that field exists on the slave.
See other member function of table_def compatible_with() for an example of
this.
*/
TABLE *create_conversion_table(THD *thd, rpl_group_info *rgi,
RPL_TABLE_LIST *target_table) const;
#endif
private:
unsigned char *m_type; // Array of type descriptors
uint m_size; // Number of elements in the types array
uint m_field_metadata_size;
uint16 *m_field_metadata;
uint16 m_flags; // Table flags
uchar *m_null_bits;
uchar *m_memory;
public:
LEX_CUSTRING optional_metadata;
uint *master_to_slave_map;
uint *master_to_slave_error;
char **master_column_name;
};
/* Different errors when converting a field from master to slave */
#define SLAVE_FIELD_NAME_MISSING 1
#define SLAVE_FIELD_NR_MISSING 2
#define SLAVE_FIELD_UNKNOWN_TYPE 3
#define SLAVE_FIELD_WRONG_TYPE 4
#ifndef MYSQL_CLIENT
/**
Extend the normal table list with a few new fields needed by the
slave thread, but nowhere else.
*/
struct RPL_TABLE_LIST : public TABLE_LIST
{
table_def m_tabledef;
TABLE *m_conv_table;
const Copy_field *m_online_alter_copy_fields;
const Copy_field *m_online_alter_copy_fields_end;
uint cached_key_nr; // [0..MAX_KEY] if set, ~0U if unset
uint cached_usable_key_parts;
bool m_tabledef_valid;
bool master_had_triggers;
LEX_CUSTRING optional_metadata;
RPL_TABLE_LIST(const LEX_CSTRING *db_arg, const LEX_CSTRING *table_name_arg,
thr_lock_type thr_lock_type,
table_def &&tabledef, bool master_had_trigers)
: TABLE_LIST(db_arg, table_name_arg, NULL, thr_lock_type),
m_tabledef(std::move(tabledef)), m_conv_table(NULL),
m_online_alter_copy_fields(NULL), m_online_alter_copy_fields_end(NULL),
cached_key_nr(~0U), m_tabledef_valid(true),
master_had_triggers(master_had_trigers)
{
optional_metadata.length= 0;
}
RPL_TABLE_LIST(TABLE *table, thr_lock_type lock_type, TABLE *conv_table,
table_def &&tabledef,
const Copy_field online_alter_copy_fields[],
const Copy_field *online_alter_copy_fields_end)
: TABLE_LIST(table, lock_type),
m_tabledef(std::move(tabledef)), m_conv_table(conv_table),
m_online_alter_copy_fields(online_alter_copy_fields),
m_online_alter_copy_fields_end(online_alter_copy_fields_end),
cached_key_nr(~0U), m_tabledef_valid(true), master_had_triggers(false)
{
optional_metadata.length= 0;
}
bool create_column_mapping(rpl_group_info *rgi);
bool give_compatibility_error(rpl_group_info *rgi, uint col);
bool check_wrong_column_usage(rpl_group_info *rgi, MY_BITMAP *m_cols);
};
/* Anonymous namespace for template functions/classes */
CPP_UNNAMED_NS_START
/*
Smart pointer that will automatically call my_afree (a macro) when
the pointer goes out of scope. This is used so that I do not have
to remember to call my_afree() before each return. There is no
overhead associated with this, since all functions are inline.
I (Matz) would prefer to use the free function as a template
parameter, but that is not possible when the "function" is a
macro.
*/
template <class Obj>
class auto_afree_ptr
{
Obj* m_ptr;
public:
auto_afree_ptr(Obj* ptr) : m_ptr(ptr) { }
~auto_afree_ptr() { if (m_ptr) my_afree(m_ptr); }
void assign(Obj* ptr) {
/* Only to be called if it hasn't been given a value before. */
DBUG_ASSERT(m_ptr == NULL);
m_ptr= ptr;
}
Obj* get() { return m_ptr; }
};
CPP_UNNAMED_NS_END
class Deferred_log_events
{
private:
DYNAMIC_ARRAY array;
Log_event *last_added;
public:
Deferred_log_events(Relay_log_info *rli);
~Deferred_log_events();
/* queue for exection at Query-log-event time prior the Query */
int add(Log_event *ev);
bool is_empty();
bool execute(struct rpl_group_info *rgi);
void rewind();
bool is_last(Log_event *ev) { return ev == last_added; };
};
#endif
// NB. number of printed bit values is limited to sizeof(buf) - 1
#define DBUG_PRINT_BITSET(N,FRM,BS) \
do { \
char buf[256]; \
uint i; \
for (i = 0 ; i < MY_MIN(sizeof(buf) - 1, (BS)->n_bits) ; i++) \
buf[i] = bitmap_is_set((BS), i) ? '1' : '0'; \
buf[i] = '\0'; \
DBUG_PRINT((N), ((FRM), buf)); \
} while (0)
#endif /* RPL_UTILITY_H */