Problem
=======
When facing decoding of corrupt binary log files, server may misbehave
without detecting the events corruption.
This patch makes MySQL server more resilient to binary log decoding.
Fixes for events de-serialization and apply
===========================================
@sql/log_event.cc
Query_log_event::Query_log_event: added a check to ensure query length
is respecting event buffer limits.
Query_log_event::do_apply_event: extended a debug print, added a check
to character set to determine if it is "parseable" or not, verified if
database name is valid for system collation.
Start_log_event_v3::do_apply_event: report an error on applying a
non-supported binary log version.
Load_log_event::copy_log_event: added a check to table_name length.
User_var_log_event::User_var_log_event: added checks to avoid reading
out of buffer limits.
User_var_log_event::do_apply_event: reported an sanity check error
properly and added individual sanity checks for variable types that
expect fixed (or minimum) amount of bytes to be read.
Rows_log_event::Rows_log_event: added checks to avoid reading out of
buffer limits.
@sql/log_event_old.cc
Old_rows_log_event::Old_rows_log_event: added a sanity check to avoid
reading out of buffer limits.
@sql/sql_priv.h
Added a sanity check to available_buffer() function.
Main problem was that no log-event print function checked for disk
full error on the IO_CACHE.
All changes in this patch only affects mysqlbinlog, not the server!
- Changed all log-event print functions to return 1 on error
- Fixed memory usage when not using --flashback.
- Added printing of number of rows in row events. Can be disabled with
--print-row-count=0
- Print annotated rows when using mysqlbinlog --short-form
- Fixed that mysqlbinlog --debug works
- Fixed create_drop_binlog.test test failure
- Reorganized fields in PRINT_EVENT_INFO to be according to size to
optimize storage
- Don't change print_row_event_position or print_row_counts if set by user
- Remove some testing of argument to my_free is 0
- base64-output=never is now supported and works in all context
- Updated help information for --base64-output and --short-form
- print_row_count is now on by default. Reset automatically if --short-form
is used
- Removed obsolote warning for mysql 5.6.0
- More DBUG_PRINT for mysqltest.cc
- my_b_write_byte() now checks for flush failures. This fixed a memory
overrun on disk full
- my_b_printf() now returns 1 on failure, 0 on ok. This simplifies code
and no old code was using the old return value of my_b_printf().
- my_b_Write_backtick_quote() now returns 1 on failure and 0 on ok
- Fixed some error conditions in log printing that was not previously
handled.
- Slave_rows_error_report() can now handle longlong positions
- Write_on_release_cache() rewritten so that we can detect errors
on flush. Not depending on automatic release anymore.
- Changed types for Pos and End_log_pos to 64 bit in SHOW BINLOG EVENTS
- Fixed that copy_event_cache_to_string_and_reinit() works with strings
longer than 4G (Changed to use LEX_STRING instead of String)
- Restricted binlog_rows_event_max_size to UINT32_MAX-1 as String's are
anyway restricted to UINT32_MAX
- Fixed bug in rpl_binlog_state::write_to_iocache() which hide write
failures (duplicate variable name)
- Fixed bug in String::append if original string was not allocated
- Stop mysqlbinlog output at once if there is an error.
- Before printing error message, flush result file. This ensures that
the error message is printed last. (Easier to find)
- Fix win64 pointer truncation warnings
(usually coming from misusing 0x%lx and long cast in DBUG)
- Also fix printf-format warnings
Make the above mentioned warnings fatal.
- fix pthread_join on Windows to set return value.
- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
Reason for this crash is that table->rpl_write_set is NULL. In
Rows_log_event::do_apply_event we set table->rpl_write_set equal to
table->write_set. But we do not set table->rpl_write_set in
Old_rows_log_event::do_apply_event.
REPLICATION
Problem: In RBR mode, merge table updates are not successfully applied on a cascading
replication.
Analysis & Fix: Every type of row event is preceded by one or more table_map_log_events
that gives the information about all the tables that are involved in the row
event. Server maintains the list in RPL_TABLE_LIST and it goes through all the
tables and checks for the compatibility between master and slave. Before
checking for the compatibility, it calls 'open_tables()' which takes the list
of all tables that needs to be locked and opened. In RBR, because of the
Table_map_log_event , we already have all the tables including base tables in
the list. But the open_tables() which is generic call takes care of appending
base tables if the list contains merge tables. There is an assumption in the
current replication layer logic that these tables (TABLE_LIST type objects) are always
added in the end of the list. Replication layer maintains the count of
tables(tables_to_lock_count) that needs to be verified for compatibility check
and runs through only those many tables from the list and rest of the objects
in linked list can be skipped. But this assumption is wrong.
open_tables()->..->add_children_to_list() adds base tables to the list immediately
after seeing the merge table in the list.
For eg: If the list passed to open_tables() is t1->t2->t3 where t3 is merge
table (and t1 and t2 are base tables), it adds t1'->t2' to the list after t3.
New table list looks like t1->t2->t3->t1'->t2'. It looks like it added at the
end of the list but that is not correct. If the list passed to open_tables()
is t3->t1->t2 where t3 is merge table (and t1 and t2 are base tables), the new
prepared list will be t3->t1'->t2'->t1->t2. Where t1' and t2' are of
TABLE_LIST objects which were added by add_children_to_list() call and replication
layer should not look into them. Here tables_to_lock_count will not help as the
objects are added in between the list.
Fix: After investigating add_children_list() logic (which is called from open_tables()),
there is no flag/logic in it to skip adding the children to the list even if the
children are already included in the table list. Hence to fix the issue, a
logic should be added in the replication layer to skip children in the list by
checking whether 'parent_l' is non-null or not. If it is children, we will skip 'compatibility'
check for that table.
Also this patch is not removing 'tables_to_lock_count' logic for the performance issues
if there are any children at the end of the list, those can be easily skipped directly by
stopping the loop with tables_to_lock_count check.
changes in query execution plans.
Fixed by introducing table->rpl_write_set which holds which columns should
be stored in the binary log.
Other things:
- Removed some not needed references to read_set and write_set to make
code really changing read_set and write_set easier to read
(in opt_range.cc)
- Added error handling of failed unpack_current_row()
- Added missing call to mark_columns_needed_for_insert() for DELAYED INSERT
- Removed not used functions in_read_set() and in_write_set()
- In rpl_record.cc, removed not used variable error
Introduce Log_event_writer() that encapsulates
writing data to an IO_CACHE with automatic checksum calculation.
Now all events properly checksum themselves as needed.
Use Log_event_writer in MYSQL_BIN_LOG::write_cache() instead
of copy-pasting its logic all over.
Later Log_event_writer will also do encryption.
1. After a period of wait (where last_master_timestamp=0)
do NOT restore the last_master_timestamp to the timestamp
of the last executed event (which would mean we've just
executed it, and we're that much behind the master).
2. Update last_master_timestamp before executing the event,
not after.
Take the approach from the this commit (but with a different test
case that actually makes sense):
commit 0c75ab453fb8c5439576af8fe5add7a1b89f1569
Author: Luis Soares <luis.soares@sun.com>
Date: Thu Apr 15 17:39:31 2010 +0100
BUG#52166: Seconds_Behind_Master spikes after long idle period
The reason for the failure was a bug in an include file on debian that causes 'struct stat'
to have different sized depending on the environment.
This patch fixes so that we always include my_global.h or my_config.h before we include any other files.
Other things:
- Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few
"always-include-this-file-first' files as possible.
- Removed usage of some include files that where already included by my_global.h or by other files.
client/mysql_plugin.c:
Use my_global.h first
client/mysqlslap.c:
Remove duplicated include files
extra/comp_err.c:
Remove duplicated include files
include/m_string.h:
Remove duplicated include files
include/maria.h:
Remove duplicated include files
libmysqld/emb_qcache.cc:
Use my_global.h first
plugin/semisync/semisync.h:
Use my_pthread.h first
sql/datadict.cc:
Use my_global.h first
sql/debug_sync.cc:
Use my_global.h first
sql/derror.cc:
Use my_global.h first
sql/des_key_file.cc:
Use my_global.h first
sql/discover.cc:
Use my_global.h first
sql/event_data_objects.cc:
Use my_global.h first
sql/event_db_repository.cc:
Use my_global.h first
sql/event_parse_data.cc:
Use my_global.h first
sql/event_queue.cc:
Use my_global.h first
sql/event_scheduler.cc:
Use my_global.h first
sql/events.cc:
Use my_global.h first
sql/field.cc:
Use my_global.h first
Remove duplicated include files
sql/field_conv.cc:
Use my_global.h first
sql/filesort.cc:
Use my_global.h first
Remove duplicated include files
sql/gstream.cc:
Use my_global.h first
sql/ha_ndbcluster.cc:
Use my_global.h first
sql/ha_ndbcluster_binlog.cc:
Use my_global.h first
sql/ha_ndbcluster_cond.cc:
Use my_global.h first
sql/ha_partition.cc:
Use my_global.h first
sql/handler.cc:
Use my_global.h first
sql/hash_filo.cc:
Use my_global.h first
sql/hostname.cc:
Use my_global.h first
sql/init.cc:
Use my_global.h first
sql/item.cc:
Use my_global.h first
sql/item_buff.cc:
Use my_global.h first
sql/item_cmpfunc.cc:
Use my_global.h first
sql/item_create.cc:
Use my_global.h first
sql/item_geofunc.cc:
Use my_global.h first
sql/item_inetfunc.cc:
Use my_global.h first
sql/item_row.cc:
Use my_global.h first
sql/item_strfunc.cc:
Use my_global.h first
sql/item_subselect.cc:
Use my_global.h first
sql/item_sum.cc:
Use my_global.h first
sql/item_timefunc.cc:
Use my_global.h first
sql/item_xmlfunc.cc:
Use my_global.h first
sql/key.cc:
Use my_global.h first
sql/lock.cc:
Use my_global.h first
sql/log.cc:
Use my_global.h first
sql/log_event.cc:
Use my_global.h first
sql/log_event_old.cc:
Use my_global.h first
sql/mf_iocache.cc:
Use my_global.h first
sql/mysql_install_db.cc:
Remove duplicated include files
sql/mysqld.cc:
Remove duplicated include files
sql/net_serv.cc:
Remove duplicated include files
sql/opt_range.cc:
Use my_global.h first
sql/opt_subselect.cc:
Use my_global.h first
sql/opt_sum.cc:
Use my_global.h first
sql/parse_file.cc:
Use my_global.h first
sql/partition_info.cc:
Use my_global.h first
sql/procedure.cc:
Use my_global.h first
sql/protocol.cc:
Use my_global.h first
sql/records.cc:
Use my_global.h first
sql/records.h:
Don't include my_global.h
Better to do this at the upper level
sql/repl_failsafe.cc:
Use my_global.h first
sql/rpl_filter.cc:
Use my_global.h first
sql/rpl_gtid.cc:
Use my_global.h first
sql/rpl_handler.cc:
Use my_global.h first
sql/rpl_injector.cc:
Use my_global.h first
sql/rpl_record.cc:
Use my_global.h first
sql/rpl_record_old.cc:
Use my_global.h first
sql/rpl_reporting.cc:
Use my_global.h first
sql/rpl_rli.cc:
Use my_global.h first
sql/rpl_tblmap.cc:
Use my_global.h first
sql/rpl_utility.cc:
Use my_global.h first
sql/set_var.cc:
Added comment
sql/slave.cc:
Use my_global.h first
sql/sp.cc:
Use my_global.h first
sql/sp_cache.cc:
Use my_global.h first
sql/sp_head.cc:
Use my_global.h first
sql/sp_pcontext.cc:
Use my_global.h first
sql/sp_rcontext.cc:
Use my_global.h first
sql/spatial.cc:
Use my_global.h first
sql/sql_acl.cc:
Use my_global.h first
sql/sql_admin.cc:
Use my_global.h first
sql/sql_analyse.cc:
Use my_global.h first
sql/sql_audit.cc:
Use my_global.h first
sql/sql_base.cc:
Use my_global.h first
sql/sql_binlog.cc:
Use my_global.h first
sql/sql_bootstrap.cc:
Use my_global.h first
Use my_global.h first
sql/sql_cache.cc:
Use my_global.h first
sql/sql_class.cc:
Use my_global.h first
sql/sql_client.cc:
Use my_global.h first
sql/sql_connect.cc:
Use my_global.h first
sql/sql_crypt.cc:
Use my_global.h first
sql/sql_cursor.cc:
Use my_global.h first
sql/sql_db.cc:
Use my_global.h first
sql/sql_delete.cc:
Use my_global.h first
sql/sql_derived.cc:
Use my_global.h first
sql/sql_do.cc:
Use my_global.h first
sql/sql_error.cc:
Use my_global.h first
sql/sql_explain.cc:
Use my_global.h first
sql/sql_expression_cache.cc:
Use my_global.h first
sql/sql_handler.cc:
Use my_global.h first
sql/sql_help.cc:
Use my_global.h first
sql/sql_insert.cc:
Use my_global.h first
sql/sql_lex.cc:
Use my_global.h first
sql/sql_load.cc:
Use my_global.h first
sql/sql_locale.cc:
Use my_global.h first
sql/sql_manager.cc:
Use my_global.h first
sql/sql_parse.cc:
Use my_global.h first
sql/sql_partition.cc:
Use my_global.h first
sql/sql_plugin.cc:
Added comment
sql/sql_prepare.cc:
Use my_global.h first
sql/sql_priv.h:
Added error if we use this before including my_global.h
This check is here becasue so many files includes sql_priv.h first.
sql/sql_profile.cc:
Use my_global.h first
sql/sql_reload.cc:
Use my_global.h first
sql/sql_rename.cc:
Use my_global.h first
sql/sql_repl.cc:
Use my_global.h first
sql/sql_select.cc:
Use my_global.h first
sql/sql_servers.cc:
Use my_global.h first
sql/sql_show.cc:
Added comment
sql/sql_signal.cc:
Use my_global.h first
sql/sql_statistics.cc:
Use my_global.h first
sql/sql_table.cc:
Use my_global.h first
sql/sql_tablespace.cc:
Use my_global.h first
sql/sql_test.cc:
Use my_global.h first
sql/sql_time.cc:
Use my_global.h first
sql/sql_trigger.cc:
Use my_global.h first
sql/sql_udf.cc:
Use my_global.h first
sql/sql_union.cc:
Use my_global.h first
sql/sql_update.cc:
Use my_global.h first
sql/sql_view.cc:
Use my_global.h first
sql/sys_vars.cc:
Added comment
sql/table.cc:
Use my_global.h first
sql/thr_malloc.cc:
Use my_global.h first
sql/transaction.cc:
Use my_global.h first
sql/uniques.cc:
Use my_global.h first
sql/unireg.cc:
Use my_global.h first
sql/unireg.h:
Removed inclusion of my_global.h
storage/archive/ha_archive.cc:
Added comment
storage/blackhole/ha_blackhole.cc:
Use my_global.h first
storage/csv/ha_tina.cc:
Use my_global.h first
storage/csv/transparent_file.cc:
Use my_global.h first
storage/federated/ha_federated.cc:
Use my_global.h first
storage/federatedx/federatedx_io.cc:
Use my_global.h first
storage/federatedx/federatedx_io_mysql.cc:
Use my_global.h first
storage/federatedx/federatedx_io_null.cc:
Use my_global.h first
storage/federatedx/federatedx_txn.cc:
Use my_global.h first
storage/heap/ha_heap.cc:
Use my_global.h first
storage/innobase/handler/handler0alter.cc:
Use my_global.h first
storage/maria/ha_maria.cc:
Use my_global.h first
storage/maria/unittest/ma_maria_log_cleanup.c:
Remove duplicated include files
storage/maria/unittest/test_file.c:
Added comment
storage/myisam/ha_myisam.cc:
Move sql_plugin.h first as this includes my_global.h
storage/myisammrg/ha_myisammrg.cc:
Use my_global.h first
storage/oqgraph/oqgraph_thunk.cc:
Use my_config.h and my_global.h first
One could not include my_global.h before oqgraph_thunk.h (don't know why)
storage/spider/ha_spider.cc:
Use my_global.h first
storage/spider/hs_client/config.cpp:
Use my_global.h first
storage/spider/hs_client/escape.cpp:
Use my_global.h first
storage/spider/hs_client/fatal.cpp:
Use my_global.h first
storage/spider/hs_client/hstcpcli.cpp:
Use my_global.h first
storage/spider/hs_client/socket.cpp:
Use my_global.h first
storage/spider/hs_client/string_util.cpp:
Use my_global.h first
storage/spider/spd_conn.cc:
Use my_global.h first
storage/spider/spd_copy_tables.cc:
Use my_global.h first
storage/spider/spd_db_conn.cc:
Use my_global.h first
storage/spider/spd_db_handlersocket.cc:
Use my_global.h first
storage/spider/spd_db_mysql.cc:
Use my_global.h first
storage/spider/spd_db_oracle.cc:
Use my_global.h first
storage/spider/spd_direct_sql.cc:
Use my_global.h first
storage/spider/spd_i_s.cc:
Use my_global.h first
storage/spider/spd_malloc.cc:
Use my_global.h first
storage/spider/spd_param.cc:
Use my_global.h first
storage/spider/spd_ping_table.cc:
Use my_global.h first
storage/spider/spd_sys_table.cc:
Use my_global.h first
storage/spider/spd_table.cc:
Use my_global.h first
storage/spider/spd_trx.cc:
Use my_global.h first
storage/xtradb/handler/handler0alter.cc:
Use my_global.h first
storage/xtradb/handler/i_s.cc:
Use my_global.h first
Follow-up patch. The original patch added an extra argument to the
rli->report() function, however it was forgotten to adjust the calls
accordingly in a few places.
This patch updates the remaining calls as needed. In files log_event_old.cc
and rpl_record_old.cc, it just adds NULL, since this is only for old event
formats from ancient master servers, which would not have any GTID information
to add to the error messages in any case.
If replication breaks in GTID mode, it is not trivial to determine the GTID of
the failing event group. This is a problem, as such GTID is needed eg. to
explicitly set @@gtid_slave_pos to skip to after that event group, or to
compare errors on different servers, etc.
Fix by ensuring that relevant slave errors logged to the error log include the
GTID of the event group containing the problem event.
NON-EXISTS RECORDS
Problem:
========
In RBR replication, master deletes a record but the record
don't exist on slave. when slave tries to apply the
Delete_row_log_event from master, it will result in an
assert on slave.
Analysis:
========
This problem exists not only with Delete_rows event but also
with Update_rows event as well. Trying to update a non
existing row on the slave from the master will cause the
same assert. This assert occurs only for the tables that
doesn't have primary keys and which basically require
sequential scan to be done to locate a record. This bug
occurs only with innodb engine not with myisam.
When update or delete rows is executed on a slave on a table
which doesn't have primary key the updated record is stored
in a buffer named table->record[0] and the same is copied to
table->record[1] so that during sequential scan
table->record[0] can reloaded with fetched data from the
table and compared against table->record[1]. In a special
case where there is no record on the slave side scan will
result in EOF in that case we reinit the scan and we try to
compare record[0] with record[1] which are basically the
same. This comparison is incorrect. Since they both are the
same record_compare() will report that record is found and
we try to go ahead and try to update/delete non existing
row. Ideally if the scan results in EOF means no data found
hence no need to do a record_compare() at all.
Fix:
===
Avoid comparision of records on EOF.
- Added MALLOC_LIBRARY variable to hold name of malloc library
- Back ported valgrind related fixes from jemalloc 3.4.1 to the included jemalloc 3.3.1
- Renamed bitmap_init() and bitmap_free() to my_bitmap_init() and my_bitmap_free() to avoid clash with jemalloc 3.4.1
- Use option --soname-synonyms=somalloc=NON to valgrind when using jemalloc
- Show version related variables in mysqld --help
-- Added SHOW_VALUE_IN_HELP marker
Increased back_log to 150 as the original value was a bit too small
CMakeLists.txt:
Added MALLOC_LIBRARY variable to hold name of malloc library
cmake/jemalloc.cmake:
Added MALLOC_LIBRARY variable to hold name of malloc library
config.h.cmake:
Added MALLOC_LIBRARY variable to hold name of malloc library
extra/jemalloc/ChangeLog:
Updates changelog
extra/jemalloc/include/jemalloc/internal/arena.h:
Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/include/jemalloc/internal/jemalloc_internal.h.in:
Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/include/jemalloc/internal/private_namespace.h:
Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/include/jemalloc/internal/tcache.h:
Backported valgrind fixes from jemalloc 3.4.1
extra/jemalloc/src/arena.c:
Backported valgrind fixes from jemalloc 3.4.1
include/my_bitmap.h:
Renamed bitmap_init() and bitmap_free() to my_bitmap_init() and my_bitmap_free() to avoid clash with jemalloc 3.4.1
mysql-test/mysql-test-run.pl:
Use option --soname-synonyms=somalloc=NON to valgrind when using jemalloc
mysql-test/valgrind.supp:
Supression of memory leak in OpenSuse 12.3
mysys/my_bitmap.c:
Renamed bitmap_init() and bitmap_free() to my_bitmap_init() and my_bitmap_free()
sql/ha_ndbcluster_binlog.cc:
Renames
sql/ha_ndbcluster_cond.h:
Renames
sql/ha_partition.cc:
Renames
sql/handler.cc:
Renames
sql/item_subselect.cc:
Renames
sql/log_event.cc:
Renames
sql/log_event_old.cc:
Renames
sql/mysqld.cc:
Renames
Show version related variables in mysqld --help
sql/opt_range.cc:
Renames
sql/opt_table_elimination.cc:
Renames
sql/partition_info.cc:
Renames
sql/rpl_injector.h:
Renames
sql/set_var.h:
Renames
sql/slave.cc:
Renames
sql/sql_bitmap.h:
Renames
sql/sql_insert.cc:
Renames
sql/sql_lex.h:
Renames
sql/sql_parse.cc:
Renames
sql/sql_partition.cc:
Renames
sql/sql_select.cc:
Renames
sql/sql_show.cc:
Renames
sql/sql_update.cc:
Renames
sql/sys_vars.cc:
Show version related variables in mysqld --help
sql/sys_vars.h:
Added SHOW_VALUE_IN_HELP marker for variables that should be shown in --help
sql/table.cc:
Renames
sql/table.h:
Removed not used bitmap_init_value
storage/connect/ha_connect.cc:
Removed compiler warning
storage/maria/ma_open.c:
Renames
unittest/mysys/bitmap-t.c:
Renames
(and valgrind warnings)
* move thd userstat initialization to the same function
that was adding thd userstat to global counters.
* initialize thd->start_bytes_received in THD::init
(when thd->userstat_running is set)
The merge is still missing a few hunks related to temporary tables and
InnoDB log file size. The associated code did not seem to exist in
10.0, so the merge of that needs more work. Until this is fixed, there
are a number of test failures as a result.