Using a boolean flag for 'there is a RESET MASTER in progress' doesn't
work very well for multiple concurrent RESET MASTER statements.
Changed to a counter.
The real problem here was inconsistent handling of entry->commit_errno in
MYSQL_BIN_LOG::write_transaction_or_stmt(). Some return paths were setting it
to the value of errno, some where not. And the setting was redundant anyway,
as it is set consistently by the caller.
Fix by consistently setting it in the caller, and not in each return path in
the function.
The test failure happened because a DBUG_EXECUTE_IF() used in the test case
set an entry->commit_errno that was immediately overwritten in the caller with
whatever happened to be the value of errno. This could lead to different error
message in the .result file.
The code in binlog group commit around wait_for_commit that controls commit
order, did the wakeup of subsequent commits early, as soon as a following
transaction is put into the group commit queue, but before any such commit has
actually taken place. This causes problems with too early wakeup of
transactions that need to wait for prior to commit, but do not take part in
the binlog group commit for one reason or the other.
This patch solves the problem, by moving the wakeup to happen only after the
binlog group commit is completed.
This requires a new solution to ensure that transactions that arrive later
than the leader are still able to participate in group commit. This patch
introduces a flag wait_for_commit::commit_started. When this is set, a waiter
can queue up itself in the group commit queue.
This way, effectively the wait_for_prior_commit() is skipped only for
transactions that participate in group commit, so that skipping the wait is
safe. Other transactions still wait as needed for correctness.
In parallel replication, the wait_for_commit facility is used to ensure that
events are written into the binlog in the correct order. This is handled in an
optimised way in the binlogging group commit code.
However, some statements, for example GRANT, are written directly into the
binlog, outside of the group commit code. There was a bug that this direct
write does not correctly wait for the prior transactions to have been written
first, which allows f.ex. GRANT to be written ahead of earlier transactions.
This patch adds the missing wait_for_prior_commit() before writing directly to
the binlog.
However, the problem is still there, although the race is much less likely to
occur now. The problem is that the optimised group commit code does wakeup of
following transactions early, before the binlog write is actually done. A
woken-up following transaction is then allowed to run ahead and queue up for
the group commit, which will ensure that binlog write happens in correct order
in the end. However, the code for directly written events currently bypass
this mechanism, so they get woken up and written too early.
This will be fixed properly in a later patch.
The reason for the failure was a bug in an include file on debian that causes 'struct stat'
to have different sized depending on the environment.
This patch fixes so that we always include my_global.h or my_config.h before we include any other files.
Other things:
- Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few
"always-include-this-file-first' files as possible.
- Removed usage of some include files that where already included by my_global.h or by other files.
client/mysql_plugin.c:
Use my_global.h first
client/mysqlslap.c:
Remove duplicated include files
extra/comp_err.c:
Remove duplicated include files
include/m_string.h:
Remove duplicated include files
include/maria.h:
Remove duplicated include files
libmysqld/emb_qcache.cc:
Use my_global.h first
plugin/semisync/semisync.h:
Use my_pthread.h first
sql/datadict.cc:
Use my_global.h first
sql/debug_sync.cc:
Use my_global.h first
sql/derror.cc:
Use my_global.h first
sql/des_key_file.cc:
Use my_global.h first
sql/discover.cc:
Use my_global.h first
sql/event_data_objects.cc:
Use my_global.h first
sql/event_db_repository.cc:
Use my_global.h first
sql/event_parse_data.cc:
Use my_global.h first
sql/event_queue.cc:
Use my_global.h first
sql/event_scheduler.cc:
Use my_global.h first
sql/events.cc:
Use my_global.h first
sql/field.cc:
Use my_global.h first
Remove duplicated include files
sql/field_conv.cc:
Use my_global.h first
sql/filesort.cc:
Use my_global.h first
Remove duplicated include files
sql/gstream.cc:
Use my_global.h first
sql/ha_ndbcluster.cc:
Use my_global.h first
sql/ha_ndbcluster_binlog.cc:
Use my_global.h first
sql/ha_ndbcluster_cond.cc:
Use my_global.h first
sql/ha_partition.cc:
Use my_global.h first
sql/handler.cc:
Use my_global.h first
sql/hash_filo.cc:
Use my_global.h first
sql/hostname.cc:
Use my_global.h first
sql/init.cc:
Use my_global.h first
sql/item.cc:
Use my_global.h first
sql/item_buff.cc:
Use my_global.h first
sql/item_cmpfunc.cc:
Use my_global.h first
sql/item_create.cc:
Use my_global.h first
sql/item_geofunc.cc:
Use my_global.h first
sql/item_inetfunc.cc:
Use my_global.h first
sql/item_row.cc:
Use my_global.h first
sql/item_strfunc.cc:
Use my_global.h first
sql/item_subselect.cc:
Use my_global.h first
sql/item_sum.cc:
Use my_global.h first
sql/item_timefunc.cc:
Use my_global.h first
sql/item_xmlfunc.cc:
Use my_global.h first
sql/key.cc:
Use my_global.h first
sql/lock.cc:
Use my_global.h first
sql/log.cc:
Use my_global.h first
sql/log_event.cc:
Use my_global.h first
sql/log_event_old.cc:
Use my_global.h first
sql/mf_iocache.cc:
Use my_global.h first
sql/mysql_install_db.cc:
Remove duplicated include files
sql/mysqld.cc:
Remove duplicated include files
sql/net_serv.cc:
Remove duplicated include files
sql/opt_range.cc:
Use my_global.h first
sql/opt_subselect.cc:
Use my_global.h first
sql/opt_sum.cc:
Use my_global.h first
sql/parse_file.cc:
Use my_global.h first
sql/partition_info.cc:
Use my_global.h first
sql/procedure.cc:
Use my_global.h first
sql/protocol.cc:
Use my_global.h first
sql/records.cc:
Use my_global.h first
sql/records.h:
Don't include my_global.h
Better to do this at the upper level
sql/repl_failsafe.cc:
Use my_global.h first
sql/rpl_filter.cc:
Use my_global.h first
sql/rpl_gtid.cc:
Use my_global.h first
sql/rpl_handler.cc:
Use my_global.h first
sql/rpl_injector.cc:
Use my_global.h first
sql/rpl_record.cc:
Use my_global.h first
sql/rpl_record_old.cc:
Use my_global.h first
sql/rpl_reporting.cc:
Use my_global.h first
sql/rpl_rli.cc:
Use my_global.h first
sql/rpl_tblmap.cc:
Use my_global.h first
sql/rpl_utility.cc:
Use my_global.h first
sql/set_var.cc:
Added comment
sql/slave.cc:
Use my_global.h first
sql/sp.cc:
Use my_global.h first
sql/sp_cache.cc:
Use my_global.h first
sql/sp_head.cc:
Use my_global.h first
sql/sp_pcontext.cc:
Use my_global.h first
sql/sp_rcontext.cc:
Use my_global.h first
sql/spatial.cc:
Use my_global.h first
sql/sql_acl.cc:
Use my_global.h first
sql/sql_admin.cc:
Use my_global.h first
sql/sql_analyse.cc:
Use my_global.h first
sql/sql_audit.cc:
Use my_global.h first
sql/sql_base.cc:
Use my_global.h first
sql/sql_binlog.cc:
Use my_global.h first
sql/sql_bootstrap.cc:
Use my_global.h first
Use my_global.h first
sql/sql_cache.cc:
Use my_global.h first
sql/sql_class.cc:
Use my_global.h first
sql/sql_client.cc:
Use my_global.h first
sql/sql_connect.cc:
Use my_global.h first
sql/sql_crypt.cc:
Use my_global.h first
sql/sql_cursor.cc:
Use my_global.h first
sql/sql_db.cc:
Use my_global.h first
sql/sql_delete.cc:
Use my_global.h first
sql/sql_derived.cc:
Use my_global.h first
sql/sql_do.cc:
Use my_global.h first
sql/sql_error.cc:
Use my_global.h first
sql/sql_explain.cc:
Use my_global.h first
sql/sql_expression_cache.cc:
Use my_global.h first
sql/sql_handler.cc:
Use my_global.h first
sql/sql_help.cc:
Use my_global.h first
sql/sql_insert.cc:
Use my_global.h first
sql/sql_lex.cc:
Use my_global.h first
sql/sql_load.cc:
Use my_global.h first
sql/sql_locale.cc:
Use my_global.h first
sql/sql_manager.cc:
Use my_global.h first
sql/sql_parse.cc:
Use my_global.h first
sql/sql_partition.cc:
Use my_global.h first
sql/sql_plugin.cc:
Added comment
sql/sql_prepare.cc:
Use my_global.h first
sql/sql_priv.h:
Added error if we use this before including my_global.h
This check is here becasue so many files includes sql_priv.h first.
sql/sql_profile.cc:
Use my_global.h first
sql/sql_reload.cc:
Use my_global.h first
sql/sql_rename.cc:
Use my_global.h first
sql/sql_repl.cc:
Use my_global.h first
sql/sql_select.cc:
Use my_global.h first
sql/sql_servers.cc:
Use my_global.h first
sql/sql_show.cc:
Added comment
sql/sql_signal.cc:
Use my_global.h first
sql/sql_statistics.cc:
Use my_global.h first
sql/sql_table.cc:
Use my_global.h first
sql/sql_tablespace.cc:
Use my_global.h first
sql/sql_test.cc:
Use my_global.h first
sql/sql_time.cc:
Use my_global.h first
sql/sql_trigger.cc:
Use my_global.h first
sql/sql_udf.cc:
Use my_global.h first
sql/sql_union.cc:
Use my_global.h first
sql/sql_update.cc:
Use my_global.h first
sql/sql_view.cc:
Use my_global.h first
sql/sys_vars.cc:
Added comment
sql/table.cc:
Use my_global.h first
sql/thr_malloc.cc:
Use my_global.h first
sql/transaction.cc:
Use my_global.h first
sql/uniques.cc:
Use my_global.h first
sql/unireg.cc:
Use my_global.h first
sql/unireg.h:
Removed inclusion of my_global.h
storage/archive/ha_archive.cc:
Added comment
storage/blackhole/ha_blackhole.cc:
Use my_global.h first
storage/csv/ha_tina.cc:
Use my_global.h first
storage/csv/transparent_file.cc:
Use my_global.h first
storage/federated/ha_federated.cc:
Use my_global.h first
storage/federatedx/federatedx_io.cc:
Use my_global.h first
storage/federatedx/federatedx_io_mysql.cc:
Use my_global.h first
storage/federatedx/federatedx_io_null.cc:
Use my_global.h first
storage/federatedx/federatedx_txn.cc:
Use my_global.h first
storage/heap/ha_heap.cc:
Use my_global.h first
storage/innobase/handler/handler0alter.cc:
Use my_global.h first
storage/maria/ha_maria.cc:
Use my_global.h first
storage/maria/unittest/ma_maria_log_cleanup.c:
Remove duplicated include files
storage/maria/unittest/test_file.c:
Added comment
storage/myisam/ha_myisam.cc:
Move sql_plugin.h first as this includes my_global.h
storage/myisammrg/ha_myisammrg.cc:
Use my_global.h first
storage/oqgraph/oqgraph_thunk.cc:
Use my_config.h and my_global.h first
One could not include my_global.h before oqgraph_thunk.h (don't know why)
storage/spider/ha_spider.cc:
Use my_global.h first
storage/spider/hs_client/config.cpp:
Use my_global.h first
storage/spider/hs_client/escape.cpp:
Use my_global.h first
storage/spider/hs_client/fatal.cpp:
Use my_global.h first
storage/spider/hs_client/hstcpcli.cpp:
Use my_global.h first
storage/spider/hs_client/socket.cpp:
Use my_global.h first
storage/spider/hs_client/string_util.cpp:
Use my_global.h first
storage/spider/spd_conn.cc:
Use my_global.h first
storage/spider/spd_copy_tables.cc:
Use my_global.h first
storage/spider/spd_db_conn.cc:
Use my_global.h first
storage/spider/spd_db_handlersocket.cc:
Use my_global.h first
storage/spider/spd_db_mysql.cc:
Use my_global.h first
storage/spider/spd_db_oracle.cc:
Use my_global.h first
storage/spider/spd_direct_sql.cc:
Use my_global.h first
storage/spider/spd_i_s.cc:
Use my_global.h first
storage/spider/spd_malloc.cc:
Use my_global.h first
storage/spider/spd_param.cc:
Use my_global.h first
storage/spider/spd_ping_table.cc:
Use my_global.h first
storage/spider/spd_sys_table.cc:
Use my_global.h first
storage/spider/spd_table.cc:
Use my_global.h first
storage/spider/spd_trx.cc:
Use my_global.h first
storage/xtradb/handler/handler0alter.cc:
Use my_global.h first
storage/xtradb/handler/i_s.cc:
Use my_global.h first
This was missing in my last commit for fixing possible lockups in SHOW STATUS.
sql/log.cc:
Fixed comment
sql/sql_show.cc:
Use LOCK_show_status when we add things to all_status_vars
sql/sql_test.cc:
Remove not needed mutex_lock
If the slave gets a reconnect in the middle of a GTID event group, normally
it will re-fetch that event group, skipping the first part that was already
queued for the SQL thread.
However, if the master crashed while writing the event group, the group is
incomplete. This patch detects this case and makes sure that the
transaction is rolled back and nothing is skipped from any following
event groups.
Similarly, a network proxy might cause the reconnect to end up on a
different master server. Detect this by noticing a different server_id,
and similarly in this case roll back the partially received group.
GOES AWAY, MYSQL QUITS WORKING.
Analysis:
-----------------
Issue in this bug and in bug 11907705 is, the socket file or
fifo file is set for general log at command line while starting
the server. But currently, only regular file can be set for the
general log. Instead of reporting any error, the provided files
are opened for writing and continued. Because of this issues
mentioned in the bug reports are seen.
As mentioned, only when any non-regular file is set for general
log at command line while starting the server, these issues are
seen. If general log file is set to non-regular file from CLI
using system variable general_log_file then error is reported.
These issues can also be faced with slow query log file, if it is
set to non-regular file.
Fix:
-----------------
Currently while starting the server if we fail to open log file
then we report an error, disable logging to file and continue.
To fix issue reported code is modified to check whether file
is regular file or not before opening it. If file is not a
regular file then error is logged to error log and logging to
file is disabled.
Merge the patches into MariaDB 10.0 main.
With this patch, parallel replication will now automatically retry a
transaction that fails due to deadlock or other temporary error, same as
single-threaded replication.
We catch deadlocks with InnoDB transactions due to enforced commit order. If
T1 must commit before T2 in parallel replication and T1 ends up waiting for T2
inside InnoDB, we kill T2 and retry it later to resolve the deadlock
automatically.
After-review changes.
For this patch in 10.0, we do not introduce a new public storage engine API,
we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
API that can be used for all storage engines (MDEV-6429).
Eliminate the background thread that did deadlock kills asynchroneously.
Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
inside the deadlock detection code (when thd_report_wait_for() needs to kill a
later thread to resolve a deadlock).
(We preserve the part of the original patch that introduces dedicated mutex
and condition for the slave init thread, to remove the abuse of
LOCK_thread_count for start/stop synchronisation of the slave init thread).
- Make log_slow_verbosity print "Priority_queue: (Yes|No)" into the slow query log.
(but we do not add a correspoding column to P_S.*statement* tables).
This is MySQL Bug#59123. The message string stored in an INCIDENT event was
not zero-terminated. This caused any following checksum bytes (if enabled on
the master) to be output to the error log as trailing garbage when the message
was printed to the error log.
Backport the patch from MySQL 5.6:
revno: 2876.228.200
revision-id: zhenxing.he@sun.com-20110111051323-w2xnzvcjn46x6h6u
committer: He Zhenxing <zhenxing.he@sun.com>
timestamp: Tue 2011-01-11 13:13:23 +0800
message:
BUG#59123 rpl_stm_binlog_max_cache_size fails sporadically with found warnings
Also add a test case.
server initialization
ER() macro was used during server initialization. It refers to
current_thd, which is not available that early.
Print error to error log in "lc-messages" locale.
Avoid duplicate error message during server initialization.
replication causing replication to fail.
Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.
The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.
With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.
Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.
Also some misc. fixes found during development and testing:
- Remove thd_rpl_is_parallel(), it is not used or needed.
- Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
worker thread is killed to resolve a deadlock with fixed commit
ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
can be ignored if the query otherwise completed successfully, and this
could cause the deadlock kill to be lost, so that the deadlock was not
correctly resolved.
- Fix random test failure due to missing wait_for_binlog_checkpoint.inc.
- Make sure that deadlock or other temporary errors during parallel
replication are not printed to the the error log; there were some places
around the replication code with extra error logging. These conditions can
occur occasionally and are handled automatically without breaking
replication, so they should not pollute the error log.
- Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
the end of a transaction, to be able to detect and resolve deadlocks due to
commit ordering. But this value was also used as a flag to mark whether
record_gtid() had been called, by being set to zero, losing the value. Now,
introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
valid for the entire duration of the transaction.
- Fix one place where the code to handle ignored errors called reset_killed()
unconditionally, even if no error was caught that should be ignored. This
could cause loss of a deadlock kill signal, breaking deadlock detection and
resolution.
- Fix a couple of missing mysql_reset_thd_for_next_command(). This could
cause a prior error condition to remain for the next event executed,
causing assertions about errors already being set and possibly giving
incorrect error handling for following event executions.
- Fix code that cleared thd->rgi_slave in the parallel replication worker
threads after each event execution; this caused the deadlock detection and
handling code to not be able to correctly process the associated
transactions as belonging to replication worker threads.
- Remove useless error code in slave_background_kill_request().
- Fix bug where wfc->wakeup_error was not cleared at
wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
error condition to wrongly propagate to a later wait_for_prior_commit(),
causing spurious ER_PRIOR_COMMIT_FAILED errors.
- Do not put the binlog background thread into the processlist. It causes
too many result differences in mtr, but also it probably is not useful
for users to pollute the process list with a system thread that does not
really perform any user-visible tasks...
replication causing replication to fail.
In parallel replication, we run transactions from the master in parallel, but
force them to commit in the same order they did on the master. If we force T1
to commit before T2, but T2 holds eg. a row lock that is needed by T1, we get
a deadlock when T2 waits until T1 has committed.
Usually, we do not run T1 and T2 in parallel if there is a chance that they
can have conflicting locks like this, but there are certain edge cases where
it can occasionally happen (eg. MDEV-5914, MDEV-5941, MDEV-6020). The bug was
that this would cause replication to hang, eventually getting a lock timeout
and causing the slave to stop with error.
With this patch, InnoDB will report back to the upper layer whenever a
transactions T1 is about to do a lock wait on T2. If T1 and T2 are parallel
replication transactions, and T2 needs to commit later than T1, we can thus
detect the deadlock; we then kill T2, setting a flag that causes it to catch
the kill and convert it to a deadlock error; this error will then cause T2 to
roll back and release its locks (so that T1 can commit), and later T2 will be
re-tried and eventually also committed.
The kill happens asynchroneously in a slave background thread; this is
necessary, as the reporting from InnoDB about lock waits happen deep inside
the locking code, at a point where it is not possible to directly call
THD::awake() due to mutexes held.
Deadlock is assumed to be (very) rarely occuring, so this patch tries to
minimise the performance impact on the normal case where no deadlocks occur,
rather than optimise the handling of the occasional deadlock.
Also fix transaction retry due to deadlock when it happens after a transaction
already signalled to later transactions that it started to commit. In this
case we need to undo this signalling (and later redo it when we commit again
during retry), so following transactions will not start too early.
Also add a missing thd->send_kill_message() that got triggered during testing
(this corrects an incorrect fix for MySQL Bug#58933).
Handle retry of event groups that span multiple relay log files.
- If retry reaches the end of one relay log file, move on to the next.
- Handle refcounting of relay log files, and avoid purging relay log
files until all event groups have completed that might have needed
them for transaction retry.
SHOW PROCESSLIST, SHOW BINLOGS
Problem: A deadlock was occurring when 4 threads were
involved in acquiring locks in the following way
Thread 1: Dump thread ( Slave is reconnecting, so on
Master, a new dump thread is trying kill
zombie dump threads. It acquired thread's
LOCK_thd_data and it is about to acquire
mysys_var->current_mutex ( which LOCK_log)
Thread 2: Application thread is executing show binlogs and
acquired LOCK_log and it is about to acquire
LOCK_index.
Thread 3: Application thread is executing Purge binary logs
and acquired LOCK_index and it is about to
acquire LOCK_thread_count.
Thread 4: Application thread is executing show processlist
and acquired LOCK_thread_count and it is
about to acquire zombie dump thread's
LOCK_thd_data.
Deadlock Cycle:
Thread 1 -> Thread 2 -> Thread 3-> Thread 4 ->Thread 1
The same above deadlock was observed even when thread 4 is
executing 'SELECT * FROM information_schema.processlist' command and
acquired LOCK_thread_count and it is about to acquire zombie
dump thread's LOCK_thd_data.
Analysis:
There are four locks involved in the deadlock. LOCK_log,
LOCK_thread_count, LOCK_index and LOCK_thd_data.
LOCK_log, LOCK_thread_count, LOCK_index are global mutexes
where as LOCK_thd_data is local to a thread.
We can divide these four locks in two groups.
Group 1 consists of LOCK_log and LOCK_index and the order
should be LOCK_log followed by LOCK_index.
Group 2 consists of other two mutexes
LOCK_thread_count, LOCK_thd_data and the order should
be LOCK_thread_count followed by LOCK_thd_data.
Unfortunately, there is no specific predefined lock order defined
to follow in the MySQL system when it comes to locks across these
two groups. In the above problematic example,
there is no problem in the way we are acquiring the locks
if you see each thread individually.
But If you combine all 4 threads, they end up in a deadlock.
Fix:
Since everything seems to be fine in the way threads are taking locks,
In this patch We are changing the duration of the locks in Thread 4
to break the deadlock. i.e., before the patch, Thread 4
('show processlist' command) mysqld_list_processes()
function acquires LOCK_thread_count for the complete duration
of the function and it also acquires/releases
each thread's LOCK_thd_data.
LOCK_thread_count is used to protect addition and
deletion of threads in global threads list. While show
process list is looping through all the existing threads,
it will be a problem if a thread is exited but there is no problem
if a new thread is added to the system. Hence a new mutex is
introduced "LOCK_thd_remove" which will protect deletion
of a thread from global threads list. All threads which are
getting exited should acquire LOCK_thd_remove
followed by LOCK_thread_count. (It should take LOCK_thread_count
also because other places of the code still thinks that exit thread
is protected with LOCK_thread_count. In this fix, we are changing
only 'show process list' query logic )
(Eg: unlink_thd logic will be protected with
LOCK_thd_remove).
Logic of mysqld_list_processes(or file_schema_processlist)
will now be protected with 'LOCK_thd_remove' instead of
'LOCK_thread_count'.
Now the new locking order after this patch is:
LOCK_thd_remove -> LOCK_thd_data -> LOCK_log ->
LOCK_index -> LOCK_thread_count
Fixed use-copy option to mysql-test-run
mysql-test/mysql-test-run.pl:
Fixed use-copy and added comment
sql/log.cc:
Make copy_up_file_and_fill() safe for disk full
mysql-test/r/create_or_replace.result:
More tests for create or replace
mysql-test/t/create_or_replace.test:
More tests for create or replace
sql/log.cc:
Don't use binlog_hton if binlog is not enabmed
sql/sql_base.cc:
We have to call restart_trans_for_tables also if tables where not locked with LOCK TABLES.
If not, we will get a crash in TokuDB
sql/sql_insert.cc:
Don't call binlog_reset_cache() if we don't have binary log open
sql/sql_table.cc:
Don't log to binary log if not open
Better test if we where using create or replace ... select
storage/tokudb/mysql-test/tokudb_mariadb/r/create_or_replace.result:
More tests for create or replace
storage/tokudb/mysql-test/tokudb_mariadb/t/create_or_replace.test:
More tests for create or replace
Bug #3329 Incomplete lower_case_table_names=2 implementation
The problem was that check_db_name() converted database names to lower case also in case of lower_case_table_names=2.
Fixed by removing the conversion in check_db_name for lower_case_table_names = 2 and instead converting db name to
lower case at same places as table names are converted.
Fixed bug that SHOW CREATE DATABASE FOO showed information for database 'foo'.
I also removed some checks of lower_case_table_names when it was enough to use table_alias_charset.
mysql-test/mysql-test-run.pl:
Added --use-copy argument to force mysql-test-run to copy files instead of doing symlinks. This is needed when you run
with test directory on another file system
mysql-test/r/lowercase_table.result:
Updated results
mysql-test/r/lowercase_table2.result:
Updated results
mysql-test/suite/parts/r/partition_mgm_lc2_innodb.result:
Updated results
mysql-test/suite/parts/r/partition_mgm_lc2_memory.result:
Updated results
mysql-test/suite/parts/r/partition_mgm_lc2_myisam.result:
Updated results
mysql-test/t/lowercase_table.test:
Added tests with mixed case databases
mysql-test/t/lowercase_table2.test:
Added tests with mixed case databases
sql/log.cc:
Don't check lower_case_table_names when we can use table_alias_charset
sql/sql_base.cc:
Don't check lower_case_table_names when we can use table_alias_charset
sql/sql_db.cc:
Use cmp_db_names() for checking if current database changed.
mysql_rm_db() now converts db to lower case if lower_case_table_names was used.
Changed database options cache to use table_alias_charset. This fixed a bug where SHOW CREATE DATABASE showed wrong information.
sql/sql_parse.cc:
Change also db name to lower case when file names are changed.
Don't need to story copy of database name anymore when lower_case_table_names == 2 as check_db_name() don't convert in this case.
Updated arguments to mysqld_show_create_db().
When adding table to TABLE_LIST also convert db name to lower case if needed (same way as we do with table names).
sql/sql_show.cc:
mysqld_show_create_db() now also takes original name as argument for output to user.
sql/sql_show.h:
Updated prototype for mysqld_show_create_db()
sql/sql_table.cc:
In mysql_rename_table(), do same conversions to database name as we do for the file name
Now if CREATE OR REPLACE fails but we have deleted a table already, we will generate a DROP TABLE in the binary log.
This fixes this issue.
In addition, for a failing CREATE OR REPLACE TABLE ... SELECT we don't generate a log of all the inserted rows, only the DROP TABLE.
I added code for not logging DROP TEMPORARY TABLE for tables where the CREATE TABLE was not logged. This code will be activated in 10.1
by removing the code protected by DONT_LOG_DROP_OF_TEMPORARY_TABLES.
mysql-test/suite/rpl/r/create_or_replace_mix.result:
More test cases
mysql-test/suite/rpl/r/create_or_replace_row.result:
More test cases
mysql-test/suite/rpl/r/create_or_replace_statement.result:
More test cases
mysql-test/suite/rpl/t/create_or_replace.inc:
More test cases
sql/log.cc:
Added binlog_reset_cache() to clear the binary log.
sql/log.h:
Added prototype
sql/sql_insert.cc:
If CREATE OR REPLACE TABLE ... SELECT fails:
- Don't log anything if nothing changed
- If table was deleted, log a DROP TABLE.
Remember if we table creation of temporary tables was logged.
sql/sql_table.cc:
Added log_drop_table()
Remember if we table creation of temporary tables was logged.
If CREATE OR REPLACE TABLE ... SELECT fails and a table was deleted, log a DROP TABLE.
sql/sql_table.h:
Added prototype
sql/sql_truncate.cc:
Remember if we table creation of temporary tables was logged.
sql/table.h:
Added table_creation_was_logged
Clean up and improve the parallel implementation code, mainly related to
scheduling of work to threads and handling of stop and errors.
Fix a lot of bugs in various corner cases that could lead to crashes or
corruption.
Fix that a single replication domain could easily grab all worker threads and
stall all other domains; now a configuration variable
--slave-domain-parallel-threads allows to limit the number of
workers.
Allow next event group to start as soon as previous group begins the commit
phase (as opposed to when it ends it); this allows multiple event groups on
the slave to participate in group commit, even when no other opportunities for
parallelism are available.
Various fixes:
- Fix some races in the rpl.rpl_parallel test case.
- Fix an old incorrect assertion in Log_event iocache read.
- Fix repeated malloc/free of wait_for_commit and rpl_group_info objects.
- Simplify wait_for_commit wakeup logic.
- Fix one case in queue_for_group_commit() where killing one thread would
fail to correctly signal the error to the next, causing loss of the
transaction after slave restart.
- Fix leaking of pthreads (and their allocated stack) due to missing
PTHREAD_CREATE_DETACHED attribute.
- Fix how one batch of group-committed transactions wait for the previous
batch before starting to execute themselves. The old code had a very
complex scheduling where the first transaction was handled differently,
with subtle bugs in corner cases. Now each event group is always scheduled
for a new worker (in a round-robin fashion amongst available workers).
Keep a count of how many transactions have started to commit, and wait for
that counter to reach the appropriate value.
- Fix slave stop to wait for all workers to actually complete processing;
before, the wait was for update of last_committed_sub_id, which happens a
bit earlier, and could leave worker threads potentially accessing bits of
the replication state that is no longer valid after slave stop.
- Fix a couple of places where the test suite would kill a thread waiting
inside enter_cond() in connection with debug_sync; debug_sync + kill can
crash in rare cases due to a race with mysys_var_current_mutex in this
case.
- Fix some corner cases where we had enter_cond() but no exit_cond().
- Fix that we could get failure in wait_for_prior_commit() but forget to flag
the error with my_error().
- Fix slave stop (both for normal stop and stop due to error). Now, at stop
we pick a specific safe point (in terms of event groups executed) and make
sure that all event groups before that point are executed to completion,
and that no event group after start executing; this ensures a safe place to
restart replication, even for non-transactional stuff/DDL. In error stop,
make sure that all prior event groups are allowed to execute to completion,
and that any later event groups that have started are rolled back, if
possible. The old code could leave eg. T1 and T3 committed but T2 not, or
it could even leave half a transaction not rolled back in some random
worker, which would cause big problems when that worker was later reused
after slave restart.
- Fix the accounting of amount of events queued for one worker. Before, the
amount was reduced immediately as soon as the events were dequeued (which
happens all at once); this allowed twice the amount of events to be queued
in memory for each single worker, which is not what users would expect.
- Fix that an error set during execution of one event was sometimes not
cleared before executing the next, causing problems with the error
reporting.
- Fix incorrect handling of thd->killed in worker threads.
The problem is a deadlock between MYSQL_BIN_LOG::reset_logs() and
MYSQL_BIN_LOG::mark_xid_done(). The former takes LOCK_log and waits for the
latter to complete. But the latter also tries to take LOCK_log; this can lead
to a deadlock.
There was already code that tries to deal with this, with the flag
reset_master_pending. However, there was still a small opportunity for
deadlock, when an previous mark_xid_done() is still running when reset_logs()
is called and is at the precise point where it first releases LOCK_xid_list
and then re-aquires both LOCK_log and LOCK_xid_list.
Solve by setting reset_master_pending in reset_logs() before taking
LOCK_log. And also count how many invocations of LOCK_xid_list are in the
progress of releasing and re-aquiring locks, and in reset_logs() wait for that
number to drop to zero after setting reset_master_pending and before taking
LOCK_log.
MASTER_GTID_WAIT() is similar to MASTER_POS_WAIT(), but works with a
GTID position rather than an old-style filename/offset.
@@LAST_GTID gives the GTID assigned to the last transaction written
into the binlog.
Together, the two can be used by applications to obtain the GTID of
an update on the master, and then do a MASTER_GTID_WAIT() for that
position on any read slave where it is important to get results that
are caught up with the master at least to the point of the update.
The implementation of MASTER_GTID_WAIT() is implemented in a way
that tries to minimise the performance impact on the SQL threads,
even in the presense of many waiters on single GTID positions (as
from @@LAST_GTID).
- CREATE TABLE is by default executed on the slave as CREATE OR REPLACE
- DROP TABLE is by default executed on the slave as DROP TABLE IF NOT EXISTS
This means that a slave will by default continue even if we try to create
a table that existed on the slave (the table will be deleted and re-created) or
if we try to drop a table that didn't exist on the slave.
This should be safe as instead of having the slave stop because of an inconsistency between
master and slave, it will fix the inconsistency.
Those that would prefer to get a stopped slave instead for the above cases can set slave_ddl_exec_mode to STRICT.
- Ensure that a CREATE OR REPLACE TABLE which dropped a table is replicated
- DROP TABLE that generated an error on master is handled as an identical DROP TABLE on the slave (IF NOT EXISTS is not added in this case)
- Added slave_ddl_exec_mode variable to decide how DDL's are replicated
New logic for handling BEGIN GTID ... COMMIT from the binary log:
- When we find a BEGIN GTID, we start a transaction and set OPTION_GTID_BEGIN
- When we find COMMIT, we reset OPTION_GTID_BEGIN and execute the normal COMMIT code.
- While OPTION_GTID_BEGIN is set:
- We don't generate implict commits before or after statements
- All tables are regarded as transactional tables in the binary log (to ensure things are executed exactly as on the master)
- We reset OPTION_GTID_BEGIN also on rollback
This will help ensuring that we don't get any sporadic commits (and thus new GTID's) on the slave and will help keep the GTID's between master and slave in sync.
mysql-test/extra/rpl_tests/rpl_log.test:
Added testing of mode slave_ddl_exec_mode=STRICT
mysql-test/r/mysqld--help.result:
New help messages
mysql-test/suite/rpl/r/create_or_replace_mix.result:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_row.result:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/r/create_or_replace_statement.result:
Testing replication of create or replace
mysql-test/suite/rpl/r/rpl_gtid_startpos.result:
Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/r/rpl_row_log.result:
Updated result
mysql-test/suite/rpl/r/rpl_row_log_innodb.result:
Updated result
mysql-test/suite/rpl/r/rpl_row_show_relaylog_events.result:
Updated result
mysql-test/suite/rpl/r/rpl_stm_log.result:
Updated result
mysql-test/suite/rpl/r/rpl_stm_mix_show_relaylog_events.result:
Updated result
mysql-test/suite/rpl/r/rpl_temp_table_mix_row.result:
Updated result
mysql-test/suite/rpl/t/create_or_replace.inc:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.cnf:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_mix.test:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.cnf:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_row.test:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.cnf:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/create_or_replace_statement.test:
Testing of CREATE OR REPLACE TABLE with replication
mysql-test/suite/rpl/t/rpl_gtid_startpos.test:
Test must be run in slave_ddl_exec_mode=STRICT as part of the test depends on that DROP TABLE should fail on slave.
mysql-test/suite/rpl/t/rpl_stm_log.test:
Removed some lines
mysql-test/suite/sys_vars/r/slave_ddl_exec_mode_basic.result:
Testing of slave_ddl_exec_mode
mysql-test/suite/sys_vars/t/slave_ddl_exec_mode_basic.test:
Testing of slave_ddl_exec_mode
sql/handler.cc:
Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
This is to ensure that statments are not commited too early if non transactional tables are used.
sql/log.cc:
Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
Also treat 'direct' log events as transactional (to get them logged as they where on the master)
sql/log_event.cc:
Ensure that the new error from DROP TABLE when trying to drop a view is treated same as the old one.
Store error code that slave expects in THD.
Set OPTION_GTID_BEGIN if we find a BEGIN.
Reset OPTION_GTID_BEGIN if we find a COMMIT.
sql/mysqld.cc:
Added slave_ddl_exec_mode_options
sql/mysqld.h:
Added slave_ddl_exec_mode_options
sql/rpl_gtid.cc:
Reset OPTION_GTID_BEGIN if we record a gtid (safety)
sql/sql_class.cc:
Regard all tables as transactional in commit if OPTION_GTID_BEGIN is set.
sql/sql_class.h:
Added to THD: log_current_statement and slave_expected_error
sql/sql_insert.cc:
Ensure that CREATE OR REPLACE is logged if table was deleted.
Don't do implicit commit for CREATE if we are under OPTION_GTID_BEGIN
sql/sql_parse.cc:
Change CREATE TABLE -> CREATE OR REPLACE TABLE for slaves
Change DROP TABLE -> DROP TABLE IF EXISTS for slaves
CREATE TABLE doesn't force implicit commit in case of OPTION_GTID_BEGIN
Don't do commits before or after any statement if OPTION_GTID_BEGIN was set.
sql/sql_priv.h:
Added OPTION_GTID_BEGIN
sql/sql_show.cc:
Enhanced store_create_info() to also be able to handle CREATE OR REPLACE
sql/sql_show.h:
Updated prototype
sql/sql_table.cc:
Ensure that CREATE OR REPLACE is logged if table was deleted.
sql/sys_vars.cc:
Added slave_ddl_exec_mode
sql/transaction.cc:
Added warning if we got a GTID under OPTION_GTID_BEGIN
Using CREATE OR REPLACE TABLE is be identical to
DROP TABLE IF EXISTS table_name;
CREATE TABLE ...;
Except that:
* CREATE OR REPLACE is be atomic (now one can create the same table between drop and create).
* Temporary tables will not shadow the table name for the DROP as the CREATE TABLE tells us already if we are using a temporary table or not.
* If the table was locked with LOCK TABLES, the new table will be locked with the same lock after it's created.
Implementation details:
- We don't anymore open the to-be-created table during CREATE TABLE, which the original code did.
- There is no need to open a table we are planning to create. It's enough to check if the table exists or not.
- Removed some of duplicated code for CREATE IF NOT EXISTS.
- Give an error when using CREATE OR REPLACE with IF NOT EXISTS (conflicting options).
- As a side effect of the code changes, we don't anymore have to internally re-prepare prepared statements with CREATE TABLE if the table exists.
- Made one code path for all testing if log table are in use.
- Better error message if one tries to create/drop/alter a log table in use
- Added back disabled rpl_row_create_table test as it now seams to work and includes a lot of interesting tests.
- Added HA_LEX_CREATE_REPLACE to mark if we are using CREATE OR REPLACE
- Aligned CREATE OR REPLACE parsing code in sql_yacc.yy for TABLE and VIEW
- Changed interface for drop_temporary_table() to make it more reusable
- Changed Locked_tables_list::init_locked_tables() to work on the table object instead of the table list object. Before this it used a mix of both, which was not good.
- Locked_tables_list::unlock_locked_tables(THD *thd) now requires a valid thd argument. Old usage of calling this with 0 i changed to instead call Locked_tables_list::reset()
- Added functions Locked_tables_list:restore_lock() and Locked_tables_list::add_back_last_deleted_lock() to be able to easily add back a locked table to the lock list.
- Added restart_trans_for_tables() to be able to restart a transaction.
- DROP_ACL is required if one uses CREATE TABLE OR REPLACE.
- Added drop of normal and temporary tables in create_table_imp() if CREATE OR REPLACE was used.
- Added reacquiring of table locks in mysql_create_table() and mysql_create_like_table()
mysql-test/include/commit.inc:
With new code we get fewer status increments
mysql-test/r/commit_1innodb.result:
With new code we get fewer status increments
mysql-test/r/create.result:
Added testing of create or replace with timeout
mysql-test/r/create_or_replace.result:
Basic testing of CREATE OR REPLACE TABLE
mysql-test/r/partition_exchange.result:
New error message
mysql-test/r/ps_ddl.result:
Fewer reprepares with new code
mysql-test/suite/archive/discover.result:
Don't rediscover archive tables if the .frm file exists
(Sergei will look at this if there is a better way...)
mysql-test/suite/archive/discover.test:
Don't rediscover archive tables if the .frm file exists
(Sergei will look at this if there is a better way...)
mysql-test/suite/funcs_1/r/innodb_views.result:
New error message
mysql-test/suite/funcs_1/r/memory_views.result:
New error message
mysql-test/suite/rpl/disabled.def:
rpl_row_create_table should now be safe to use
mysql-test/suite/rpl/r/rpl_row_create_table.result:
Updated results after adding back disabled test
mysql-test/suite/rpl/t/rpl_create_if_not_exists.test:
Added comment
mysql-test/suite/rpl/t/rpl_row_create_table.test:
Added CREATE OR REPLACE TABLE test
mysql-test/t/create.test:
Added CREATE OR REPLACE TABLE test
mysql-test/t/create_or_replace-master.opt:
Create logs
mysql-test/t/create_or_replace.test:
Basic testing of CREATE OR REPLACE TABLE
mysql-test/t/partition_exchange.test:
Error number changed as we are now using same code for all log table change issues
mysql-test/t/ps_ddl.test:
Fewer reprepares with new code
sql/handler.h:
Moved things around a bit in a structure to get better alignment.
Added HA_LEX_CREATE_REPLACE to mark if we are using CREATE OR REPLACE
Added 3 elements to end of HA_CREATE_INFO to be able to store state to add backs locks in case of LOCK TABLES.
sql/log.cc:
Reimplemented check_if_log_table():
- Simpler and faster usage
- Can give error messages
This gives us one code path for allmost all error messages if log tables are in use
sql/log.h:
New interface for check_if_log_table()
sql/slave.cc:
More logging
sql/sql_alter.cc:
New interface for check_if_log_table()
sql/sql_base.cc:
More documentation
Changed interface for drop_temporary_table() to make it more reusable
Changed Locked_tables_list::init_locked_tables() to work on the table object instead of the table list object. Before this it used a mix of both, which was not good.
Locked_tables_list::unlock_locked_tables(THD *thd) now requires a valid thd argument. Old usage of calling this with 0 i changed to instead call Locked_tables_list::reset()
Added functions Locked_tables_list:restore_lock() and Locked_tables_list::add_back_last_deleted_lock() to be able to easily add back a locked table to the lock list.
Check for command number instead of open_strategy of CREATE TABLE was used.
Added restart_trans_for_tables() to be able to restart a transaction. This was needed in "create or replace ... select" between the drop table and the select.
sql/sql_base.h:
Added and updated function prototypes
sql/sql_class.h:
Added new prototypes to Locked_tables_list class
Added extra argument to select_create to avoid double call to eof() or send_error()
- I needed this in some edge case where the table was not created against expections.
sql/sql_db.cc:
New interface for check_if_log_table()
sql/sql_insert.cc:
Remember position to lock information so that we can reaquire table lock for LOCK TABLES + CREATE OR REPLACE TABLE SELECT. Later add back the lock by calling restore_lock().
Removed one not needed indentation level in create_table_from_items()
Ensure we don't call send_eof() or abort_result_set() twice.
sql/sql_lex.h:
Removed variable that I temporarly added in an earlier changeset
sql/sql_parse.cc:
Removed old test code (marked with QQ)
Ensure that we have open_strategy set as TABLE_LIST::OPEN_STUB in CREATE TABLE
Removed some IF NOT EXISTS code as this is now handled in create_table_table_impl().
Set OPTION_KEEP_LOGS later. This code had to be moved as the test for IF EXISTS has changed place.
DROP_ACL is required if one uses CREATE TABLE OR REPLACE.
sql/sql_partition_admin.cc:
New interface for check_if_log_table()
sql/sql_rename.cc:
New interface for check_if_log_table()
sql/sql_table.cc:
New interface for check_if_log_table()
Moved some code in mysql_rm_table() under a common test.
- Safe as temporary tables doesn't have statistics.
- !is_temporary_table(table) test was moved out from drop_temporary_table() and merged with upper level code.
- Added drop of normal and temporary tables in create_table_imp() if CREATE OR REPLACE was used.
- Added reacquiring of table locks in mysql_create_table() and mysql_create_like_table()
- In mysql_create_like_table(), restore table->open_strategy() if it was changed.
- Re-test if table was a view after opening it.
sql/sql_table.h:
New prototype for mysql_create_table_no_lock()
sql/sql_yacc.yy:
Added syntax for CREATE OR REPLACE TABLE
Reuse new code for CREATE OR REPLACE VIEW
sql/table.h:
Added name for enum type
sql/table_cache.cc:
More DBUG
Add a test case for killing a waiting query in parallel replication.
Fix several bugs found:
- We should not wakeup_subsequent_commits() in ha_rollback_trans(), since we
do not know the right wakeup_error() to give.
- When a wait_for_prior_commit() is killed, we must unregister from the
waitee so we do not race and get an extra (non-kill) wakeup.
- We need to deal with error propagation correctly in queue_for_group_commit
when one thread is killed.
- Fix one locking issue in queue_for_group_commit(), we could unlock the
waitee lock too early and this end up processing wakeup() with insufficient
locking.
- Fix Xid_log_event::do_apply_event; if commit fails it must not update the
in-memory @@gtid_slave_pos state.
- Fix and cleanup some things in the rpl_parallel.cc error handling.
- Add a missing check for killed in the slave sql driver thread, to avoid a
race.
Attempt to read the master-bin.state file always, even if the
binlog files (master-bin.index and master-bin.XXXXXX) have been
deleted.
This allows to easily preserve the binlog state when provisioning
a new server from a copy of an old one, without needing to copy
over the binlog files themselves.
MDEV-4725: Incorrect binlog state recovery if crash while writing event group
The binlog state was not recovered correctly if XA is not used (eg. InnoDB
disabled), or if server crashed in the middle of writing an event group to the
binlog.
With this patch, we ensure that recovery of binlog state is done even if we do
not do the full XA binlog recovery, and we ensure that we only recover fully
written event groups into the binlog state.
There were some places where insufficient locking between
parallel threads could cause invalid memory accesses and
possibly other grief.
This patch adds the missing locking, and moves the locking
into the struct rpl_binlog_state methods to make it easier
to see that proper locking is in place everywhere.
The merge is still missing a few hunks related to temporary tables and
InnoDB log file size. The associated code did not seem to exist in
10.0, so the merge of that needs more work. Until this is fixed, there
are a number of test failures as a result.