The real problem here was inconsistent handling of entry->commit_errno in
MYSQL_BIN_LOG::write_transaction_or_stmt(). Some return paths were setting it
to the value of errno, some where not. And the setting was redundant anyway,
as it is set consistently by the caller.
Fix by consistently setting it in the caller, and not in each return path in
the function.
The test failure happened because a DBUG_EXECUTE_IF() used in the test case
set an entry->commit_errno that was immediately overwritten in the caller with
whatever happened to be the value of errno. This could lead to different error
message in the .result file.
reset default fields not for every modified row, but only once,
at the beginning, as the set of modified fields doesn't change.
exception: INSERT ... ON DUPLICATE KEY UPDATE - the set of fields
does change per row and in that case we reset default fields per row.
This bug was seen when parallel replication experienced a deadlock between
transactions T1 and T2, where T2 has reached the commit phase and is waiting
for T1 to commit first. In this case, the deadlock is broken by sending a kill
to T2; that kill error is then later detected and converted to a deadlock
error, which causes T2 to be rolled back and retried.
The problem was that the kill caused ha_commit_trans() to errorneously call
wakeup_subsequent_commits() on T3, signalling it to abort because T2 failed
during commit. This is incorrect, because the error in T2 is only a temporary
error, which will be resolved by normal transaction retry. We should not
signal error to the next transaction until we have executed the code that
handles such temporary errors.
So this patch just removes the calls to wakeup_subsequent_commits() from
ha_commit_trans(). They are incorrect in this case, and they are not needed in
general, as wakeup_subsequent_commits() must in any case be called in
finish_event_group() to wakeup any transactions that may have started to wait
after ha_commit_trans(). And normally, wakeup will in fact have happened
earlier, either from the binlog group commit code, or (in case of no
binlogging) after the fast part of InnoDB/XtraDB group commit.
The symptom of this bug was that replication would break on some transaction
with "Commit failed due to failure of an earlier commit on which this one
depends", but with no such failure of an earlier commit visible anywhere.
The retry of an event group in parallel replication set the wrong value for
the end log position of the event that was retried
(qev->future_event_relay_log_pos). It was too large by the size of the event,
so it pointed into the middle of the following event.
If the retry happened in the very last event of the event group, _and_ the SQL
thread was stopped just after successfully retrying that event, then the SQL
threads's relay log position would be left incorrect. Restarting the SQL
thread could then try to read events from a garbage offset in the relay log,
usually leading to an error about not being able to read the event.
The code in binlog group commit around wait_for_commit that controls commit
order, did the wakeup of subsequent commits early, as soon as a following
transaction is put into the group commit queue, but before any such commit has
actually taken place. This causes problems with too early wakeup of
transactions that need to wait for prior to commit, but do not take part in
the binlog group commit for one reason or the other.
This patch solves the problem, by moving the wakeup to happen only after the
binlog group commit is completed.
This requires a new solution to ensure that transactions that arrive later
than the leader are still able to participate in group commit. This patch
introduces a flag wait_for_commit::commit_started. When this is set, a waiter
can queue up itself in the group commit queue.
This way, effectively the wait_for_prior_commit() is skipped only for
transactions that participate in group commit, so that skipping the wait is
safe. Other transactions still wait as needed for correctness.
The code that handles free lists of various objects passed to worker threads
in parallel replication handles freeing in batches, to avoid taking and
releasing LOCK_rpl_thread too often. However, it was possible for freeing to
be delayed to the point where one thread could stall the SQL driver thread due
to full queue, while other worker threads might be idle. This could
significantly degrade possible parallelism and thus performance.
Clean up the batch freeing code so that it is more robust and now able to
regularly free batches of object, so that normally the queue will not run full
unless the SQL driver thread is really far ahead of the worker threads.
The bug occured in parallel replication when re-trying transactions that
failed due to deadlock. In this case, the relay log file is re-opened and the
events are read out again. This reading requires a format description event of
the appropriate version. But the code was using a description event stored in
rli, which is not thread-safe. This could lead to various rare races if the
format description event was replaced by the SQL driver thread at the exact
moment where a worker thread was trying to use it.
The fix is to instead make the retry code create and maintain its own format
description event. When the relay log file is opened, we first read the format
description event from the start of the file, before seeking to the current
position. This now uses the same code as when the SQL driver threads starts
from a given relay log position. This also makes sure that the correct format
description event version will be used in cases where the version of the
binlog could change during replication.
In parallel replication, threads can do two different waits for a prior
transaction. One is for the prior transaction to start commit, the other is
for it to complete commit.
It turns out that the same PSI_stage_info message was errorneously used in
both cases (probably a merge error), causing SHOW PROCESSLIST to be
misleading.
Fix by using correct, distinct message in each case.
In parallel replication, the wait_for_commit facility is used to ensure that
events are written into the binlog in the correct order. This is handled in an
optimised way in the binlogging group commit code.
However, some statements, for example GRANT, are written directly into the
binlog, outside of the group commit code. There was a bug that this direct
write does not correctly wait for the prior transactions to have been written
first, which allows f.ex. GRANT to be written ahead of earlier transactions.
This patch adds the missing wait_for_prior_commit() before writing directly to
the binlog.
However, the problem is still there, although the race is much less likely to
occur now. The problem is that the optimised group commit code does wakeup of
following transactions early, before the binlog write is actually done. A
woken-up following transaction is then allowed to run ahead and queue up for
the group commit, which will ensure that binlog write happens in correct order
in the end. However, the code for directly written events currently bypass
this mechanism, so they get woken up and written too early.
This will be fixed properly in a later patch.
The real bug was that open_tables() returned error in case of
thd->killed() without properly calling thd->send_kill_message()
to set the correct error. This was fixed some time ago.
So remove the, now redundant, extra checks for thd->is_error(),
possibly allowing to catch in debug builds more incorrect
error handling cases.
In SAFE_MUTEX builds, reset the wait_for_commit mutex (destroy and
re-initialise), so that SAFE_MUTEX lock order check does not become
confused when the mutex is re-used for a different purpose.
When parsing a field declaration, grab type information from LEX before it's overwritten
by further rules. Pass type information through the parser stack to the rule that needs it.
Issue :
-------
This seems for some platform -(LONGLONG_MIN) is
not flagged as out of range.
Fix:
----
Fix is backported from mysql-5.6 bug 14314156.
Fixed by adding an explicit test for this value in
Item_func_neg::int_op().
sql/item_func.cc:
For some platforms we need special handling of
LONGLONG_MIN to guarantee overflow.
Issue :
-------
This seems for some platform -(LONGLONG_MIN) is
not flagged as out of range.
Fix:
----
Fix is backported from mysql-5.6 bug 14314156.
Fixed by adding an explicit test for this value in
Item_func_neg::int_op().
The bug is not very important per se, but it was helpful to move
Item_func_strcmp out of Item_bool_func2 (to Item_int_func),
for the purposes of "MDEV-4912 Add a plugin to field types (column types)".
(Backport to 5.3)
(Attempt #2)
- Don't attempt to use BKA for materialized derived tables. The
table is neither filled nor fully opened yet, so attempt to
call handler->multi_range_read_info() causes crash.
(Backport to 5.3)
(variant #2, with fixed coding style)
- Make Mrr_ordered_index_reader::resume_read() restore index position
only if it was saved before with Mrr_ordered_index_reader::interrupt_read().
- TABLE::create_key_part_by_field() should not set PART_KEY_FLAG in field->flags
= The reason is that it is used by hash join code which calls it to create a hash
table lookup structure. It doesn't create a real index.
= Another caller of the function is TABLE::add_tmp_key(). Made it to set the flag itself.
- The differences in join_cache.result could also be observed before this patch: one
could put "FLUSH TABLES" before the queries and get exactly the same difference.
(Attempt #2)
- Don't attempt to use BKA for materialized derived tables. The
table is neither filled nor fully opened yet, so attempt to
call handler->multi_range_read_info() causes crash.
1. Do not use NULL `info' field in processlist to select the thread of
interest. This can fail if the read of processlist ends up happening after
REAP succeeds, but before the `info' field is reset. Instead, select on the
CONNECTION_ID(), making sure we still scan the whole list to trigger the same
code as in the original test case.
2. Wait for the query to really complete before reading it in the
processlist. When REAP returns, it only means that ack has been sent to
client, the reset of query stage happens a bit later in the code.
- Don't attempt to use BKA for materialized derived tables. The
table is neither filled nor fully opened yet, so attempt to
call handler->multi_range_read_info() causes crash.
- test_if_skip_sort_order()/create_ref_for_key() may change table
access from EQ_REF(index1) to REF(index2).
- Doing so doesn't make much sense from optimization POV, but since
they are doing it, they should update tab->read_record.unlock_row
accordingly.
Bug#17959689: MAKE GCC AND CLANG GIVE CONSISTENT COMPILATION WARNINGS
Bug#18313717: ENABLE -WERROR IN MAINTANER MODE WHEN COMPILING WITH CLANG
Bug#18510941: REMOVE CMAKE WORKAROUNDS FOR OLDER VERSIONS OF OS X/XCODE
Backport from mysql-5.6 to mysql-5.5
Bug#17959689: MAKE GCC AND CLANG GIVE CONSISTENT COMPILATION WARNINGS
Bug#18313717: ENABLE -WERROR IN MAINTANER MODE WHEN COMPILING WITH CLANG
Bug#18510941: REMOVE CMAKE WORKAROUNDS FOR OLDER VERSIONS OF OS X/XCODE
Backport from mysql-5.6 to mysql-5.5
Problem: For every event read, mysqlbinlog calls localtime() which in turn
calls stat(/etc/localtime) which is causing kernel mutex contention.
Analysis and Fix:
localtime() calls stat(/etc/localtime) for every instance of the call
where as localtime_r() the reentrant version was optimized to store
the read only tz internal structure. Hence it will not call
stat(/etc/localtime). It will call only once at the beginning.
The mysql server is calling localtime_r() and mysqlbinlog tool is
one place where we are still using localtime().
Once the process (mysqlbinlog) is started if timezone is changed
it will be not picked up the the process and it will continue
with the same values as the beginning of the process. This
behavior is in-lined with mysql server.
Also adding localtime_r() and gmtime_r() support for windows.
Problem: For every event read, mysqlbinlog calls localtime() which in turn
calls stat(/etc/localtime) which is causing kernel mutex contention.
Analysis and Fix:
localtime() calls stat(/etc/localtime) for every instance of the call
where as localtime_r() the reentrant version was optimized to store
the read only tz internal structure. Hence it will not call
stat(/etc/localtime). It will call only once at the beginning.
The mysql server is calling localtime_r() and mysqlbinlog tool is
one place where we are still using localtime().
Once the process (mysqlbinlog) is started if timezone is changed
it will be not picked up the the process and it will continue
with the same values as the beginning of the process. This
behavior is in-lined with mysql server.
Also adding localtime_r() and gmtime_r() support for windows.
Problem:
========
In a master slave replication if a slave receives a
Start_log_event_v3 the payload is expected to be of fixed
size. If a payload which is smaller than the fixed size is
received it causes a read out of bounds issue.
Analysis:
========
According to documentation the fixed data part of
Start_log_event_v3 looks as shown below.
2 bytes: The binary log format version
50 bytes: The MySQL server's version
4 bytes: Timestamp in seconds when this event was created
Since the payload is expected to be of fixed size, therefore
ST_SERVER_VER_LEN (50) bytes are memcpy'ed into
server_version. But if a malicious master sends a shorter
payload it causes a read out of bounds issue.
Fix:
===
In Start_log_event_v3 event's constructor a check has been
added which expects the minimum payload length to be of size
common_header_len + ST_COMMON_HEADER_LEN_OFFSET bytes. If a
malicious packet of lesser length is received it will be
considered as an invalid event.
sql/log_event.cc:
Added code changes to check the minimum packet length
of Start_log_event_v3 should be > 56.
sql/log_event.h:
Moved server_version from stack to heap and modified
is_valid function for Start_log_event_v3.
Problem:
========
In a master slave replication if a slave receives a
Start_log_event_v3 the payload is expected to be of fixed
size. If a payload which is smaller than the fixed size is
received it causes a read out of bounds issue.
Analysis:
========
According to documentation the fixed data part of
Start_log_event_v3 looks as shown below.
2 bytes: The binary log format version
50 bytes: The MySQL server's version
4 bytes: Timestamp in seconds when this event was created
Since the payload is expected to be of fixed size, therefore
ST_SERVER_VER_LEN (50) bytes are memcpy'ed into
server_version. But if a malicious master sends a shorter
payload it causes a read out of bounds issue.
Fix:
===
In Start_log_event_v3 event's constructor a check has been
added which expects the minimum payload length to be of size
common_header_len + ST_COMMON_HEADER_LEN_OFFSET bytes. If a
malicious packet of lesser length is received it will be
considered as an invalid event.
Don't double-check privileges for a column in the GROUP BY that refers to
the same column in SELECT clause. Privileges were already checked for SELECT clause.
- Fix the crash by making get_column_range_cardinality()
to handle the special case where Column_stats objects
is an all-zeros object (the question of what is the point
of having Field::read_stats point to such object remains a
mystery)
- Added a few comments. Learning the code still.
- Restarting mysqld with --expire-log-days=1 triggers 'log_in_use()' to be called while current_thd is NULL.
- Check current_thd before calling DEBUG_SYNC() to avoid passing NULL pointer to DEBUG_SYNC()
- Wrap debug code construct inside #ifndef DBUG_OFF like in other parts of the file
- Restarting mysqld with --expire-log-days=1 triggers 'log_in_use()' to be called while current_thd is NULL.
- Check current_thd before calling DEBUG_SYNC() to avoid passing NULL pointer to DEBUG_SYNC()
- Wrap debug code construct inside #ifndef DBUG_OFF like in other parts of the file
SQLCOM_ASSIGN_TO_KEYCACHE should not be CF_AUTO_COMMIT_TRANS
SQLCOM_PRELOAD_KEYS should not be CF_AUTO_COMMIT_TRANS
SQLCOM_INSTALL_PLUGIN should need CF_AUTO_COMMIT_TRANS
SQLCOM_UNINSTALL_PLUGIN should need CF_AUTO_COMMIT_TRANS
merge from MySQL-5.6, revision:
revno: 3677.2.1
committer: Alfranio Correia <alfranio.correia@oracle.com>
timestamp: Tue 2012-02-28 16:26:37 +0000
message:
BUG#13627921 - MISSING FLAGS IN SQL_COMMAND_FLAGS MAY LEAD TO REPLICATION PROBLEMS
Flags in sql_command_flags[command] are not correctly set for the following
commands:
. SQLCOM_SET_OPTION is missing CF_CAN_GENERATE_ROW_EVENTS;
. SQLCOM_BINLOG_BASE64_EVENT is missing CF_CAN_GENERATE_ROW_EVENTS;
. SQLCOM_REVOKE_ALL is missing CF_CHANGES_DATA;
. SQLCOM_CREATE_FUNCTION is missing CF_AUTO_COMMIT_TRANS;
This may lead to a wrong sequence of events in the binary log. To fix
the problem, we correctly set the flags in sql_command_flags[command].
The reason for the failure was a bug in an include file on debian that causes 'struct stat'
to have different sized depending on the environment.
This patch fixes so that we always include my_global.h or my_config.h before we include any other files.
Other things:
- Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few
"always-include-this-file-first' files as possible.
- Removed usage of some include files that where already included by my_global.h or by other files.
client/mysql_plugin.c:
Use my_global.h first
client/mysqlslap.c:
Remove duplicated include files
extra/comp_err.c:
Remove duplicated include files
include/m_string.h:
Remove duplicated include files
include/maria.h:
Remove duplicated include files
libmysqld/emb_qcache.cc:
Use my_global.h first
plugin/semisync/semisync.h:
Use my_pthread.h first
sql/datadict.cc:
Use my_global.h first
sql/debug_sync.cc:
Use my_global.h first
sql/derror.cc:
Use my_global.h first
sql/des_key_file.cc:
Use my_global.h first
sql/discover.cc:
Use my_global.h first
sql/event_data_objects.cc:
Use my_global.h first
sql/event_db_repository.cc:
Use my_global.h first
sql/event_parse_data.cc:
Use my_global.h first
sql/event_queue.cc:
Use my_global.h first
sql/event_scheduler.cc:
Use my_global.h first
sql/events.cc:
Use my_global.h first
sql/field.cc:
Use my_global.h first
Remove duplicated include files
sql/field_conv.cc:
Use my_global.h first
sql/filesort.cc:
Use my_global.h first
Remove duplicated include files
sql/gstream.cc:
Use my_global.h first
sql/ha_ndbcluster.cc:
Use my_global.h first
sql/ha_ndbcluster_binlog.cc:
Use my_global.h first
sql/ha_ndbcluster_cond.cc:
Use my_global.h first
sql/ha_partition.cc:
Use my_global.h first
sql/handler.cc:
Use my_global.h first
sql/hash_filo.cc:
Use my_global.h first
sql/hostname.cc:
Use my_global.h first
sql/init.cc:
Use my_global.h first
sql/item.cc:
Use my_global.h first
sql/item_buff.cc:
Use my_global.h first
sql/item_cmpfunc.cc:
Use my_global.h first
sql/item_create.cc:
Use my_global.h first
sql/item_geofunc.cc:
Use my_global.h first
sql/item_inetfunc.cc:
Use my_global.h first
sql/item_row.cc:
Use my_global.h first
sql/item_strfunc.cc:
Use my_global.h first
sql/item_subselect.cc:
Use my_global.h first
sql/item_sum.cc:
Use my_global.h first
sql/item_timefunc.cc:
Use my_global.h first
sql/item_xmlfunc.cc:
Use my_global.h first
sql/key.cc:
Use my_global.h first
sql/lock.cc:
Use my_global.h first
sql/log.cc:
Use my_global.h first
sql/log_event.cc:
Use my_global.h first
sql/log_event_old.cc:
Use my_global.h first
sql/mf_iocache.cc:
Use my_global.h first
sql/mysql_install_db.cc:
Remove duplicated include files
sql/mysqld.cc:
Remove duplicated include files
sql/net_serv.cc:
Remove duplicated include files
sql/opt_range.cc:
Use my_global.h first
sql/opt_subselect.cc:
Use my_global.h first
sql/opt_sum.cc:
Use my_global.h first
sql/parse_file.cc:
Use my_global.h first
sql/partition_info.cc:
Use my_global.h first
sql/procedure.cc:
Use my_global.h first
sql/protocol.cc:
Use my_global.h first
sql/records.cc:
Use my_global.h first
sql/records.h:
Don't include my_global.h
Better to do this at the upper level
sql/repl_failsafe.cc:
Use my_global.h first
sql/rpl_filter.cc:
Use my_global.h first
sql/rpl_gtid.cc:
Use my_global.h first
sql/rpl_handler.cc:
Use my_global.h first
sql/rpl_injector.cc:
Use my_global.h first
sql/rpl_record.cc:
Use my_global.h first
sql/rpl_record_old.cc:
Use my_global.h first
sql/rpl_reporting.cc:
Use my_global.h first
sql/rpl_rli.cc:
Use my_global.h first
sql/rpl_tblmap.cc:
Use my_global.h first
sql/rpl_utility.cc:
Use my_global.h first
sql/set_var.cc:
Added comment
sql/slave.cc:
Use my_global.h first
sql/sp.cc:
Use my_global.h first
sql/sp_cache.cc:
Use my_global.h first
sql/sp_head.cc:
Use my_global.h first
sql/sp_pcontext.cc:
Use my_global.h first
sql/sp_rcontext.cc:
Use my_global.h first
sql/spatial.cc:
Use my_global.h first
sql/sql_acl.cc:
Use my_global.h first
sql/sql_admin.cc:
Use my_global.h first
sql/sql_analyse.cc:
Use my_global.h first
sql/sql_audit.cc:
Use my_global.h first
sql/sql_base.cc:
Use my_global.h first
sql/sql_binlog.cc:
Use my_global.h first
sql/sql_bootstrap.cc:
Use my_global.h first
Use my_global.h first
sql/sql_cache.cc:
Use my_global.h first
sql/sql_class.cc:
Use my_global.h first
sql/sql_client.cc:
Use my_global.h first
sql/sql_connect.cc:
Use my_global.h first
sql/sql_crypt.cc:
Use my_global.h first
sql/sql_cursor.cc:
Use my_global.h first
sql/sql_db.cc:
Use my_global.h first
sql/sql_delete.cc:
Use my_global.h first
sql/sql_derived.cc:
Use my_global.h first
sql/sql_do.cc:
Use my_global.h first
sql/sql_error.cc:
Use my_global.h first
sql/sql_explain.cc:
Use my_global.h first
sql/sql_expression_cache.cc:
Use my_global.h first
sql/sql_handler.cc:
Use my_global.h first
sql/sql_help.cc:
Use my_global.h first
sql/sql_insert.cc:
Use my_global.h first
sql/sql_lex.cc:
Use my_global.h first
sql/sql_load.cc:
Use my_global.h first
sql/sql_locale.cc:
Use my_global.h first
sql/sql_manager.cc:
Use my_global.h first
sql/sql_parse.cc:
Use my_global.h first
sql/sql_partition.cc:
Use my_global.h first
sql/sql_plugin.cc:
Added comment
sql/sql_prepare.cc:
Use my_global.h first
sql/sql_priv.h:
Added error if we use this before including my_global.h
This check is here becasue so many files includes sql_priv.h first.
sql/sql_profile.cc:
Use my_global.h first
sql/sql_reload.cc:
Use my_global.h first
sql/sql_rename.cc:
Use my_global.h first
sql/sql_repl.cc:
Use my_global.h first
sql/sql_select.cc:
Use my_global.h first
sql/sql_servers.cc:
Use my_global.h first
sql/sql_show.cc:
Added comment
sql/sql_signal.cc:
Use my_global.h first
sql/sql_statistics.cc:
Use my_global.h first
sql/sql_table.cc:
Use my_global.h first
sql/sql_tablespace.cc:
Use my_global.h first
sql/sql_test.cc:
Use my_global.h first
sql/sql_time.cc:
Use my_global.h first
sql/sql_trigger.cc:
Use my_global.h first
sql/sql_udf.cc:
Use my_global.h first
sql/sql_union.cc:
Use my_global.h first
sql/sql_update.cc:
Use my_global.h first
sql/sql_view.cc:
Use my_global.h first
sql/sys_vars.cc:
Added comment
sql/table.cc:
Use my_global.h first
sql/thr_malloc.cc:
Use my_global.h first
sql/transaction.cc:
Use my_global.h first
sql/uniques.cc:
Use my_global.h first
sql/unireg.cc:
Use my_global.h first
sql/unireg.h:
Removed inclusion of my_global.h
storage/archive/ha_archive.cc:
Added comment
storage/blackhole/ha_blackhole.cc:
Use my_global.h first
storage/csv/ha_tina.cc:
Use my_global.h first
storage/csv/transparent_file.cc:
Use my_global.h first
storage/federated/ha_federated.cc:
Use my_global.h first
storage/federatedx/federatedx_io.cc:
Use my_global.h first
storage/federatedx/federatedx_io_mysql.cc:
Use my_global.h first
storage/federatedx/federatedx_io_null.cc:
Use my_global.h first
storage/federatedx/federatedx_txn.cc:
Use my_global.h first
storage/heap/ha_heap.cc:
Use my_global.h first
storage/innobase/handler/handler0alter.cc:
Use my_global.h first
storage/maria/ha_maria.cc:
Use my_global.h first
storage/maria/unittest/ma_maria_log_cleanup.c:
Remove duplicated include files
storage/maria/unittest/test_file.c:
Added comment
storage/myisam/ha_myisam.cc:
Move sql_plugin.h first as this includes my_global.h
storage/myisammrg/ha_myisammrg.cc:
Use my_global.h first
storage/oqgraph/oqgraph_thunk.cc:
Use my_config.h and my_global.h first
One could not include my_global.h before oqgraph_thunk.h (don't know why)
storage/spider/ha_spider.cc:
Use my_global.h first
storage/spider/hs_client/config.cpp:
Use my_global.h first
storage/spider/hs_client/escape.cpp:
Use my_global.h first
storage/spider/hs_client/fatal.cpp:
Use my_global.h first
storage/spider/hs_client/hstcpcli.cpp:
Use my_global.h first
storage/spider/hs_client/socket.cpp:
Use my_global.h first
storage/spider/hs_client/string_util.cpp:
Use my_global.h first
storage/spider/spd_conn.cc:
Use my_global.h first
storage/spider/spd_copy_tables.cc:
Use my_global.h first
storage/spider/spd_db_conn.cc:
Use my_global.h first
storage/spider/spd_db_handlersocket.cc:
Use my_global.h first
storage/spider/spd_db_mysql.cc:
Use my_global.h first
storage/spider/spd_db_oracle.cc:
Use my_global.h first
storage/spider/spd_direct_sql.cc:
Use my_global.h first
storage/spider/spd_i_s.cc:
Use my_global.h first
storage/spider/spd_malloc.cc:
Use my_global.h first
storage/spider/spd_param.cc:
Use my_global.h first
storage/spider/spd_ping_table.cc:
Use my_global.h first
storage/spider/spd_sys_table.cc:
Use my_global.h first
storage/spider/spd_table.cc:
Use my_global.h first
storage/spider/spd_trx.cc:
Use my_global.h first
storage/xtradb/handler/handler0alter.cc:
Use my_global.h first
storage/xtradb/handler/i_s.cc:
Use my_global.h first
- Fix the crash, dont call from->type() at the start of the function
because it might be unsafe.
- Unfortunately there is no testcase
- And this is also the reason we can't fix it properly (it should be
safe to call from->type() here).
SET @@global.log_output
Deadlock chain:
rdlock(LOCK_logger) -> lock(LOCK_open) SELECT 1
lock(LOCK_open) -> lock(LOCK_status) DROP TABLE t1
lock(LOCK_status) -> lock(LOCK_g_s_v) SHOW STATUS
lock(LOCK_g_s_) -> wrlock(LOCK_logger) SET @@global.log_output=DEFAULT
Fixed by removing relationship between LOCK_status and
LOCK_global_system_variables during SHOW STATUS: we don't really need
LOCK_global_system_variables when accessing status vars.
Normally, SET SESSION SQL_LOG_BIN is used by DBAs to run a
non-conflicting command locally only, ensuring it does not
get replicated.
Setting GLOBAL SQL_LOG_BIN would not require all sessions to
disconnect. When SQL_LOG_BIN is changed globally, it does not
immediately take effect for any sessions. It takes effect by
becoming the session-level default inherited at the start of
each new session, and this setting is kept and cached for the
duration of that session. Setting it intentionally is unlikely
to have a useful effect under any circumstance; setting it
unintentionally, such as while intending to use SET [SESSION]
is potentially disastrous. Accidentally using SET GLOBAL
SQL_LOG_BIN will not show an immediate effect to the user,
instead not having the desired session-level effect, and thus
causing other potential problems with local-only maintenance
being binlogged and executed on slaves; And transactions from
new sessions (after SQL_LOG_BIN is changed globally) are not
binlogged and replicated, which would result in irrecoverable
or difficult data loss.
This is the regular GLOBAL variables way to work, but in
replication context it does not look right on a working server
(with connected sessions) 'set global sql_log_bin' and none of
that connections is affected. Unexperienced DBA after noticing
that the command did "nothing" will change the session var and
most probably won't unset the global var, causing new sessions
to not be binlog.
Setting GLOBAL SQL_LOG_BIN allows DBA to stop binlogging on all
new sessions, which can be used to make a server "replication
read-only" without restarting the server. But this has such big
requirements, stop all existing connections, that it is more
likely to make a mess, it is too risky to allow the GLOBAL variable.
The statement 'SET GLOBAL SQL_LOG_BIN=N' will produce an error
in 5.5, 5.6 and 5.7. Reading the GLOBAL SQL_LOG_BIN will produce
a deprecation warning in 5.7.
Normally, SET SESSION SQL_LOG_BIN is used by DBAs to run a
non-conflicting command locally only, ensuring it does not
get replicated.
Setting GLOBAL SQL_LOG_BIN would not require all sessions to
disconnect. When SQL_LOG_BIN is changed globally, it does not
immediately take effect for any sessions. It takes effect by
becoming the session-level default inherited at the start of
each new session, and this setting is kept and cached for the
duration of that session. Setting it intentionally is unlikely
to have a useful effect under any circumstance; setting it
unintentionally, such as while intending to use SET [SESSION]
is potentially disastrous. Accidentally using SET GLOBAL
SQL_LOG_BIN will not show an immediate effect to the user,
instead not having the desired session-level effect, and thus
causing other potential problems with local-only maintenance
being binlogged and executed on slaves; And transactions from
new sessions (after SQL_LOG_BIN is changed globally) are not
binlogged and replicated, which would result in irrecoverable
or difficult data loss.
This is the regular GLOBAL variables way to work, but in
replication context it does not look right on a working server
(with connected sessions) 'set global sql_log_bin' and none of
that connections is affected. Unexperienced DBA after noticing
that the command did "nothing" will change the session var and
most probably won't unset the global var, causing new sessions
to not be binlog.
Setting GLOBAL SQL_LOG_BIN allows DBA to stop binlogging on all
new sessions, which can be used to make a server "replication
read-only" without restarting the server. But this has such big
requirements, stop all existing connections, that it is more
likely to make a mess, it is too risky to allow the GLOBAL variable.
The statement 'SET GLOBAL SQL_LOG_BIN=N' will produce an error
in 5.5, 5.6 and 5.7. Reading the GLOBAL SQL_LOG_BIN will produce
a deprecation warning in 5.7.
Added comments
Ensure that tokudb test works even if jemalloc is not installed
Removed not referenced function Item::remove_fixed()
mysql-test/suite/rpl/t/rpl_gtid_reconnect.test:
Fixed race condition
sql/item.cc:
Indentation fix
sql/item.h:
Removed not used function
Added comment
sql/sql_select.cc:
Fixed indentation
storage/tokudb/mysql-test/rpl/include/have_tokudb.opt:
Ensure that tokudb test works even if jemalloc is not installed
storage/tokudb/mysql-test/tokudb/suite.opt:
Ensure that tokudb test works even if jemalloc is not installed
storage/tokudb/mysql-test/tokudb_add_index/suite.opt:
Ensure that tokudb test works even if jemalloc is not installed
storage/tokudb/mysql-test/tokudb_alter_table/suite.opt:
Ensure that tokudb test works even if jemalloc is not installed
storage/tokudb/mysql-test/tokudb_bugs/suite.opt:
Ensure that tokudb test works even if jemalloc is not installed
storage/tokudb/mysql-test/tokudb_mariadb/suite.opt:
Ensure that tokudb test works even if jemalloc is not installed
SET slow_query_log and failed connection attempt
A very subtle though valid deadlock. Deadlock chain:
wrlock(LOCK_grant) -> lock(acl_cache->lock) GRANT/REVOKE CREATE/DROP USER
lock(LOCK_open) -> rdlock(LOCK_grant) SELECT * FROM I_S.COLUMNS
wrlock(LOCK_logger) -> lock(LOCK_open) SET @@global.slow_query_log='ON'
lock(acl_cache->lock) -> rdlock(LOCK_logger) Failed connection
Fixed by removing relationship between acl_cache->lock and LOCK_logger
during failed connection attempt.
sql/item.cc:
don't forget to adjust the length of the string when removing leading spaces
sql/sql_acl.cc:
when updating the hostname of the ACL_USER, update the hostname_length too
sql/sql_parse.cc:
first compare the username string, then test the host pointer
(host pointer is undefined when the username string is one of the hard-coded values
set by the parser). This is not a bug, old code is perfectly safe as the undefined
host pointer is never dereferenced, but let's keep valgrind happy.
AND IS_USED_LOCK().
Analysis:
-----------
In functions Item_func_is_free_lock::val_int() and
Item_func_is_used_lock::val_int(), for the specified user lock
name, pointer to its "User_level_lock" object is obtained from hash
"hash_user_locks". Mutex "LOCK_user_locks" is acquired for this
and released immediately. And we are accessing members of
User_level_lock after releasing the mutex. If same user lock is
deleted(released) from concurrent thread then accessing members
results in invalid(freed) memory access issue.
Deleting of user lock is also protected from the mutex
"LOCK_user_locks". Since this mutex is released in "val_int"
functions mentioned above, delete operation proceeds while concurrent
thread tries to access its members.
With the test case, valgrind reports invalid read issues in val_int
functions.
Fix:
-----------
To fix this issue, in "val_int" function of classes
"Item_func_is_free_lock" and "Item_func_is_used_lock", now releasing
mutex "LOCK_user_locks" after accessing User_level_lock members.
AND IS_USED_LOCK().
Analysis:
-----------
In functions Item_func_is_free_lock::val_int() and
Item_func_is_used_lock::val_int(), for the specified user lock
name, pointer to its "User_level_lock" object is obtained from hash
"hash_user_locks". Mutex "LOCK_user_locks" is acquired for this
and released immediately. And we are accessing members of
User_level_lock after releasing the mutex. If same user lock is
deleted(released) from concurrent thread then accessing members
results in invalid(freed) memory access issue.
Deleting of user lock is also protected from the mutex
"LOCK_user_locks". Since this mutex is released in "val_int"
functions mentioned above, delete operation proceeds while concurrent
thread tries to access its members.
With the test case, valgrind reports invalid read issues in val_int
functions.
Fix:
-----------
To fix this issue, in "val_int" function of classes
"Item_func_is_free_lock" and "Item_func_is_used_lock", now releasing
mutex "LOCK_user_locks" after accessing User_level_lock members.
We should assume that the store engine will report the first duplicate key for this case.
Old code of suppression of unsafe logging error with LIMIT didn't work, because of wrong usage of my_interval_timer().
Suppress unsafe logging errors to the error log if we get too many unsafe logging errors in a short time.
This is to not overflow the error log with meaningless errors.
- Each error code is suppressed and counted separately.
- We do a 5 minute suppression of new errors if we get more than 10 errors in that time.
Only print unsafe logging errors if log_warnings > 1.
mysql-test/suite/binlog/r/binlog_stm_unsafe_warning.result:
Update test results as INSERT ... ON DUPLICATE KEY UPDATE doesn't get logged anymore
mysql-test/suite/binlog/r/binlog_unsafe.result:
Update test results as INSERT ... ON DUPLICATE KEY UPDATE doesn't get logged anymore
mysql-test/suite/engines/README:
Fixed typos
mysql-test/suite/rpl/r/rpl_known_bugs_detection.result:
Update test results as INSERT ... ON DUPLICATE KEY UPDATE doesn't get logged anymore
sql/sql_base.cc:
Don't log warning if there are two unique keys used with INSERT .. ON DUPLICATE KEY UPDATE.
We should assume that the store engine will report the first duplicate key for this case.
sql/sql_class.cc:
Suppress error in binary log if we get too many unsafe logging errors in a short time.
Only print unsafe logging errors if log_warnings > 1
This was missing in my last commit for fixing possible lockups in SHOW STATUS.
sql/log.cc:
Fixed comment
sql/sql_show.cc:
Use LOCK_show_status when we add things to all_status_vars
sql/sql_test.cc:
Remove not needed mutex_lock
Fixed the bug itself.
Also, added "SET NAMES swe7" which was forgotten in the previous commit,
so latin1 was actually tested lati1 instead of swe7 in a mistake.
Now it tests swe7.
Instead we use LOCK_status only to protect summary of thread statistics and use a new mutex, LOCK_show_status
to protect concurrent SHOW STATUS.
sql/mysqld.cc:
Add LOCK_show_status
Don't free LOCK_status while calculating status variables.
sql/mysqld.h:
Add LOCK_show_status
sql/sql_show.cc:
Use LOCK_show_status to protect SHOW STATUS instead of LOCK_status.
CHECK.
Analysis:
----------
Issue here is, while creating or altering the InnoDB table,
if the foreign key defined on the table references a parent
table on which the user has no access privileges then the
table is created without reporting any error.
Currently the privilege level REFERENCES_ACL is unused
and is not used for access evaluation while creating the
table with a foreign key constraint or adding the foreign
key constraint to a table. But when no privileges are granted
to user then also access evaluation on parent table is ignored.
Fix:
---------
For DMLs, irrelevant of the fact, support does not want any
changes to avoid permission checks on every operation.
So, as a fix, added a function "check_fk_parent_table_access"
to check whether any of the SELECT_ACL, INSERT_ACL, UDPATE_ACL,
DELETE_ACL or REFERENCE_ACL privileges are granted for user
at table level. If none of them is granted then error is reported.
This function is called during the table creation and alter
operation.
CHECK.
Analysis:
----------
Issue here is, while creating or altering the InnoDB table,
if the foreign key defined on the table references a parent
table on which the user has no access privileges then the
table is created without reporting any error.
Currently the privilege level REFERENCES_ACL is unused
and is not used for access evaluation while creating the
table with a foreign key constraint or adding the foreign
key constraint to a table. But when no privileges are granted
to user then also access evaluation on parent table is ignored.
Fix:
---------
For DMLs, irrelevant of the fact, support does not want any
changes to avoid permission checks on every operation.
So, as a fix, added a function "check_fk_parent_table_access"
to check whether any of the SELECT_ACL, INSERT_ACL, UDPATE_ACL,
DELETE_ACL or REFERENCE_ACL privileges are granted for user
at table level. If none of them is granted then error is reported.
This function is called during the table creation and alter
operation.
Avoided exponential recursive calls of JOIN_CACHE::join_records() in the case
of non-nested outer joins.
A different solution is required to resolve this performance problem for
nested outer joins.
Moved freeing of mutex earlier, as we don't need to have log_space_cond locked for doing rotate_relay_log()
sql/slave.cc:
Moved freeing of mutex earlier, as we don't need to have log_space_cond locked for doing rotate_relay_log()
MDEV-6560 Assertion `! is_set() ' failed in Diagnostics_area::set_ok_status on killing CREATE OR REPLACE
MDEV-6525 Assertion `table->pos_in_locked _tables == __null || table->pos_in_locked_tables->table = table' failed in mark_used_tables_as_free_for_reuse, locking problems and binlogging problems on CREATE OR REPLACE under lock.
mysql-test/r/create_or_replace.result:
Added test for MDEV-6560
mysql-test/t/create_or_replace.test:
Added test for MDEV-6560
mysql-test/valgrind.supp:
Added suppression for OpenSuse 12.3
sql/sql_base.cc:
More DBUG
sql/sql_class.cc:
Changed that thd_sqlcom_can_generate_row_events() does not report that CREATE OR REPLACE is generating row events.
This is safe as this function is only used by InnoDB/XtraDB to check if a query is generating row events as part of another transaction. As CREATE is always run as it's own transaction, this isn't a problem.
This fixed MDEV-6525.
sql/sql_table.cc:
Remember if reopen_tables() generates an error (which can only happen in case of KILL).
This fixed MDEV-6560
* handler::get_auto_increment() was not expecting any errors from the storage engine.
That was wrong, errors could happen.
* ha_partition::get_auto_increment() was doing index lookups in partition under a mutex.
This was redundant (engine transaction isolation was covering that anyway)
and harmful (causing deadlocks).
the bug was introduced by CREATE OR REPLACE implementation.
CREATE IF NOT EXISTS ... SELECT was returning an error status to the caller,
while sending an ok packet to the user. SP code was not prepared for that
and trusted that error status means an error.
The Item_string constructors called set_name() on the source string,
which was wrong because in case of UCS2/UTF16/UTF32 the source value
might be a not well formed string (e.g. have incomplete leftmost character).
Now set_name() is called on str_value after its copied
(with optionally left zero padding) from the source string.
- MDEV-6694 Illegal mix of collation with a PS parameter
Item_param::convert_str_value() did not set repertoire.
Introducing a new structure MY_STRING_METADATA to collect
character length and repertoire of a string in a single loop,
to avoid two separate loops. Adding a new class Item_basic_value::Metadata
as a convenience wrapper around MY_STRING_METADATA, to reuse the
code between Item_string and Item_param.
and moving set_value() from Item_string to Item_string_for_in_vector,
as set_value() updates the members incompletely
(e.g. does not update max_length),
so it was dangerous to have set_value() available in Item_string.
If the slave gets a reconnect in the middle of a GTID event group, normally
it will re-fetch that event group, skipping the first part that was already
queued for the SQL thread.
However, if the master crashed while writing the event group, the group is
incomplete. This patch detects this case and makes sure that the
transaction is rolled back and nothing is skipped from any following
event groups.
Similarly, a network proxy might cause the reconnect to end up on a
different master server. Detect this by noticing a different server_id,
and similarly in this case roll back the partially received group.
Item_string::eq() and Item_param::eq() in string context behaved differently.
Introducing a new class Item_basic_value to share the eq() code between
literals (Item_int, Item_double, Item_string, Item_null) and Item_param.
MDEV-6666 Malformed result for CONCAT(utf8_column, binary_string)
Item_static_string_func::safe_charset_converter() and
Item_hex_string::safe_charset_converter() did not
handle character sets with mbminlen>1 properly, as well as
did not handle conversion from binary to multi-byte well.
Introducing Item::const_charset_converter(), to reuse it in a number
of Item_*::safe_charset_converter().
Problem: A corrupted header length in FORMAT_DESCRIPTION_LOG_EVENT
can cause server to crash.
Analysis: FORMAT_DESCRIPTION_EVENT will be considered invalid if
header len is too small (i.e. below OLD_HEADER_LEN).
Format_description_log_event:: Format_description_log_event(...)
{
...
if ((common_header_len=buf[ST_COMMON_HEADER_LEN_OFFSET]) < OLD_HEADER_LEN)
DBUG_VOID_RETURN; /* sanity check */
...
post_header_len= my_memdup(...)
}
In that case Format_description_log_event constructor will return early,
without allocating any memory for post_header_len. Thence this variable is
left uninitialized and making server to crash when server is trying
to free the uninitialized value.
Fix: When Format_description_log_event constructor returns early, assign
NULL to post_header_len.
Problem: A corrupted header length in FORMAT_DESCRIPTION_LOG_EVENT
can cause server to crash.
Analysis: FORMAT_DESCRIPTION_EVENT will be considered invalid if
header len is too small (i.e. below OLD_HEADER_LEN).
Format_description_log_event:: Format_description_log_event(...)
{
...
if ((common_header_len=buf[ST_COMMON_HEADER_LEN_OFFSET]) < OLD_HEADER_LEN)
DBUG_VOID_RETURN; /* sanity check */
...
post_header_len= my_memdup(...)
}
In that case Format_description_log_event constructor will return early,
without allocating any memory for post_header_len. Thence this variable is
left uninitialized and making server to crash when server is trying
to free the uninitialized value.
Fix: When Format_description_log_event constructor returns early, assign
NULL to post_header_len.
Don't restore the whole of thd->server_status after a routine invocation,
only restore SERVER_STATUS_CURSOR_EXISTS and SERVER_STATUS_LAST_ROW_SENT,
as --ps --embedded needs.
In particular, don't restore SERVER_STATUS_IN_TRANS.
Several string functions have optimizations for constant
sub-expressions which lead to setting max_length == 0.
For subqueries, where we need a temporary table to holde the result,
we need to ensure that we use a VARCHAR(0) column rather than a
CHAR(0) column when such expressions take part in grouping.
With CHAR(0) end_update() may write garbage into the next field.
Several string functions have optimizations for constant
sub-expressions which lead to setting max_length == 0.
For subqueries, where we need a temporary table to holde the result,
we need to ensure that we use a VARCHAR(0) column rather than a
CHAR(0) column when such expressions take part in grouping.
With CHAR(0) end_update() may write garbage into the next field.
After-review fixes.
Mainly catching if the wait in wait_for_workers_idle() is aborted due to kill.
In this case, we should return an error and not proceed to execute the format
description event, as other threads might still be running for a bit until the
error is caught in all threads.
Follow-up patch, fixing a possible deadlock issue.
If the master crashes in the middle of an event group, there can be an active
transaction in a worker thread when we encounter the following master restart
format description event. In this case, we need to notify that worker thread
to abort and roll back the partial event group. Otherwise a deadlock occurs:
the worker thread waits for the commit that never arrives, and the SQL driver
thread waits for the worker thread to complete its event group, which it never
does.
Fix the bug properly (plugin cannot be unloaded as long as it's locked).
Enable and fix the test case.
Significantly reduce number of LOCK_plugin locks for semisync
(practically all locks were removed)
~40% bugfixed(*) applied
~40$ bugfixed reverted (incorrect or we're not buggy)
~20% bugfixed applied, despite us being not buggy
(*) only changes in the server code, e.g. not cmakefiles