LP bug #1035225 / MySQL bug #66301: INSERT ... ON DUPLICATE KEY UPDATE +
innodb_autoinc_lock_mode=1 is broken
The problem was that when certain INSERT ... ON DUPLICATE KEY UPDATE
were executed concurrently on a table containing an AUTO_INCREMENT
column as a primary key, InnoDB would correctly reserve non-overlapping
AUTO_INCREMENT intervals for each statement, but when the server
encountered the first duplicate key error on the secondary key in one of
the statements and performed an UPDATE, it also updated the internal
AUTO_INCREMENT value to the one from the existing row that caused a
duplicate key error, even though the AUTO_INCREMENT value was not
specified explicitly in the UPDATE clause. It would then proceed with
using AUTO_INCREMENT values the range reserved previously by another
statement, causing duplicate key errors on the AUTO_INCREMENT column.
Fixed by changing write_record() to ensure that in case of a duplicate
key error the internal AUTO_INCREMENT counter is only updated when the
AUTO_INCREMENT value was explicitly updated by the UPDATE
clause. Otherwise it is restored to what it was before the duplicate key
error, as that value is unused and can be reused for subsequent
successfully inserted rows.
sql/sql_insert.cc:
Don't update next_insert_id to the value of a row found during ON DUPLICATE KEY UPDATE.
sql/sql_parse.cc:
Added DBUG_SYNC
sql/table.h:
Added next_number_field_updated flag to detect changing of auto increment fields.
Moved fields a bit to get bool fields after each other (better alignment)
Increment long_query_count also if thd->variables.log_slow_rate_limit is used
Added new state "Writing to binlog"
sql/sql_class.h:
Added THD::utime_after_query to avoid calling current_utime() twice for every end-of-query
sql/sql_parse.cc:
Increment long_query_count also if thd->variables.log_slow_rate_limit is used
Removed extra calls to thd_proc_info(thd, "logging slow query") and thd->current_utime();
sql/sql_table.cc:
Added new state "Writing to binlog"
feature_dynamic_columns,feature_fulltext,feature_gis,feature_locale,feature_subquery,feature_timezone,feature_trigger,feature_xml
Opened_views, Executed_triggers, Executed_events
Added new process status 'updating status' as part of 'freeing items'
mysql-test/r/features.result:
Test of feature_xxx status variables
mysql-test/r/mysqld--help.result:
Removed duplicated 'language' variable.
mysql-test/r/view.result:
Test of opened_views
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
Write more information on failure
mysql-test/t/features.test:
Test of feature_xxx status variables
mysql-test/t/view.test:
Test of opened_views
sql/event_scheduler.cc:
Increment executed_events status variable
sql/field.cc:
Increment status variable
sql/item_func.cc:
Increment status variable
sql/item_strfunc.cc:
Increment status variable
sql/item_subselect.cc:
Increment status variable
sql/item_xmlfunc.cc:
Increment status variable
sql/mysqld.cc:
Add new status variables to 'show status'
sql/mysqld.h:
Added executed_events
sql/sql_base.cc:
Increment status variable
sql/sql_class.h:
Add new status variables
sql/sql_parse.cc:
Added new process status 'updating status' as part of 'freeing items'
sql/sql_trigger.cc:
Increment status variable
sql/sys_vars.cc:
Increment status variable
sql/tztime.cc:
Increment status variable
updates to system, statistics and admin tables logged to binary log.
- Removed special code used to temporarily change to statement based replication.
- Changed to a faster and smaller interface for temporarily switching to statement based replication.
sql/event_db_repository.cc:
Change to new interface to not use row based replication for system table changes.
sql/events.cc:
Change to new interface to not use row based replication for system table changes.
sql/sp.cc:
Removed temporarily switching to statement based replication (this is now done automaticly in mysql_execute_command())
sql/sql_acl.cc:
Change to new interface to not use row based replication for system table changes.
Removed temporarily switching to statement based replication (this is now done automaticly in mysql_execute_command())
sql/sql_class.h:
Added new interface for temporarily switching to statement based replication.
sql/sql_parse.cc:
Mark commands that needs original replication mode with CF_FORCE_ORIGINAL_BINLOG_FORMAT.
Switch automaticly to statement based replication for statements that can't generate row based events (and can't change replication mode)
sql/sql_udf.cc:
Removed temporarily switching to statement based replication (this is now done automaticly in mysql_execute_command())
two tests still fail:
main.innodb_icp and main.range_vs_index_merge_innodb
call records_in_range() with both range ends being open
(which triggers an assert)
- Fix the year in Monty Program Ab copyrights in the new files.
- Fix permissions handling so that SHOW EXPLAIN's handling is the
same as SHOW PROCESSLIST's.
The bug prevented acceptance of UNION queries whose non-first select
clauses contained join expressions with degenerated single-table nests
as valid queries.
The bug was introduced into mysql-5.5 code line by the patch for
bug 33204.
Analysis:
-------------
If server is started with limit of MAX_CONNECTIONS and
MAX_USER_CONNECTIONS then only MAX_USER_CONNECTIONS of any particular
users can be connected to server and total MAX_CONNECTIONS of client can
be connected to server.
Server maintains a counter for total CONNECTIONS and total CONNECTIONS
from particular user.
Here, MAX_CONNECTIONS of connections are created to server. Out of this
MAX_CONNECTIONS, connections from particular user (say USER1) are
also created. The connections from USER1 is lesser than
MAX_USER_CONNECTIONS. After that there was one more connection request from
USER1. Since USER1 can still create connections as he havent reached
MAX_USER_CONNECTIONS, server increments counter of CONNECTIONS per user.
As server already has MAX_CONNECTIONS of connections, next check to total
CONNECTION count fails. In this case control is returned WITHOUT
decrementing the CONNECTIONS per user. So the counter per user CONNECTIONS goes
on incrementing for each attempt until current connections are closed.
And because of this counter per CONNECTIONS reached MAX_USER_CONNECTIONS.
So, next connections form USER1 user always returns with MAX_USER_CONNECTION
limit error, even when total connection to sever are less than MAX_CONNECTIONS.
Fix:
-------------
This issue is occurred because of not handling counters properly in the
server. Changed the code to handle per user connection counters properly.
PROBLEM:
Threads end-up in deadlock due to locks acquired as described
below,
con1: Run Query on a table.
It is important that this SELECT must back-off while
trying to open the t1 and enter into wait_for_condition().
The SELECT then is blocked trying to lock mysys_var->mutex
which is held by con3. The very significant fact here is
that mysys_var->current_mutex will still point to LOCK_open,
even if LOCK_open is no longer held by con1 at this point.
con2: Try dropping table used in con1 or query some table.
It will hold LOCK_open and be blocked trying to lock
kernel_mutex held by con4.
con3: Try killing the query run by con1.
It will hold THD::LOCK_thd_data belonging to con1 while
trying to lock mysys_var->current_mutex belonging to con1.
But current_mutex will point to LOCK_open which is held
by con2.
con4: Get innodb engine status
It will hold kernel_mutex, trying to lock THD::LOCK_thd_data
belonging to con1 which is held by con3.
So while technically only con2, con3 and con4 participate in the
deadlock, con1's mysys_var->current_mutex pointing to LOCK_open
is a vital component of the deadlock.
CYCLE = (THD::LOCK_thd_data -> LOCK_open ->
kernel_mutex -> THD::LOCK_thd_data)
FIX:
LOCK_thd_data has responsibility of protecting,
1) thd->query, thd->query_length
2) VIO
3) thd->mysys_var (used by KILL statement and shutdown)
4) THD during thread delete.
Among above responsibilities, 1), 2)and (3,4) seems to be three
independent group of responsibility. If there is different LOCK
owning responsibility of (3,4), the above mentioned deadlock cycle
can be avoid. This fix introduces LOCK_thd_kill to handle
responsibility (3,4), which eliminates the deadlock issue.
Note: The problem is not found in 5.5. Introduction MDL subsystem
caused metadata locking responsibility to be moved from TDC/TC to
MDL subsystem. Due to this, responsibility of LOCK_open is reduced.
As the use of LOCK_open is removed in open_table() and
mysql_rm_table() the above mentioned CYCLE does not form.
Revision ID for changes,
open_table() = dlenev@mysql.com-20100727133458-m3ua9oslnx8fbbvz
mysql_rm_table() = jon.hauglid@oracle.com-20101116100012-kxep9txz2fxy3nmw
BUG#11761686 insert_id event is not filtered.
Two issues are covered.
INSERT into autoincrement field which is not the first part in the composed primary key
is unsafe by autoincrement logging design. The case is specific to MyISAM engine
because Innodb does not allow such table definition.
However no warnings and row-format logging in the MIXED mode was done, and
that is fixed.
Int-, Rand-, User-var log-events were not filtered along with their parent
query that made possible them to screw up execution context of the following
query.
Fixed with deferring their execution until the parent query.
******
Bug#11754117
Post review fixes.
mysql-test/suite/rpl/r/rpl_auto_increment_bug45679.result:
a new result file is added.
mysql-test/suite/rpl/r/rpl_filter_tables_not_exist.result:
results updated.
mysql-test/suite/rpl/t/rpl_auto_increment_bug45679.test:
regression test for BUG#11754117-45670 is added.
mysql-test/suite/rpl/t/rpl_filter_tables_not_exist.test:
regression test for filtering issue of BUG#11754117 - 45670 is added.
sql/log_event.cc:
Logics are added for deferring and executing events associated
with the Query event.
sql/log_event.h:
Interface to deferred events batch execution is added.
sql/rpl_rli.cc:
initialization for new RLI members is added.
sql/rpl_rli.h:
New members to RLI are added to facilitate deferred events gathering
and execution control;
two general character RLI cleanup methods are constructed.
sql/rpl_utility.cc:
Deferred_log_events methods are difined.
sql/rpl_utility.h:
A new class Deferred_log_events is defined to implement
IRU events gathering, execution and cleanup.
sql/slave.cc:
Necessary changes to initialize `rli->deferred_events' and prevent
deferred event deletion in the main read-exec branch.
sql/sql_base.cc:
A new safe-check function for multi-part pk with auto-increment is defined
and deployed in lock_tables().
sql/sql_class.cc:
Initialization for a new member and replication cleanups are added
to THD class.
sql/sql_class.h:
THD class receives a new member to hold a specific execution
context for slave applier.
sql/sql_parse.cc:
Execution of the deferred event in started prior to its parent query.
Currently SHOW MASTER LOGS and SHOW BINARY LOGS require the SUPER
privilege. Monitoring tools (such as MEM) often want to check this
output - for instance MEM generates the SUM of the sizes of the logs
reported here, and puts that in the Replication overview within the MEM
Dashboard.
However, because of the SUPER requirement, these tools often have an
account that holds open the connection whilst monitoring, and can lock
out administrators when the server gets overloaded and reaches
max_connections - there is already another SUPER privileged account
connected, the "monitor".
As SHOW MASTER STATUS, and all other replication related statements,
return with either REPLICATION CLIENT or SUPER privileges, this worklog
is to make SHOW MASTER LOGS and SHOW BINARY LOGS be consistent with this
as well, and allow both of these commands with either SUPER or
REPLICATION CLIENT.
This allows monitoring tools to not require a SUPER privilege any more,
so is safer in overloaded situations, as well as being more secure, as
lighter privileges can be given to users of such tools or scripts.
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5
https://mariadb.atlassian.net/browse/MDEV-28
This task implements a new clause LIMIT ROWS EXAMINED <num>
as an extention to the ANSI LIMIT clause. This extension
allows to limit the number of rows and/or keys a query
would access (read and/or write) during query execution.
If a query's end time is before before its start time, the system clock has been turn back
(daylight savings time etc.). When the system clock is changed, we can't tell for certain a
given query was actually slow. We did not protect against logging such a query with a bogus
execution time (resulting from end_time - start_time being negative), and possibly logging it
even though it did not really take long to run.
We now have a sanity check in place.
sql/sql_parse.cc:
Make sure end time is not before start time - otherwise, we can be SURE the system clock
was changed in between, but not by how much. In other words, when the clock is changed,
we don't know how long a query ran, and whether it was slow.
The code was accessing a pointer in a mem_root that might be freed by
another concurrent thread. Fix by moving the access to be done while the
LOCK_thd_data is held, preventing the memory from being freed too early.
Problem: Statements that write to tables with auto_increment columns
based on the selection from another table, may lead to master
and slave going out of sync, as the order in which the rows
are retrived from the table may differ on master and slave.
Solution: We mark writing to a table with auto_increment table
as unsafe. This will cause the execution of such statements to
throw a warning and forces the statement to be logged in ROW if
the logging format is mixed.
Changes:
1. All the statements that writes to a table with auto_increment
column(s) based on the rows fetched from another table, will now
be unsafe.
2. CREATE TABLE with SELECT will now be unsafe.
sql/share/errmsg-utf8.txt:
Added new Warning messages
sql/sql_base.cc:
created a new function that checks for select + write on a autoinc table
made all such statements to be unsafe.
sql/sql_parse.cc:
made create autoincremnet tabble + select unsafe
MEMORY LEAK.
Background:
- There are caches for stored functions and stored procedures (SP-cache);
- There is no similar cache for events;
- Triggers are cached together with TABLE objects;
- Those SP-caches are per-session (i.e. specific to each session);
- A stored routine is represented by a sp_head-instance internally;
- SP-cache basically contains sp_head-objects of stored routines, which
have been executed in a session;
- sp_head-object is added into the SP-cache before the corresponding
stored routine is executed;
- SP-cache is flushed in the end of the session.
The problem was that SP-cache might grow without any limit. Although this
was not a pure memory leak (the SP-cache is flushed when session is closed),
this is still a problem, because the user might take much memory by
executing many stored routines.
The patch fixes this problem in the least-intrusive way. A soft limit
(similar to the size of table definition cache) is introduced. To represent
such limit the new runtime configuration parameter 'stored_program_cache'
is introduced. The value of this parameter is stored in the new global
variable stored_program_cache_size that used to control the size of SP-cache
to overflow.
The parameter 'stored_program_cache' limits number of cached routines for
each thread. It has the following min/default/max values given from support:
min = 256, default = 256, max = 512 * 1024.
Also it should be noted that this parameter limits the size of
each cache (for stored procedures and for stored functions) separately.
The SP-cache size is checked after top-level statement is parsed.
If SP-cache size exceeds the limit specified by parameter
'stored_program_cache' then SP-cache is flushed and memory allocated for
cache objects is freed. Such approach allows to flush cache safely
when there are dependencies among stored routines.
sql/mysqld.cc:
Added global variable stored_program_cache_size to store value of
configuration parameter 'stored-program-cache'.
sql/mysqld.h:
Added declaration of global variable stored_program_cache_size.
sql/sp_cache.cc:
Extended interface for sp_cache by adding helper routine
sp_cache_enforce_limit to control size of stored routines cache for
overflow. Also added method enforce_limit into class sp_cache that
implements control of cache size for overflow.
sql/sp_cache.h:
Extended interface for sp_cache by adding standalone routine
sp_cache_enforce_limit to control size of stored routines cache
for overflow.
sql/sql_parse.cc:
Added flush of sp_cache after processing of next sql-statement
received from a client.
sql/sql_prepare.cc:
Added flush of sp_cache after preparation/execution of next prepared
sql-statement received from a client.
sql/sys_vars.cc:
Added support for configuration parameter stored-program-cache.