replication causing replication to fail.
Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.
The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.
With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.
Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.
Also some misc. fixes found during development and testing:
- Remove thd_rpl_is_parallel(), it is not used or needed.
- Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
worker thread is killed to resolve a deadlock with fixed commit
ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
can be ignored if the query otherwise completed successfully, and this
could cause the deadlock kill to be lost, so that the deadlock was not
correctly resolved.
- Fix random test failure due to missing wait_for_binlog_checkpoint.inc.
- Make sure that deadlock or other temporary errors during parallel
replication are not printed to the the error log; there were some places
around the replication code with extra error logging. These conditions can
occur occasionally and are handled automatically without breaking
replication, so they should not pollute the error log.
- Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
the end of a transaction, to be able to detect and resolve deadlocks due to
commit ordering. But this value was also used as a flag to mark whether
record_gtid() had been called, by being set to zero, losing the value. Now,
introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
valid for the entire duration of the transaction.
- Fix one place where the code to handle ignored errors called reset_killed()
unconditionally, even if no error was caught that should be ignored. This
could cause loss of a deadlock kill signal, breaking deadlock detection and
resolution.
- Fix a couple of missing mysql_reset_thd_for_next_command(). This could
cause a prior error condition to remain for the next event executed,
causing assertions about errors already being set and possibly giving
incorrect error handling for following event executions.
- Fix code that cleared thd->rgi_slave in the parallel replication worker
threads after each event execution; this caused the deadlock detection and
handling code to not be able to correctly process the associated
transactions as belonging to replication worker threads.
- Remove useless error code in slave_background_kill_request().
- Fix bug where wfc->wakeup_error was not cleared at
wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
error condition to wrongly propagate to a later wait_for_prior_commit(),
causing spurious ER_PRIOR_COMMIT_FAILED errors.
- Do not put the binlog background thread into the processlist. It causes
too many result differences in mtr, but also it probably is not useful
for users to pollute the process list with a system thread that does not
really perform any user-visible tasks...
moved releasing of wsrep THD mutex and thd->awake later, so that wsrep->abort_pre_commit()
is guaranteed to run for a thread which is still in conflict state
Due to how gap locks work, two transactions could group commit together on the
master, but get lock conflicts and then deadlock due to different thread
scheduling order on slave.
For now, remove these deadlocks by running the parallel slave in READ
COMMITTED mode. And let InnoDB/XtraDB allow statement-based binlogging for the
parallel slave in READ COMMITTED.
We are also investigating a different solution long-term, which is based on
relaxing the gap locks only between the transactions running in parallel for
one slave, but not against possibly external transactions.
AUTO_INCREMENT_INCREMENT
Problem:
=======
When auto_increment_increment system variable decreases,
immediate next value of auto increment column is not affected.
Solution:
========
Get the previous inserted value of auto increment column by
subtracting the previous auto_increment_increment from next
auto increment value. After that calculate the current autoinc value
using newly changed auto_increment_increment variable.
Approved by Sunny [rb#4394]
AUTO_INCREMENT_INCREMENT
Problem:
=======
When auto_increment_increment system variable decreases,
immediate next value of auto increment column is not affected.
Solution:
========
Get the previous inserted value of auto increment column by
subtracting the previous auto_increment_increment from next
auto increment value. After that calculate the current autoinc value
using newly changed auto_increment_increment variable.
Approved by Sunny [rb#4394]
Syntax. Server support. Test cases.
InnoDB bugfixes:
* don't mess around with system sprintf's, always use my_error() for errors.
* don't use InnoDB internal error codes where OS error codes are expected.
* don't say "file not found", when it was.
Update InnoDB to 5.6.14
Apply MySQL-5.6 hack for MySQL Bug#16434374
Move Aria-only HA_RTREE_INDEX from my_base.h to maria_def.h (breaks an assert in InnoDB)
Fix InnoDB memory leak
The maximum value for innodb_thread_sleep_delay is 4294967295 (32-bit) or
18446744073709551615 (64-bit) microseconds. This is way too big, since
the max value of innodb_thread_sleep_delay is limited by
innodb_adaptive_max_sleep_delay if that value is set to non-zero value
(its default is 150,000).
Solution
The maximum value of innodb_thread_sleep_delay should be the same as
the maximum value of innodb_adaptive_max_sleep_delay, which is 1000000.
Approved by Jimmy, rb#4429
The maximum value for innodb_thread_sleep_delay is 4294967295 (32-bit) or
18446744073709551615 (64-bit) microseconds. This is way too big, since
the max value of innodb_thread_sleep_delay is limited by
innodb_adaptive_max_sleep_delay if that value is set to non-zero value
(its default is 150,000).
Solution
The maximum value of innodb_thread_sleep_delay should be the same as
the maximum value of innodb_adaptive_max_sleep_delay, which is 1000000.
Approved by Jimmy, rb#4429
IN DOCUMENTATION
Problem
-------
The documentation says that we support 'K' prefix
while specifiying size for innodb datafile in the
server variable for innodb_data_file_path ,but the
function srv_parse_megabytes() only handles only
'M' (megabytes) and 'G' (gigabytes) .
Fix
---
Modify srv_parse_megabytes() to handle Kilobytes.
Add in documentation that while specifying size
in KB it should be mentioned in multiples of 1024
other wise they will be rounded off to nearest
MB (megabyte) boundry .(eg if size mentioned
as 2313KB will be considered as 2 MB ).
[ Approved by Marko #rb 2387 ]
IN DOCUMENTATION
Problem
-------
The documentation says that we support 'K' prefix
while specifiying size for innodb datafile in the
server variable for innodb_data_file_path ,but the
function srv_parse_megabytes() only handles only
'M' (megabytes) and 'G' (gigabytes) .
Fix
---
Modify srv_parse_megabytes() to handle Kilobytes.
Add in documentation that while specifying size
in KB it should be mentioned in multiples of 1024
other wise they will be rounded off to nearest
MB (megabyte) boundry .(eg if size mentioned
as 2313KB will be considered as 2 MB ).
[ Approved by Marko #rb 2387 ]
ERRORS IN THE FK SECTION
ANALYSIS
--------
Any error during the renaming of the table was
incorrectly logged in the dict_foreign_err_file
and it showed up in foreign key section when
we give the query "show engine innodb status".
FIX
---
Prevent renaming error from being logged in
dict_foreign_err_file section.
[Aprooved by marko #rb 2501 ]
ERRORS IN THE FK SECTION
ANALYSIS
--------
Any error during the renaming of the table was
incorrectly logged in the dict_foreign_err_file
and it showed up in foreign key section when
we give the query "show engine innodb status".
FIX
---
Prevent renaming error from being logged in
dict_foreign_err_file section.
[Aprooved by marko #rb 2501 ]
support ha_innodb.so as a dynamic plugin.
* remove obsolete *,innodb_plugin.rdiff files
* s/--plugin-load=/--plugin-load-add=/
* MYSQL_PLUGIN_IMPORT glob_hostname[]
* use my_error instead of push_warning_printf(ER_DEFAULT)
* don't use tdc_size and tc_size in a module
update test cases (XtraDB is 5.6.14, InnoDB is 5.6.10)
* copy new tests over
* disable some tests for (old) InnoDB
* delete XtraDB tests that no longer apply
small compatibility changes:
* s/HTON_EXTENDED_KEYS/HTON_SUPPORTS_EXTENDED_KEYS/
* revert unnecessary InnoDB changes to make it a bit closer to the upstream
fix XtraDB to compile on Windows (both as a static and a dynamic plugin)
disable XtraDB on Windows (deadlocks) and where no atomic ops are available (e.g. CentOS 5)
storage/innobase/handler/ha_innodb.cc:
revert few unnecessary changes to make it a bit closer to the original InnoDB
storage/innobase/include/univ.i:
correct the version to match what it was merged from
This is now otherwise on level wsrep-25.9, but storage/innobase has not been fully merged
wsrep-5.5 is not good source for that, so we probably have to cherry pick innodb changes from wsrep-5.6
restore old innodb get_innobase_type_from_mysql_type() function,
record all mysql_type->innodb_type mapping
(as generated by mysql-5.6).
add safety code to disable online alter when internal types don't match
storage/innobase/dict/dict0stats.cc:
revert to 5.6 state
The merge is still missing a few hunks related to temporary tables and
InnoDB log file size. The associated code did not seem to exist in
10.0, so the merge of that needs more work. Until this is fixed, there
are a number of test failures as a result.
For compatibility purposes let InnoDB use DATA_INT for MYSQL_TYPE_ENUM and MYSQL_TYPE_SET.
Silence the warning for these types and let the index translation table to be built anyway.
Test case by Jeremy Cole.
--Implemented CHECK TABLE...QUICK.
Introduce CHECK TABLE...QUICK that would skip the btr_validate_index()
and btr_search_validate() call, and count the no. of records in each index.
Approved by Marko and Kevin. (rb#3567).
--Implemented CHECK TABLE...QUICK.
Introduce CHECK TABLE...QUICK that would skip the btr_validate_index()
and btr_search_validate() call, and count the no. of records in each index.
Approved by Marko and Kevin. (rb#3567).