Adjust the test case to try and avoid some sporadic failures on loaded test
hosts.
The wait for SQL thread to stop may complete before worker threads
have completed.
The code was using the wrong variable when comparing the binlog name
for the UNTIL position. This could cause the comparison to fail after
binlog rotation, in turn causing the UNTIL clause to not trigger slave
stop.
The assertion is there to catch cases where we rollback while
mark_start_commit() is active. This can allow following event groups
to be replicated too early, causing conflicts.
But in this case, we have an _explicit_ ROLLBACK event in the binlog,
which should not assert.
We fix this by delaying the mark_start_commit() in the explicit
ROLLBACK case. It seems safest to delay this in ROLLBACK case anyway,
and there should be no reason to try to optimise this corner case.
PROBLEMS
Description:- Server variable "--lower_case_tables_names"
when set to "0" on windows platform which does not support
case sensitive file operations leads to problems. A warning
message is printed in the error log while starting the
server with "--lower_case_tables_names=0". Also according to
the documentation, seting "lower_case_tables_names" to "0"
on a case-insensitive filesystem might lead to index
corruption.
Analysis:- The problem reported in the bug is:-
Creating an INNODB table 'a' and executing a query, "INSERT
INTO a SELECT a FROM A;" on a server started with
"--lower_case_tables_names=0" and running on a
case-insensitive filesystem leads innodb to flat spin.
Optimizer thinks that "a" and "A" are two different tables
as the variable "lower_case_table_names" is set to "0". As a
result, optimizer comes up with a plan which does not need a
temporary table. If the same table is used in select and
insert, a temporary table is needed. This incorrect
optimizer plan leads to infinite insertions.
Fix:- If the server is started with
"--lower_case_tables_names" set to 0 on a case-insensitive
filesystem, an error, "The server option
'lower_case_table_names'is configured to use case sensitive
table names but the data directory is on a case-insensitive
file system which is an unsupported combination. Please
consider either using a case sensitive file system for your
data directory or switching to a case-insensitive table name
mode.", is printed in the server error log and the server
exits.
This bug is essentially another variant of MDEV-7458.
If a transaction conflict caused a deadlock kill of T2 in record_gtid()
during commit, the code would do a rollback _before_ running
rgi->unmark_start_commit(). This creates a race where following transactions
could start too early (before T2 has completed its transaction retry). This
in turn could lead to replication failure, if there was a conflict that
caused eg. duplicate key error or similar.
The fix is to remove these rollbacks (in Query_log_event::do_apply_event()
and Xid_log_event::do_apply_event(). They seem out-of-place; code in
log_event.cc generally does not roll back on error, this is handled higher
up.
In addition, because of the extreme difficulty of reproducing bugs like
MDEV-7458 and MDEV-8302, this patch adds some extra precations to try to
detect (in debug builds) or prevent (in release builds) similar bugs.
ha_rollback_trans() will now call unmark_start_commit() if needed (and
assert in debug build when a caller does rollback without unmark first).
We also add an extra check for thd->killed() so that we avoid doing
mark_start_commit() if we already have a pending deadlock kill.
And we add a missing unmark_start_commit() call in the error case, found by
the above assertion.
There is several different ways to incorrectly define
foreign key constraint. In many cases earlier MariaDB
versions the error messages produced by these cases
are not very clear and helpful. This patch improves
the warning messages produced by foreign key parsing.
with plugin-load-add that are already registered at mysql.plugin
- issue just one error message, without this extra warning
- don't abuse ER_UDF_EXISTS, instead add a proper error message for plugins
- report started initialization for each plugin source
Fix was to add a test in Query_log_event::Query_log_event() if we are using
CREATE ... SELECT and in this case use trans cache, like we do on the master.
This avoid using (with doesn't have checksum)
Other things:
- Removed dummy call my_checksum(0L, NULL, 0)
- More DBUG_PRINT
- Cleaned up Log_event::need_checksum() to make it more readable (similar as in MySQL 5.6)
- Renamed variable that was hiding another one in create_table_imp()
Analysis: At check_trx_exists function InnoDB allocates
a new trx if no trx is found from thd but this newly
allocated trx is not registered to thd. This is unsafe,
because nothing prevents InnoDB plugin from being uninstalled
while there's active transaction. This can cause crashes, hang
and any other odd behavior. It may also corrupt stack, as
functions pointers are not available after dlclose.
Fix: The fix is to use thd_set_ha_data() when
manipulating per-connection handler data. It does appropriate
plugin locking.
Problem was that test just takes too long time in slow I/O and triggers
testcase timeout. Reduced the number of operations and inserts to make
test shorter.
The fix is that if the slave has a different integer size than
the master, then they will assume the master has the same signed/unsigned modifier
as the slave.
This means that one can safely change a coon the slave an int to a bigint
or an unsigned int to an unsigned int. Changing an unsigned int to an
signed bigint will cause replication failures when the high bit of the
unsigned int is set.
We can't give an error if the signess is different on the master and slave
as the binary log doesn't contain the signess of the column on the master.
Analysis; Problem is that InnoDB does not have support for generating
CURRENT_TIMESTAMP or constant default.
Fix: Add additional check if column has changed from NULL -> NOT NULL
and column default has changed. If this is is first column definition
whose SQL type is TIMESTAMP and it is defined as NOT NULL and
it has either constant default or function default we must use
"Copy" method for alter table.
The --gtid-ignore-duplicates option was not working correctly with row-based
replication. When a row event was completed, but before committing, there
was a small window where another multi-source SQL thread could wrongly try
to re-execute the same transaction, without properly ignoring the duplicate
GTID. This would lead to duplicate key error or out-of-order GTID error or
similar.
Thanks to Matt Neth for reporting this and giving an easy way to reproduce
the issue.
Problem:
If we add a referential integrity constraint with a duplicate
name, an error occurs. The foreign key object would not have
been added to the dictionary cache. In the error path, there
is an attempt to remove this foreign key object. Since this
object is not there, the search returns a NULL result.
De-referencing the null object results in this crash.
Solution:
If the search to the foreign key object failed, then don't
attempt to access it.
rb#9309 approved by Marko.
in ha_delete_table()
* only convert ENOENT and HA_ERR_NO_SUCH_TABLE to warnings
* only return real error codes (that is, not ENOENT and
not HA_ERR_NO_SUCH_TABLE)
* intercept HA_ERR_ROW_IS_REFERENCED to generate backward
compatible ER_ROW_IS_REFERENCED
in mysql_rm_table_no_locks()
* no special code to handle HA_ERR_ROW_IS_REFERENCED
* no special code to handle ENOENT and HA_ERR_NO_SUCH_TABLE
* return multi-table error ER_BAD_TABLE_ERROR <table list> only
when there were many errors, not when there were many
tables to drop (but only one table generated an error)
Follow-up patch to temporarily avoid a sporadic failure in the test
rpl.rpl_000011 due to MDEV-8301.
There is a window during thread exit where the global status is
counted incorrectly - the contribution for the exiting thread is
counted twice. The patch for MDEV-8294 made this window visible to the
test case rpl.rpl_000011, causing it to sporadically fail. Temporarily
silence this with a wait for the expected value; can be removed once
MDEV-8294 is fixed.