FOREIGN_KEY_CHECKS is disabled
- dict_foreign_find_index() can return NULL if InnoDB already dropped
the foreign index when FOREIGN_KEY_CHECKS is disabled.
The problem happened in these line:
uval0= (ulonglong) (val0_negative ? -val0 : val0);
uval1= (ulonglong) (val1_negative ? -val1 : val1);
return check_integer_overflow(val0_negative ? -(longlong) res : res,
!val0_negative);
when unary minus was performed on -9223372036854775808.
This behavior is undefined in C/C++.
This bug could manifest itself in a very rare cases when the optimizer
chose an execution plan by which a joined table was accessed by a table
scan and the optimizer was checking whether ranges checked for each record
could improve this plan. In such cases the optimizer evaluates range
conditions over a table that depend on other tables. For such conditions
the constructed SEL_ARG trees are marked as MAYBE_KEY. If a SEL_ARG object
constructed for a sargable condition marked as RANGE_KEY had the same
first key part as a MAYBE_KEY SEL_ARG object and the key_and() function
was called for this pair of SEL_ARG objects then an invalid SEL_ARG
object could be constructed that ultimately could lead to a crash before
the execution phase.
'index_merge_sort_union=off'
When index_merge_sort_union is set to 'off' and index_merge_union is set
to 'on' then any evaluated index merge scan must consist only of ROR scans.
The cheapest out of such index merges must be chosen. This index merge
might not be the cheapest index merge.
* `--defaults-file` option is showed only in `--help --verbose` if
applied
* `--default-extra-file` is showing correctly now in `--help --verbose`,
previously it was treated as a directory with appended `my.cnf`
The only change is a change of the version number.
In MySQL 5.6.46, the copyright comments in a number of files were changed
in mysql/mysql-server@f1a006ece7
but there was no functional change to InnoDB code.
This was also reflected by XtraDB. We are not changing the copyright
comments in MariaDB Server for now.
Between MySQL 5.6.46 and 5.6.47, InnoDB was not changed at all.
Actually, we had forgotten to update the InnoDB version number to
5.6.46. With this change, we are updating InnoDB
from 5.6.45 to 5.6.47 and XtraDB from 5.6.45-86.1 to 5.6.46-86.2.
Remove the offending test case. This sort of error is hard to test in
all possible corner cases and thus makes the test less valuable. The
overflow error will be covered by warnings generated by the compiler,
which is much more reliable in the general case.
Problem:
========
SHOW BINLOG EVENTS FROM <pos> causes a variety of failures, some of which are
listed below. It is not a race condition issue, but there is some
non-determinism in it.
Analysis:
========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that read
event data satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.
Fix:
====
The fix is split into different parts. Each part addresses either an ASAN
issue or an assert/crash.
**Part1: Checksum based position validation when checksum is enabled**
Using checksum validate the very first event read at the user specified
position. If there is a checksum mismatch report an appropriate error for the
invalid event.
Set read_set bitmap for view from the JOIN::all_fields list instead of JOIN::fields_list
as split_sum_func would have added items to the all_fields list.
Item_ref::val_(datetime|time)_packed() erroneously called
(*ref)->val_(datetime|time)_packed().
- Fixing to call (*ref)->val_(datetime|time)_packed_result().
- Backporting Item::val_(datetime|time)_packed_result() from 10.3.
- Fixing Item_field::get_date_result() to handle null_value in the same
way how Item_field::get_date() does.
Problem:
=======
Test "binlog.binlog_parallel_replication_marks_row" fails sporadically due to
result length mismatch.
Analysis:
=========
Test generates a binary log and it looks for certain words within the binary
log file and prints them. For example word like "GTID,BEGIN,COMMIT ...".
Binary log output contains base64 encoded characters. Occasionally the encoded
characters match with the above words and results in test failure.
+XwoFWxMBAAAALgAAAGEDAAAAAB8AAAAAAAEABHRlc3QAAnQxAAIDAwACFGTIDQ==
+AAAAAAAAAAAEEwQADQgICAoKCgGTIDw9
Fix:
===
Improve the regular expression to match exact words.
Let MTR check for error existence after running a test and return it back to user.
Error reporting itset might be much better, but first of all we need to see that
something went wrong.
$glob_mysql_test_dir was the wrong directory for strace output as
it was for in-tree builds only so failed for:
* out of tree builds
* --parallel; and
* --mem
strace output wasn't saved.
strace-option never replaced existing arguments (so ammended
documentation).
strace-client didn't accept an argument as described.
Replaced specification of client with this with 'stracer' to be
consistent with --debugger option.
For consistency with debugger options, --client-strace was added to
execute the strace on the mysqltest.
Example: Running one test
$ ./mtr --strace --client-strace funcs_1.is_table_constraints
Logging: ./mtr --strace --client-strace funcs_1.is_table_constraints
vardir: /home/anel/mariadb/5.5/mysql-test/var
Checking leftover processes...
Removing old var directory...
- WARNING: Using the 'mysql-test/var' symlink
Creating var directory '/home/anel/mariadb/5.5/mysql-test/var'...
Checking supported features...
MariaDB Version 5.5.67-MariaDB-debug
Installing system database...
- SSL connections supported
- binaries are debug compiled
Collecting tests...
==============================================================================
TEST RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
funcs_1.is_table_constraints [ pass ] 1270
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 1.270 of 3 seconds executing testcases
Completed: All 1 tests were successful
$ find -L . -name \*strace -ls
653 56 -rw-r--r-- 1 anel anel 57147 Nov 29 15:08 ./var/log/mysqltest.strace
646 1768 -rw-r--r-- 1 anel anel 1809855 Nov 29 15:08 ./var/log/mysqld.1.strace
Example: Running test in parallel
$ mysql-test/mtr --strace --client-strace --mem --parallel=3 main.select
Logging: /home/dan/software_projects/mariadb-server/mysql-test/mysql-test-run.pl --strace --client-strace --mem --parallel=3 main.select
vardir: /home/dan/software_projects/build-mariadb-10.3/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/dan/software_projects/build-mariadb-10.3/mysql-test/var'...
- symlinking 'var' to '/dev/shm/var_auto_0v2E'
Checking supported features...
MariaDB Version 5.5.67-MariaDB
- SSL connections supported
Collecting tests...
Installing system database...
==============================================================================
TEST WORKER RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[3] - 'localhost:16040' was not free
worker[2] Using MTR_BUILD_THREAD 301, with reserved ports 16020..16039
worker[3] Using MTR_BUILD_THREAD 303, with reserved ports 16060..16079
main.select w1 [ pass ] 7310
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 7.310 of 11 seconds executing testcases
Completed: All 1 tests were successful.
$ find mysql-test/var/ -name \*strace -ls
5213766 1212 -rw-r--r-- 1 dan dan 1237817 May 20 16:47 mysql-test/var/1/log/mysqltest.strace
5214733 13016 -rw-r--r-- 1 dan dan 13328335 May 20 16:47 mysql-test/var/1/log/mysqld.1.strace
$ mysql-test/mtr --strace --client-strace --strace-option='-e' --strace-option='trace=openat' --mem --parallel=3 main.select
...
$ find mysql-test/var/ -name \*strace -ls
5220790 8 -rw-r--r-- 1 dan dan 6291 May 20 17:02 mysql-test/var/3/log/mysqltest.strace
5224140 308 -rw-r--r-- 1 dan dan 314356 May 20 17:02 mysql-test/var/3/log/mysqld.1.strace
$ more mysql-test/var/3/mysqltest.strace
1692 openat(AT_FDCWD, "/home/dan/software_projects/mariadb-server/libmysql/.libs/tls/x86_64/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) =
-1 ENOENT (No such file or directory)
1692 openat(AT_FDCWD, "/home/dan/software_projects/mariadb-server/libmysql/.libs/tls/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOE
NT (No such file or directory)
Closes#600
The problem happens when MariaDB master replicates writes for only non InnoDB
tables (e.g. writes to MyISAM table(s)). Async slave node, in Galera cluster,
can apply these writes successfully, but it will, in the end, write gtid position in
mysql.gtid_slave_pos table. mysql.gtid_slave_pos table is InnoDB engine, and
this write makes innodb handlerton part of the replicated "transaction".
Note that wsrep patch identifies that write to gtid_slave_pos should not be replicated
and skips appending wsrep keys for these writes. However, as InnoDB was present
in the transaction, and there are replication events (for MyISAM table) in transaction
cache, but there are no appended keys, wsrep raises an error, and this makes the söave
thread to stop.
The fix is simply to not treat it as an error if async slave tries to replicate a write
set with binlog events, but no keys. We just skip wsrep replication and return successfully.
This commit contains also a mtr test which forces mysql.gtid_slave_pos table isto be
of InnoDB engine, and executes MyISAM only write through asyn replication.
There is additional fix for declaring IO and background slave threads as non wsrep.
These threads should not write anything for wsrep replication, and this is just a safeguard
to make sure nothing leaks into cluster from these slave threads.
This PR contains a mtr test for reproducing a failure with replicating create table as select statement (CTAS) through asynchronous mariadb replication to mariadb galera cluster.
The problem happens when CTAS replication contains both create table statement followed by row events for populating the table. In such situation, the galera node operating as mariadb replication slave, will first replicate only the create table part into the cluster, and then perform another replication containing both the create table and row events. This will lead all other nodes to fail for duplicate table create attempt, and crash due to this failure.
PR contains also a fix, which identifies the situation when CTAS has been replicated, and makes further scan in async replication stream to see if there are following row events. The slave node will replicate either single TOI in case the CTAS table is empty, or if CTAS table contains rows, then single bundled write set with create table and row events is replicated to galera cluster.
This fix should keep master server's GTID's for CTAS replication in sync with GTID's in galera cluster.
Make sure that the sort buffers can store atleast one sort key.
This is needed to make sure that all merge buffers are read else
with no sort keys some merge buffers are skipped because the code
makes a conclusion there is no data to be read.
Problem:
========
CURRENT_TEST: binlog_encryption.rpl_corruption
mysqltest: In included file "./include/wait_for_slave_io_error.inc":
...
At line 72: Slave stopped with wrong error code
**** Slave stopped with wrong error code: 1743 (expected 1595,1913) ****
Analysis:
========
The test emulates the corruption at the various stages of replication for
example in binlog file, in network and in relay log etc. It verifies that all
corruption cases are handled through appropriate error messages.
The test cases which emulate network failure expect following errors.
--ER_SLAVE_RELAY_LOG_WRITE_FAILURE (1595)
--ER_NETWORK_READ_EVENT_CHECKSUM_FAILURE (1743)
Ideally test should expect error codes as 1595 and 1743.
But the test actually waits on incorrect error code 1595,1913
Fix:
===
Added appropriate error code for 'ER_NETWORK_READ_EVENT_CHECKSUM_FAILURE'.
Replaced 1913 with 1743.
The assert indicates that the current transaction got caught uncleaned from
the semisync master's cache when it is signaled to proceed upon its
ack receive.
The reason of missed cleanup turns out to be a flaw in the gtid
connect mode.
A submitted by connecting slave value of its last received event's
binlog file *name* was adopted into
{{Repl_semi_sync_master::m_reply_file_name}} as a part of semisync
initialization.
Notice that the initialization still refines the position part of the
submitted last received event's binlog coordinates.
The master side binlog filename:pos refinement is
specific to the gtid connect mode for purpose of computing the latest
binlog file to resume slave feeding from.
Effectively in the gtid connect mode the computed resumption filename:pos
may appear smaller in which case a new post-connect time committing
transaction may be logged with its filename:pos also less than the
submitted coordinates and that triggers the assert.
Fixed with making the semisync initialization to use the refined filename:pos.
It is guaranteed to be less than any new generated transaction's binlog:pos.
The issue here is the wrong estimate of the cardinality of a partial join,
the cardinality is too high because the function table_cond_selectivity()
returns an absurd number 100 while selectivity cannot be greater than 1.
When accessing table t by outer reference t1.a via index we do not perform any
range analysis for t. Yet we see TABLE::quick_key_parts[key] and
TABLE->quick_rows[key] contain a non-zero value though these should have been
remained untouched and equal to 0.
Thus real cause of the problem is that TABLE::init does not clean the arrays
TABLE::quick_key_parts[] and TABLE::>quick_rows[].
It should have done it because the TABLE structure created for any
instance of a table can be reused for many queries.
InnoDB: Assertion failure in file .../dict/dict0dict.cc line ...
InnoDB: Failing assertion: table->can_be_evicted
This fixes a regression that was caused by the fix of MDEV-20621
(commit a41d429765).
MySQL 5.6 (and MariaDB 10.0) introduced eviction of tables from
the InnoDB data dictionary cache. Tables that are connected to
FOREIGN KEY constraints or FULLTEXT INDEX are exempt of the eviction.
With the problematic change, a table that would already be exempt
from eviction due to FOREIGN KEY would cause the problem if there
also was a FULLTEXT INDEX defined on it.
dict_load_table(): Only prevent eviction if table->can_be_evicted holds.
In the function prev_record_reads where one finds the different row combinations for a
subset of partial join, it did not take into account the selectivity of tables
involved in the subset of partial join.
Unfortunate DROP TEMPORARY..IF EXISTS on a regular table may allow
subsequent CREATE TABLE statements to steal away the PFS_table_share
instance from the dropped table.
mysql_insert() first opens all affected tables (which implicitly
starts a transaction in InnoDB), then stat tables.
A failure to open a stat table caused open_tables() to abort
the current stmt transaction (trans_rollback_stmt()). So, from the
server point of view the following ha_write_row()-s happened outside
of a transactions, and the server didn't bother to commit them.
The server has a mechanism to prevent a transaction being
unexpectedly committed or rolled back in the middle of a statement -
if an operation takes place _in a sub-statement_ it cannot change
the transaction state. Operations on stat tables are exactly that -
they are not allowed to change a transaction state. Put them in
a sub-statement to make sure they don't.
Apply the changes to InnoDB and XtraDB that had been
inadvertently skipped in the merge
commit ae476868a5
That merge failure sabotaged part of MDEV-20127:
>Revert a problematic auto_increment_increment 'fix' from 2014.
>This involves replacing the MDEV-8827 fix and in 10.1,
>removing some WSREP instrumentation.
The code changes were re-merged manually by executing the following:
# Get the parent of the problematic merge.
git checkout ae476868a5394041a00e75a29c7d45917e8dfae8^
# Perform the merge again.
git merge ae476868a5394041a00e75a29c7d45917e8dfae8^2
# Get the conflict resolution from that merge.
git checkout ae476868a5 .
# Note: Any changes to these files were removed (empty diff)!
git diff HEAD storage/{innobase,xtradb}/handler/ha_innodb.cc
# Apply the code changes:
git diff cf40393471b10ca68cc1d2804c22ab9203900978^2..MERGE_HEAD \
storage/{innobase,xtradb}/handler/ha_innodb.cc|
patch -p1
InnoDB stores synced_doc_id + 1 value in FTS_CONFIG table. But
while reading the synced doc id from FTS_CONFIG table after restart,
InnoDB should read synced_doc_id - 1 to get the actual synced
doc id value.
using a specially crafted strings one could overflow `shift`
variable and cause a crash by dereferencing d10[-2147483648]
(on a sufficiently old gcc).
This is a correct fix and a test case for
Bug #29723340: MYSQL SERVER CRASH AFTER SQL QUERY WITH DATA ?AST
Apply the correct pattern for debug instrumentation:
SET @save_dbug=@@debug_dbug;
SET debug_dbug='+d,...';
...
SET debug_dbug=@save_dbug;
Numerous tests use statements of the form
SET debug_dbug='-d,...';
which will inadvertently enable all DBUG tracing output,
causing unnecessary waste of resources.
The test main.index_merge_innodb is taking very much time,
especially on later versions (10.2 and 10.3).
Some of this could be attributed to the use of INSERT...SELECT,
which is time-consumingly creating explicit record locks in InnoDB
for the locking read in the SELECT part.
In 10.3 and later, some slowness can be attributed to MDEV-12288,
which makes the InnoDB purge thread spend time to reset transaction
identifiers in the inserted records. If we prevent purge from
running before all tables are dropped, the test seems to be
10% faster on an unoptimized debug build on 10.5. (A proper fix
would be to implement MDEV-515 and stop writing row-level undo log
records for inserts into an empty table or partition.)
At the same time, it should not hurt to make main.index_merge_myisam
to use the sequence engine. Not only could it be a little faster,
but the test would be slightly more readable.
Analysis:
========
In general if there are three groups.
1 - Inserts 32 which fails due to local entry '32' on slave.
2 - Inserts 33
3 - Inserts 34
Each group considers itself as a waiter and it waits for prior group 'waitee'.
This is done in 'register_wait_for_prior_event_group_commit'. If there is no
other parallel group being scheduled then no waitee will be there.
Let us assume 3 groups are being scheduled in parallel.
3-> waits for 2-> waits for->1
'1' upon completion it checks is there any registered subsequent waiter. If
so it wakes up the subsequent waiter with its execution status. This execution
status is stored in wakeup_error.
If '1' failed then it sends corresponding wakeup_error to 2. Then '2' aborts
and it propagates error to '3'. So all further commits are aborted. This
mechanism works only when all transactions reach a stage where they are
waiting for their prior commit to complete.
In case of optimistic following scenario occurs.
1,2,3 are scheduled in parallel.
3 - Reaches group_commit_code waits for 2 to complete.
1 - errors out sets stop_on_error_sub_id=1.
When a group execution results in error its corresponding sub_id is set to
'stop_on_error_sub_id'. Any new groups queued for execution will check if
their sub_id is > stop_on_error_sub_id. If it is true their execution will be
skipped as prior group execution failed. 'skip_event_group=1' will be set.
Since the execution of SQL thread is about to stop we just skip execution of
all the following event groups. We still do all the normal waiting and wakeup
processing between the event groups as a simple way to ensure that everything
is stopped and cleaned up correctly.
Upon error '1' transaction checks for registered waiters. Since no one is
there it simply goes away.
2 - Starts the execution. It checks do I have a waitee.
Since wait_commit_sub_id == entry->last_committed_sub_id no waitee is set.
Secondly: 'entry->stop_on_error_sub_id' is set by '1'st execution. Now
'handle_parallel_thread' code checks if the current group 'sub_id' is greater
than the 'sub_id' set within 'stop_on_error_sub_id'.
Since the above is true 'skip_event_group=true' is set. Simply call
'wait_for_prior_commit' to wakeup all waiters. Group '2' didn't had any
waitee and its execution is skipped. Hence its wakeup_error=0.It sends a
positive wakeup signal to '3'. Which commits. This results in a missed
transaction. i.e 33 is missed and 34 is committed.
Fix:
===
When a worker learns that an earlier transaction execution has failed, and it
should not proceed for further execution, it should mark its own execution
status as failed so that it alerts its followers to abort as well.
Eliminate one InnoDB table with 128*16384 rows, and use
the sequence engine instead. Also, run everything in a single
transaction, to prevent purge from running concurrently
unnecessarily. (Starting with MariaDB Server 10.3, purge would
reset the DB_TRX_ID after INSERT.)
Also fixes:
MDEV-20560 Assertion `precision > 0' failed in decimal_bin_size upon SELECT with MOD short unsigned decimal
Changing the way how Item_func_mod calculates its max_length.
It now uses decimal_precision(), decimal_scale() and unsigned_flag
of its arguments, like all other Item_num_op descendants do.
In the function test_if_cheaper_ordering we make a decision if using an index is better than
using filesort for ordering. If we chose to do range access then in test_quick_select we
should make sure that cost for table scan is set to DBL_MAX so that it is not picked.
Problem:
=======
During dropping of fts index, InnoDB waits for fts_optimize_remove_table()
and it holds dict_sys->mutex and dict_operaiton_lock even though the
table id is not present in the queue. But fts_optimize_thread does wait
for dict_sys->mutex to process the unrelated table id from the slot.
Solution:
========
Whenever table is added to fts_optimize_wq, update the fts_status
of in-memory fts subsystem to TABLE_IN_QUEUE. Whenever drop index
wants to remove table from the queue, it can check the fts_status
to decide whether it should send the MSG_DELETE_TABLE to the queue.
Removed the following functions because these are all deadcode.
dict_table_wait_for_bg_threads_to_exit(),
fts_wait_for_background_thread_to_start(),fts_start_shutdown(), fts_shudown().
selectivity values fails
After having set the assertion that checks validity of selectivity values
returned by the function table_cond_selectivity() a test case from
order_by.tesst failed. The failure occurred because range optimizer could
return as an estimate of the cardinality of the ranges built for an index
a number exceeding the total number of records in the table.
The second bug is more subtle. It may happen when there are several
indexes with same prefix defined on the first joined table t accessed by
a constant ref access. In this case the range optimizer estimates the
number of accessed records of t for each usable index and these
estimates can be different. Only the first of these estimates is taken
into account when the selectivity of the ref access is calculated.
However the optimizer later can choose a different index that provides
a different estimate. The function table_condition_selectivity() could use
this estimate to discount the selectivity of the ref access. This could
lead to an selectivity value returned by this function that was greater
that 1.
InnoDB intentionally (it's a documented behavior) ignores changing of
DATA DIRECTORY and INDEX DIRECTORY for partitions. Though we should
issue warning when this happens.
Analysis:
========
As part of BUG#28642318 fix, two new test cases were added. The first test
case tests a scenario where two sessions are present, in which the first
session has a regular table named 't1' and another session has a temporary
table named 't1'. Test executes a DELETE statement on regular table. These
statements are captured from binary log and replayed back on new client
connection to prove that DELETE statement is applied successfully. Note that
the binlog contains only CREATE TEMPORARY TABLE part hence a temporary table
gets created in new connection. This replaying logic is implemented by using
'--exec $MYSQL' command. If the new connection gets disconnected within the
scope of first test case the test passes, i.e the temporary table gets dropped
as part thread cleanup. But on slow platforms the connection gets closed at
the time of execution of test case 2. When the temporary table is dropped as
part thread cleanup a "DROP TEMPORARY TABLE t1" is written into the binary
log. In test case two the same sessions continue to exist and and table names
are reused to test a new bug scenario. The additional "DROP TEMPORARY TABLE"
command drops second test specific tables which results in "Unknown table"
error.
Fix:
====
Rename the second case specific table to 't2'. Even if the close connection
from test case one happens later the drop command with has
'DROP /*!40005 TEMPORARY */ TABLE IF EXISTS `t1`' will not result in an error.
This patch corrects the fix of the patch for mdev-19421 that resolved
the problem of parsing some embedded join expressions such as
t1 join t2 left join t3 on t2.a=t3.a on t1.a=t2.a.
Yet the patch contained a bug that prevented proper context analysis
of the queries where such expressions were used together with comma
separated table references in from clauses.
There were two problems:
(1) If user wanted same time zone information on all nodes in the Galera
cluster all updates were not replicated as time zone information was
stored on MyISAM tables. This is fixed on Galera by altering time zone
tables to InnoDB while they are modified.
(2) If user wanted different time zone information to nodes in the Galera
cluster TRUNCATE TABLE for time zone tables was replicated by Galera
destroying time zone information from other nodes. This is fixed
on Galera by introducing new option for mysql_tzinfo_to_sql_symlink
tool --skip-write-binlog to disable Galera replication while
time zone tables are modified.
Changes to be committed:
modified: mysql-test/r/mysql_tzinfo_to_sql_symlink.result
modified: mysql-test/suite/wsrep/r/mysql_tzinfo_to_sql_symlink.result
new file: mysql-test/suite/wsrep/r/mysql_tzinfo_to_sql_symlink_skip.result
new file: mysql-test/suite/wsrep/t/mysql_tzinfo_to_sql_symlink_skip.test
modified: sql/tztime.cc
When discounting selectivity of ref access, don't discount the
selectivity we've already discounted for range access.
The 10.1 version of the fix. Will need to adjust condition filtering
test results in 10.4
Skip the test on big-endian systems.
In MariaDB Server 10.0 and 10.1 (as well as MySQL 5.6),
the implementation of innodb_checksum_algorithm=crc32
wrongly assumes little-endian byte order.
Problem:-
When mysql executes INSERT ON DUPLICATE KEY INSERT, the storage engine checks
if the inserted row would generate a duplicate key error. If yes, it returns
the existing row to mysql, mysql updates it and sends it back to the storage
engine.When the table has more than one unique or primary key, this statement
is sensitive to the order in which the storage engines checks the keys.
Depending on this order, the storage engine may determine different rows
to mysql, and hence mysql can update different rows.The order that the
storage engine checks keys is not deterministic. For example, InnoDB checks
keys in an order that depends on the order in which indexes were added to
the table. The first added index is checked first. So if master and slave
have added indexes in different orders, then slave may go out of sync.
Solution:-
Make INSERT...ON DUPLICATE KEY UPDATE unsafe while using stmt or mixed format
When there is more then one unique key.
Although there is two exception.
1. Auto Increment key is not counted because Innodb will get gap lock for
failed Insert and concurrent insert will get a next increment value. But if
user supplies auto inc value it can be unsafe.
2. Count only unique keys for which insertion is performed.
So this patch also addresses the bug id #72921
- The commit ab6dd77408 wrongly sets the
condition inside innobase_srv_conc_enter_innodb(). Problem is that
InnoDB makes the thread to sleep indefinitely if it is a replication
slave thread.
Thanks to Sujatha Sivakumar for contributing the replication test case.
Problem:
=======
Failed CREATE OR REPLACE TEMPORARY TABLE statement which dropped the table but
failed at a later stage of creation of temporary table is not written to
binarylog in row based replication. This causes the slave to diverge.
Analysis:
========
CREATE OR REPLACE statements work as shown below.
CREATE OR REPLACE TABLE table_name (a int);
is basically the same as:
DROP TABLE IF EXISTS table_name;
CREATE TABLE table_name (a int);
Hence every CREATE OR REPLACE TABLE command which dropped the table should be
written to binary log, even when following CREATE TABLE part fails. In order
to achieve this, during the execution of CREATE OR REPLACE command, when a
table is dropped 'thd->log_current_statement' flag is set. When table creation
results in an error within 'mysql_create_table' code, the error handling part
looks for this flag. If it is set the failed CREATE OR REPLACE statement is
written into the binary log inspite of error. This ensure that slave doesn't
diverge from the master. In case of row based replication the error handling
code returns very early, if the table is of type temporary. This is done based
on the assumption that temporary tables are not replicated in row based
replication.
It fails to handle the cases where a temporary table was created as part of
statement based replication at an earlier stage and the binary log format was
changed to row because of an unsafe statement. In this case when a CREATE OR
REPLACE statement is executed on this temporary table it will dropped but the
query will not be written to binary log. Hence slave diverges.
Fix:
===
In error handling code check the return status of create table operation. If
it is successful and replication mode is row based and table is of type
temporary then return. Other wise proceed further to the code which checks for
thd->log_current_statement flag and does appropriate logging.
MDEV-5589 commit set up a policy to skip DROP TEMPORARY TABLE binary logging
in case the target table has not been "CREATEed" in binlog (no CREATE
Query-log-event was logged into the binary log).
It turns out that
1. the rule did not cover non-existing table DROPped with IF-EXISTS clause.
The logged-create knowledge for the non-existing one does not even need
MDEV-5589 patch, and
2. connection close disobeys it to trigger automatic DROP-IF-EXISTS
binlogging.
Either 1 or 2 or even both is/are also responsible for unexpected binlog
records observed in MDEV-17863, actually rendering a referred
@@global.read_only irrelevant as far as the described stored procedure
definition *and* the ROW binlog-format are concerned.
Analysis
========
Point in time recovery using mysqlbinlog containing queries
operating on temporary tables results in an error.
While writing the query log event in the binary log, the
thread id used for execution of DROP TABLE and DELETE commands
were incorrect. The thread variable 'thread_specific_used'
is used to determine whether a specific thread id is to used
while executing the statements i.e using 'SET
@@session.pseudo_thread_id'. This variable was not set
correctly for DROP TABLE query and was never set for DELETE
query. The thread id is important for temporary tables
since the tables are session specific. DROP TABLE and DELETE
queries executed using a wrong thread id resulted in errors
while applying the queries generated by mysqlbinlog utility.
Fix
===
Set the 'thread_specific_used' THD variable for DROP TABLE and
DELETE queries.
ReviewBoard: 21833
check_valid_path() uses my_strcspn() that cannot handle invalid characters
properly. This is fixed by a big refactoring in 10.2 (MDEV-6353).
For 5.5, let's simply swap tests, because check_string_char_length()
rejects invalid characters just fine.
Description:- During server startup, the server exits if
the 'mysql.plugin' system table has any rows with empty
value for the field 'name' (plugin name).
There is one directly applicable change to InnoDB:
commit 739f5239f1 in the
5.5 branch will be merged before the next MariaDB releases.
Another potentially applicable change will be tracked
separately as MDEV-20126.
Thus, here we only update the InnoDB version number and do
not change anything else.
Problem: Clients running different values for auto_increment_increment
and doing concurrent inserts leads to "Duplicate key error" in one of them.
Analysis:
When auto_increment_increment value is reduced in a session,
InnoDB uses last auto_increment_increment value
to recalculate the autoinc value.
In case, some other session has inserted a value
with different auto_increment_increment, InnoDB recalculate
autoinc values based on current session previous auto_increment_increment
instead of considering the auto_increment_increment used for last insert
across all session
Fix:
revert 7acdf29cb4
a.k.a. 7c12a9e5c3
as it causing the bug.
Reviewed By:
Bin <bin.x.su@oracle.com>
Kevin <kevin.lewis@oracle.com>
RB#21777
Note: In MariaDB Server, earlier changes in
ae5bc05988
for MDEV-533 require that the original test in
mysql/mysql-server@1ccd472d63
be adjusted for MariaDB.
Also, ef47b62551 (MDEV-8827)
had to be reverted after the upstream fix had been backported.
Problem:
=======
Autoincrement value gives duplicate values because of the following reasons.
(1) In InnoDB handler function, current autoincrement value is not changed
based on newly set auto_increment_increment or auto_increment_offset variable.
(2) Handler function does the rounding logic and changes the current
autoincrement value and InnoDB doesn't aware of the change in current
autoincrement value.
Solution:
========
Fix the problem(1), InnoDB always respect the auto_increment_increment
and auto_increment_offset value in case of current autoincrement value.
By fixing the problem (2), handler layer won't change any current
autoincrement value.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 13748
This is a regression due to MDEV-16515 that affects some versions in
the MariaDB 10.1 server series starting with 10.1.35, and possibly
all versions starting with 10.2.17, 10.3.8, and 10.4.0.
The idea of MDEV-16515 is to allow DROP TABLE to be interrupted,
in case it was stuck due to some concurrent activity. We already
made some cases of internal DROP TABLE immune to kill in MDEV-18237,
MDEV-16647, MDEV-17470. We must include the cleanup of
CREATE TABLE...SELECT in the list of such internal DROP TABLE.
ha_innobase::delete_table(): Pass create_failed=true if the current
SQL statement is CREATE, so that the table will be dropped.
row_drop_table_for_mysql(): If create_failed=true, do not allow
the operation to be interrupted.
Problem:
=======
Executing command, "mysqlbinlog --read-from-remote-server --host='xx.xx.xx.xx'
--port=3306 --user=xxx --password=xxx --database=mysql --to-last-log
mysql-bin.000001 --start-position=1098699 --stop-never |mysql -uxxx -pxxx", we
found that last data read from remote couldn't commit.
Analysis:
========
The purpose of 'Write_on_release_cache' is that the contents of the Cache will
automatically be written to a dedicated result file on destruction. Flush
operation on the result file is controlled by a flag 'FLUSH_F'. Events which
require force flush upon their destruction will have to enable this
'Write_on_release_cache::FLUSH_F'. At present the 'FLUSH_F' flag is defined as
an enum as shown below.
enum flag
{
FLUSH_F
};
Since 'FLUSH_F' is the first member without initialization it get the default
value '0'. Because of this the following flush condition never succeeds.
if (m_flags & FLUSH_F)
fflush(m_file);
At present the file gets flushed only during my_fclose(result_file) operation.
When continuous streaming is enabled through --stop-never option it never gets
flushed and hence events are not replicated.
Fix:
===
Initialize the enum value to non zero value.
The parser returned a syntax error message for the queries with join
expressions like this t1 JOIN t2 [LEFT | RIGHT] JOIN t3 ON ... ON ... when
the second operand of the outer JOIN operation with ON clause was another
join expression with ON clause. In this expression the JOIN operator is
right-associative, i.e. expression has to be parsed as the expression
t1 JOIN (t2 [LEFT | RIGHT] JOIN t3 ON ... ) ON ...
Such join expressions are hard to parse because the outer JOIN is
left-associative if there is no ON clause for the first outer JOIN operator.
The patch implements the solution when the JOIN operator is always parsed
as right-associative and builds first the right-associative tree. If it
happens that there is no corresponding ON clause for this operator the
tree is converted to left-associative.
The idea of the solution was taken from the patch by Martin Hansson
"WL#8083: Fixed the join_table rule" from MySQL-8.0 code line.
As the grammar rules related to join expressions in MySQL-8.0 and
MariaDB-5.5+ are quite different MariaDB solution could not borrow
any code from the MySQL-8.0 solution.
The problem was that sp_head::MULTI_RESULTS was not set correctly for ANALYZE statement
with SELECT ... INTO variable.
This is a follow up fix for MDEV-7023
Problem:
=======
Executing test with following options will result in test failure.
./mtr rpl.kill_race_condition{,,,,,,,,,,} --repeat=10 --par 12 --mem
Fix:
====
Test simulates applier thread kill scenario while applying a row event. But it
doesn't wait for applier to catch the error stop.
Added :wait_for_slave_sql_error.inc to catch the error.
Test uses START SLAVE as a final step and doesn't wait for both threads to
start.
Added: start_slave.inc
The test allowed non-deterministic execution thanks to unresetable status
var of Slave_connections.
Fixed with expecting a correct value for Slaves_connected.
Since the purpose of event is just to see on second node whether it is
created or not And we are not goint to execute the event also. So instead
of setting GLOBAL event_scheduler=ON and then turning it off, we can just
disable the warning.
and WHERE filter afterwards
This patch complements the patch fixing the bug MDEV-6892. The latter
properly handled queries that used mergeable views returning constant
columns as inner tables of outer joins and whose where clause contained
predicates referring to these columns if the predicates of happened not
to be equality predicates. Otherwise the server still could return wrong
result sets for such queries. Besides the fix for MDEV-6892 prevented
some possible conversions of outer joins to inner joins for such queries.
This patch corrected the function check_simple_equality() to handle
properly conjunctive equalities of the where clause that refer to the
constant columns of mergeable views used as inner tables of an outer join.
The patch also changed the code of Item_direct_view_ref::not_null_tables().
This change allowed to take into account predicates containing references
to constant columns of mergeable views when converting outer joins into
inner joins.
in where clause
The classes Item_func_isnottrue and Item_func_isnotfalse inherited the
implementation of the eval_not_null_tables method from the Item_func
class. As a result the not_null_tables_cache was set incorrectly for
the objects of these classes. It led to improper conversion of outer
joins to inner joins when the where clause of the processed query
contained IS NOT TRUE or IS NOT FALSE predicates. The coverted query
in many cases produced a wrong result set.
Remove the test, because it easily fails with a result difference.
Analysis by Thirunarayanan Balathandayuthapani:
By default, innodb_encrypt_tables=0.
1) Test case creates 100 tables in innodb_encrypt_1.
2) creates another 100 unencrypted tables (encryption=off) in innodb_encrypt_2
3) creates another 100 encrypted tables (encryption=on) in innodb_encrypt_3
4) enabling innodb_encrypt_tables=1 and checking that only
100 encrypted tables exist. (already we have 100 in dictionary)
5) opening all tables again (no idea why)
6) After that, set innodb_encrypt_tables=0 and wait for 100 tables
to be decrypted (already we have 100 unencrypted tables)
7) dropping all databases
Sporadic failure happens because after step 4, it could encrypt the
normal table too, because innodb_encryption_threads=4.
This test was added in MDEV-9931, which was about InnoDB startup being
slow due to all .ibd files being opened. There have been a number of
later fixes to this problem. Currently the latest one is
commit cad56fbaba, in which some tests
(in particular the test innodb.alter_kill) could fail if all InnoDB
.ibd files are read during startup. That could make this test redundant.
Let us remove the test, because it is big, slow, unreliable, and
does not seem to reliably catch the problem that all files are being
read on InnoDB startup.
Handling of top level conjuncts in WHERE whose used_tables() contained
RAND_TABLE_BIT in the function make_join_select() was incorrect.
As a result if such a conjunct referred to fields non of which belonged
to the last joined table it was pushed twice. (This could be seen
for a test case from subselect.test whose output was changed after this
patch had been applied. In 10.1 when running EXPLAIN FORMAT=JSON for
the query from this test case we clearly see that one of the conjuncts
is pushed twice.) This fact by itself was not good. Besides, if such a
conjunct was pushed to a table that was the result of materialization
of a semi-join the query could return a wrong result set. In particular
we could watch it for queries with semi-join subqueries whose left parts
used stored functions without "deterministic' specifier.
as well as
MDEV-19500 Update with join stopped worked if there is a call to a procedure in a trigger
MDEV-19521 Update Table Fails with Trigger and Stored Function
MDEV-19497 Replication stops because table not found
MDEV-19527 UPDATE + JOIN + TRIGGERS = table doesn't exists error
Reimplement the fix for (5d510fdbf0)
MDEV-18507 can't update temporary table when joined with table with triggers on read-only
instead of calling open_tables() twice, put multi-update
prepare code inside open_tables() loop.
Add a test for a MDL backoff-and-retry loop inside open_tables()
across multi-update prepare code.
Problem:
=======
rpl_blackhole.test fails when executed with following options
mysqld=--binlog_annotate_row_events=1, mysqld=--replicate_annotate_row_events=1
Test output:
------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
rpl.rpl_blackhole_bug 'mix' [ pass ] 791
rpl.rpl_blackhole_bug 'row' [ fail ]
Replicate_Wild_Ignore_Table
Last_Errno 1032
Last_Error Could not execute Update_rows_v1 event on table test.t1; Can't find
record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's
master log master-bin.000001, end_log_pos 1510
Analysis:
=========
Enabling "replicate_annotate_row_events" on slave, Tells the slave to write
annotate rows events received from the master to its own binary log. The
received annotate events are applied after the Gtid event as shown below.
thd->query() will be set to the actual query received from the master, through
annotate event. Annotate_rows event should not be deleted after the event is
applied as the thd->query will be used to generate new Annotate_rows event
during applying the subsequent Rows events. After the last Rows event has been
applied, the saved Annotate_rows event (if any) will be deleted.
In balckhole engine all the DML operations are noops as they donot store any
data. They simply return success without doing any operation. But the existing
strictly expects thd->query() to be 'NULL' to identify that row based
replication is in use. This assumption will fail when row annotations are
enabled as the query is not 'NULL'. Hence various row based operations like
'update', 'delete', 'index lookup' will fail when row annotations are enabled.
Fix:
===
Extend the row based replication check to include row annotations as well.
i.e Either the thd->query() is NULL or thd->query() points to query and row
annotations are in use.
This patch complements the patch that fixes bug MDEV-18479.
This patch takes care of possible overflow when calculating the
estimated number of rows in a materialized derived table / view.