INSERTS/UPDATES ON TEMPORARY TABLES
Bug#14294223: CHANGES NOT ALLOWED TO TEMPORARY TABLES ON
READ-ONLY SERVERS
Problem:
========
Running 5.5.14 in read only we can create temporary tables
but can not insert or update records in the table. When we
try we get Error 1290 : The MySQL server is running with the
--read-only option so it cannot execute this statement.
Analysis:
=========
This bug is very specific to binlog being enabled and
binlog-format being stmt/mixed. Standalone server without
binlog enabled or with row based binlog-mode works fine.
How standalone server and row based replication work:
=====================================================
Standalone server and row based replication mark the
transactions as read_write only when they are modifying
non temporary tables as part of their current transaction.
Because of this when code enters commit phase it checks
if a transaction is read_write or not. If the transaction
is read_write and global read only mode is enabled those
transaction will fail with 'server is read only mode'
error.
In the case of statement based mode at the time of writing
to binary log a binlog handler is created and it is always
marked as read_write. In case of temporary tables even
though the engine did not mark the transaction as read_write
but the new transaction that is started by binlog handler is
considered as read_write.
Hence in this case when code enters commit phase it finds
one handler which has a read_write transaction even when
we are modifying temporary table. This causes the server
to throw an error when global read-only mode is enabled.
Fix:
====
At the time of commit in "ha_commit_trans" if a read_write
transaction is found, we should check if this transaction is
coming from a handler other than binlog_handler. This will
ensure that there is a genuine read_write transaction being
sent by the engine apart from binlog_handler and only then
it should be blocked.
The check_user_can_set_role() used find_user_exact() to get the
permissions for the SET ROLE NONE command. Which returned NULL too often,
for instance when user authenticated as 'user'@'%'.
Now we use find_user_wild() instead.
Problem was that in-place online alter table was used on a table
that had mismatch between MySQL frm file and InnoDB data dictionary.
Fixed so that traditional "Copy" method is used if the MySQL frm
and InnoDB data dictionary is not consistent.
FAILURES
Analysis:
=========
Test script is not ensuring that "assert_grep.inc" should be
called only after 'Disk is full' error is written to the
error log.
Test checks for "Queueing master event to the relay log"
state. But this state is set before invoking 'queue_event'.
Actual 'Disk is full' error happens at a very lower level.
It can happen that we might even reset the debug point
before even the actual disk full simulation occurs and the
"Disk is full" message will never appear in the error log.
In order to guarentee that we must have some mechanism where
in after we write "Disk is full" error messge into the error
log we must signal the test to execute SSS and then reset
the debug point. So that test is deterministic.
Fix:
===
Added debug sync point to make script deterministic.
In some cases, MariaDB 10.0 could write a master.info file that was read
incorrectly by 10.1 and could cause server to fail to start after an upgrade.
(If writing a new master.info file that is shorter than the old, extra
junk may remain at the end of the file. This is handled properly in
10.1 with an END_MARKER line, but this line is not written by
10.0. The fix here is to make 10.1 robust at reading the master.info
files written by 10.0).
Fix several things around reading master.info and read_mi_key_from_file():
- read_mi_key_from_file() did not distinguish between a line with and
without an eqals '=' sign.
- If a line was empty, read_mi_key_from_file() would incorrectly return
the key from the previous call.
- An extra using_gtid=X line left-over by MariaDB 10.0 might incorrectly
be read and overwrite the correct value.
- Fix incorrect usage of strncmp() which should be strcmp().
- Add test cases.
Previous fix using wait condition did not work because of MDEV-9867, so
we have to use conditional sleep instead. Sleep will only happen if the test
is executed after another one which also ran buffer pool dump without
server restart between two tests
The cause of the issue is when DROP DATABASE takes
metadata lock and is in progress through it's
execution, a concurrently running CREATE FUNCTION checks
for the existence of database which it succeeds and then it
waits on the metadata lock. Once DROP DATABASE writes to
BINLOG and finally releases the metadata lock on schema
object, the CREATE FUNCTION waiting on metadata lock
gets in it's code path and succeeds and writes to binlog.
LOG-SLOW-SLAVE-STATEMENTS NOT DISPLAYED.
These parameters were moved from the command line options to
the system variables section. Treatment of the
opt_log_slow_slave_statements changed to let the
dynamic change of the variable.
delete deferred events after they're executed
(otherwise they can be executed again for a sub-statement)
See also
commit 0e78d1d
Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com>
Date: Wed Mar 20 11:20:47 2013 +0530
BUG#15850951-DUPLICATE ERROR IN REPLICATION WITH SLAVE
TRIGGERS AND AUTO INCREMENT
Some statements are always replicated in STATEMENT binlog format.
So upon their execution, the current binlog format is temporarily
switched to STATEMENT even though the session's format is different.
This state, stored in THD's current_stmt_binlog_format, was getting
incorrectly masked by wsrep_forced_binlog_format, causing assertions
and unintended generation of row events.
Backported galera.galera_forced_binlog_format and added a test
specific to this case.
In RHEL7/RHEL7.1 libcrack behavior seem to have been modified so that
"foobar" password is considered bad (due to descending "ba") earlier than
expected. For details google for cracklib-2.9.0-simplistic.patch.
Adjusted affected passwords not to have descending and ascending sequences.
Analysis:
-- InnoDB has n (>0) redo-log files.
-- In the first page of redo-log there is 2 checkpoint records on fixed location (checkpoint is not encrypted)
-- On every checkpoint record there is up to 5 crypt_keys containing the keys used for encryption/decryption
-- On crash recovery we read all checkpoints on every file
-- Recovery starts by reading from the latest checkpoint forward
-- Problem is that latest checkpoint might not always contain the key we need to decrypt all the
redo-log blocks (see MDEV-9422 for one example)
-- Furthermore, there is no way to identify is the log block corrupted or encrypted
For example checkpoint can contain following keys :
write chk: 4 [ chk key ]: [ 5 1 ] [ 4 1 ] [ 3 1 ] [ 2 1 ] [ 1 1 ]
so over time we could have a checkpoint
write chk: 13 [ chk key ]: [ 14 1 ] [ 13 1 ] [ 12 1 ] [ 11 1 ] [ 10 1 ]
killall -9 mysqld causes crash recovery and on crash recovery we read as
many checkpoints as there is log files, e.g.
read [ chk key ]: [ 13 1 ] [ 12 1 ] [ 11 1 ] [ 10 1 ] [ 9 1 ]
read [ chk key ]: [ 14 1 ] [ 13 1 ] [ 12 1 ] [ 11 1 ] [ 10 1 ] [ 9 1 ]
This is problematic, as we could still scan log blocks e.g. from checkpoint 4 and we do
not know anymore the correct key.
CRYPT INFO: for checkpoint 14 search 4
CRYPT INFO: for checkpoint 13 search 4
CRYPT INFO: for checkpoint 12 search 4
CRYPT INFO: for checkpoint 11 search 4
CRYPT INFO: for checkpoint 10 search 4
CRYPT INFO: for checkpoint 9 search 4 (NOTE: NOT FOUND)
For every checkpoint, code generated a new encrypted key based on key
from encryption plugin and random numbers. Only random numbers are
stored on checkpoint.
Fix: Generate only one key for every log file. If checkpoint contains only
one key, use that key to encrypt/decrypt all log blocks. If checkpoint
contains more than one key (this is case for databases created
using MariaDB server version 10.1.0 - 10.1.12 if log encryption was
used). If looked checkpoint_no is found from keys on checkpoint we use
that key to decrypt the log block. For encryption we use always the
first key. If the looked checkpoint_no is not found from keys on checkpoint
we use the first key.
Modified code also so that if log is not encrypted, we do not generate
any empty keys. If we have a log block and no keys is found from
checkpoint we assume that log block is unencrypted. Log corruption or
missing keys is found by comparing log block checksums. If we have
a keys but current log block checksum is correct we again assume
log block to be unencrypted. This is because current implementation
stores checksum only before encryption and new checksum after
encryption but before disk write is not stored anywhere.
It could have happened that one of previous tests already executed
buffer pool dump and set the status variable value, so when it's been
checked, the check passes too early, before the dump starts and
the dump file is created. See more detailed explanation in MDEV-9713.
Fixed by waiting for the current time to change in case it equals
to the timestamp in the status variable, and then checking that
the status variable not only matches the expected pattern, but also
differs from the previous value, whatever it was.
In row_search_for_mysql function on XtraDB there was a old logic
where null bytes were inited. This caused server to think that
key value is null and continue on incorrect path.
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE
Problem:
========
Currently SHOW SLAVE STATUS blocks if IO thread waits for
disk space. This makes automation tools verifying
server health block on taking relevant action. Finally this
will create SHOW SLAVE STATUS piles.
Analysis:
=========
SHOW SLAVE STATUS hangs on mi->data_lock if relay log write
is waiting for free disk space while holding mi->data_lock.
mi->data_lock is needed to protect the format description
event (mi->format_description_event) which is accessed by
the clients running FLUSH LOGS and slave IO thread. Note
relay log writes don't need to be protected by
mi->data_lock, LOCK_log is used to protect relay log between
IO and SQL thread (see MYSQL_BIN_LOG::append_event). The
code takes mi->data_lock to protect
mi->format_description_event during relay log rotate which
might get triggered right after relay log write.
Fix:
====
Release the data_lock just for the duration of writing into
relay log.
Made change to ensure the following lock order is maintained
to avoid deadlocks.
data_lock, LOCK_log
data_lock is held during relay log rotations to protect
the description event.
The main.merge test case was failing when tested using row based
binlog format.
While analyzing the issue it was found the following issues:
a) The server is calling binlog related code even when a statement will
not be binlogged;
b) The child table list was not present into table structure by the time
to generate the create table statement;
c) The tables in the child table list will not be opened yet when
generating table create info using row based replication;
d) CREATE TABLE LIKE TEMP_TABLE does not preserve original table storage
engine when using row based replication;
This patch addressed all above issues.
@ sql/sql_class.h
Added a function to determine if the binary log is disabled to
the current session. This is related with issue (a) above.
@ sql/sql_table.cc
Added code to skip binary logging related code if the statement
will not be binlogged. This is related with issue (a) above.
Added code to add the children to the query list of the table that
will have its CREATE TABLE generated. This is related with issue (b)
above.
Added code to force the storage engine to be generated into the
CREATE TABLE. This is related with issue (d) above.
@ storage/myisammrg/ha_myisammrg.cc
Added a test to skip a table getting info about a child table if the
child table is not opened. This is related to issue (c) above.
The test created a file in location relative to the datadir
(a few levels above datadir).
The file was created by MariaDB server (via INTO OUTFILE), and
later removed by mysqltest (via remove_file). The problem is that
when the vardir is a symlink, MariaDB server and mysqltest can
resolve such paths differently. MariaDB server would return back
to where the symlink is located, while mysqltest would go above
the real directory. For example, if the test is run with --mem,
and /bld/5.5/mysql-test/var points at /dev/shm/var_auto_X, then
SELECT INTO OUTFILE created a file in /bld/5.5/mysql-test , but
remove_file would look for it in /dev/shm/.
The test is re-written so that all paths are resolved in perl,
the logic itself hasn't changed.