Starting with commit 210855ce5d
Valgrind became aware that the unused tail of the buffer that
is returned by thd_get_xid() is actually uninitialized.
The problem should exist already in MySQL 5.0. I was able to
repeat it on MariaDB Server 5.5 with some additional instrumentation.
InnoDB is allocating 128+4+4 bytes for the XID and the lengths of
its components, even when the XID is shorter than 64+64 bytes.
In MariaDB Server 10.3, while running the test main.xa_binlog,
in the xid_t::set() that is called by sql_yacc.yy, the 128-byte data
buffer was uninitialized according to Valgrind, and only the first bytes
were initialized. When the xid_t::data was copied to
thd.transaction.xid_state.xid.data, it happened so that the entire
target buffer was considered initialized. With MariaDB Server 10.4 since
the said commit, Valgrind will correctly be detect the tail of the buffer
as uninitialized.
The impact of this bug is as follows:
(1) InnoDB will write unnecessarily much redo log for XA PREPARE.
(2) InnoDB will write garbage bytes to the redo log and undo log pages.
(3) The garbage should be 'harmless', because on recovery, only the
actual payload of the XID will be used, based on the written length.
trx_rseg_write_wsrep_checkpoint(), trx_undo_write_xid(): Write only
the actually used length of xid->data to the data page, and
zero out the rest of the buffer by mlog_memset().
This fixes the following test failures related to index cardinality:
main.join main.stat_tables main.partition main.stat_tables_innodb
innodb.innodb_bug57252
MDEV-17614 flags INSERT…ON DUPLICATE KEY UPDATE unsafe for statement-based
replication when there are multiple unique indexes. This correctly fixes
something whose attempted fix in MySQL 5.7
in mysql/mysql-server@c93b0d9a97
caused lock conflicts. That change was reverted in MySQL 5.7.26
in mysql/mysql-server@066b6fdd43
(with a substantial amount of other changes).
In MDEV-17073 we already disabled the unfortunate MySQL change when
statement-based replication was not being used. Now, thanks to MDEV-17614,
we can actually remove the change altogether.
This reverts commit 8a346f31b9 (MDEV-17073)
and mysql/mysql-server@c93b0d9a97 while
keeping the test cases.
- mysqltest didn't free read_command_buf
- wait_for_slave_param did write different things to the log if valgrind
was used.
- Table open cache should not write the initial variable value as it
can depend on the configuration or if valgrind is used
- A variable in GetResult was used uninitalized
- Initialize variables that could be used uninitialized
- Added extra end space to DbugStringItemTypeValue to get rid of warnings
from c_ptr()
- Session_sysvars_tracker::update() accessed unitialized memory if called
with NULL value.
- get_schema_stat_record() accessed unitialized memory if HA_KEY_LONG_HASH
was used
- parse_vcol_defs() accessed random memory for tables without keys.
initialize_readline() is called with full pathname of executable which
sets rl_readline_name to that value.
It is expected that rl_readline_name is initialized with static name
not depending on the file name at all. Needed for setting custom
hotkeys in .inputrc
1. Fix DBUG_ASSERT(!table->pos_in_locked_tables) in tc_release_table();
2. Fix access of prematurely freed MDL_ticket: don't close ticket if table was not closed;
3. Fix deadlock after erroneous ALTER.
mysql_alter_table() leaves dirty table->m_needs_reopen in case of
error exit which then incorrectly treated by mysql_lock_tables().
For MODE_SIMULTANEOUS_ASSIGNMENT it is required to return back field
offsets from record[1] to record[0]. 'continue' in warning branch did
skip of rfield->move_field_offset() call.
... produces "bytes lost" warnings
When rocksdb_validate_update_cf_options() returns an error,
the update won't happen.
Free the copy of the string in this case.
Problem:-
When mysql executes INSERT ON DUPLICATE KEY INSERT, the storage engine checks
if the inserted row would generate a duplicate key error. If yes, it returns
the existing row to mysql, mysql updates it and sends it back to the storage
engine.When the table has more than one unique or primary key, this statement
is sensitive to the order in which the storage engines checks the keys.
Depending on this order, the storage engine may determine different rows
to mysql, and hence mysql can update different rows.The order that the
storage engine checks keys is not deterministic. For example, InnoDB checks
keys in an order that depends on the order in which indexes were added to
the table. The first added index is checked first. So if master and slave
have added indexes in different orders, then slave may go out of sync.
Solution:-
Make INSERT...ON DUPLICATE KEY UPDATE unsafe while using stmt or mixed format
When there is more then one unique key.
Although there is two exception.
1. Auto Increment key is not counted because Innodb will get gap lock for
failed Insert and concurrent insert will get a next increment value. But if
user supplies auto inc value it can be unsafe.
2. Count only unique keys for which insertion is performed.
So this patch also addresses the bug id #72921
MDEV-17717
Assertion `!table->pos_in_locked_tables' failed in tc_release_table on
flushing RocksDB table under SERIALIZABLE
MDEV-17998
Deadlock and eventual Assertion `!table->pos_in_locked_tables' failed
in tc_release_table on KILL_TIMEOUT
MDEV-19591
Assertion `!table->pos_in_locked_tables' failed in tc_release_table upon
altering table into S3 under lock.
The problem was that thd->open_tables->pos_in_locked_tables was not reset
when alter table failed to reopen a locked table.
- pcretest.c could use macro with side effect
- maria_chk could access freed memory
- Initialized some variables that could be accessed uninitalized
- Fixed compiler warning in my_atomic-t.c
The general reason why innodb redo log file is limited by 512G is that
log_block_convert_lsn_to_no() returns value limited by 1G. But there is no
need to have unique log block numbers in log group. The fix removes 512G
limit and limits log group size by
(uint32_t maximum value) * (minimum page size), which, in turns, can be
removed if fil_io() is no longer used for innodb redo log io.
- The commit ab6dd77408 wrongly sets the
condition inside innobase_srv_conc_enter_innodb(). Problem is that
InnoDB makes the thread to sleep indefinitely if it is a replication
slave thread.
Thanks to Sujatha Sivakumar for contributing the replication test case.
Non-owning reference to elements.
Use it as function argument instead of pointer+size pair or instead of
const std::vector<T>.
Do not use it for strings!
More info is here http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines
Or just google about it.
Problem:
=======
Failed CREATE OR REPLACE TEMPORARY TABLE statement which dropped the table but
failed at a later stage of creation of temporary table is not written to
binarylog in row based replication. This causes the slave to diverge.
Analysis:
========
CREATE OR REPLACE statements work as shown below.
CREATE OR REPLACE TABLE table_name (a int);
is basically the same as:
DROP TABLE IF EXISTS table_name;
CREATE TABLE table_name (a int);
Hence every CREATE OR REPLACE TABLE command which dropped the table should be
written to binary log, even when following CREATE TABLE part fails. In order
to achieve this, during the execution of CREATE OR REPLACE command, when a
table is dropped 'thd->log_current_statement' flag is set. When table creation
results in an error within 'mysql_create_table' code, the error handling part
looks for this flag. If it is set the failed CREATE OR REPLACE statement is
written into the binary log inspite of error. This ensure that slave doesn't
diverge from the master. In case of row based replication the error handling
code returns very early, if the table is of type temporary. This is done based
on the assumption that temporary tables are not replicated in row based
replication.
It fails to handle the cases where a temporary table was created as part of
statement based replication at an earlier stage and the binary log format was
changed to row because of an unsafe statement. In this case when a CREATE OR
REPLACE statement is executed on this temporary table it will dropped but the
query will not be written to binary log. Hence slave diverges.
Fix:
===
In error handling code check the return status of create table operation. If
it is successful and replication mode is row based and table is of type
temporary then return. Other wise proceed further to the code which checks for
thd->log_current_statement flag and does appropriate logging.
A combination of:
* lots of include'd test files where each has "--source
include/have_rocksdb.inc"
* for each such occurrence, MTR adds testsuite's arguments into server
arguments
* which hits some limit on the length of argv array on Windows, causing
the server to get garbage data in the last argument.
Work around this by commenting out one of the totally redundant
"source include/have_rocksdb.inc" lines.
- Fix the LooseScan code to support storage engines that return
HA_ERR_END_OF_FILE if the index scan goes out of provided range
bounds
- Add a DBUG_EXECUTE_IF("force_group_by",...) to allow a test to
force a LooseScan
- Adjust rocksdb.group_min_max test not to use features not present
in MariaDB 10.2 (e.g. optimizer_trace. In MariaDB 10.4 it's present
but it doesn't meet the assumptions that the test makes about it
- Adjust the test result file:
= MariaDB doesn't support "Enhanced Loose Scan" that FB/MySQL has
= MariaDB has different cost calculations.