EVEN IF I LOG TO FILE.
Analysis:
----------
MYSQL_UPGRADE of the master breaks the replication when
the query logging is enabled with FILE/NONE 'log-output'
option on the slave.
mysql_upgrade modifies the 'general_log' and 'slow_log'
tables after the logging is disabled as below:
SET @old_log_state = @@global.general_log;
SET GLOBAL general_log = 'OFF';
ALTER TABLE general_log
MODIFY event_time TIMESTAMP NOT NULL,
( .... );
SET GLOBAL general_log = @old_log_state;
and
SET @old_log_state = @@global.slow_query_log;
SET GLOBAL slow_query_log = 'OFF';
ALTER TABLE slow_log
MODIFY start_time TIMESTAMP NOT NULL,
( .... );
SET GLOBAL slow_query_log = @old_log_state;
In the binary log, only the ALTER statements are logged
but not the SET statements which turns ON/OFF the logging.
So when the slave replays the binary log,the ALTER of LOG
tables throws an error since the logging is enabled. Also
the 'log-output' option is not checked to determine
whether to allow/disallow the ALTER operation.
Fix:
----
The 'log-output' option is included in the check while
determining whether the query logging happens using the
log tables.
Picked from mysql respository at 0daaf8aecd8f84ff1fb400029139222ea1f0d812
MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel
replication
Fixed by partly reverting MDEV-21953 to put back MDL_BACKUP_COMMIT locking
before log_and_order.
The original problem for MDEV-21953 was that while a thread was waiting in
for another threads to commit in 'log_and_order', it had the
MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the
MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks.
This causes a deadlock as the waited-for thread can never get past the
MDL_BACKUP_COMMIT lock in ha_commit_trans.
The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while
a thread is waiting for other 'previous' threads to commit. This ensures
that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which
ensures that there are no deadlocks anymore.
Analysis:
========
RESET MASTER TO # command deletes all binary log files listed in the index
file, resets the binary log index file to be empty, and creates a new binary
log with number #. When the user provided binary log number is greater than
the max allowed value '2147483647' server fails to generate a new binary log.
The RESET MASTER statement marks the binlog closure status as
'LOG_CLOSE_TO_BE_OPENED' and exits. Statements which follow RESET MASTER
try to write to binary log they find the log_state != LOG_CLOSED and
proceed to write to binary log cache and it results in crash.
Fix:
===
During MYSQL_BIN_LOG open, if generation of new binary log name fails then the
"log_state" needs to be marked as "LOG_CLOSED". With this further statements
will find binary log as closed and they will skip writing to the binary log.
This is a backport of the applicable part of
commit 93475aff8d and
commit 2c39f69d34
from 10.4.
Before 10.4 and Galera 4, WSREP_ON is a macro that points to
a global Boolean variable, so it is not that expensive to
evaluate, but we will add an unlikely() hint around it.
WSREP_ON_NEW: Remove. This macro was introduced in
commit c863159c32
when reverting WSREP_ON to its previous definition.
We replace some use of WSREP_ON with WSREP(thd), like it was done
in 93475aff8d. Note: the macro
WSREP() in 10.1 is equivalent to WSREP_NNULL() in 10.4.
Item_func_rand::seed_random(): Avoid invoking current_thd
when WSREP is not enabled.
When binlog is disabled, WSREP will not behave correctly when
SAVEPOINT ROLLBACK is executed since we don't register handlers for such case.
Fixed by registering WSREP handlerton for SAVEPOINT related commands.
* Remove dead code
* MDEV-21675 Data inconsistency after multirow insert rollback
This patch fixes data inconsistencies that happen after rollback of
multirow inserts, with binlog disabled.
For example, statements such as `INSERT INTO t1 VALUES (1,'a'),(1,'b')`
that fail with duplicate key error. In such cases the whole statement
is rolled back. However, with wsrep_emulate_binlog in effect, the
IO_CACHE would not be truncated, and the pending rows events would be
replicated to the rest of the cluster. In the above example, it would
result in row (1,'a') being replicated, whereas locally the statement
is rolled back entirely. Making the cluster inconsistent.
The patch changes the code so that prior to statement rollback,
pending rows event are removed and the stmt cache reset.
That patch also introduces MTR tests that excercise multirow insert
statements for regular, and streaming replication.
Problem:
-------
Accessing a member within 'xid_count_per_binlog' structure results in
following error when 'UBSAN' is enabled.
member access within address 0xXXX which does not point to an object of type
'xid_count_per_binlog'
Analysis:
---------
The problem appears to be that no constructor for 'xid_count_per_binlog' is
being called, and thus the vtable will not be initialized.
Fix:
---
Defined a parameterized constructor for 'xid_count_per_binlog' class.
Analysis:
========
'max_binlog_cache_size' is configured and a huge transaction is executed. When
the transaction specific events size exceeds 'max_binlog_cache_size' the event
cannot be written to the binary log cache and cache write error is raised.
Upon cache write error the statement is rolled back and the transaction cache
should be truncated to a previous statement specific position. The truncate
operation should reset the cache to earlier valid positions and flush the new
changes. Even though the flush is successful the cache write error is still in
marked state. The truncate code interprets the cache write error as cache flush
failure and returns abruptly without modifying the write cache parameters.
Hence cache is in a invalid state. When a COMMIT statement is executed in this
session it tries to flush the contents of transaction cache to binary log.
Since cache has partial events the cache write operation will report
'writer.remains' assert.
Fix:
===
Binlog truncate function resets the cache to a specified size. As a first step
of truncation, clear the cache write error flag that was raised during earlier
execution. With this new errors that surface during cache truncation can be
clearly identified.
revision-id: 673e253724979fd9fe43a4a22bd7e1b2c3a5269e
Author: Kristian Nielsen
Fix missing memory barrier in wait_for_commit.
The function wait_for_commit::wait_for_prior_commit() has a fast path where it
checks without locks if wakeup_subsequent_commits() has already been called.
This check was missing a memory barrier. The waitee thread does two writes to
variables `waitee' and `wakeup_error', and if the waiting thread sees the
first write it _must_ also see the second or incorrect behavior will occur.
This requires memory barriers between both the writes (release semantics) and
the reads (acquire semantics) of those two variables.
Other accesses to these variables are done under lock or where only one thread
will be accessing them, and can be done without barriers (relaxed semantics).
- Any temporary tables created under read-only mode will never be logged
to binary log. Any usage of these tables to update normal tables, even
after read-only has been disabled, will use row base logging (as the
temporary table will not be on the slave).
- Analyze, check and repair table will not be logged in read-only mode.
Other things:
- Removed not used varaibles in
MYSQL_BIN_LOG::flush_and_set_pending_rows_event.
- Set table_share->table_creation_was_logged for all normal tables.
- THD::binlog_query() now returns -1 if statement was not logged., This
is used to update table_share->table_creation_was_logged.
- Don't log admin statements in opt_readonly is set.
- Table's that doesn't have table_creation_was_logged will set binlog format to row
logging.
- Removed not needed/wrong setting of table->s->table_creation_was_logged
in create_table_from_items()
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug
Maintainer mode makes all warnings errors. This patch fix warnings. Mostly about
deprecated `register` keyword.
Too much warnings came from Mroonga and I gave up on it.
Rather than parsing session_track_system_variables when thread starts, do
it when first trackable event occurs.
Benchmarked on a 2socket/20core/40threads Broadwell system using sysbench
connect brencmark @40 threads (with select 1 disabled):
101379.77 -> 143016.68 CPS, whereas 10.2 is currently at 137766.31 CPS.
Part of MDEV-14984 - regression in connect performance