LOAD DATA into partitioned MyISAM table
Problem was that both partitioning and myisam
used the same table_share->mutex for different protections
(auto inc and repair).
Solved by adding a specific mutex for the partitioning
auto_increment.
Also adding destroying the ha_data structure in
free_table_share (which is to be propagated
into 5.5).
This is a 5.1 ONLY patch, already fixed in 5.5+.
detector" that doesn't introduce bug #56715 "Concurrent
transactions + FLUSH result in sporadical unwarranted
deadlock errors".
Deadlock could have occurred when workload containing a mix
of DML, DDL and FLUSH TABLES statements affecting the same
set of tables was executed in a heavily concurrent environment.
This deadlock occurred when several connections tried to
perform deadlock detection in the metadata locking subsystem.
The first connection started traversing wait-for graph,
encountered a sub-graph representing a wait for flush, acquired
LOCK_open and dived into sub-graph inspection. Then it
encountered sub-graph corresponding to wait for metadata lock
and blocked while trying to acquire a rd-lock on
MDL_lock::m_rwlock, since some,other thread had a wr-lock on it.
When this wr-lock was released it could have happened (if there
was another pending wr-lock against this rwlock) that the rd-lock
from the first connection was left unsatisfied but at the same
time the new rd-lock request from the second connection sneaked
in and was satisfied (for this to be possible the second
rd-request should come exactly after the wr-lock is released but
before pending the wr-lock manages to grab rwlock, which is
possible both on Linux and in our own rwlock implementation).
If this second connection continued traversing the wait-for graph
and encountered a sub-graph representing a wait for flush it tried
to acquire LOCK_open and thus the deadlock was created.
The previous patch tried to workaround this problem by not
allowing the deadlock detector to lock LOCK_open mutex if
some other thread doing deadlock detection already owns it
and current search depth is greater than 0. Instead deadlock
was reported. As a result it has introduced bug #56715.
This patch solves this problem in a different way.
It introduces a new rw_pr_lock_t implementation to be used
by MDL subsystem instead of one based on Linux rwlocks or
our own rwlock implementation. This new implementation
never allows situation in which an rwlock is rd-locked and
there is a blocked pending rd-lock. Thus the situation which
has caused this bug becomes impossible with this implementation.
Due to fact that this implementation is optimized for
wr-lock/unlock scenario which is most common in the MDL
subsystem it doesn't introduce noticeable performance
regressions in sysbench tests. Moreover it significantly
improves situation for POINT_SELECT test when many
connections are used.
No test case is provided as this bug is very hard to repeat
in MTR environment but is repeatable with the help of RQG
tests.
This patch also doesn't include a test for bug #56715
"Concurrent transactions + FLUSH result in sporadical
unwarranted deadlock errors" as it takes too much time to
be run as part of normal test-suite runs.
config.h.cmake:
We no longer need to check for presence of
pthread_rwlockattr_setkind_np as we no longer
use Linux-specific implementation of rw_pr_lock_t
which uses this function.
configure.cmake:
We no longer need to check for presence of
pthread_rwlockattr_setkind_np as we no longer
use Linux-specific implementation of rw_pr_lock_t
which uses this function.
configure.in:
We no longer need to check for presence of
pthread_rwlockattr_setkind_np as we no longer
use Linux-specific implementation of rw_pr_lock_t
which uses this function.
include/my_pthread.h:
Introduced new implementation of rw_pr_lock_t.
Since it never allows situation in which rwlock is rd-locked
and there is a blocked pending rd-lock it is not affected by
bug #56405 "Deadlock in the MDL deadlock detector".
This implementation is also optimized for wr-lock/unlock
scenario which is most common in MDL subsystem. So it doesn't
introduce noticiable performance regressions in sysbench tests
(compared to old Linux-specific implementation). Moreover it
significantly improves situation for POINT_SELECT test when
many connections are used.
As part of this change removed try-lock part of API for
this type of lock. It is not used in our code and it would
be hard to implement correctly within constraints of new
implementation.
Finally, removed support of preferring readers from
my_rw_lock_t implementation as the only user of this
feature was old rw_pr_lock_t implementation.
include/mysql/psi/mysql_thread.h:
Removed try-lock part of prlock API.
It is not used in our code and it would be hard
to implement correctly within constraints of new
prlock implementation.
mysys/thr_rwlock.c:
Introduced new implementation of rw_pr_lock_t.
Since it never allows situation in which rwlock is rd-locked
and there is a blocked pending rd-lock it is not affected by
bug #56405 "Deadlock in the MDL deadlock detector".
This implementation is also optimized for wr-lock/unlock
scenario which is most common in MDL subsystem. So it doesn't
introduce noticiable performance regressions in sysbench tests
(compared to old Linux-specific implementation). Moreover it
significantly improves situation for POINT_SELECT test when
many connections are used.
Also removed support of preferring readers from
my_rw_lock_t implementation as the only user of this
feature was old rw_pr_lock_t implementation.
This crash occured if the same debug trace file was closed twice,
leading to the same memory being free'd twice. This could occur
if the "debug" server system variable refered to the same trace
file in both global and session scope.
Example of an order of events that would lead to a crash:
1) Enable debug tracing to a trace file (global scope)
2) Enable debug tracing to the same trace file (session scope)
3) Reset debug settings (global scope)
4) Reset debug settings (session scope)
This caused a crash because the trace file was, by mistake, closed
in 3), leading to the same memory being free'd twice when the file
was closed again in 4).
Internally, the debug settings are stored in a stack, with session
settings (if any) on top and the global settings below. Each connection
has its own stack. When a set of settings is changed, it must be
determined if its debug trace file is to be closed. Before, this was done
by only checking below on the settings stack. So if the global settings
were changed, an existing debug trace file reference in session settings
would be missed. This caused the file to be closed even if it was in use,
leading to a crash later when it was closed again.
This patch fixes the problem by preventing the trace file from being shared
between global and session settings. If session debug settings are set without
specifying a new trace file, stderr is used for output. This is a change
in behaviour and should be reflected in the documentation.
Test case added to variables.test.
Use UNINIT_VAR workaround instead of LINT_INIT. The former can
also be used to silence false-positives in non-debug builds as
it actually does not cause new code to be generated.
argument of inline_mysql_mutex_init in sql_base.cc.
When initializing LOCK_dd_owns_lock_open mutex pass
correct PSI key instead of NULL value.
mysql-test/suite/perfschema/r/dml_setup_instruments.result:
Updated test results after adding P_S instrumentation
for LOCK_dd_owns_lock_open.
sql/sql_base.cc:
When initializing LOCK_dd_owns_lock_open mutex pass
correct PSI key instead of NULL value.
Temporarily disable strict aliasing warnings in order to get
wider coverage for optimized builds. Once the violations are
fixed and false-positives silenced, this flag should be removed.
Update to previous patch according to reviewers comments.
Removing parts.partition_alter4_innodb from default.experimental
(Also closed bug#45299 as a duplicate of bug#56659 as a result of this.)
Adding run of tests requiring --big-test flag to default.weekly to keep the coverage.
mysql-test/collections/default.experimental:
Removed partition_alter4_innodb since it now requires --big-test flag to run
since it is very time consuming.
mysql-test/collections/default.weekly:
Added run of test that require --big-test flag, to be run on weekly basis.
After the patch for Bug#54579, multi inserts done with INSERT DELAYED
are binlogged as normal INSERT. During processing of the statement,
a new query string without the DELAYED keyword is made. The problem
was that this new string was incorrectly made when the INSERT DELAYED
was part of a prepared statement - data was read outside the allocated
buffer.
The reason for this bug was that a pointer to the position of the
DELAYED keyword inside the query string was stored when parsing the
statement. This pointer was then later (at runtime) used (via pointer
subtraction) to find the number of characters to skip when making a
new query string without DELAYED. But when the statement was re-executed
as part of a prepared statement, the original pointer would be invalid
and the pointer subtraction would give a wrong/random result.
This patch fixes the problem by instead storing the offsets from the
beginning of the query string to the start and end of the DELAYED
keyword. These values will not depend on the memory position
of the query string at runtime and therefore not give wrong results
when the statement is executed in a prepared statement.
This bug was a regression introduced by the patch for Bug#54579.
No test case added as this bug is already covered by the existing
binlog.binlog_unsafe test case when running with valgrind.
but broken.
Before this patch, it was allowed to use stored functions in
HANDLER ... READ statements. The problem was that this functionality
was not really supported by the code. Proper locking would for example
not be performed, and it was also possible to break replication by
having stored functions that performed updates.
This patch disallows the use of stored functions in HANDLER ... READ.
Any such statement will now give an ER_NOT_SUPPORTED_YET error.
This is an incompatible change and should be reflected in the
documentation.
Test case added to handler_myisam/handler_innodb.test.
reports corruption along with timeout
This patch updates the result file for the
parts.partition_special_innodb test case which was, by mistake,
not updated in the original patch.
REPAIR of merge table
Bug #56422 CHECK TABLE run when the table is locked reports
corruption along with timeout
The crash happened if a table maintenance statement (ANALYZE TABLE,
REPAIR TABLE, etc.) was executed on a MERGE table and opening and
locking a child table failed. This could for example happen if a child
table did not exist or if a lock timeout happened while waiting for
a conflicting metadata lock to disappear.
Since opening and locking the MERGE table and its children failed,
the tables would be closed and the metadata locks released.
However, TABLE_LIST::table for the MERGE table would still be set,
with its value invalid since the tables had been closed.
This caused the table maintenance statement to try to continue
and upgrade the metadata lock on the MERGE table. But since the lock
already had been released, this caused a segfault.
This patch fixes the problem by setting TABLE_LIST::table to NULL
if open_and_lock_tables() fails. This prevents maintenance
statements from continuing and trying to upgrade the metadata lock.
The patch includes a 5.5 version of the fix for
Bug #46339 crash on REPAIR TABLE merge table USE_FRM.
This bug caused REPAIR TABLE ... USE_FRM to give an assert
when used on merge tables.
The patch also enables the CHECK TABLE statement for log tables.
Before, CHECK TABLE for log tables gave ER_CANT_LOCK_LOG_TABLE,
yet still counted the statement as successfully executed.
With the changes to table maintenance statement error handling
in this patch, CHECK TABLE would no longer be considered as
successful in this case. This would have caused upgrade scripts
to mistakenly think that the general and slow logs are corrupted
and have to be repaired. Enabling CHECK TABLES for log tables
prevents this from happening.
Finally, the patch changes the error message from "Corrupt" to
"Operation failed" for a number of issues not related to table
corruption. For example "Lock wait timeout exceeded" and
"Deadlock found trying to get lock".
Test cases added to merge.test and check.test.
The problem was that the x86 assembly based atomic CAS
(compare and swap) implementation could copy the wrong
value to the ebx register, where the cmpxchg8b expects
to see part of the "comparand" value. Since the original
value in the ebx register is saved in the stack (that is,
the push instruction causes the stack pointer to change),
a wrong offset could be used if the compiler decides to
put the source of the comparand value in the stack.
The solution is to copy the comparand value directly from
memory. Since the comparand value is 64-bits wide, it is
copied in two steps over to the ebx and ecx registers.
include/atomic/x86-gcc.h:
For reference, an excerpt from a faulty binary follows.
It is a disassembly of my_atomic-t, compiled at -O3 with
ICC 11.0. Most of the code deals with preparations for
a atomic cmpxchg8b operation. This instruction compares
the value in edx:eax with the destination operand. If the
values are equal, the value in ecx:ebx is stored in the
destination, otherwise the value in the destination operand
is copied into edx:eax.
In this case, my_atomic_add64 is implemented as a compare
and exchange. The addition is done over temporary storage
and loaded into the destination if the original term value
is still valid.
volatile int64 a64;
int64 b=0x1000200030004000LL;
a64=0;
mov 0xfffffda8(%ebx),%eax
xor %ebp,%ebp
mov %ebp,(%eax)
mov %ebp,0x4(%eax)
my_atomic_add64(&a64, b);
mov 0xfffffda8(%ebx),%ebp # Load address of a64
mov 0x0(%ebp),%edx # Copy value
mov 0x4(%ebp),%ecx
mov %edx,0xc(%esp) # Assign to tmp var in the stack
mov %ecx,0x10(%esp)
add $0x30004000,%edx # Sum values
adc $0x10002000,%ecx
mov %edx,0x8(%esp) # Save part of result for later
mov 0x0(%ebp),%esi # Copy value of a64 again
mov 0x4(%ebp),%edi
mov 0xc(%esp),%eax # Load the value of a64 used
mov 0x10(%esp),%edx # for comparison
mov %esi,(%esp)
mov %edi,0x4(%esp)
push %ebx # Push %ebx into stack. Changes esp.
mov 0x8(%esp),%ebx # Wrong restore of the result.
lock cmpxchg8b 0x0(%ebp)
sete %cl
pop %ebx
CHECKSUM TABLE for performance schema tables could cause uninitialized
memory reads.
The root cause is a design flaw in the implementation of
mysql_checksum_table(), which do not honor null fields.
However, fixing this bug in CHECKSUM TABLE is risky, as it can cause the
checksum value to change.
This fix implements a work around, to systematically reset fields values
even for null fields, so that the field memory representation is always
initialized with a known value.