bug#41023 maria: Server asserts in sysbench OLTP_RO test
bug#40888 maria: Server crashes in sysbench OLTP_RW test at lf_alloc-pin.c:513
bug#40892 maria: Livelock in sysbench OLTP_RW test
bug#40895 maria: Server crashes in sysbench OLTP_RO test at lf_alloc-pin.c:367
bug#40890 maria: Server crashes in sysbench OLTP_RW test at ctype-bin.c:8
Yet another strict aliasing issue
storage/innobase/handler/ha_innodb.cc:
Fixing Valgrind error: deadlock detector was set to random. Making the recently added lines closer
to the InnoDB style (tabs not spaces)
This writes a warning on stderr if one uses mutex in different order,
like if one in one case would lock mutex in the order A,B and in another case
would lock mutex in the order B,A
This is inspired by and loosely based on the LOCKDEP patch by Jonas
Wrong mutex order is either fixed or mutex are marked with MYF_NO_DEADLOCK_DETECTION
if used inconsistently (need to be fixed by server team)
KNOWN_BUGS.txt:
Added information that one need to dump and restore Maria tables
include/hash.h:
Added prototype function for walking over all elements in a hash
include/my_pthread.h:
Added my_pthread_mutex_init() and my_pthread_mutex_lock(); These should be used if one wants to disable mutex order checking.
Changed names of the nonposix mutex_init functions to not conflict with my_phread_mutex_init()
Added and extended structures for mutex deadlock detection.
New arguments to sage_mutex_init() and safe_mutex_lock() to allow one to disable mutex order checking.
Added variable 'safe_mutex_deadlock_detector' to enable/disable deadlock detection for all pthread_mutex_init()
mysys/Makefile.am:
Added cleaning of test files
Added test_thr_mutex
mysys/hash.c:
Added hash_iterate() to iterate over all elements in a hash
More comments
mysys/my_init.c:
Added calls to destory all mutex uses by mysys()
Added waiting for threads to end before calling TERMINATE() to list not freed memory
mysys/my_pthread.c:
Changed names to free my_pthread_mutex_init() for mutex-lock-order-checking
mysys/my_sleep.c:
Fixed too long wait if using 1000000L as argument
mysys/my_thr_init.c:
Mark THR_LOCK_threads and THR_LOCK_malloc to not have mutex deadlock detection.
(We can't have it enabled for this as these are internal mutex used by the detector
Call my_thread_init() early as we need thread specific variables enabled for the following pthread_mutex_init()
Move code to wait for threads to end to my_wait_for_other_threads_to_die()
Don't destroy mutex and conditions unless all threads have died
Added my_thread_destroy_mutex() to destroy all mutex used by the mysys thread system
Name the thread specific mutex as "mysys_var->mutex"
Added my_thread_var_mutex_in_use() to return pointer to mutex in use or 0 if thread variables are not initialized
mysys/mysys_priv.h:
Added prototypes for functions used internally with mutex-wrong-usage detection
mysys/thr_mutex.c:
Added runtime detection of mutex used in conflicting order
See WL#3262 or test_thr_mutex.c for examples
The base idea is for each mutex have two hashes:
- mutex->locked_mutex points to all mutex used after this one
- mutex->used_mutex points to all mutex which has this mutex in it's mutex->locked_mutex
There is a wrong mutex order if any mutex currently locked before this mutex is in the mutex->locked_mutex hash
sql/event_queue.cc:
Mark mutex used inconsistently (need to be fixed by server team)
sql/event_scheduler.cc:
Declare the right order to take the mutex
sql/events.cc:
Mark mutex used inconsistently (need to be fixed by server team)
sql/ha_ndbcluster_binlog.cc:
Mark mutex used inconsistently (need to be fixed by server team)
sql/log.cc:
Mark mutex used inconsistently (need to be fixed by server team)
sql/mysqld.cc:
Use pthread_mutex_trylock instead of pthread_mutex_unlock() when sending kill signal to thread
This is needed to avoid wrong mutex order as normally one takes 'current_mutex' before mysys_var->mutex.
Added call to free sp cache.
Add destruction of LOCK_server_started and COND_server_started.
Added register_mutex_order() function to register in which order mutex should be taken
(to initiailize mutex_deadlock_detector).
Added option to turn off safe_mutex_deadlock_detector
sql/protocol.cc:
Fixed wrong argument to DBUG_PRINT (found by valgrind)
sql/rpl_mi.cc:
Mark mutex used inconsistently (need to be fixed by server team)
sql/set_var.cc:
Remove wrong locking of LOCK_global_system_variables when reading and setting log variables
(would cause inconsistent mutex order).
Update global variables outside of logger.unlock() as LOCK_global_system_variables has to be taken before logger locks
Reviewed by gluh
sql/sp_cache.cc:
Added function to destroy mutex used by sp cache
sql/sp_cache.h:
Added function to destroy mutex used by sp cache
sql/sql_class.cc:
Use pthread_mutex_trylock instead of pthread_mutex_unlock() when sending kill signal to thread
This is needed to avoid wrong mutex order as normally one takes 'current_mutex' before mysys_var->mutex.
Register order in which LOCK_delete and mysys_var->mutex is taken
sql/sql_insert.cc:
Give a name for Delayed_insert::mutex
Mark mutex used inconsistently (need to be fixed by server team)
Move closing of tables outside of di->mutex (to avoid wrong mutex order)
sql/sql_show.cc:
Don't keep LOCK_global_system_variables locked over value->show_type() as this leads to wrong mutex order
storage/innobase/handler/ha_innodb.cc:
Disable safe_muted_deadlock_detector for innobase intern mutex (to speed up page cache initialization)
storage/maria/ha_maria.cc:
Added flag to ha_maria::info() to signal if we need to lock table share or not.
This is needed to avoid locking mutex in wrong order
storage/maria/ha_maria.h:
Added flag to ha_maria::info() to signal if we need to lock table share or not.
storage/maria/ma_close.c:
Destroy key_del_lock
Simplify freeing ftparser_param
storage/maria/ma_key.c:
Better comment
storage/maria/ma_loghandler.c:
Mark mutex used inconsistently (need to be fixed by sanja)
storage/maria/ma_state.c:
More comments
storage/maria/ma_test1.c:
Ensure that safe_mutex_deadlock_detector is always on (should be, this is just for safety)
storage/maria/ma_test2.c:
Ensure that safe_mutex_deadlock_detector is always on (should be, this is just for safety)
mysqlslap: fix a crash when mysql_store_result() fails
client/mysqlslap.c:
fix a crash
dbug/dbug.c:
only do safemalloc checks if a function is selected
mysql-test/mysql-test-run.pl:
it's easier to add new gdb parameters this way
storage/maria/ma_open.c:
typo in a comment
(BUG#41127: Maria: assertion when SHOW ENGINE MARIA LOGS and missing logs)
mysql-test/suite/maria/r/maria_showlog_error.result:
test suite for the BUG#41127
mysql-test/suite/maria/t/maria_showlog_error.test:
test suite for the BUG#41127
storage/maria/ha_maria.cc:
Do not use MY_WME in the stat call which errors we process on high level.
mysql-test/t/partition.test
sql/ha_partition.cc
Bug#40954: Crash in MyISAM index code with concurrency test using partitioned tables
Problem was usage of read_range_first with an empty key.
Solution was to not to give a key if it was empty. (real author Mattias Jonsson)
storage/archive/archive_reader.c
client/mysqlslap.c
Aligned the copyright texts output from "--version" of tools, to
let internal tools be able to change them if needed.
storage/ndb/test/tools/connect.cpp
storage/ndb/test/run-test/atrt.hpp
Corrected a few GPL headers not restricted to GPL version 2
Makefile.am
Added missing --report-features to the 'test-bt-fast' target
support-files/mysql.spec.sh
Reversed the removal of the "%define license GPL" in as internal
tools depended on it
Scenario of the BUG#40731 ("Maria: hang (probably in page cache) under concurrency"):
T1: Disable logging for the table
T1: Start inserting into the table
T2: Tries to lock the table so waits.
T2: Tries unlock and relock during the process see that the table has disabled logging and reenables it
T1: Got DBUG_ASSERT because suddenly start use table with transaction switched on which is not expected during bulk insert
storage/maria/ma_pagecache.c:
Page type print added for debugging purposes.
storage/maria/ma_recovery.c:
Check that it was this thred which switched off logging (transactional mode).
storage/maria/maria_def.h:
Flag for controling which thread switched off transactiona mode for the table added.
storage/maria/trnman.c:
During Maria's checkpoint, we walk the list of active transactions; in this list we may find a transaction with a short_id of 0 which means "uninitialized" (is being created right now) and want to ignore this transaction. Such short_id is set under trn->state_lock, so use this mutex to reliably read short_id during checkpoint.
storage/maria/trnman.c:
Store min used trid in a global variable and change trnman_get_min_trid() to return this variable without using a mutex.
This is safe as trnman_get_min_trid() is used for trid optimization and all algoritms will work even if it returns a slightly older trid.
Also ensure that LOCK_trn_list is unlocked in trnman_new_trn() in the very unlikely case that lf_hash_get_pins() fails
The old way to store the length prefix was (256 - length); This is now changed to (length -249)
Fixed also that some defines to have a MARIA_ prefix
storage/maria/ma_control_file.c:
Added comment
storage/maria/ma_key.c:
Added MARIA_ prefix to some defines
Changed how packed transid length was stored
storage/maria/ma_open.c:
Added MARIA_ prefix to some defines
storage/maria/maria_def.h:
Added MARIA_ prefix to some defines
Changed how packed transid length was stored
storage/maria/ha_maria.cc:
Use file->s->lock_key_trees instead of file->s->lock.get_status to detect if we are using versioning
storage/maria/ma_state.c:
Fixed function prototype
storage/maria/ma_state.h:
Fixed function prototype
storage/maria/ha_maria.cc:
Added ha_maria::is_changed()
storage/maria/ha_maria.h:
Added ha_maria::is_changed()
storage/maria/ma_delete.c:
Mark that table changed
storage/maria/ma_open.c:
Ensure that info->state->changed is always reset from thr_lock()
storage/maria/ma_state.c:
Reset handler->state->changed at first usage of transactional table
Reset handler->state->changed when taking lock for not transactional table
storage/maria/ma_state.h:
Added variable and function to track changes of table
storage/maria/ma_update.c:
Mark that table changed
storage/maria/ma_write.c:
Mark that table changed
(need a mutex when modifying bitmap->non_flushable), which I hit when running maria_bulk_insert.yy.
After fixing this, I hit an assertion in check_and_set_lsn() saying that the page was PAGECACHE_PLAIN_PAGE.
This could be caused by pages left by an operation which had transactions disabled (like a bulk insert with repair):
in this patch we remove those pages out of the cache when we re-enable transactions.
After fixing this, I get page cache deadlocks, pushbuild2 also has some, to be looked at.
No testcase, requires concurrency and running for 15 minutes, but automatically tested by pushbuild2.
storage/maria/ma_bitmap.c:
Doing bitmap->non_flushable++ without mutex was wrong. If this ++ happened while another ++ or -- was happening
in another thread, one ++ or -- could be missed and the bitmap code would behave wrongly. For example, if a ++
was missed, the DBUG_ASSERT(((int) (bitmap->non_flushable)) >= 0) in _ma_bitmap_release_unused() could fire.
I saw this assertion happen in practice in maria_bulk_insert.yy. Adding this mutex lock eliminated
the assertion problem.
The >=0 was wrong, should be >0 (or the variable could go negative).
storage/maria/ma_recovery.c:
When we re-enable transactionality, as we may have created pages of type PAGECACHE_PLAIN_PAGE before,
we need to remove them from the cache (FLUSH_RELEASE). Or they would stay this way, and later when we
maria_write() to them, we would try to tag them with a LSN (ma_unpin_all_pages()), which is incorrect
for a plain page (and causes assertion in the page cache at start of check_and_set_lsn()).
I saw the assertion fire with maria_bulk_insert.yy, and this seems to cure it.
page cache
No testcase, this requires concurrency and is automatically tested by
maria_bulk_insert.yy in pushbuild2.
storage/maria/ha_maria.cc:
The case of BUG#39710 is:
two threads want to INSERT SELECT into the same table.
Thread1 (T1) starts, does external_lock, thr_lock (store_lock sees 0 records so
upgrades to TL_WRITE), goes into bulk insert, starts writes
T2 starts, external_lock, thr_lock (store_lock sees 0 records so
upgrades to TL_WRITE), blocks on existing thr_lock of T1.
T1 ends writes, ends bulk insert, commits (ha_maria::implicit_commit()
at end of dispatch_command()), external_lock and thr_unlock
(close_thread_tables() at end of dispatch_command()).
T2 wakes up, gets thr_lock, goes into start_bulk_insert() where
file->state is out-of-date and still says that file->state->records==0,
so maria_disable_non_unique_index() is called, which asserts because
the actual number of records (share->state.state.records) is >0.
The solution, maybe temporary, is to also check share->state.state.records==0
when deciding to do bulk insert, with the idea that such operation cannot
rely on the view of the start of the transaction, as it uses repair,
and can safely read share->state as it has acquired the exclusive
TL_WRITE.
Question for reviewer: if we enter the if() branch, do we also need to do:
*(file->state)= share->state.state;
or even call some existing function which does that?
storage/maria/ma_pagecache.c:
Fixed ability to read without read lock acquiring.
storage/maria/unittest/CMakeLists.txt:
New unit test which tests simple read and prolonged writes consistency added.
storage/maria/unittest/Makefile.am:
New unit test which tests simple read and prolonged writes consistency added.
storage/maria/unittest/ma_pagecache_rwconsist2.c:
New unit test which tests simple read and prolonged writes consistency added.
Don't fsync() index file when closing Maria table if not transactional.
mysql-test/suite/maria/r/maria.result:
piece moved
mysql-test/suite/maria/r/maria_partition.result:
result
mysql-test/suite/maria/t/maria.test:
- reset default storage engine at end of test, not in the middle
- move piece which requires partitioning, to maria_partition.test, otherwise test fails
on builds without partitioning compiled in
mysql-test/suite/maria/t/maria_partition.test:
new test for those Maria bugs which are specific of partitioning
mysys/my_uuid.c:
compiler warning fix (fix imported from latest 5.1-main)
storage/maria/ma_close.c:
don't fsync() index file when closing table if not transactional
(same test as in _ma_once_end_block_record() when fsync-ing data file)
storage/maria/ma_create.c:
compiler warning fix (char* assigned to uchar*)
storage/maria/ma_loghandler.c:
compiler warning fix (char* assigned to uchar*)
Fixed that mysql-test-run --skip-from starts from the given test
mysql-test/lib/mtr_cases.pl:
Moved testing of $opt_start_from to mysql-test-run.pl because tests are now run per suite and the old way would rerun not wanted tests
mysql-test/mysql-test-run.pl:
Fixed that mysql-test-run --skip-from starts from the given test
MARIA_MAX_MSG_BUF -> HA_MAX_MSG_BUF
include/maria.h:
Remove MARIA_MAX_MSG_BUF; We are now using HA_MAX_MSG_BUF
Added maria_test_invalid_symlink
storage/maria/ha_maria.cc:
MARIA_MAX_MSG_BUF -> HA_MAX_MSG_BUF
storage/maria/ma_check.c:
Removed tab in string constant
Add extra argument to ma_open_datafile()
storage/maria/ma_create.c:
Set error number if table is in use
storage/maria/ma_open.c:
Added name argument to open functions for security check if filename is linked to another file in database directory
storage/maria/ma_static.c:
Default functions for checking if wrong symlink
storage/maria/maria_chk.c:
Add extra argument to _ma_open_datafile()
storage/maria/maria_def.h:
Add extra argument to _ma_open_datafile()
in write_changed_bitmap(), and page cache forbids that. Here we make the page
cache more relaxed. Original patch by Sanja, simplified by me as limited to
not-locked. See comment of ma_bitmap.c.
With that, maria_stress.yy runs until hitting BUG 39665.
storage/maria/ma_bitmap.c:
A thread which unpins bitmap pages in _ma_bitmap_unpin_all() sometimes
hit an assertion in the page cache (info!=0 in remove_pin()) which states
that you can unpin/unlock only what *you* have pinned/locked.
Fixed by setting the new last parameter of pagecache_unlock_by_link()
to TRUE in _ma_bitmap_unpin_all().
storage/maria/ma_blockrec.c:
new prototype and splitting assertion in three (3rd one fires: BUG 39665)
storage/maria/ma_check.c:
new prototype
storage/maria/ma_key_recover.c:
new prototype
storage/maria/ma_loghandler.c:
new prototype
storage/maria/ma_pagecache.c:
Allow a thread to unpin, with pagecache_unlock_by_link(), a non-locked page pinned by others.
This is a hack for _ma_bitmap_unpin_all() which needs to unpin pages which were
pinned by other threads in write_changed_bitmap().
storage/maria/ma_pagecache.h:
new prototype
storage/maria/ma_preload.c:
new prototype
storage/maria/unittest/ma_pagecache_rwconsist.c:
new prototype
storage/maria/unittest/ma_pagecache_single.c:
new prototype
already supports pin-without-lock so implementation of this WL is instant and
done here. This could improve concurrency. No testcase, this requires
multiple threads and is automatically tested at push time by maria_stress.yy (pushbuild2).
storage/maria/ma_bitmap.c:
As the page cache supports pinning without write-locking, we don't take write lock
in write_changed_bitmap(), only a pin; this could improve concurrency (WL#4595).
which nobody woke up (see comment of ma_bitmap.c). No testcase, this requires
multiple threads and is automatically tested at push time by maria_stress.yy (pushbuild2).
storage/maria/ma_bitmap.c:
* _ma_bitmap_wait_or_flush() didn't publish that it was waiting for bitmap to not
be over-allocated (i.e. didn't modify bitmap->flush_all_requested) so nobody
(_ma_bitmap_flushable(), _ma_bitmap_release_unused()) knew it had to wake it up
=> it stalled (BUG#39210). In fact the wait in _ma_bitmap_wait_or_flush()
is not needed, it's ok if this function sends the over-allocated bitmap to page
cache and keeps pin on it (_ma_bitmap_unpin_all() will unpin it later, and
the one who added _ma_bitmap_wait_or_flush() didn't know it). Function
is thus deleted, as _ma_bitmap_flush() can do its job.
* After fixing that, test runs longer and BUG 39665 happens, which looks like
a separate page cache bug.
* Smaller changes: _ma_bitmap_flush_all() called write_changed_bitmap() even
though it might not be changed; added some DBUG calls in functions; split
assertions.
* In _ma_bitmap_release_unused(), it's more logical to test non_flushable_state
than now_transactional to know if we have to decrement non_flushable
(it's exactly per the definition of non_flushable_state).
storage/maria/ma_blockrec.c:
_ma_bitmap_wait_or_flush() is not needed.
******
new prototype and splitting assertion in three (3rd one fires: BUG 39665)
storage/maria/ma_blockrec.h:
_ma_bitmap_wait_or_flush() is not needed.
- The problem was that we didn't inform the handler that we are going to close tables that are locked and may have (at least in Maria) be part of an active transaction.
Fix for Bug#39227 Maria: crash with ALTER TABLE PARTITION
Fix for Bug #39987 main.partition_not_windows fails under debug build
Fixed some compiler errors & warnings found by pushbuild
include/my_base.h:
Added HA_EXTRA_PREPARE_FOR_FORCED_CLOSE for signaling the handler that the file will be forced closed
include/my_global.h:
Removed 'register' from 'swap_variables' as this gives a warnings when the variables are structs. Compilers should also now be smart enough to figure out this themselves
mysql-test/r/subselect_debug.result:
Reset value of the debug variable; Without setting this the subselect_innodb test will fail when run after this one
mysql-test/suite/maria/r/maria.result:
Merged test with myisam.test
Added tests for new fixed bugs
mysql-test/suite/maria/t/maria.test:
Merged test with myisam.test
Added tests for new fixed bugs
mysql-test/t/subselect_debug.test:
Reset value of the debug variable; Without setting this the subselect_innodb test will fail when run after this one
mysys/my_uuid.c:
Fixed compiler error on windows
sql/ha_partition.cc:
Added support for the new extra flag: HA_EXTRA_PREPARE_FOR_FORCED_CLOSE (Bug #39226)
Ensure that we call extra() for HA_EXTRA_PREPARE_FOR_DROP (Bug#39227)
sql/mysqld.cc:
Fix for Bug #39987 main.partition_not_windows fails under debug build
The problem was that when compiling for purify/valgrind realpath() is not used, which causes test_if_data_home_dir to fail when it shouldn't
sql/sql_base.cc:
Call HA_EXTRA_PREPARE_FOR_FORCED_CLOSE for tables that are locked but we are going to force close without doing a commit
sql/sql_parse.cc:
More DBUG_PRINT. Fixed comments
storage/maria/ma_extra.c:
If HA_EXTRA_PREPARE_FOR_FORCED_CLOSE is called and the table is part of a transaction, remove the table from beeing part of a transaction.
This is safe as this is only used as part of flush tables or when the table is not part of a transaction
storage/myisam/mi_open.c:
Indentation fix
unittest/mysys/waiting_threads-t.c:
Remove not needed 'volatile' to get rid of compiler warnings on windows
It was a forgotten rw_unlock(), due to the deadlock detector feature (so bug was only in 5.1-maria, not
6.0-maria).
mysql-test/suite/maria/r/maria3.result:
result, all fine
mysql-test/suite/maria/t/maria3.test:
Test of BUG#39697: two scenarios (transactional tables, and non-transactional table but dynamic row format so still taking the rwlock) where the hang happened.
t2 added by this test was masked by a temporary table created earlier in the test, which we forgot to drop.
storage/maria/ha_maria.cc:
use new macro
storage/maria/ma_blockrec.c:
use new macro
storage/maria/ma_commit.c:
use new macro
storage/maria/ma_init.c:
putting address of dummy_transaction_object in --debug trace can be useful
storage/maria/ma_open.c:
use new macro
storage/maria/ma_write.c:
if local_lock_tree is true, we have acquired keyinfo->root_lock so need to release it before "goto err".
A pair of assertions so that our usage of TrIDs is kept sensible.
storage/maria/maria_def.h:
A macro so that changes of MARIA_HA::trn can be tracked with --debug. It helped to understand in what cases,
in maria_write(), we could have !(info->dup_key_trid == info->trn->trid) && !share->now_transactional
(answer: ALTER TABLE adding UNIQUE index on transactional table).
case and then select
Problem was that the archive share was using a case insensitive
charset when comparing table names
Solution was to use a case sensitive char set when the table
names are case sensitive
mysql-test/suite/parts/r/partition_mgm_lc0_archive.result:
Bug#37719: Crash if rename Archive table to same name with different
case and then select
Updated to correct result.
storage/archive/ha_archive.cc:
Bug#37719: Crash if rename Archive table to same name with different
case and then select
system_charset_info is case insensitive, table_alias_charset depends
on the filesystem/lower_case_table_names variable.
since there could be two tables that used the same share, unpredicted
things could happen.
The Blackhole engine did not support row-based replication
since the delete_row(), update_row(), and the index and range
searching functions were not implemented.
This patch adds row-based replication support for the
Blackhole engine by implementing the two functions mentioned
above, and making the engine pretend that it has found the
correct row to delete or update when executed from the slave
SQL thread by implementing index and range searching functions.
It is necessary to only pretend this for the SQL thread, since
a SELECT executed on the Blackhole engine will otherwise never
return EOF, causing a livelock.
mysql-test/extra/binlog_tests/blackhole.test:
Blackhole now handles row-based replication.
mysql-test/extra/rpl_tests/rpl_blackhole.test:
Test helper file for testing that blackhole actually
writes something to the binary log on the slave.
mysql-test/suite/binlog/t/binlog_multi_engine.test:
Replication now handles row-based replcation.
mysql-test/suite/rpl/t/rpl_blackhole.test:
Test that Blackhole works with primary key, index, or none.
sql/log_event.cc:
Correcting code to only touch filler bits and leave
all other bits alone. It is necessary since there is
no guarantee that the engine will be able to fill in
the bits correctly (e.g., the blackhole engine).
storage/blackhole/ha_blackhole.cc:
Adding definitions for update_row() and delete_row() to return OK
when executed from the slave SQL thread with thd->query == NULL
(indicating that row-based replication events are being processed).
Changing rnd_next(), index_read(), index_read_idx(), and
index_read_last() to return OK when executed from the slave SQL
thread (faking that the row has been found so that processing
proceeds to update/delete the row).
storage/blackhole/ha_blackhole.h:
Enabling row capabilities for engine.
Defining write_row(), update_row(), and delete_row().
Making write_row() private (as it should be).