Before this fix, the server could crash during shutdown,
due to race conditions, that occured when killing the server.
In particular, the performance schema instrumentation handle,
PSI_server, and the performance schema itself would be cleaned up
too soon, causing race conditions with a running kill server thread.
The specifics of the race condition found are that:
the main thread executing "PSI_server= NULL" can cause crashes in
other threads still running, which are executing
"if (PSI_server != NULL) PSI_server->xxx()"
as part of the performance schema instrumentation.
While the bug was reported for the kill server thread,
in theory the same crash could happen with the signal thread,
as found by code analysis.
The correct fix would be to only shutdown the performance schema
and set PSI_server to NULL after every other thread is guaranteed
to be completed, including the kill_server_thread.
However, due to the existing mysqld server design, this is not the case.
See in particular bug number 56666.
The work around used to fix this race condition is to simply not
perform the call to shutdown_performance_schema() when the server exits,
and to keep the PSI_server pointer unchanged.
This will cause memory leaks to be reported by tools like valgrind,
but no memory leak actually happen because the process is about to exit().
As a result, the file mysql-test/valgrind.supp has been updated
to filter out these false positive messages.
This code has been tested with running in a loop the following
tests in parallel, which have been known to fail with race conditions
in the past:
- rpl_change_master
- binlog_max_extension
- events_restart
- rpl_heartbeat_basic
and no crash of test failure has been seen with the changed code.
Before this fix, the server could crash during shutdown,
due to race conditions, that occured when killing the server.
In particular, the performance schema instrumentation handle,
PSI_server, and the performance schema itself would be cleaned up
too soon, causing race conditions with a running kill server thread.
The specifics of the race condition found are that:
the main thread executing "PSI_server= NULL" can cause crashes in
other threads still running, which are executing
"if (PSI_server != NULL) PSI_server->xxx()"
as part of the performance schema instrumentation.
While the bug was reported for the kill server thread,
in theory the same crash could happen with the signal thread,
as found by code analysis.
The correct fix would be to only shutdown the performance schema
and set PSI_server to NULL after every other thread is guaranteed
to be completed, including the kill_server_thread.
However, due to the existing mysqld server design, this is not the case.
See in particular bug number 56666.
The work around used to fix this race condition is to simply not
perform the call to shutdown_performance_schema() when the server exits,
and to keep the PSI_server pointer unchanged.
This will cause memory leaks to be reported by tools like valgrind,
but no memory leak actually happen because the process is about to exit().
As a result, the file mysql-test/valgrind.supp has been updated
to filter out these false positive messages.
This code has been tested with running in a loop the following
tests in parallel, which have been known to fail with race conditions
in the past:
- rpl_change_master
- binlog_max_extension
- events_restart
- rpl_heartbeat_basic
and no crash of test failure has been seen with the changed code.
mysql-test/suite/funcs_1/r/is_columns_is.result:
update for monty changes
mysql-test/valgrind.supp:
more generic suppression for dlerror() internal allocations
scripts/make_win_bin_dist:
update for hakan changes - we don't generate .map files anymore
sql/CMakeLists.txt:
innodb_plugin is said to need .map files in sql/
innodb.innodb-zip [ fail ] Found warnings/errors in server log file!
Test ended at 2010-05-17 16:41:25
line
==31182== Thread 13:
==31182== Conditional jump or move depends on uninitialised value(s)
==31182== at 0xA9193F: longest_match (deflate.c:1143)
==31182== by 0xA92C19: deflate_slow (deflate.c:1595)
==31182== by 0xA90C6B: deflate (deflate.c:790)
==31182== by 0x928A07: btr_store_big_rec_extern_fields (btr0cur.c:4092)
==31182== by 0x9C9B90: row_ins_index_entry_low (row0ins.c:2119)
==31182== by 0x9C9DFB: row_ins_index_entry (row0ins.c:2167)
==31182== by 0x9CA057: row_ins_index_entry_step (row0ins.c:2252)
==31182== by 0x9CA0FD: row_ins (row0ins.c:2384)
==31182== by 0x9CA760: row_ins_step (row0ins.c:2494)
==31182== by 0x8CBF7E: row_insert_for_mysql (row0mysql.c:1138)
==31182== by 0x8BCF32: ha_innobase::write_row(unsigned char*) (ha_innodb.cc:4929)
==31182== by 0x736E03: handler::ha_write_row(unsigned char*) (handler.cc:4682)
==31182== by 0x5B0EEE: write_record(THD*, TABLE*, st_copy_info*) (sql_insert.cc:1670)
==31182== by 0x5B129D: select_insert::send_data(List<Item>&) (sql_insert.cc:3279)
==31182== by 0x5F31ED: end_send(JOIN*, st_join_table*, bool) (sql_select.cc:12428)
==31182== by 0x5F9B9B: evaluate_join_record(JOIN*, st_join_table*, int) (sql_select.cc:11632)
innodb.innodb-zip [ fail ] Found warnings/errors in server log file!
Test ended at 2010-05-17 16:41:25
line
==31182== Thread 13:
==31182== Conditional jump or move depends on uninitialised value(s)
==31182== at 0xA9193F: longest_match (deflate.c:1143)
==31182== by 0xA92C19: deflate_slow (deflate.c:1595)
==31182== by 0xA90C6B: deflate (deflate.c:790)
==31182== by 0x928A07: btr_store_big_rec_extern_fields (btr0cur.c:4092)
==31182== by 0x9C9B90: row_ins_index_entry_low (row0ins.c:2119)
==31182== by 0x9C9DFB: row_ins_index_entry (row0ins.c:2167)
==31182== by 0x9CA057: row_ins_index_entry_step (row0ins.c:2252)
==31182== by 0x9CA0FD: row_ins (row0ins.c:2384)
==31182== by 0x9CA760: row_ins_step (row0ins.c:2494)
==31182== by 0x8CBF7E: row_insert_for_mysql (row0mysql.c:1138)
==31182== by 0x8BCF32: ha_innobase::write_row(unsigned char*) (ha_innodb.cc:4929)
==31182== by 0x736E03: handler::ha_write_row(unsigned char*) (handler.cc:4682)
==31182== by 0x5B0EEE: write_record(THD*, TABLE*, st_copy_info*) (sql_insert.cc:1670)
==31182== by 0x5B129D: select_insert::send_data(List<Item>&) (sql_insert.cc:3279)
==31182== by 0x5F31ED: end_send(JOIN*, st_join_table*, bool) (sql_select.cc:12428)
==31182== by 0x5F9B9B: evaluate_join_record(JOIN*, st_join_table*, int) (sql_select.cc:11632)
- Adjust timing in test case, to avoid test failures caused by high load
on machines and consequent race conditions in the test case.
- Add another variant of Valgrind suppressions for memory leak in system
libraries when unloading dynamic object files.
mysql-test/r/information_schema.result:
Adjust timing to avoid test failures due to races.
mysql-test/t/information_schema.test:
Adjust timing to avoid test failures due to races.
mysql-test/valgrind.supp:
Add another variant of valgrind suppression for leak in system libs.
mysql-test/r/innodb-timeout.result:
Make test more robust to scheduling delays on the host running the test suite.
mysql-test/suite/rpl/r/rpl_relayspace.result:
Apply patch from BUG#25228 and tweak timeout value in an attempt to fix random
failure of this test in Buildbot (could not repeat locally).
mysql-test/suite/rpl/t/rpl_relayspace.test:
Apply patch from BUG#25228 and tweak timeout value in an attempt to fix random
failure of this test in Buildbot (could not repeat locally).
mysql-test/t/innodb-timeout.test:
Make test more robust to scheduling delays on the host running the test suite.
mysql-test/valgrind.supp:
Add suppression for Glibc bug.
- Moved some code from innodb_plugin to xtradb, to ensure that all tests runs
- Did changes in pbxt and maria storage engines becasue of changes in thd->query
- Reverted wrong code in sql_table.cc for how ROW_FORMAT is used.
This is a re-commit of Monty's merge to eliminate an extra commit from
MySQL-5.1.42 that was accidentally included in the merge.
This is a merge of the MySQL 5.1.41 clone-off (clone-5.1.41-build). In
case there are any extra changes done before final MySQL 5.1.41
release, these will need to be merged later before MariaDB 5.1.41
release.
* Finished Monty and Jani's merge
* Some InnoDB tests still fail (because it's old xtradb code run against
newer testsuite). They are expected to go after mergning with the latest
xtradb.
The server shutdown and start code triggered the valgrind failures
within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64)
in rpl_trigger.test file.
For fixing the bug, suppress valgrind failures within nptl_pthread_exit_hack_handler
on Ubuntu 9.04, x86 (but not amd64). Because the server shutdown and start
code has been heavily used in mysql test set.
mysql-test/valgrind.supp:
Add code for suppressing valgrind failures within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64).
The server shutdown and start code triggered the valgrind failures
within nptl_pthread_exit_hack_handler on Ubuntu 9.04, x86 (but not amd64)
in rpl_trigger.test file.
For fixing the bug, suppress valgrind failures within nptl_pthread_exit_hack_handler
on Ubuntu 9.04, x86 (but not amd64). Because the server shutdown and start
code has been heavily used in mysql test set.
Use MY_MUTEX_INIT_FAST for pool mutex
mysql-test/mysql-test-run.pl:
Added option --staging-run
Added information about --parallell=# to help message
mysql-test/suite/federated/federated_server.test:
Slow test, don't run with --staging-run
mysql-test/suite/maria/t/maria-preload.test:
Slow test, don't run with --staging-run
mysql-test/suite/rpl/t/rpl_optimize.test:
Slow test, don't run with --staging-run
mysql-test/suite/rpl/t/rpl_relayrotate.test:
Slow test, don't run with --staging-run
mysql-test/suite/rpl/t/rpl_row_001.test:
Slow test, don't run with --staging-run
mysql-test/suite/rpl/t/rpl_row_mysqlbinlog.test:
Slow test, don't run with --staging-run
mysql-test/suite/rpl/t/rpl_row_sp003.test:
Slow test, don't run with --staging-run
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
Slow test, don't run with --staging-run
mysql-test/t/compress.test:
Slow test, don't run with --staging-run
mysql-test/t/count_distinct3.test:
Slow test, don't run with --staging-run
mysql-test/t/index_merge_innodb.test:
Slow test, don't run with --staging-run
mysql-test/t/information_schema_all_engines.test:
Slow test, don't run with --staging-run
mysql-test/t/innodb_mysql.test:
Slow test, don't run with --staging-run
mysql-test/t/pool_of_threads.test:
Slow test, don't run with --staging-run
mysql-test/t/preload.test:
Slow test, don't run with --staging-run
mysql-test/t/ssl.test:
Slow test, don't run with --staging-run
mysql-test/t/ssl_compress.test:
Slow test, don't run with --staging-run
mysql-test/valgrind.supp:
Suppress warnings from SuSE 11.1 on x86
sql/scheduler.cc:
Use MY_MUTEX_INIT_FAST for pool mutex
- Version number.
- Valgrind false alarms in libz.
- New variant of suppression for Valgrind warning in dlclose().
- Fix double free() in plugin init error case.
configure.in:
Fix version number. We should reset the maria variant back to `1' when the MySQL version
number increases.
include/my_sys.h:
Fix false alarms in Valgrind for zlib.
Apply same fix as for archive storage handler also to the cases of compression in the
client protocol, and to the compression SQL function.
mysql-test/valgrind.supp:
A new variant of the dlclose() suppression is needed now.
mysys/my_compress.c:
Fix false alarms in Valgrind for zlib.
Apply same fix as for archive storage handler also to the cases of compression in the
client protocol, and to the compression SQL function.
sql/handler.cc:
Fix a double free() in error case for plugin initialisation.
sql/item_strfunc.cc:
Fix false alarms in Valgrind for zlib.
Apply same fix as for archive storage handler also to the cases of compression in the
client protocol, and to the compression SQL function.
Fix mysql-test-run.pl to not terminate early when warnings in error logs are detected during
server shutdown. Instead, give a nice summary report at the end of the failures.
Fix code to make 100% sure no failures will go undetected.
Revert earlier wrong change.
Fix race with port allocation semaphore file permissions.
Adjust testsuite to copy with new PBXT engine now in the tree. The PBXT engine causes an
extra table to appear in the INFORMATION_SCHEMA. This causes different output for a few
test cases.
dbug/dbug.c:
If DbugParse() is called multiple times, the stack->keywords for the
top stack frame could be overwritten without being freed, causing a
memory leak reported by Valgrind.
include/my_global.h:
Add useful macro for different values when Valgrind (HAVE_purify) and not.
mysql-test/extra/rpl_tests/rpl_auto_increment.test:
Omit pbxt variables from show variables output.
mysql-test/include/have_pbxt.inc:
Add facility to disable test if PBXT engine is not available.
mysql-test/lib/mtr_report.pm:
Give a nice summary report at the end of tests of any warnings seen in logs during
server shutdowns.
mysql-test/lib/mtr_unique.pm:
Move chmod 777 to greatly reduce the risk of leaving the port semaphore file unaccessible
bu other users.
mysql-test/mysql-test-run.pl:
Don't abort in case of warnings detected, instead give a nice summary report.
Fix code to make 100% sure no failures will go undetected.
Revert earlier wrong change when master disconnects early.
mysql-test/r/information_schema.result:
Omit PBXT INFORMATION_SCHEMA table from output.
Move part of test to information_schema_all_engines.
mysql-test/r/information_schema_all_engines.result:
New file for information_schema tests that depend on which engines are available.
mysql-test/r/information_schema_db.result:
Move part of test to information_schema_all_engines.
mysql-test/r/innodb-autoinc.result:
Omit pbxt variables from show variables output.
mysql-test/r/mysqlshow.result:
Move part of test to information_schema_all_engines.
mysql-test/suite/rpl/r/rpl_auto_increment.result:
Omit pbxt variables from show variables output.
mysql-test/t/information_schema.test:
Omit PBXT INFORMATION_SCHEMA table from output.
Move part of test to information_schema_all_engines.
mysql-test/t/information_schema_all_engines.test:
New file for information_schema tests that depend on which engines are available.
mysql-test/t/information_schema_db.test:
Move part of test to information_schema_all_engines.
mysql-test/t/innodb-autoinc.test:
Omit pbxt variables from show variables output.
mysql-test/t/mysqlshow.test:
Move part of test to information_schema_all_engines.
mysql-test/valgrind.supp:
Add variant suppression (different system library versions).
Add suppression for problem with inet_ntoa().
sql/mysqld.cc:
Fix missing DBUG_RETURN.
Fix uninitialised thd->connect_utime, likely introduced by pool_of_threads.
sql/set_var.cc:
Fix one-byte buffer overflow in several places.
Fix unsafe use of String::c_ptr() of stack-allocated String buffer.
sql/sql_select.cc:
Silence valgrind warning due to GCC bug.
sql/sql_string.h:
Document potential problem with String::c_ptr() and String() constructor with caller-supplied buffer.
storage/archive/azio.c:
Silence Valgrind false warning for libz.
If delayed insert fails to upgrade the lock it was not
freeing the temporary memory storage used to keep
newly constructed blob values in memory.
Fixed by iterating over the remaining rows in the delayed
insert rowset and freeing the blob storage for each row.
No test suite because it involves concurrent delayed inserts
on a table and cannot easily be made deterministic.
Added a correct valgrind suppression for Fedora 9.
mysql-test/valgrind.supp:
Added a vagrind suppression for Fedora 9
sql/sql_insert.cc:
Bug #38693: free the blobs temp storage on error.
If delayed insert fails to upgrade the lock it was not
freeing the temporary memory storage used to keep
newly constructed blob values in memory.
Fixed by iterating over the remaining rows in the delayed
insert rowset and freeing the blob storage for each row.
No test suite because it involves concurrent delayed inserts
on a table and cannot easily be made deterministic.
Added a correct valgrind suppression for Fedora 9.
Reset history when we reenable logging for transactional tables (safety fix)
mysql-test/r/maria2.result:
New results
mysql-test/t/maria2.test:
Added test case for alter table on locked maria table
mysql-test/valgrind.supp:
Added suppression rules for warnings in libc / libld
storage/maria/ma_extra.c:
Remove table from trnman list if we are going to drop or rename it; We don't want not existing shares in the list when we do commit!
storage/maria/ma_recovery.c:
Ensure that info->state don't point to history event when we disable logging for a table; This is needed as alter table will first do commit and then unlock, which would cause us to access a non existing object when we reenable logging.
Reset history when we reenable logging (safety fix)
storage/maria/ma_state.c:
Do less work when share->now_transactional is not set. (Safety fix)
Added function to remove shares to be deleted from trnman->used_tables
Added function to reset history to current context
storage/maria/ma_state.h:
Prototypes for new function