The problem is that a SELECT .. FOR UPDATE statement might open
a table and later wait for a impeding global read lock without
noticing whether it is holding a table that is being waited upon
the the flush phase of the process that took the global read
lock.
The same problem also affected the following statements:
LOCK TABLES .. WRITE
UPDATE .. SET (update and multi-table update)
TRUNCATE TABLE ..
LOAD DATA ..
The solution is to make the above statements wait for a impending
global read lock before opening the tables. If there is no
impending global read lock, the statement raises a temporary
protection against global read locks and progresses smoothly
towards completion.
Important notice: the patch does not try to address all possible
cases, only those which are common and can be fixed unintrusively
enough for 5.0.
Details for Bug#43015 main.lock_multi: Weak code (sleeps etc.)
-------------------------------------------------------------
- The fix for bug 42003 already removed a lot of the weaknesses mentioned.
- Tests showed that there are unfortunately no improvements of this tests
in MySQL 5.1 which could be ported back to 5.0.
- Remove a superfluous "--sleep 1" around line 195
Details for Bug#43065 main.lock_multi: This test is too big if the disk is slow
-------------------------------------------------------------------------------
- move the subtests for the bugs 38499 and 36691 into separate scripts
- runtime under excessive parallel I/O load after applying the fix
lock_multi [ pass ] 22887
lock_multi_bug38499 [ pass ] 536926
lock_multi_bug38691 [ pass ] 258498
+ Fix for Bug#43114 wait_until_count_sessions too restrictive, random PB failures
+ Removal of a lot of other weaknesses found
+ modifications according to review
derived table cause crash
When a multi-UPDATE command fails to lock some table, and
subsequently succeeds, the tables need to be reopened if
they were altered. But the reopening procedure failed for
derived tables.
Extra cleanup has been added.
``FLUSH TABLES WITH READ LOCK''
Concurrent execution of 1) multitable update with a
NATURAL/USING join and 2) a such query as "FLUSH TABLES
WITH READ LOCK" or "ALTER TABLE" of updating table led
to a server crash.
The mysql_multi_update_prepare() function call is optimized
to lock updating tables only, so it postpones locking to
the last, and if locking fails, it does cleanup of modified
syntax structures and repeats a query analysis. However,
that cleanup procedure was incomplete for NATURAL/USING join
syntax data: 1) some Field_item items pointed into freed
table structures, and 2) the TABLE_LIST::join_columns fields
was not reset.
Major change:
short-living Field *Natural_join_column::table_field has
been replaced with long-living Item*.
The problem is that the Table_locks_waited was incremented only
when the lock request succeed. If a thread waiting for the lock
gets killed or the lock request is aborted, the variable would
not be incremented, leading to inaccurate values in the variable.
The solution is to increment the Table_locks_waited whenever the
lock request is queued. This reflects better the intended behavior
of the variable -- show how many times a lock was waited.
The problem is that some DDL statements (ALTER TABLE, CREATE
TRIGGER, FLUSH TABLES, ...) when under LOCK TABLES need to
momentarily drop the lock, reopen the table and grab the write
lock again (using reopen_tables). When grabbing the lock again,
reopen_tables doesn't pass a flag to mysql_lock_tables in
order to ignore the impending global read lock, which causes a
assertion because LOCK_open is being hold. Also dropping the
lock must not signal to any threads that the table has been
relinquished (related to the locking/flushing protocol).
The solution is to correct the way the table is reopenned
and the locks grabbed. When reopening the table and under
LOCK TABLES, the table version should be set to 0 so other
threads have to wait for the table. When grabbing the lock,
any other flush should be ignored because it's theoretically
a atomic operation. The chosen solution also fixes a potential
discrepancy between binlog and GRL (global read lock) because
table placeholders were being ignored, now a FLUSH TABLES WITH
READ LOCK will properly for table with open placeholders.
It's also important to mention that this patch doesn't fix
a potential deadlock if one uses two GRLs under LOCK TABLES
concurrently.
Kill of a CREATE TABLE source_table LIKE statement waiting for a
name-lock on the source table causes a bad lock interaction.
The mysql_create_like_table() has a bug that if the connection is
killed while waiting for the name-lock on the source table, it will
jump to the wrong error path and try to unlock the source table and
LOCK_open, but both weren't locked.
The solution is to simple return when the name lock request is killed,
it's safe to do so because no lock was acquired and no cleanup is needed.
Original bug report also contains description of other problems
related to this scenario but they either already fixed in 5.1 or
will be addressed separately (see bug report for details).
mysql_ha_open calls mysql_ha_close on the error path (unsupported) to close the (opened) table before inserting it into the tables hash list handler_tables_hash) but mysql_ha_close only closes tables which are on the hash list, causing the table to be left open and locked.
This change moves the table close logic into a separate function that is always called on the error path of mysql_ha_open or on a normal handler close (mysql_ha_close).
The patch changes the test case only.
The fix is to replace all 'sleep's with wait_condition. This makes
the test deterministic and also ~300 times faster.
- Use mysql_system_tables.sql to create MySQL system tables in
all places where we create them(mysql_install_db, mysql-test-run-pl
and mysql_fix_privilege_tables.sql)
Addendum fixes after changing the condition variable
for the global read lock.
The stress test suite revealed some deadlocks. Some were
related to the new condition variable (COND_global_read_lock)
and some were general problems with the global read lock.
It is now necessary to signal COND_global_read_lock whenever
COND_refresh is signalled.
We need to wait for the release of a global read lock if one
is set before every operation that requires a write lock.
But we must not wait if we have locked tables by LOCK TABLES.
After setting a global read lock a thread waits until all
write locks are released.
Fixes bug#17264, for alter table on win32 for successfull operation completion
it is used TL_WRITE(=10) lock instead of TL_WRITE_ALLOW_READ(=6), however here
in innodb handler TL_WRTIE is lifted to TL_WRITE_ALLOW_WRITE, which causes
race condition when several clients do alter table simultaneously.
The order of acquiring LOCK_mysql_create_db
and wait_if_global_read_lock() was wrong. It could happen
that a thread held LOCK_mysql_create_db while waiting for
the global read lock to be released. The thread with the
global read lock could try to administrate a database too.
It would first try to lock LOCK_mysql_create_db and hang...
The check if the current thread has the global read lock
is done in wait_if_global_read_lock(), which could not be
reached because of the hang in LOCK_mysql_create_db.
Now I exchanged the order of acquiring LOCK_mysql_create_db
and wait_if_global_read_lock(). This makes
wait_if_global_read_lock() fail with an error message for
the thread with the global read lock. No deadlock happens.
Replaced COND_refresh with COND_global_read_lock becasue of a bug in NTPL threads when using different mutexes as arguments to pthread_cond_wait()
The original code caused a hang in FLUSH TABLES WITH READ LOCK in some circumstances because pthread_cond_broadcast() was not delivered to other threads.
This fixes:
Bug#16986: Deadlock condition with MyISAM tables
Bug#20048: FLUSH TABLES WITH READ LOCK causes a deadlock
Added missing DBUG_xxx_RETURN statements
Fixed some usage of not initialized variables (as found by valgrind)
Ensure that we don't remove locked tables used as name locks from open table cache until unlock_table_names() are called.
This was fixed by having drop_locked_name() returning any table used as a name lock so that we can free it in unlock_table_names()
This will allow Tomas to continue with his work to use namelocks to syncronize things.
Note: valgrind still produces a lot of warnings about using not initialized code and shows memory loss errors when running the ndb tests
Use open_normal_and_derived_tables instead of open_and_lock_tables when reading metadata for a table.
Add two test cases, one for "USE database" and one for "SHOW COLUMNS FROM table"