bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
REBUILD PARTITION under LOCK TABLE
Collapsed patch including updates from the reviews.
In case of failure in ALTER ... PARTITION under LOCK TABLE
the server could crash, due to it had modified the locked
table object, which was not reverted in case of failure,
resulting in a bad table definition used after the failed
command.
Solved by instead of altering the locked table object and
its partition_info struct, creating an internal temporary
intermediate table object used for altering,
just like the non partitioned mysql_alter_table.
So if an error occur before the alter operation is complete,
the original table is not modified at all.
But if the alter operation have succeeded so far that it
must be completed as whole,
the table is properly closed and reopened.
(The completion on failure is done by the ddl_log.)
adding new indexes
A fast alter table requires that the existing (old) table
and indices are unchanged (i.e only new indices can be
added). To verify this, the layout and flags of the old
table/indices are compared for equality with the new.
The PACK_KEYS option is a no-op in InnoDB, but the flag
exists, and is used in the table compare. We need to
check this (table) option flag before deciding whether an
index should be packed or not. If the table has
explicitly set PACK_KEYS to 0, the created indices should
not be marked as packed/packable.
The problem was that RENAME TABLE caused an assert if the system variable
lower_case_table_names was 2 (default on Mac OS X) and the old table name
was given in upper case. This caused lowercase_table2.test to fail.
The assert checks that an exclusive metadata lock is held by the connection
trying to do RENAME TABLE - specificially during updates of table triggers.
The assert was triggered since the check is case sensitive and the lock
was held on the normalized (lower case) version of the table name.
This patch fixes the problem by making sure a normalized version of the
table name is used for the metadata lock check, while using a non-normalized
version of the table name for the rename of trigger files. The same is done
for ALTER TABLE ... RENAME.
Regression testing for the bug itself is already covered by
lowercase_table2.test. Additional coverage added to lowercase_fs_off.test.
to allow temp table operations) -- prerequisite patch #3.
Rename open_temporary_table() to open_table_uncached().
open_temporary_table() will be introduced in following patches
to open temporary tables for a statement.
to allow temp table operations) -- prerequisite patch #2.
Introduce a new form of find_temporary_table() function:
find_temporary_table() by a table key. It will be used
in further patches.
Replace find_temporary_table(table_list->db, table_list->name)
by more appropiate find_temporary_table(table_list) across
the codebase.
The ALTER PARTITION and SELECT seemed to be deadlocked
when having innodb_thread_concurrency = 1.
Problem was that there was unreleased latches
in the ALTER PARTITION thread which was needed
by the SELECT thread to be able to continue.
Solution was to release the latches by commit
before requesting upgrade to exclusive MDL lock.
Updated according to reviewers comments (3).
temp table
This patch introduces two key changes in the replication's behavior.
Firstly, it reverts part of BUG#51894 which puts any update to temporary tables
into the trx-cache. Now, updates to temporary tables are handled according to
the type of their engines as a regular table.
Secondly, an unsafe mixed statement, (i.e. a statement that access transactional
table as well non-transactional or temporary table, and writes to any of them),
are written into the trx-cache in order to minimize errors in the execution when
the statement logging format is in use.
Such changes has a direct impact on which statements are classified as unsafe
statements and thus part of BUG#53259 is reverted.
'CREATE TABLE IF NOT EXISTS ... SELECT' behaviour
BUG#47132, BUG#47442, BUG49494, BUG#23992 and BUG#48814 will disappear
automatically after the this patch.
BUG#55617 is fixed by this patch too.
This is the 5.5 part.
It implements:
- 'CREATE TABLE IF NOT EXISTS ... SELECT' statement will not insert
anything and binlog anything if the table already exists.
It only generate a warning that table already exists.
- A couple of test cases for the behavior changing.
locks on the table
Fixing the partitioning specifics after TRUNCATE TABLE in
bug-42643 was fixed.
Reorganize of code to decrease the size of the giant switch
in mysql_execute_command, and to prepare for future parser
reengineering. Moved code into Sql_statement objects.
Updated patch according to davi's review comments.
corruption on ADD PARTITION and LOCK TABLE
Bug#53770: Server crash at handler.cc:2076 on
LOAD DATA after timed out COALESCE PARTITION
5.5 fix for:
Bug#51042: REORGANIZE PARTITION can leave table in an
inconsistent state in case of crash
Needs to be back-ported to 5.1
5.5 fix for:
Bug#50418: DROP PARTITION does not interact with
transactions
Main problem was non-persistent operations done
before meta-data lock was taken (53770+53676).
And 53676 needed to keep the table/partitions opened and locked
while copying the data to the new partitions.
Also added thorough tests to spot some additional bugs
in the ddl_log code, which could result in bad state
between the .frm and partitions.
Collapsed patch, includes all fixes required from the reviewers.
A label statement needs to be followed by at least
one primary expression. If built without
WITH_PARTITION_STORAGE_ENGINE set, the block would
be empty.
Added ';' as a dummy statement to fix it.
Remove acquisition of LOCK_open around file system operations,
since such operations are now protected by metadata locks.
Rework table discovery algorithm to not require LOCK_open.
No new tests added since all MDL locking operations are covered
in lock.test and mdl_sync.test, and as long as these tests
pass despite the increased concurrency, consistency must be
unaffected.
the precursor patch for Bug#52044.
When passing the TABLE instance for invalidation to the
query cache, we didn't always have a valid share
(in case of error).
Make sure we invalidate the table using TABLE_LIST, not
TABLE, object.
This patch also fixes Bug#55452 "SET PASSWORD is
replicated twice in RBR mode".
The goal of this patch is to remove the release of
metadata locks from close_thread_tables().
This is necessary to not mistakenly release
the locks in the course of a multi-step
operation that involves multiple close_thread_tables()
or close_tables_for_reopen().
On the same token, move statement commit outside
close_thread_tables().
Other cleanups:
Cleanup COM_FIELD_LIST.
Don't call close_thread_tables() in COM_SHUTDOWN -- there
are no open tables there that can be closed (we leave
the locked tables mode in THD destructor, and this
close_thread_tables() won't leave it anyway).
Make open_and_lock_tables() and open_and_lock_tables_derived()
call close_thread_tables() upon failure.
Remove the calls to close_thread_tables() that are now
unnecessary.
Simplify the back off condition in Open_table_context.
Streamline metadata lock handling in LOCK TABLES
implementation.
Add asserts to ensure correct life cycle of
statement transaction in a session.
Remove a piece of dead code that has also become redundant
after the fix for Bug 37521.
Fix warnings flagged by the new warning option -Wunused-but-set-variable
that was added to GCC 4.6 and that is enabled by -Wunused and -Wall. The
option causes a warning whenever a local variable is assigned to but is
later unused. It also warns about meaningless pointer dereferences.
table with active trx
Essentially, the problem is that InnoDB does a implicit commit
when a cursor (table handler) is unlocked/closed, creating
a dissonance between the transaction state within the server
layer and the storage engine layer. Theoretically, a statement
transaction can encompass several table instances in a similar
manner to a multiple statement transaction, hence it does not
make sense to limit a statement transaction to the lifetime of
the table instances (cursors) used within it.
Since this particular instance of the problem is only triggerable
on 5.1 and is masked on 5.5 due 2PC being skipped (assertion is in
the prepare phase of a 2PC), the solution (which is less risky) is
to explicitly end the transaction before the cached table is unlock
on rename table.
The patch is to be null merged into trunk.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
This crash occured after ALTER TABLE was used on a temporary
transactional table locked by LOCK TABLES. Any later attempts to
execute LOCK/UNLOCK TABLES, caused the server to crash.
The reason for the crash was the list of locked tables would
end up having a pointer to a free'd table instance. This happened
because ALTER TABLE deleted the table without also removing the
table reference from the locked tables list.
This patch fixes the problem by making sure ALTER TABLE also
removes the table from the locked tables list.
Test case added to innodb_mysql.test.
value and NO_ZERO_DATE
The problem was that a older version of the error path for a
failed admin statement relied upon a few error conditions being
met in order to access a table handler, the first one being that
the table object pointer was not NULL. Probably due to chance,
in all cases a table object was closed but the reference wasn't
reset, the other conditions didn't evaluate to true. With the
addition of a new check on the error path, the handler started
being dereferenced whenever it was not reset to NULL, causing
problems for code paths which closed the table but didn't reset
the reference.
The solution is to reset the reference whenever a admin statement
fails and the tables are closed.
DROP TEMP TABLE
Cset: alfranio.correia@sun.com-20100420091043-4i6ouzozb34hvzhb
introduced a change that made drop temporary table to be always
logged if current statement log format was set to row. This is
fine. However, logging operations, for a "DROP TABLE" statement
in mysql_rm_table_part2, are not protected by first checking if
the mysql_bin_log is open before proceeding to the actual
logging. They only check the dont_log_query variable. This was
actually uncovered by the aforementioned cset and not introduced
by it.
We fix this by extending the condition used in the "if" that
wraps logging operations in mysql_rm_table_part2.
DATABASE with open HANDLER"
Remove LOCK_create_db, database name locks, and use metadata locks instead.
This exposes CREATE/DROP/ALTER DATABASE statements to the graph-based
deadlock detector in MDL, and paves the way for a safe, deadlock-free
implementation of RENAME DATABASE.
Database DDL statements will now take exclusive metadata locks on
the database name, while table/view/routine DDL statements take
intention exclusive locks on the database name. This prevents race
conditions between database DDL and table/view/routine DDL.
(e.g. DROP DATABASE with concurrent CREATE/ALTER/DROP TABLE)
By adding database name locks, this patch implements
WL#4450 "DDL locking: CREATE/DROP DATABASE must use database locks" and
WL#4985 "DDL locking: namespace/hierarchical locks".
The patch also changes code to use init_one_table() where appropriate.
The new lock_table_names() function requires TABLE_LIST::db_length to
be set correctly, and this is taken care of by init_one_table().
This patch also adds a simple template to help work with
the mysys HASH data structure.
Most of the patch was written by Konstantin Osipov.
Bug#47633 - assert in ha_myisammrg::info during OPTIMIZE
The server crashed on an attempt to optimize a MERGE table with
non-existent child table.
mysql_admin_table() relied on the table to be successfully open
if a table object had been allocated.
Changed code to check return value of the open function before
calling a handler:: function on it.
------------------------------------------------------------
revno: 3672
committer: lars-erik.bjork@sun.com
branch nick: 48067-mysql-6.0-codebase-bugfixing
timestamp: Mon 2009-10-26 13:51:43 +0100
message:
This is a patch for bug#48067
"A temp table with the same name as an existing table, makes drop
database fail"
When dropping the database, mysql_rm_known_files() reads the contents
of the database directory, and creates a TABLE_LIST object, for each
.frm file encountered. Temporary tables, however, are not associated
with any .frm file.
The list of tables to drop are passed to mysql_rm_table_part2().
This method prefers temporary tables over regular tables, so if
there is a temporary table with the same name as a regular, the
temporary is removed, leaving the regular table intact.
Regular tables are only deleted if there are no temporary tables
with the same name.
This fix ensures, that for all TABLE_LIST objects that are created
by mysql_rm_known_files(), 'open_type' is set to 'OT_BASE_ONLY', to
indicate that this is a regular table. In all cases in
mysql_rm_table_part2() where we prefer a temporary table to a
non-temporary table, we chek if 'open_type' equals 'OT_BASE_ONLY'.
strict aliasing violations.
One somewhat major source of strict-aliasing violations and
related warnings is the SQL_LIST structure. For example,
consider its member function `link_in_list` which takes
a pointer to pointer of type T (any type) as a pointer to
pointer to unsigned char. Dereferencing this pointer, which
is done to reset the next field, violates strict-aliasing
rules and might cause problems for surrounding code that
uses the next field of the object being added to the list.
The solution is to use templates to parametrize the SQL_LIST
structure in order to deference the pointers with compatible
types. As a side bonus, it becomes possible to remove quite
a few casts related to acessing data members of SQL_LIST.
make tdc_refresh_version an atomic counter".
To avoid orphaned TABLE_SHARE objects left in the
cache, make sure that wherever we set table->s->version
we take care of removing all unused table share objects
from the table cache.
Always set table->s->version under LOCK_open, to make sure
that no other connection sees an old value of the
version and adds the table to unused_tables list.
Add an assert to table_def_unuse_table() that we never
'unuse' a talbe of a share that has an old version.
With this patch, only three places are left in the code
that manipulate with table->s->version:
- tdc_remove_table(). In most cases we have an X mdl lock
in tdc_remove_table(), the two remaining cases when we
don't are 'FLUSH TABLE' and mysql_admin_table().
- sql_view.cc - a crude hack that needs a separate fix
- initial assignment from refresh_version in table.cc.
thd->open_tables"
thd->open_tables list is not normally accessed concurrently
except for one case: when the connection has open SQL
HANDLER tables, and we want to perform a DDL on the table,
we want to abort waits on MyISAM thr_lock of those connections
that prevent the DDL from proceeding, and iterate
over thd->open_tables list to find out the tables on which
the thread is waiting.
In 5.5 we mostly use deadlock detection and soft deadlock
prevention, as opposed to "hard" deadlock prevention
of 5.1, which would abort any transaction that
may cause a deadlock. The only remaining case when
neither deadlock detection nor deadlock prevention
is implemented in 5.5 is HANDLER SQL, where we use
old good thr_lock_abort() technique form 5.1.
Thus, replace use of LOCK_open to protect thd->open_tables
with thd->LOCK_ha_data (a lock protecting various session
private data).
This is a port of the work done for 5.5.4 for review
and inclusion into 5.5.5.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
Make open flags part of Open_table_context.
This allows to simplify some code and (in future)
enforce the invariant that we don't, say, request a back
off on the table when there is MYSQL_OPEN_IGNORE_FLUSH
flag.
is allowed on views (not documented, broken)".
Remove support of ALTER TABLE RENAME for views as:
a) this feature was not documented,
c) does not add any compatibility with other databases,
b) its implementation doesn't follow metadata locking
protocol by accessing .FRM without holding any
metadata lock,
c) its implementation complicates ALTER TABLE's code
by introducing yet another separate branch to it.
After this patch one can rename a view by using the
documented way - RENAME TABLE statement.