Added 'from_end' as extra parameter to Field::unpack() to detect wrong from data.
Change ha_archive::unpack_row() to detect wrong field lengths.
Replication code changed to detect wrong field information in events.
mysql-test/r/archive.result:
dded test case for lp:917689
sql/field.cc:
Added 'from_end' as extra parameter to Field::unpack() to detect wrong from data.
Removed not used 'unpack_key' functions.
sql/field.h:
Added 'from_end' as extra parameter to Field::unpack() to detect wrong from data.
Removed not used 'unpack_key' functions.
Removed some not needed unpack() functions.
sql/filesort.cc:
Added buffer end parameter to unpack_addon_fields()
sql/log_event.h:
Added end of buffer argument to unpack_row()
sql/log_event_old.cc:
Added end of buffer argument to unpack_row()
sql/log_event_old.h:
Added end of buffer argument to unpack_row()
sql/records.cc:
Added buffer end parameter to unpack_addon_fields()
sql/rpl_record.cc:
Added end of buffer argument to unpack_row()
Added detection of wrong field information in events
sql/rpl_record.h:
Added end of buffer argument to unpack_row()
sql/rpl_record_old.cc:
Added end of buffer argument to unpack_row()
Added detection of wrong field information in events
sql/rpl_record_old.h:
Added end of buffer argument to unpack_row()
sql/table.h:
Added buffer end parameter to unpack()
storage/archive/ha_archive.cc:
Change ha_archive::unpack_row() to detect wrong field lengths.
This fixes lp:917689
The result of materialization of the right part of an IN subquery predicate
is placed into a temporary table. Each row of the materialized table is
distinct. A unique key over all fields of the temporary table is defined and
created. It allows to perform key look-ups into the table.
The table created for a materialized subquery can be accessed by key as
any other table. The function best_access-path search for the best access
to join a table to a given partial join. With some where conditions this
function considers a possibility of a ref_or_null access. If such access
employs the unique key on the temporary table then when estimating
the cost this access the function tries to use the array rec_per_key. Yet,
such array is not built for this unique key. This causes a crash of the server.
Rows returned by the subquery that contain nulls don't have to be placed
into temporary table, as they cannot be match any row produced by the
left part of the subquery predicate. So all fields of the temporary table
can be defined as non-nullable. In this case any ref_or_null access
to the temporary table does not make any sense and it does not make sense
to estimate such an access.
The fix makes sure that the temporary table for a materialized IN subquery
is defined with columns that are all non-nullable. The also ensures that
any row with nulls returned by the subquery is not placed into the
temporary table.
Problem was that now we can merge derived table (subquery in the FROM clause).
Fix: in case of detected conflict and presence of derived table "over" the table which cased the conflict - try materialization strategy.
For single table update/insert added deep check of single tables (single_table_updatable()).
For multi-table view insert added additional check of target table (check_view_single_update).
Multi-update was correct.
Test suite for all cases added.
Bug#13011410 CRASH IN FILESORT CODE WITH GROUP BY/ROLLUP
The assert in 13580775 is visible in 5.6 only,
but shows that all versions are vulnerable.
13011410 crashes in all versions.
filesort tries to re-use the sort buffer between invocations in order to save
malloc/free overhead.
The fix for Bug 11748783 - 37359: FILESORT CAN BE MORE EFFICIENT.
added an assert that buffer properties (num_records, record_length) are
consistent between invocations. Indeed, they are not necessarily consistent.
Fix: re-allocate the sort buffer if properties change.
mysql-test/r/partition.result:
New tests.
mysql-test/t/partition.test:
New tests.
sql/filesort.cc:
If we already have allocated a sort buffer in a previous execution,
then verify that it is big enough for the current one.
sql/table.h:
Add sort_keys_size; Number of bytes allocated for the sort_keys buffer.
Bug#13011410 CRASH IN FILESORT CODE WITH GROUP BY/ROLLUP
The assert in 13580775 is visible in 5.6 only,
but shows that all versions are vulnerable.
13011410 crashes in all versions.
filesort tries to re-use the sort buffer between invocations in order to save
malloc/free overhead.
The fix for Bug 11748783 - 37359: FILESORT CAN BE MORE EFFICIENT.
added an assert that buffer properties (num_records, record_length) are
consistent between invocations. Indeed, they are not necessarily consistent.
Fix: re-allocate the sort buffer if properties change.
- Part 1 of the fix: for semi-join merged subqueries, calling child_join->optimize() until we're done with all
PS-lifetime optimizations in the parent.
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
The following were missing in the patch for mwl106:
- KEY_PART_INFO::fieldnr were not set for generated keys to access
tmp tables storing the rows of materialized derived tables/views
- TABLE_SHARE::column_bitmap_size was not set for tmp tables storing
the rows of materialized derived tables/views.
These could cause crashes or memory overwrite.
UPDATED TWICE
For multi update it is not allowed to update a column
of a table if that table is accessed through multiple aliases
and either
1) the updated column is used as partitioning key
2) the updated column is part of the primary key
and the primary key is clustered
This check is done in unsafe_key_update().
The bug was that for case 2), it was checked whether
updated_column_number == table_share->primary_key
However, the primary_key variable is the index number of the
primary key, not a column number.
Prior to this bugfix, the first column was wrongly believed to be
the primary key. The columns covered by an index is found in
table->key_info[idx_number]->key_part. The bugfix is to check if
any of the columns in the keyparts of the primary key are
updated.
The user-visible effect is that for storage engines with
clustered primary key (e.g. InnoDB but not MyISAM) queries
like
"UPDATE t1 AS A JOIN t2 AS B SET A.primkey=..."
will now error with
"ERROR HY000: Primary key/partition key update is not allowed
since the table is updated both as 'A' and 'B'."
instead of
"ERROR 1032 (HY000): Can't find record in 't1_tb'"
even if primkey is not the first column in the table. This
was the intended behavior of bugfix 11764529.
mysql-test/r/multi_update.result:
Add test for bug#11882110
mysql-test/r/multi_update_innodb.result:
Add test for bug#11882110
mysql-test/t/multi_update.test:
Add test for bug#11882110
mysql-test/t/multi_update_innodb.test:
Add test for bug#11882110
sql/sql_update.cc:
unsafe_key_update() wrongly checked if the primary key index
number was the same as updated column number. Now it is checked
whether any of the columns making up the primary key is updated.
sql/table.h:
Fix comment on TABLE_SHARE::primary_key. Incorrect comment
was introduced by an earlier merge conflict (as per dlenev)
way of processing prepared statements:
- conversion subquery_predicate -> TABLE_LIST is once per-statement
- However, the code must take into account that materialized temptable
is dropped and re-created on each execution (the tricky part is that
at start of n-th EXECUTE we have TABLE_LIST object but not its TABLE object)
- IN-equality is injected into WHERE on every execution.
A lot of small fixes and new test cases.
client/mysqlbinlog.cc:
Cast removed
client/mysqltest.cc:
Added missing DBUG_RETURN
include/my_pthread.h:
set_timespec_time_nsec() now only takes one argument
mysql-test/t/date_formats.test:
Remove --disable_ps_protocl as now also ps supports microseconds
mysys/my_uuid.c:
Changed to use my_interval_timer() instead of my_getsystime()
mysys/waiting_threads.c:
Changed to use my_hrtime()
sql/field.h:
Added bool special_const_compare() for fields that may convert values before compare (like year)
sql/field_conv.cc:
Added test to get optimal copying of identical temporal values.
sql/item.cc:
Return that item_int is equal if it's positive, even if unsigned flag is different.
Fixed Item_cache_str::save_in_field() to have identical null check as other similar functions
Added proper NULL check to Item_cache_int::save_in_field()
sql/item_cmpfunc.cc:
Don't call convert_constant_item() if there is nothing that is worth converting.
Simplified test when years should be converted
sql/item_sum.cc:
Mark cache values in Item_sum_hybrid as not constants to ensure they are not replaced by other cache values in compare_datetime()
sql/item_timefunc.cc:
Changed sec_to_time() to take a my_decimal argument to ensure we don't loose any sub seconds.
Added Item_temporal_func::get_time() (This simplifies some things)
sql/mysql_priv.h:
Added Lazy_string_decimal()
sql/mysqld.cc:
Added my_decimal constants max_seconds_for_time_type, time_second_part_factor
sql/table.cc:
Changed expr_arena to be of type CONVENTIONAL_EXECUTION to ensure that we don't loose any items that are created by fix_fields()
sql/tztime.cc:
TIME_to_gmt_sec() now sets *in_dst_time_gap in case of errors
This is needed to be able to detect if timestamp is 0
storage/maria/lockman.c:
Changed from my_getsystime() to set_timespec_time_nsec()
storage/maria/ma_loghandler.c:
Changed from my_getsystime() to my_hrtime()
storage/maria/ma_recovery.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/maria/unittest/trnman-t.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/xtradb/handler/ha_innodb.cc:
Added support for new time,datetime and timestamp
unittest/mysys/thr_template.c:
my_getsystime() -> my_interval_timer()
unittest/mysys/waiting_threads-t.c:
my_getsystime() -> my_interval_timer()
Resolved all conflicts, bad merges and fixed a few minor bugs in the code.
Commented out the queries from multi_update, view, subselect_sj, func_str,
derived_view, view_grant that failed either with crashes in ps-protocol or
with wrong results.
The failures are clear indications of some bugs in the code and these bugs
are to be fixed.
UPDATED TWICE
For multi update it is not allowed to update a column
of a table if that table is accessed through multiple aliases
and either
1) the updated column is used as partitioning key
2) the updated column is part of the primary key
and the primary key is clustered
This check is done in unsafe_key_update().
The bug was that for case 2), it was checked whether
updated_column_number == table_share->primary_key
However, the primary_key variable is the index number of the
primary key, not a column number.
Prior to this bugfix, the first column was wrongly believed to be
the primary key. The columns covered by an index is found in
table->key_info[idx_number]->key_part. The bugfix is to check if
any of the columns in the keyparts of the primary key are
updated.
The user-visible effect is that for storage engines with
clustered primary key (e.g. InnoDB but not MyISAM) queries
like
"UPDATE t1 AS A JOIN t2 AS B SET A.primkey=..."
will now error with
"ERROR HY000: Primary key/partition key update is not allowed
since the table is updated both as 'A' and 'B'."
instead of
"ERROR 1032 (HY000): Can't find record in 't1_tb'"
even if primkey is not the first column in the table. This
was the intended behavior of bugfix 11764529.
(aka BUG#11766883)
- fix review comments
- Rewrite last usage of handler::get_tablespace_name to use
table->s->tablespace directly
- Remove(revert) the addition of default implementation for
handler::get_tablespace_name
- Add comments describing the new TABLE_SHARE members default_storage_media
and tablespace
- Fix usage of incorrect mask for column_format bits, i.e COLUMN_FORMAT_MASK
(aka BUG#11766883)
- fix review comments
- Rewrite last usage of handler::get_tablespace_name to use
table->s->tablespace directly
- Remove(revert) the addition of default implementation for
handler::get_tablespace_name
- Add comments describing the new TABLE_SHARE members default_storage_media
and tablespace
- Fix usage of incorrect mask for column_format bits, i.e COLUMN_FORMAT_MASK
- Add new "format section" in extra data segment with additional table and
column properties. This was originally introduced in 5.1.20 based MySQL Cluster
- Remove hardcoded STORAGE DISK for table and instead
output the real storage format used. Keep both TABLESPACE
and STORAGE inside same version guard.
- Implement default version of handler::get_tablespace_name() since tablespace
is now available in share and it's unnecessary for each handler to implement.
(the function could actually be removed totally now).
- Add test for combinations of TABLESPACE and STORAGE with CREATE TABLE
and ALTER TABLE
- Add test to show that 5.5 now can read a .frm file created by MySQL Cluster
7.0.22. Although it does not yet show the column level attributes, they are read.
- Add new "format section" in extra data segment with additional table and
column properties. This was originally introduced in 5.1.20 based MySQL Cluster
- Remove hardcoded STORAGE DISK for table and instead
output the real storage format used. Keep both TABLESPACE
and STORAGE inside same version guard.
- Implement default version of handler::get_tablespace_name() since tablespace
is now available in share and it's unnecessary for each handler to implement.
(the function could actually be removed totally now).
- Add test for combinations of TABLESPACE and STORAGE with CREATE TABLE
and ALTER TABLE
- Add test to show that 5.5 now can read a .frm file created by MySQL Cluster
7.0.22. Although it does not yet show the column level attributes, they are read.
Changed some String.ptr() -> String.c_ptr() for String that are not guaranteed to end with \0
Removed some c_ptr() usage from parameters to functions that takes ptr & length
Use preallocate buffers to avoid calling malloc() for most operations.
sql/event_db_repository.cc:
alias is now a String
sql/event_scheduler.cc:
c_ptr -> c_ptr_safe() to avoid warnings from valgrind.
sql/events.cc:
c_ptr -> c_ptr_safe() to avoid warnings from valgrind.
c_ptr -> ptr() as function takes ptr & length
sql/field.cc:
alias is now a String
sql/field.h:
alias is now a String
sql/ha_partition.cc:
alias is now a String
sql/handler.cc:
alias is now a String
ptr() -> c_ptr() as string is not guaranteed to be \0 terminated
sql/item.cc:
Store error parameter in separarte buffer to ensure correct error message
sql/item_func.cc:
ptr() -> c_ptr_safe() as string is not guaranteed to be \0 terminated
sql/item_sum.h:
Use my_strtod() instead of my_atof() to not have to make string \0 terminated
sql/lock.cc:
alias is now a String
sql/log.cc:
c_ptr() -> ptr() as function takes ptr & length
sql/log_event.cc:
c_ptr_quick() -> ptr() as we only want to get the pointer to String buffer
sql/opt_range.cc:
ptr() -> c_ptr() as string is not guaranteed to be \0 terminated
sql/opt_table_elimination.cc:
alias is now a String
sql/set_var.cc:
ptr() -> c_ptr() as string is not guaranteed to be \0 terminated
c_ptr() -> c_ptr_safe() to avoid warnings from valgrind.
c_ptr() -> ptr() as function takes ptr & length
Simplify some code.
sql/sp.cc:
c_ptr() -> ptr() as function takes ptr & length
sql/sp_rcontext.cc:
alias is now a String
sql/sql_base.cc:
alias is now a String.
Here we win a realloc() for most alias usage.
sql/sql_class.cc:
Use size descriptor for printf() to avoid accessing bytes outside of buffer
sql/sql_insert.cc:
Change allocation of TABLE as it's now contains a String
_ptr() -> ptr() as function takes ptr & length
sql/sql_load.cc:
Use preallocate buffers to avoid calling malloc() for most operations.
sql/sql_parse.cc:
Use c_ptr_safe() to ensure string is \0 terminated.
sql/sql_plugin.cc:
c_ptr_quick() -> ptr() as function takes ptr & length
sql/sql_select.cc:
alias is now a String
sql/sql_show.cc:
alias is now a String
sql/sql_string.h:
Added move() function to change who owns the string (owner does the free)
sql/sql_table.cc:
alias is now a String
c_ptr() -> c_ptr_safe() to avoid warnings from valgrind.
sql/sql_test.cc:
c_ptr() -> c_ptr_safe() to avoid warnings from valgrind.
alias is now a String
sql/sql_trigger.cc:
c_ptr() -> c_ptr_safe() to avoid warnings from valgrind.
Use field->init() to setup pointers to alias.
sql/sql_update.cc:
alias is now a String
sql/sql_view.cc:
ptr() -> c_ptr_safe() as string is not guaranteed to be \0 terminated
sql/sql_yacc.yy:
r() -> c_ptr() as string is not guaranteed to be \0 terminated
sql/table.cc:
alias is now a String
sql/table.h:
alias is now a String
storage/federatedx/ha_federatedx.cc:
Remove extra 1 byte alloc that is automaticly done by strmake()
Ensure that error message ends with \0
storage/maria/ha_maria.cc:
alias is now a String
storage/myisam/ha_myisam.cc:
alias is now a String
- Fixed some issues with partitions and connection_string, which also fixed lp:716890 "Pre- and post-recovery crash in Aria"
- Fixed wrong assert in Aria
Now need to merge with latest xtradb before pushing
sql/ha_partition.cc:
Ensure that m_ordered_rec_buffer is not freed before close.
sql/mysqld.cc:
Changed to use opt_stack_trace instead of opt_pstack.
Removed references to pstack
sql/partition_element.h:
Ensure that connect_string is initialized
storage/maria/ma_key_recover.c:
Fixed wrong assert
- Fix for MySQL BUG#52357 added NESTED_JOIN::is_fully_covered() which would
not take into account that MariaDB's table elimination could eliminate tables
from join plan (and so, from join nest).
Fixed the check in the function to compare post-table-elimination numbers.
- Removed files specific to compiling on OS/2
- Removed files specific to SCO Unix packaging
- Removed "libmysqld/copyright", text is included in documentation
- Removed LaTeX headers for NDB Doxygen documentation
- Removed obsolete NDB files
- Removed "mkisofs" binaries
- Removed the "cvs2cl.pl" script
- Changed a few GPL texts to use "program" instead of "library"
- Removed files specific to compiling on OS/2
- Removed files specific to SCO Unix packaging
- Removed "libmysqld/copyright", text is included in documentation
- Removed LaTeX headers for NDB Doxygen documentation
- Removed obsolete NDB files
- Removed "mkisofs" binaries
- Removed the "cvs2cl.pl" script
- Changed a few GPL texts to use "program" instead of "library"
Original revid: alexey.kopytov@sun.com-20100723115254-jjwmhq97b9wl932l
> Bug #54476: crash when group_concat and 'with rollup' in
> prepared statements
>
> Using GROUP_CONCAT() together with the WITH ROLLUP modifier
> could crash the server.
>
> The reason was a combination of several facts:
>
> 1. The Item_func_group_concat class stores pointers to ORDER
> objects representing the columns in the ORDER BY clause of
> GROUP_CONCAT().
>
> 2. find_order_in_list() called from
> Item_func_group_concat::setup() modifies the ORDER objects so
> that their 'item' member points to the arguments list
> allocated in the Item_func_group_concat constructor.
>
> 3. In some cases (e.g. in JOIN::rollup_make_fields) a copy of
> the original Item_func_group_concat object could be created by
> using the Item_func_group_concat::Item_func_group_concat(THD
> *thd, Item_func_group_concat *item) copy constructor. The
> latter essentially creates a shallow copy of the source
> object. Memory for the arguments array is allocated on
> thd->mem_root, but the pointers for arguments and ORDER are
> copied verbatim.
>
> What happens in the test case is that when executing the query
> for the first time, after a copy of the original
> Item_func_group_concat object has been created by
> JOIN::rollup_make_fields(), find_order_in_list() is called for
> this new object. It then resolves ORDER BY by modifying the
> ORDER objects so that they point to elements of the arguments
> array which is local to the cloned object. When thd->mem_root
> is freed upon completing the execution, pointers in the ORDER
> objects become invalid. Those ORDER objects, however, are also
> shared with the original Item_func_group_concat object which is
> preserved between executions of a prepared statement. So the
> first call to find_order_in_list() for the original object on
> the second execution tries to dereference an invalid pointer.
>
> The solution is to create copies of the ORDER objects when
> copying Item_func_group_concat to not leave any stale pointers
> in other instances with different lifecycles.
mysql-test/r/func_gconcat.result:
Test case for bug #54476.
mysql-test/t/func_gconcat.test:
Test case for bug #54476.
sql/item_sum.cc:
Copy the ORDER objects pointed to by the elements of the
'order' array in the copy constructor of
Item_func_group_concat.
sql/table.h:
Removed the unused 'item_copy' member of the ORDER class.
Original revid: alexey.kopytov@sun.com-20100723115254-jjwmhq97b9wl932l
> Bug #54476: crash when group_concat and 'with rollup' in
> prepared statements
>
> Using GROUP_CONCAT() together with the WITH ROLLUP modifier
> could crash the server.
>
> The reason was a combination of several facts:
>
> 1. The Item_func_group_concat class stores pointers to ORDER
> objects representing the columns in the ORDER BY clause of
> GROUP_CONCAT().
>
> 2. find_order_in_list() called from
> Item_func_group_concat::setup() modifies the ORDER objects so
> that their 'item' member points to the arguments list
> allocated in the Item_func_group_concat constructor.
>
> 3. In some cases (e.g. in JOIN::rollup_make_fields) a copy of
> the original Item_func_group_concat object could be created by
> using the Item_func_group_concat::Item_func_group_concat(THD
> *thd, Item_func_group_concat *item) copy constructor. The
> latter essentially creates a shallow copy of the source
> object. Memory for the arguments array is allocated on
> thd->mem_root, but the pointers for arguments and ORDER are
> copied verbatim.
>
> What happens in the test case is that when executing the query
> for the first time, after a copy of the original
> Item_func_group_concat object has been created by
> JOIN::rollup_make_fields(), find_order_in_list() is called for
> this new object. It then resolves ORDER BY by modifying the
> ORDER objects so that they point to elements of the arguments
> array which is local to the cloned object. When thd->mem_root
> is freed upon completing the execution, pointers in the ORDER
> objects become invalid. Those ORDER objects, however, are also
> shared with the original Item_func_group_concat object which is
> preserved between executions of a prepared statement. So the
> first call to find_order_in_list() for the original object on
> the second execution tries to dereference an invalid pointer.
>
> The solution is to create copies of the ORDER objects when
> copying Item_func_group_concat to not leave any stale pointers
> in other instances with different lifecycles.
- Fixed problem with oqgraph and 'make dist'
Note that after this merge we have a problem show in join_outer where we examine too many rows in one specific case (related to BUG#57024).
This will be fixed when mwl#128 is merged into 5.3.
- Changed TABLE->alias to String to get fewer reallocs when alias are used.
- Preallocate some buffers
Changed some String->c_ptr() -> String->ptr() when \0 is not needed.
Fixed wrong usage of String->ptr() when we need a \0 terminated string.
Use my_strtod() instead of my_atof() to avoid having to add \0 to string.
c_ptr() -> c_ptr_safe() to avoid warnings from valgrind.
zr
sql/event_db_repository.cc:
Update usage of TABLE->alias
sql/event_scheduler.cc:
c_ptr() -> c_ptr_safe()
sql/events.cc:
c_ptr() -> ptr() as \0 was not needed
sql/field.cc:
Update usage of TABLE->alias
sql/field.h:
Update usage of TABLE->alias
sql/ha_partition.cc:
Update usage of TABLE->alias
sql/handler.cc:
Update usage of TABLE->alias
Fixed wrong usage of str.ptr()
sql/item.cc:
Fixed error where code wrongly assumed string was \0 terminated.
sql/item_func.cc:
c_ptr() -> c_ptr_safe()
Update usage of TABLE->alias
sql/item_sum.h:
Use my_strtod() instead of my_atof() to avoid having to add \0 to string
sql/lock.cc:
Update usage of TABLE->alias
sql/log.cc:
c_ptr() -> ptr() as \0 was not needed
sql/log_event.cc:
c_ptr_quick() -> ptr() as \0 was not needed
sql/opt_range.cc:
ptr() -> c_ptr() as \0 is needed
sql/opt_subselect.cc:
Update usage of TABLE->alias
sql/opt_table_elimination.cc:
Update usage of TABLE->alias
sql/set_var.cc:
ptr() -> c_ptr() as \0 is needed
c_ptr() -> c_ptr_safe()
sql/sp.cc:
c_ptr() -> ptr() as \0 was not needed
sql/sp_rcontext.cc:
Update usage of TABLE->alias
sql/sql_base.cc:
Preallocate buffers
Update usage of TABLE->alias
sql/sql_class.cc:
Fix arguments to sprintf() to work even if string is not \0 terminated
sql/sql_insert.cc:
Update usage of TABLE->alias
c_ptr() -> ptr() as \0 was not needed
sql/sql_load.cc:
Preallocate buffers
Trivial optimizations
sql/sql_parse.cc:
Trivial optimization
sql/sql_plugin.cc:
c_ptr() -> ptr() as \0 was not needed
sql/sql_select.cc:
Update usage of TABLE->alias
sql/sql_show.cc:
Update usage of TABLE->alias
sql/sql_string.h:
Added move() function to move allocated memory from one object to another.
sql/sql_table.cc:
Update usage of TABLE->alias
c_ptr() -> c_ptr_safe()
sql/sql_test.cc:
ptr() -> c_ptr_safe()
sql/sql_trigger.cc:
Update usage of TABLE->alias
c_ptr() -> c_ptr_safe()
sql/sql_update.cc:
Update usage of TABLE->alias
sql/sql_view.cc:
ptr() -> c_ptr_safe()
sql/sql_yacc.yy:
ptr() -> c_ptr()
sql/table.cc:
Update usage of TABLE->alias
sql/table.h:
Changed TABLE->alias to String to get fewer reallocs when alias are used.
storage/federatedx/ha_federatedx.cc:
Use c_ptr_safe() to ensure strings are \0 terminated.
storage/maria/ha_maria.cc:
Update usage of TABLE->alias
storage/myisam/ha_myisam.cc:
Update usage of TABLE->alias
storage/xtradb/row/row0sel.c:
Ensure that null bits in record are properly reset.
(Old code didn't work as row_search_for_mysql() can be called twice while reading fields from one row.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
mysql-test/include/check_ftwrl_compatible.inc:
Added helper script which allows to check that a statement is
compatible with FLUSH TABLES WITH READ LOCK.
mysql-test/include/check_ftwrl_incompatible.inc:
Added helper script which allows to check that a statement is
incompatible with FLUSH TABLES WITH READ LOCK.
mysql-test/include/handler.inc:
Adjusted test case to the fact that now DROP TABLE closes
open HANDLERs for the table to be dropped before checking
if there active FTWRL in this connection.
mysql-test/include/wait_show_condition.inc:
Fixed small error in the timeout message. The correct name
of variable used as parameter for this script is "$condition"
and not "$wait_condition".
mysql-test/r/delayed.result:
Added test coverage for scenario which triggered assert in
metadata locking subsystem.
mysql-test/r/events_2.result:
Updated test results after prohibiting event DDL operations
under LOCK TABLES.
mysql-test/r/flush.result:
Added test coverage for bug #57006 "Deadlock between HANDLER
and FLUSH TABLES WITH READ LOCK".
mysql-test/r/flush_read_lock.result:
Added test coverage for various aspects of FLUSH TABLES WITH
READ LOCK functionality.
mysql-test/r/flush_read_lock_kill.result:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Use new
debug_sync point. Do not disable concurrent inserts as now
InnoDB we always use InnoDB table.
mysql-test/r/handler_innodb.result:
Adjusted test case to the fact that now DROP TABLE closes
open HANDLERs for the table to be dropped before checking
if there active FTWRL in this connection.
mysql-test/r/handler_myisam.result:
Adjusted test case to the fact that now DROP TABLE closes
open HANDLERs for the table to be dropped before checking
if there active FTWRL in this connection.
mysql-test/r/mdl_sync.result:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Replaced
usage of GRL-specific debug_sync's with appropriate sync
points in MDL subsystem.
mysql-test/suite/perfschema/r/dml_setup_instruments.result:
Updated test results after removing global
COND_global_read_lock condition variable.
mysql-test/suite/perfschema/r/func_file_io.result:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/r/func_mutex.result:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/r/global_read_lock.result:
Adjusted test case to take into account that new GRL
implementation is based on MDL.
mysql-test/suite/perfschema/r/server_init.result:
Adjusted test case after replacing custom global read
lock implementation with one based on MDL and replacing
LOCK_event_metadata mutex with metadata lock.
mysql-test/suite/perfschema/t/func_file_io.test:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/t/func_mutex.test:
Ensure that this test doesn't affect subsequent tests.
At the end of its execution enable back P_S instrumentation
which this test disables at some point.
mysql-test/suite/perfschema/t/global_read_lock.test:
Adjusted test case to take into account that new GRL
implementation is based on MDL.
mysql-test/suite/perfschema/t/server_init.test:
Adjusted test case after replacing custom global read
lock implementation with one based on MDL and replacing
LOCK_event_metadata mutex with metadata lock.
mysql-test/suite/rpl/r/rpl_tmp_table_and_DDL.result:
Updated test results after prohibiting event DDL under
LOCK TABLES.
mysql-test/t/delayed.test:
Added test coverage for scenario which triggered assert in
metadata locking subsystem.
mysql-test/t/events_2.test:
Updated test case after prohibiting event DDL operations
under LOCK TABLES.
mysql-test/t/flush.test:
Added test coverage for bug #57006 "Deadlock between HANDLER
and FLUSH TABLES WITH READ LOCK".
mysql-test/t/flush_block_commit.test:
Adjusted test case after changing thread state name which
is used when COMMIT waits for FLUSH TABLES WITH READ LOCK
from "Waiting for release of readlock" to "Waiting for commit
lock".
mysql-test/t/flush_block_commit_notembedded.test:
Adjusted test case after changing thread state name which is
used when DML waits for FLUSH TABLES WITH READ LOCK. Now we
use "Waiting for global read lock" in this case.
mysql-test/t/flush_read_lock.test:
Added test coverage for various aspects of FLUSH TABLES WITH
READ LOCK functionality.
mysql-test/t/flush_read_lock_kill-master.opt:
We no longer need to use make_global_read_lock_block_commit_loop
debug tag in this test. Instead we rely on an appropriate
debug_sync point in MDL code.
mysql-test/t/flush_read_lock_kill.test:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Use new
debug_sync point. Do not disable concurrent inserts as now
InnoDB we always use InnoDB table.
mysql-test/t/lock_multi.test:
Adjusted test case after changing thread state names which
are used when DML or DDL waits for FLUSH TABLES WITH READ
LOCK to "Waiting for global read lock".
mysql-test/t/mdl_sync.test:
Adjusted test case after replacing custom global read lock
implementation with one based on metadata locks. Replaced
usage of GRL-specific debug_sync's with appropriate sync
points in MDL subsystem. Updated thread state names which
are used when DDL waits for FTWRL.
mysql-test/t/trigger_notembedded.test:
Adjusted test case after changing thread state names which
are used when DML or DDL waits for FLUSH TABLES WITH READ
LOCK to "Waiting for global read lock".
sql/event_data_objects.cc:
Removed Event_queue_element::status/last_executed_changed
members and Event_queue_element::update_timing_fields()
method. We no longer use this class for updating mysql.events
once event is chosen for execution. Accesses to instances of
this class in scheduler thread require protection by
Event_queue::LOCK_event_queue mutex and we try to avoid
updating table while holding this lock.
sql/event_data_objects.h:
Removed Event_queue_element::status/last_executed_changed
members and Event_queue_element::update_timing_fields()
method. We no longer use this class for updating mysql.events
once event is chosen for execution. Accesses to instances of
this class in scheduler thread require protection by
Event_queue::LOCK_event_queue mutex and we try to avoid
updating table while holding this lock.
sql/event_db_repository.cc:
- Changed Event_db_repository methods to not release all
metadata locks once they are done updating mysql.events
table. This allows to keep metadata lock protecting
against GRL and lock protecting particular event around
until corresponding DDL statement is written to the binary
log.
- Removed logic for conditional update of "status" and
"last_executed" fields from update_timing_fields_for_event()
method. In the only case when this method is called now
"last_executed" is always modified and tracking change
of "status" is too much hassle.
sql/event_db_repository.h:
Removed logic for conditional update of "status" and
"last_executed" fields from Event_db_repository::
update_timing_fields_for_event() method.
In the only case when this method is called now "last_executed"
is always modified and tracking change of "status" field is
too much hassle.
sql/event_queue.cc:
Changed event scheduler code not to update mysql.events
table while holding Event_queue::LOCK_event_queue mutex.
Doing so led to a deadlock with a new GRL implementation.
This deadlock didn't occur with old implementation due to
fact that code acquiring protection against GRL ignored
pending GRL requests (which lead to GRL starvation).
One of goals of new implementation is to disallow GRL
starvation and so we have to solve problem with this
deadlock in a different way.
sql/events.cc:
Changed methods of Events class to acquire protection
against GRL while perfoming DDL statement and keep it
until statement is written to the binary log.
Unfortunately this step together with new GRL implementation
exposed deadlock involving Events::LOCK_event_metadata
and GRL. To solve it Events::LOCK_event_metadata mutex was
replaced with a metadata lock on event. As a side-effect
events DDL has to be prohibited under LOCK TABLES even in
cases when mysql.events table was explicitly locked for
write.
sql/events.h:
Replaced Events::LOCK_event_metadata mutex with a metadata
lock on event.
sql/ha_ndbcluster.cc:
Updated code after replacing custom global read lock
implementation with one based on MDL. Since MDL subsystem
should now be able to detect deadlocks involving metadata
locks and GRL there is no need for special handling of
active GRL.
sql/handler.cc:
Replaced custom implementation of global read lock with
one based on metadata locks. Consequently when doing
commit instead of calling method of Global_read_lock
class to acquire protection against GRL we simply acquire
IX in COMMIT namespace.
sql/lock.cc:
Replaced custom implementation of global read lock with
one based on metadata locks. This step allows to expose
wait for GRL to deadlock detector of MDL subsystem and
thus succesfully resolve deadlocks similar to one behind
bug #57006 "Deadlock between HANDLER and FLUSH TABLES
WITH READ LOCK". It also solves problem with GRL starvation
described in bug #54673 "It takes too long to get readlock
for 'FLUSH TABLES WITH READ LOCK'" since metadata locks used
by GRL give preference to FTWRL statement instead of DML
statements (if needed in future this can be changed to
fair scheduling).
Similar to old implementation of acquisition of GRL is
two-step. During the first step we block all concurrent
DML and DDL statements by acquiring global S metadata lock
(each DML and DDL statement acquires global IX lock for
its duration). During the second step we block commits by
acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
To support this change:
- Global_read_lock::lock/unlock_global_read_lock and
make_global_read_lock_block_commit methods were changed
accordingly.
- Global_read_lock::wait_if_global_read_lock() and
start_waiting_global_read_lock() methods were dropped.
It is now responsibility of code acquiring metadata locks
opening tables to acquire protection against GRL by
explicitly taking global IX lock with statement duration.
- Global variables, mutex and condition variable used by
old implementation was removed.
- lock_routine_name() was changed to use statement duration for
its global IX lock. It was also renamed to lock_object_name()
as it now also used to take metadata locks on events.
- Global_read_lock::set_explicit_lock_duration() was added which
allows not to release locks used for GRL when leaving prelocked
mode.
sql/lock.h:
- Renamed lock_routine_name() to lock_object_name() and changed
its signature to allow its usage for events.
- Removed broadcast_refresh() function. It is no longer needed
with new GRL implementation.
sql/log_event.cc:
Release metadata locks with statement duration at the end
of processing legacy event for LOAD DATA. This ensures that
replication thread processing such event properly releases
its protection against global read lock.
sql/mdl.cc:
Changed MDL subsystem to support new MDL-based implementation
of global read lock.
Added COMMIT and EVENTS namespaces for metadata locks. Changed
thread state name for GLOBAL namespace to "Waiting for global
read lock".
Optimized MDL_map::find_or_insert() method to avoid taking
m_mutex mutex when looking up MDL_lock objects for GLOBAL
or COMMIT namespaces. We keep pre-created MDL_lock objects
for these namespaces around and simply return pointers to
these global objects when needed.
Changed MDL_lock/MDL_scoped_lock to properly handle
notification of insert delayed handler threads when FTWRL
takes global S lock.
Introduced concept of lock duration. In addition to locks with
transaction duration which work in the way which is similar to
how locks worked before (i.e. they are released at the end of
transaction), locks with statement and explicit duration were
introduced.
Locks with statement duration are automatically released at the
end of statement. Locks with explicit duration require explicit
release and obsolete concept of transactional sentinel.
* Changed MDL_request and MDL_ticket classes to support notion
of duration.
* Changed MDL_context to keep locks with different duration in
different lists. Changed code handling ticket list to take
this into account.
* Changed methods responsible for releasing locks to take into
account duration of tickets. Particularly public
MDL_context::release_lock() method now only can release
tickets with explicit duration (there is still internal
method which allows to specify duration). To release locks
with statement or transaction duration one have to use
release_statement/transactional_locks() methods.
* Concept of savepoint for MDL subsystem now has to take into
account locks with statement duration. Consequently
MDL_savepoint class was introduced and methods working with
savepoints were updated accordingly.
* Added methods which allow to set duration for one or all
locks in the context.
sql/mdl.h:
Changed MDL subsystem to support new MDL-based implementation
of global read lock.
Added COMMIT and EVENTS namespaces for metadata locks.
Introduced concept of lock duration. In addition to locks with
transaction duration which work in the way which is similar to
how locks worked before (i.e. they are released at the end of
transaction), locks with statement and explicit duration were
introduced.
Locks with statement duration are automatically released at the
end of statement. Locks with explicit duration require explicit
release and obsolete concept of transactional sentinel.
* Changed MDL_request and MDL_ticket classes to support notion
of duration.
* Changed MDL_context to keep locks with different duration in
different lists. Changed code handling ticket list to take
this into account.
* Changed methods responsible for releasing locks to take into
account duration of tickets. Particularly public
MDL_context::release_lock() method now only can release
tickets with explicit duration (there is still internal
method which allows to specify duration). To release locks
with statement or transaction duration one have to use
release_statement/transactional_locks() methods.
* Concept of savepoint for MDL subsystem now has to take into
account locks with statement duration. Consequently
MDL_savepoint class was introduced and methods working with
savepoints were updated accordingly.
* Added methods which allow to set duration for one or all
locks in the context.
sql/mysqld.cc:
Removed global mutex and condition variables which were used
by old implementation of GRL.
Also we no longer need to initialize Events::LOCK_event_metadata
mutex as it was replaced with metadata locks on events.
sql/mysqld.h:
Removed global variable, mutex and condition variables which
were used by old implementation of GRL.
sql/rpl_rli.cc:
When slave thread closes tables which were open for handling
of RBR events ensure that it releases global IX lock which
was acquired as protection against GRL.
sql/sp.cc:
Adjusted code to the new signature of lock_object/routine_name(),
to the fact that one now needs specify duration of lock when
initializing MDL_request and to the fact that savepoints for MDL
subsystem are now represented by MDL_savepoint class.
sql/sp_head.cc:
Ensure that statements in stored procedures release statement
metadata locks and thus release their protectiong against GRL
in proper moment in time.
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/sql_admin.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/sql_base.cc:
- Implemented support for new approach to acquiring protection
against global read lock. We no longer acquire such protection
explicitly on the basis of statement flags. Instead we always
rely on code which is responsible for acquiring metadata locks
on object to be changed acquiring this protection. This is
achieved by acquiring global IX metadata lock with statement
duration. Code doing this also responsible for checking that
current connection has no active GRL by calling an
Global_read_lock::can_acquire_protection() method.
Changed code in open_table() and lock_table_names()
accordingly.
Note that as result of this change DDL and DML on temporary
tables is always compatible with GRL (before it was
incompatible in some cases and compatible in other cases).
- To speed-up code acquiring protection against GRL introduced
m_has_protection_against_grl member in Open_table_context
class. It indicates that protection was already acquired
sometime during open_tables() execution and new attempts
can be skipped.
- Thanks to new GRL implementation calls to broadcast_refresh()
became unnecessary and were removed.
- Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request and to the fact that
savepoints for MDL subsystem are now represented by
MDL_savepoint class.
sql/sql_base.h:
Adjusted code to the fact that savepoints for MDL subsystem are
now represented by MDL_savepoint class.
Also introduced Open_table_context::m_has_protection_against_grl
member which allows to avoid acquiring protection against GRL
while opening tables if such protection was already acquired.
sql/sql_class.cc:
Changed THD::leave_locked_tables_mode() after transactional
sentinel for metadata locks was obsoleted by introduction of
locks with explicit duration.
sql/sql_class.h:
- Adjusted code to the fact that savepoints for MDL subsystem
are now represented by MDL_savepoint class.
- Changed Global_read_lock class according to changes in
global read lock implementation:
* wait_if_global_read_lock and start_waiting_global_read_lock
are now gone. Instead code needing protection against GRL
has to acquire global IX metadata lock with statement
duration itself. To help it new can_acquire_protection()
was introduced. Also as result of the above change
m_protection_count member is gone too.
* Added m_mdl_blocks_commits_lock member to store metadata
lock blocking commits.
* Adjusted code to the fact that concept of transactional
sentinel was obsoleted by concept of lock duration.
- Removed CF_PROTECT_AGAINST_GRL flag as it is no longer
necessary. New GRL implementation acquires protection
against global read lock automagically when statement
acquires metadata locks on tables or other objects it
is going to change.
sql/sql_db.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/sql_handler.cc:
Removed call to broadcast_refresh() function. It is no longer
needed with new GRL implementation.
Adjusted code after introducing duration concept for metadata
locks. Particularly to the fact transactional sentinel was
replaced with explicit duration.
sql/sql_handler.h:
Renamed mysql_ha_move_tickets_after_trans_sentinel() to
mysql_ha_set_explicit_lock_duration() after transactional
sentinel was obsoleted by locks with explicit duration.
sql/sql_insert.cc:
Adjusted code handling delaying inserts after switching to
new GRL implementation. Now connection thread initiating
delayed insert has to acquire global IX lock in addition
to metadata lock on table being inserted into. This IX lock
protects against GRL and similarly to SW lock on table being
inserted into has to be passed to handler thread in order to
avoid deadlocks.
sql/sql_lex.cc:
LEX::protect_against_global_read_lock member is no longer
necessary since protection against GRL is automatically
taken by code acquiring metadata locks/opening tables.
sql/sql_lex.h:
LEX::protect_against_global_read_lock member is no longer
necessary since protection against GRL is automatically
taken by code acquiring metadata locks/opening tables.
sql/sql_parse.cc:
- Implemented support for new approach to acquiring protection
against global read lock. We no longer acquire such protection
explicitly on the basis of statement flags. Instead we always
rely on code which is responsible for acquiring metadata locks
on object to be changed acquiring this protection. This is
achieved by acquiring global IX metadata lock with statement
duration. This lock is automatically released at the end of
statement execution.
- Changed implementation of CREATE/DROP PROCEDURE/FUNCTION not
to release metadata locks and thus protection against of GRL
in the middle of statement execution.
- Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request and to the fact that
savepoints for MDL subsystem are now represented by
MDL_savepoint class.
sql/sql_prepare.cc:
Adjusted code to the to the fact that savepoints for MDL
subsystem are now represented by MDL_savepoint class.
sql/sql_rename.cc:
With new GRL implementation there is no need to explicitly
acquire protection against GRL before renaming tables.
This happens automatically in code which acquires metadata
locks on tables being renamed.
sql/sql_show.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request and to the fact that
savepoints for MDL subsystem are now represented by
MDL_savepoint class.
sql/sql_table.cc:
- With new GRL implementation there is no need to explicitly
acquire protection against GRL before dropping tables.
This happens automatically in code which acquires metadata
locks on tables being dropped.
- Changed mysql_alter_table() not to release lock on new table
name explicitly and to rely on automatic release of locks
at the end of statement instead. This was necessary since
now MDL_context::release_lock() is supported only for locks
for explicit duration.
sql/sql_trigger.cc:
With new GRL implementation there is no need to explicitly
acquire protection against GRL before changing table triggers.
This happens automatically in code which acquires metadata
locks on tables which triggers are to be changed.
sql/sql_update.cc:
Fix bug exposed by GRL testing. During prepare phase acquire
only S metadata locks instead of SW locks to keep prepare of
multi-UPDATE compatible with concurrent LOCK TABLES WRITE
and global read lock.
sql/sql_view.cc:
With new GRL implementation there is no need to explicitly
acquire protection against GRL before creating view.
This happens automatically in code which acquires metadata
lock on view to be created.
sql/sql_yacc.yy:
LEX::protect_against_global_read_lock member is no longer
necessary since protection against GRL is automatically
taken by code acquiring metadata locks/opening tables.
sql/table.cc:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/table.h:
Adjusted code to the fact that one now needs specify duration
of lock when initializing MDL_request.
sql/transaction.cc:
Replaced custom implementation of global read lock with
one based on metadata locks. Consequently when doing
commit instead of calling method of Global_read_lock
class to acquire protection against GRL we simply acquire
IX in COMMIT namespace.
Also adjusted code to the fact that MDL savepoint is now
represented by MDL_savepoint class.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
Bug#54678: InnoDB, TRUNCATE, ALTER, I_S SELECT, crash or deadlock
- Incompatible change: truncate no longer resorts to a row by
row delete if the storage engine does not support the truncate
method. Consequently, the count of affected rows does not, in
any case, reflect the actual number of rows.
- Incompatible change: it is no longer possible to truncate a
table that participates as a parent in a foreign key constraint,
unless it is a self-referencing constraint (both parent and child
are in the same table). To work around this incompatible change
and still be able to truncate such tables, disable foreign checks
with SET foreign_key_checks=0 before truncate. Alternatively, if
foreign key checks are necessary, please use a DELETE statement
without a WHERE condition.
Problem description:
The problem was that for storage engines that do not support
truncate table via a external drop and recreate, such as InnoDB
which implements truncate via a internal drop and recreate, the
delete_all_rows method could be invoked with a shared metadata
lock, causing problems if the engine needed exclusive access
to some internal metadata. This problem originated with the
fact that there is no truncate specific handler method, which
ended up leading to a abuse of the delete_all_rows method that
is primarily used for delete operations without a condition.
Solution:
The solution is to introduce a truncate handler method that is
invoked when the engine does not support truncation via a table
drop and recreate. This method is invoked under a exclusive
metadata lock, so that there is only a single instance of the
table when the method is invoked.
Also, the method is not invoked and a error is thrown if
the table is a parent in a non-self-referencing foreign key
relationship. This was necessary to avoid inconsistency as
some integrity checks are bypassed. This is inline with the
fact that truncate is primarily a DDL operation that was
designed to quickly remove all data from a table.
mysql-test/suite/innodb/t/innodb-truncate.test:
Add test cases for truncate and foreign key checks.
Also test that InnoDB resets auto-increment on truncate.
mysql-test/suite/innodb/t/innodb.test:
FK is not necessary, test is related to auto-increment.
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
mysql-test/suite/innodb/t/innodb_mysql.test:
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
Use delete instead of truncate, test is used to check
the interaction of FKs, triggers and delete.
mysql-test/suite/parts/inc/partition_check.inc:
Fix typo.
mysql-test/suite/sys_vars/t/foreign_key_checks_func.test:
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
mysql-test/t/mdl_sync.test:
Modify test case to reflect and ensure that truncate takes
a exclusive metadata lock.
mysql-test/t/trigger-trans.test:
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
sql/ha_partition.cc:
Reorganize the various truncate methods. delete_all_rows is now
passed directly to the underlying engines, so as truncate. The
code responsible for truncating individual partitions is moved
to ha_partition::truncate_partition, which is invoked when a
ALTER TABLE t1 TRUNCATE PARTITION p statement is executed.
Since the partition truncate no longer can be invoked via
delete, the bitmap operations are not necessary anymore. The
explicit reset of the auto-increment value is also removed
as the underlying engines are now responsible for reseting
the value.
sql/handler.cc:
Wire up the handler truncate method.
sql/handler.h:
Introduce and document the truncate handler method. It assumes
certain use cases of delete_all_rows.
Add method to retrieve the list of foreign keys referencing a
table. Method is used to avoid truncating tables that are
parent in a foreign key relationship.
sql/share/errmsg-utf8.txt:
Add error message for truncate and FK.
sql/sql_lex.h:
Introduce a flag so that the partition engine can detect when
a partition is being truncated. Used to give a special error.
sql/sql_parse.cc:
Function mysql_truncate_table no longer exists.
sql/sql_partition_admin.cc:
Implement the TRUNCATE PARTITION statement.
sql/sql_truncate.cc:
Change the truncate table implementation to use the new truncate
handler method and to not rely on row-by-row delete anymore.
The truncate handler method is always invoked with a exclusive
metadata lock. Also, it is no longer possible to truncate a
table that is parent in some non-self-referencing foreign key.
storage/archive/ha_archive.cc:
Rename method as the description indicates that in the future
this could be a truncate operation.
storage/blackhole/ha_blackhole.cc:
Implement truncate as no operation for the blackhole engine in
order to remain compatible with older releases.
storage/federated/ha_federated.cc:
Introduce truncate method that invokes delete_all_rows.
This is required to support partition truncate as this
form of truncate does not implement the drop and recreate
protocol.
storage/heap/ha_heap.cc:
Introduce truncate method that invokes delete_all_rows.
This is required to support partition truncate as this
form of truncate does not implement the drop and recreate
protocol.
storage/ibmdb2i/ha_ibmdb2i.cc:
Introduce truncate method that invokes delete_all_rows.
This is required to support partition truncate as this
form of truncate does not implement the drop and recreate
protocol.
storage/innobase/handler/ha_innodb.cc:
Rename delete_all_rows to truncate. InnoDB now does truncate
under a exclusive metadata lock.
Introduce and reorganize methods used to retrieve the list
of foreign keys referenced by a or referencing a table.
storage/myisammrg/ha_myisammrg.cc:
Introduce truncate method that invokes delete_all_rows.
This is required in order to remain compatible with earlier
releases where truncate would resort to a row-by-row delete.
Bug#54678: InnoDB, TRUNCATE, ALTER, I_S SELECT, crash or deadlock
- Incompatible change: truncate no longer resorts to a row by
row delete if the storage engine does not support the truncate
method. Consequently, the count of affected rows does not, in
any case, reflect the actual number of rows.
- Incompatible change: it is no longer possible to truncate a
table that participates as a parent in a foreign key constraint,
unless it is a self-referencing constraint (both parent and child
are in the same table). To work around this incompatible change
and still be able to truncate such tables, disable foreign checks
with SET foreign_key_checks=0 before truncate. Alternatively, if
foreign key checks are necessary, please use a DELETE statement
without a WHERE condition.
Problem description:
The problem was that for storage engines that do not support
truncate table via a external drop and recreate, such as InnoDB
which implements truncate via a internal drop and recreate, the
delete_all_rows method could be invoked with a shared metadata
lock, causing problems if the engine needed exclusive access
to some internal metadata. This problem originated with the
fact that there is no truncate specific handler method, which
ended up leading to a abuse of the delete_all_rows method that
is primarily used for delete operations without a condition.
Solution:
The solution is to introduce a truncate handler method that is
invoked when the engine does not support truncation via a table
drop and recreate. This method is invoked under a exclusive
metadata lock, so that there is only a single instance of the
table when the method is invoked.
Also, the method is not invoked and a error is thrown if
the table is a parent in a non-self-referencing foreign key
relationship. This was necessary to avoid inconsistency as
some integrity checks are bypassed. This is inline with the
fact that truncate is primarily a DDL operation that was
designed to quickly remove all data from a table.
LOAD DATA into partitioned MyISAM table
Problem was that both partitioning and myisam
used the same table_share->mutex for different protections
(auto inc and repair).
Solved by adding a specific mutex for the partitioning
auto_increment.
Also adding destroying the ha_data structure in
free_table_share (which is to be propagated
into 5.5).
This is a 5.1 ONLY patch, already fixed in 5.5+.
LOAD DATA into partitioned MyISAM table
Problem was that both partitioning and myisam
used the same table_share->mutex for different protections
(auto inc and repair).
Solved by adding a specific mutex for the partitioning
auto_increment.
Also adding destroying the ha_data structure in
free_table_share (which is to be propagated
into 5.5).
This is a 5.1 ONLY patch, already fixed in 5.5+.
Fix some bugs where we stored values other than 0 or 1 in my_bool
Fixed some compiler warnings
client/mysql.cc:
Changed interrupted_query from my_bool to int, as we stored 2 in it.
client/mysqladmin.cc:
Changed return variable type to same type as function value type
client/mysqltest.cc:
Changed 'found' to int as we store other values than 0 or 1 into it
Changed type for parameter of set_reconnect() to match usage.
extra/libevent/evbuffer.c:
Added __attribute__((unused))
extra/libevent/event.c:
Added __attribute__((unused))
extra/libevent/signal.c:
Added __attribute__((unused))
sql/event_data_objects.h:
my_bool -> bool
sql/event_db_repository.cc:
my_bool -> bool
sql/event_db_repository.h:
my_bool -> bool
sql/event_parse_data.h:
my_bool -> bool
sql/events.cc:
my_bool -> bool
sql/events.h:
my_bool -> bool
sql/field.cc:
my_bool -> bool
sql/field.h:
my_bool -> bool
sql/hash_filo.h:
my_bool -> bool
sql/item.cc:
my_bool -> bool
sql/item.h:
my_bool -> bool
sql/item_cmpfunc.h:
my_bool -> bool
Changed result_for_null_param from my_bool to int as we stored -1 in it.
sql/item_func.cc:
my_bool -> bool
Modified udf wrapper functions so that the UDF functions would continue to use my_bool. (To keep compatibility with UDF:s)
sql/item_func.h:
my_bool -> bool
sql/item_subselect.h:
my_bool -> bool
sql/item_sum.cc:
Modified udf wrapper functions so that the UDF functions would continue to use my_bool. (To keep compatibility with UDF:s)
sql/parse_file.h:
my_bool -> bool
sql/rpl_mi.h:
my_bool -> bool
sql/sp_rcontext.h:
my_bool -> bool
sql/sql_analyse.h:
my_bool -> bool
sql/sql_base.cc:
Change some assignments so that we don't initialize bool variables with int's.
sql/sql_bitmap.h:
my_bool -> bool
sql/sql_cache.cc:
my_bool -> bool
sql/sql_cache.h:
my_bool -> bool
sql/sql_class.h:
my_bool -> bool
sql/sql_insert.cc:
Change some assignments so that we don't initialize bool variables with int's.
sql/sql_prepare.cc:
my_bool -> bool
sql/table.h:
my_bool -> bool
storage/maria/ma_check.c:
Removed duplicate assignment
strings/decimal.c:
Fixed wrong variable usage.
Don't do complex arithmetic on bool when simple works.
- Changed to still use bcmp() in certain cases becasue
- Faster for short unaligneed strings than memcmp()
- Bettern when using valgrind
- Changed to use my_sprintf() instead of sprintf() to get higher portability for old systems
- Changed code to use MariaDB version of select->skip_record()
- Removed -%::SCCS/s.% from Makefile.am:s to remove automake warnings
to allow temp table operations) -- prerequisite patch #1.
Move a piece of code that initialiazes TABLE instance
after it was successfully opened into a separate function.
This function will be reused in the following patches.
to allow temp table operations) -- prerequisite patch #1.
Move a piece of code that initialiazes TABLE instance
after it was successfully opened into a separate function.
This function will be reused in the following patches.
'CREATE TABLE IF NOT EXISTS ... SELECT' behaviour
BUG#47132, BUG#47442, BUG49494, BUG#23992 and BUG#48814 will disappear
automatically after the this patch.
BUG#55617 is fixed by this patch too.
This is the 5.5 part.
It implements:
- 'CREATE TABLE IF NOT EXISTS ... SELECT' statement will not insert
anything and binlog anything if the table already exists.
It only generate a warning that table already exists.
- A couple of test cases for the behavior changing.
'CREATE TABLE IF NOT EXISTS ... SELECT' behaviour
BUG#47132, BUG#47442, BUG49494, BUG#23992 and BUG#48814 will disappear
automatically after the this patch.
BUG#55617 is fixed by this patch too.
This is the 5.5 part.
It implements:
- 'CREATE TABLE IF NOT EXISTS ... SELECT' statement will not insert
anything and binlog anything if the table already exists.
It only generate a warning that table already exists.
- A couple of test cases for the behavior changing.
corruption on ADD PARTITION and LOCK TABLE
Bug#53770: Server crash at handler.cc:2076 on
LOAD DATA after timed out COALESCE PARTITION
5.5 fix for:
Bug#51042: REORGANIZE PARTITION can leave table in an
inconsistent state in case of crash
Needs to be back-ported to 5.1
5.5 fix for:
Bug#50418: DROP PARTITION does not interact with
transactions
Main problem was non-persistent operations done
before meta-data lock was taken (53770+53676).
And 53676 needed to keep the table/partitions opened and locked
while copying the data to the new partitions.
Also added thorough tests to spot some additional bugs
in the ddl_log code, which could result in bad state
between the .frm and partitions.
Collapsed patch, includes all fixes required from the reviewers.
mysql-test/r/partition_innodb.result:
updated result with new test
mysql-test/suite/parts/inc/partition_crash.inc:
crash test include file
mysql-test/suite/parts/inc/partition_crash_add.inc:
test all states in fast_alter_partition_table
ADD PARTITION branch
mysql-test/suite/parts/inc/partition_crash_change.inc:
test all states in fast_alter_partition_table
CHANGE PARTITION branch
mysql-test/suite/parts/inc/partition_crash_drop.inc:
test all states in fast_alter_partition_table
DROP PARTITION branch
mysql-test/suite/parts/inc/partition_fail.inc:
recovery test including injecting errors
mysql-test/suite/parts/inc/partition_fail_add.inc:
test all states in fast_alter_partition_table
ADD PARTITION branch
mysql-test/suite/parts/inc/partition_fail_change.inc:
test all states in fast_alter_partition_table
CHANGE PARTITION branch
mysql-test/suite/parts/inc/partition_fail_drop.inc:
test all states in fast_alter_partition_table
DROP PARTITION branch
mysql-test/suite/parts/inc/partition_mgm_crash.inc:
include file that runs all crash and failure injection tests.
mysql-test/suite/parts/r/partition_debug_innodb.result:
new test result file
mysql-test/suite/parts/r/partition_debug_myisam.result:
new test result file
mysql-test/suite/parts/r/partition_special_innodb.result:
updated result
mysql-test/suite/parts/r/partition_special_myisam.result:
updated result
mysql-test/suite/parts/t/partition_debug_innodb-master.opt:
opt file for using with crashing tests of partitioned innodb
mysql-test/suite/parts/t/partition_debug_innodb.test:
partitioned innodb test that require debug builds
mysql-test/suite/parts/t/partition_debug_myisam-master.opt:
opt file for using with crashing tests of partitioned myisam
mysql-test/suite/parts/t/partition_debug_myisam.test:
partitioned myisam test that require debug builds
mysql-test/suite/parts/t/partition_special_innodb-master.opt:
added innodb-file-per-table to easier verify partition status on disk
mysql-test/suite/parts/t/partition_special_innodb.test:
added test case
mysql-test/suite/parts/t/partition_special_myisam.test:
added test case
mysql-test/t/partition_innodb.test:
added test case
sql/sql_base.cc:
Moved alter_close_tables to sql_partition.cc
sql/sql_base.h:
removed some non existing and duplicated functions.
sql/sql_partition.cc:
fast_alter_partition_table:
Spletted abort_and_upgrad_lock_and_close_table
to its parts (wait_while_table_is_used and
alter_close_tables) and always have
wait_while_table_is_used before any persistent
operations (including logs, which will be executed
on failure) and alter_close_tables after
create/read/write operations and before
drop operations.
moved alter_close_tables here from sql_base.cc
Added error injections for better test coverage.
write_log_final_change_partition:
fixed a log_entry linking bug (delete_frm was not
linked to change/drop partition)
and drop partition must be executed before
change partition (change partition can rename a
partition to an old name, like REORG p1 INTO (p1,p2).
write_log_add_change_partition:
need to use drop_frm first, and relinking that entry
and reusing its execute entry.
sql/sql_table.cc:
added initialization of next_active_log_entry.
sql/table.h:
removed a duplicate declaration.
corruption on ADD PARTITION and LOCK TABLE
Bug#53770: Server crash at handler.cc:2076 on
LOAD DATA after timed out COALESCE PARTITION
5.5 fix for:
Bug#51042: REORGANIZE PARTITION can leave table in an
inconsistent state in case of crash
Needs to be back-ported to 5.1
5.5 fix for:
Bug#50418: DROP PARTITION does not interact with
transactions
Main problem was non-persistent operations done
before meta-data lock was taken (53770+53676).
And 53676 needed to keep the table/partitions opened and locked
while copying the data to the new partitions.
Also added thorough tests to spot some additional bugs
in the ddl_log code, which could result in bad state
between the .frm and partitions.
Collapsed patch, includes all fixes required from the reviewers.
This was triggered by innodb.innodb_multi_update, where we had a static length row without nulls and xtradb didn't fill in the delete-marker byte
include/my_bitmap.h:
Added prototype for bitmap_union_is_set_all()
mysys/my_bitmap.c:
Added function to check if union of two bit maps covers all bits.
sql/mysql_priv.h:
Updated protype for compare_record()
sql/sql_insert.cc:
Send to compare_record() flag if all fields are used.
sql/sql_select.cc:
Set share->null_bytes_for_compare.
sql/sql_update.cc:
In compare_record() don't use the fast cmp_record() (which is basically memcmp) if we don't know that all fields exists.
Don't compare the null_bytes if there is no data there.
sql/table.cc:
Store in share->null_bytes_for_compare the number of bytes that has null or bit fields (but not delete marker)
Store in can_cmp_whole_record if we can use memcmp() (assuming all rows are read) to compare rows in compare_record()
sql/table.h:
Added two elements in table->share to speed up checking how updated rows can be compared.
/*![:version:] Query Code */, where [:version:] is a sequence of 5
digits representing the mysql server version(e.g /*!50200 ... */),
is a special comment that the query in it can be executed on those
servers whose versions are larger than the version appearing in the
comment. It leads to a security issue when slave's version is larger
than master's. A malicious user can improve his privileges on slaves.
Because slave SQL thread is running with SUPER privileges, so it can
execute queries that he/she does not have privileges on master.
This bug is fixed with the logic below:
- To replace '!' with ' ' in the magic comments which are not applied on
master. So they become common comments and will not be applied on slave.
- Example:
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /*!99999 ,(3)*/
will be binlogged as
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /* 99999 ,(3)*/
mysql-test/suite/rpl/t/rpl_conditional_comments.test:
Test the patch for this bug.
sql/mysql_priv.h:
Rename inBuf as rawBuf and remove the const limitation.
sql/sql_lex.cc:
To replace '!' with ' ' in the magic comments which are not applied on
master.
sql/sql_lex.h:
Remove the const limitation on parameter buff, as it can be modified in the function since
this patch.
Add member function yyUnput for Lex_input_stream. It set a character back the query buff.
sql/sql_parse.cc:
Rename inBuf as rawBuf and remove the const limitation.
sql/sql_partition.cc:
Remove the const limitation on parameter part_buff, as it can be modified in the function since
this patch.
sql/sql_partition.h:
Remove the const limitation on parameter part_buff, as it can be modified in the function since
this patch.
sql/table.h:
Remove the const limitation on variable partition_info, as it can be modified since
this patch.
/*![:version:] Query Code */, where [:version:] is a sequence of 5
digits representing the mysql server version(e.g /*!50200 ... */),
is a special comment that the query in it can be executed on those
servers whose versions are larger than the version appearing in the
comment. It leads to a security issue when slave's version is larger
than master's. A malicious user can improve his privileges on slaves.
Because slave SQL thread is running with SUPER privileges, so it can
execute queries that he/she does not have privileges on master.
This bug is fixed with the logic below:
- To replace '!' with ' ' in the magic comments which are not applied on
master. So they become common comments and will not be applied on slave.
- Example:
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /*!99999 ,(3)*/
will be binlogged as
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /* 99999 ,(3)*/
TABLES <list> WITH READ LOCK are incompatible".
The problem was that FLUSH TABLES <list> WITH READ LOCK
which was issued when other connection has acquired global
read lock using FLUSH TABLES WITH READ LOCK was blocked
and has to wait until global read lock is released.
This issue stemmed from the fact that FLUSH TABLES <list>
WITH READ LOCK implementation has acquired X metadata locks
on tables to be flushed. Since these locks required acquiring
of global IX lock this statement was incompatible with global
read lock.
This patch addresses problem by using SNW metadata type of
lock for tables to be flushed by FLUSH TABLES <list> WITH
READ LOCK. It is OK to acquire them without global IX lock
as long as we won't try to upgrade those locks. Since SNW
locks allow concurrent statements using same table FLUSH
TABLE <list> WITH READ LOCK now has to wait until old
versions of tables to be flushed go away after acquiring
metadata locks. Since such waiting can lead to deadlock
MDL deadlock detector was extended to take into account
waits for flush and resolve such deadlocks.
As a bonus code in open_tables() which was responsible for
waiting old versions of tables to go away was refactored.
Now when we encounter old version of table in open_table()
we don't back-off and wait for all old version to go away,
but instead wait for this particular table to be flushed.
Such approach supported by deadlock detection should reduce
number of scenarios in which FLUSH TABLES aborts concurrent
multi-statement transactions.
Note that active FLUSH TABLES <list> WITH READ LOCK still
blocks concurrent FLUSH TABLES WITH READ LOCK statement
as the former keeps tables open and thus prevents the
latter statement from doing flush.
mysql-test/include/handler.inc:
Adjusted test case after changing status which is set
when FLUSH TABLES waits for tables to be flushed from
"Flushing tables" to "Waiting for table".
mysql-test/r/flush.result:
Added test which checks that "flush tables <list> with
read lock" is compatible with active "flush tables with
read lock" but not vice-versa. This test also covers
bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES
<list> WITH READ LOCK are incompatible".
mysql-test/r/mdl_sync.result:
Added scenarios in which wait for table to be flushed
causes deadlocks to the coverage of MDL deadlock detector.
mysql-test/suite/perfschema/r/dml_setup_instruments.result:
Adjusted test results after removal of COND_refresh
condition variable.
mysql-test/suite/perfschema/r/server_init.result:
Adjusted test and its results after removal of COND_refresh
condition variable.
mysql-test/suite/perfschema/t/server_init.test:
Adjusted test and its results after removal of COND_refresh
condition variable.
mysql-test/t/flush.test:
Added test which checks that "flush tables <list> with
read lock" is compatible with active "flush tables with
read lock" but not vice-versa. This test also covers
bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES
<list> WITH READ LOCK are incompatible".
mysql-test/t/kill.test:
Adjusted test case after changing status which is set
when FLUSH TABLES waits for tables to be flushed from
"Flushing tables" to "Waiting for table".
mysql-test/t/lock_multi.test:
Adjusted test case after changing status which is set
when FLUSH TABLES waits for tables to be flushed from
"Flushing tables" to "Waiting for table".
mysql-test/t/mdl_sync.test:
Added scenarios in which wait for table to be flushed
causes deadlocks to the coverage of MDL deadlock detector.
sql/ha_ndbcluster.cc:
Adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/ha_ndbcluster_binlog.cc:
Adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/lock.cc:
Removed COND_refresh condition variable. See comment
for sql_base.cc for details.
sql/mdl.cc:
Now MDL deadlock detector takes into account information
about waits for table flushes when searching for deadlock.
To implement this change:
- Declaration of enum_deadlock_weight and
Deadlock_detection_visitor were moved to mdl.h header
to make them available to the code in table.cc which
implements deadlock detector traversal through edges
of waiters graph representing waiting for flush.
- Since now MDL_context may wait not only for metadata
lock but also for table to be flushed an abstract
Wait_for_edge class was introduced. Its descendants
MDL_ticket and Flush_ticket incapsulate specifics
of inspecting waiters graph when following through
edge representing wait of particular type.
We no longer require global IX metadata lock when acquiring
SNW or SNRW locks. Such locks are needed only when metadata
locks of these types are upgraded to X locks. This allows
to use SNW locks in FLUSH TABLES <list> WITH READ LOCK
implementation and keep the latter compatible with global
read lock.
sql/mdl.h:
Now MDL deadlock detector takes into account information
about waits for table flushes when searching for deadlock.
To implement this change:
- Declaration of enum_deadlock_weight and
Deadlock_detection_visitor were moved to mdl.h header
to make them available to the code in table.cc which
implements deadlock detector traversal through edges
of waiters graph representing waiting for flush.
- Since now MDL_context may wait not only for metadata
lock but also for table to be flushed an abstract
Wait_for_edge class was introduced. Its descendants
MDL_ticket and Flush_ticket incapsulate specifics
of inspecting waiters graph when following through
edge representing wait of particular type.
- Deadlock_detection_visitor now has m_table_shares_visited
member which allows to support recursive locking for
LOCK_open. This is required when deadlock detector
inspects waiters graph which contains several edges
representing waits for flushes or needs to come through
the such edge more than once.
sql/mysqld.cc:
Removed COND_refresh condition variable. See comment
for sql_base.cc for details.
sql/mysqld.h:
Removed COND_refresh condition variable. See comment
for sql_base.cc for details.
sql/sql_base.cc:
Changed approach to how threads are waiting for table
to be flushed. Now thread that wants to wait for old
table to go away subscribes for notification by adding
Flush_ticket to table's share and waits using
MDL_context::m_wait object. Once table gets flushed
(i.e. all tables are closed and table share is ready
to be destroyed) all such waiters are notified
individually.
Thanks to this change MDL deadlock detector can take
such waits into account.
To implement this/as result of this change:
- tdc_wait_for_old_versions() was replaced with
tdc_wait_for_old_version() which waits for individual
old share to go away and which is called by open_table()
after finding out that share is outdated. We don't
need to perform back-off before such waiting thanks
to the fact that deadlock detector now sees such waits.
- As result Open_table_ctx::m_mdl_requests became
unnecessary and was removed. We no longer allocate
copies of MDL_request objects on MEM_ROOT when
MYSQL_OPEN_FORCE_SHARED/SHARED_HIGH_PRIO flags are
in effect.
- close_cached_tables() and tdc_wait_for_old_version()
share code which implements waiting for share to be
flushed - the both use TABLE_SHARE::wait_until_flush()
method. Thanks to this close_cached_tables() supports
timeouts and has extra parameter for this.
- Open_table_context::OT_MDL_CONFLICT enum element was
renamed to OT_CONFLICT as it is now also used in cases
when back-off is required to resolve deadlock caused
by waiting for flush and not metadata lock.
- In cases when we discover that current connection tries
to open tables from different generation we now simply
back-off and restart process of opening tables. To
support this Open_table_context::OT_REOPEN_TABLES enum
element was added.
- COND_refresh condition variable became unnecessary and
was removed.
- mysql_notify_thread_having_shared_lock() no longer wakes
up connections waiting for flush as all such connections
can be waken up by deadlock detector if necessary.
sql/sql_base.h:
- close_cached_tables() now has one more parameter -
timeout for waiting for table to be flushed.
- Open_table_context::OT_MDL_CONFLICT enum element was
renamed to OT_CONFLICT as it is now also used in cases
when back-off is required to resolve deadlock caused
by waiting for flush and not metadata lock.
Added new OT_REOPEN_TABLES enum element to be used in
cases when we need to restart open tables process even
in the middle of transaction.
- Open_table_ctx::m_mdl_requests became unnecessary and
was removed.
sql/sql_class.h:
Added assert ensuring that we won't use LOCK_open mutex
with THD::enter_cond(). Otherwise deadlocks can arise in
MDL deadlock detector.
sql/sql_parse.cc:
Changed FLUSH TABLES <list> WITH READ LOCK to take SNW
metadata locks instead of X locks on tables to be flushed.
Since we no longer require global IX lock to be taken
when SNW locks are taken this makes this statement
compatible with FLUSH TABLES WITH READ LOCK statement.
Since SNW locks allow other connections to have table
opened FLUSH TABLES <list> WITH READ LOCK now has to
wait during open_tables() for old version to go away.
Such waits can lead to deadlocks which will be detected
by MDL deadlock detector which now takes waits for table
to be flushed into account.
Also adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/sql_yacc.yy:
FLUSH TABLES <list> WITH READ LOCK now needs only SNW
metadata locks on tables.
sql/sys_vars.cc:
Adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/table.cc:
Implemented new approach to how threads are waiting for
table to be flushed. Now thread that wants to wait for
old table to go away subscribes for notification by
adding Flush_ticket to table's share and waits using
MDL_context::m_wait object. Once table gets flushed
(i.e. all tables are closed and table share is ready
to be destroyed) all such waiters are notified
individually. This change allows to make such waits
visible inside of MDL deadlock detector.
To do it:
- Added list of waiters/Flush_tickets to TABLE_SHARE
class.
- Changed free_table_share() to postpone freeing of
share memory until last waiter goes away and to
wake up subscribed waiters.
- Added TABLE_SHARE::wait_until_flushed() method which
implements subscription to the list of waiters for
table to be flushed and waiting for this event.
Implemented interface which allows to expose waits for
flushes to MDL deadlock detector:
- Introduced Flush_ticket class a descendant of
Wait_for_edge class.
- Added TABLE_SHARE::find_deadlock() method which allows
deadlock detector to find out what contexts are still
using old version of table in question (i.e. to find
out what contexts are waited for by owner of
Flush_ticket).
sql/table.h:
In order to support new strategy of waiting for table flush
(see comment for table.cc for details) added list of
waiters/Flush_tickets to TABLE_SHARE class.
Implemented interface which allows to expose waits for
flushes to MDL deadlock detector:
- Introduced Flush_ticket class a descendant of
Wait_for_edge class.
- Added TABLE_SHARE::find_deadlock() method which allows
deadlock detector to find out what contexts are still
using old version of table in question (i.e. to find
out what contexts are waited for by owner of
Flush_ticket).
TABLES <list> WITH READ LOCK are incompatible".
The problem was that FLUSH TABLES <list> WITH READ LOCK
which was issued when other connection has acquired global
read lock using FLUSH TABLES WITH READ LOCK was blocked
and has to wait until global read lock is released.
This issue stemmed from the fact that FLUSH TABLES <list>
WITH READ LOCK implementation has acquired X metadata locks
on tables to be flushed. Since these locks required acquiring
of global IX lock this statement was incompatible with global
read lock.
This patch addresses problem by using SNW metadata type of
lock for tables to be flushed by FLUSH TABLES <list> WITH
READ LOCK. It is OK to acquire them without global IX lock
as long as we won't try to upgrade those locks. Since SNW
locks allow concurrent statements using same table FLUSH
TABLE <list> WITH READ LOCK now has to wait until old
versions of tables to be flushed go away after acquiring
metadata locks. Since such waiting can lead to deadlock
MDL deadlock detector was extended to take into account
waits for flush and resolve such deadlocks.
As a bonus code in open_tables() which was responsible for
waiting old versions of tables to go away was refactored.
Now when we encounter old version of table in open_table()
we don't back-off and wait for all old version to go away,
but instead wait for this particular table to be flushed.
Such approach supported by deadlock detection should reduce
number of scenarios in which FLUSH TABLES aborts concurrent
multi-statement transactions.
Note that active FLUSH TABLES <list> WITH READ LOCK still
blocks concurrent FLUSH TABLES WITH READ LOCK statement
as the former keeps tables open and thus prevents the
latter statement from doing flush.
prepared statements
Using GROUP_CONCAT() together with the WITH ROLLUP modifier
could crash the server.
The reason was a combination of several facts:
1. The Item_func_group_concat class stores pointers to ORDER
objects representing the columns in the ORDER BY clause of
GROUP_CONCAT().
2. find_order_in_list() called from
Item_func_group_concat::setup() modifies the ORDER objects so
that their 'item' member points to the arguments list
allocated in the Item_func_group_concat constructor.
3. In some cases (e.g. in JOIN::rollup_make_fields) a copy of
the original Item_func_group_concat object could be created by
using the Item_func_group_concat::Item_func_group_concat(THD
*thd, Item_func_group_concat *item) copy constructor. The
latter essentially creates a shallow copy of the source
object. Memory for the arguments array is allocated on
thd->mem_root, but the pointers for arguments and ORDER are
copied verbatim.
What happens in the test case is that when executing the query
for the first time, after a copy of the original
Item_func_group_concat object has been created by
JOIN::rollup_make_fields(), find_order_in_list() is called for
this new object. It then resolves ORDER BY by modifying the
ORDER objects so that they point to elements of the arguments
array which is local to the cloned object. When thd->mem_root
is freed upon completing the execution, pointers in the ORDER
objects become invalid. Those ORDER objects, however, are also
shared with the original Item_func_group_concat object which is
preserved between executions of a prepared statement. So the
first call to find_order_in_list() for the original object on
the second execution tries to dereference an invalid pointer.
The solution is to create copies of the ORDER objects when
copying Item_func_group_concat to not leave any stale pointers
in other instances with different lifecycles.
mysql-test/r/func_gconcat.result:
Test case for bug #54476.
mysql-test/t/func_gconcat.test:
Test case for bug #54476.
sql/item_sum.cc:
Copy the ORDER objects pointed to by the elements of the
'order' array in the copy constructor of
Item_func_group_concat.
sql/table.h:
Removed the unused 'item_copy' member of the ORDER class.
prepared statements
Using GROUP_CONCAT() together with the WITH ROLLUP modifier
could crash the server.
The reason was a combination of several facts:
1. The Item_func_group_concat class stores pointers to ORDER
objects representing the columns in the ORDER BY clause of
GROUP_CONCAT().
2. find_order_in_list() called from
Item_func_group_concat::setup() modifies the ORDER objects so
that their 'item' member points to the arguments list
allocated in the Item_func_group_concat constructor.
3. In some cases (e.g. in JOIN::rollup_make_fields) a copy of
the original Item_func_group_concat object could be created by
using the Item_func_group_concat::Item_func_group_concat(THD
*thd, Item_func_group_concat *item) copy constructor. The
latter essentially creates a shallow copy of the source
object. Memory for the arguments array is allocated on
thd->mem_root, but the pointers for arguments and ORDER are
copied verbatim.
What happens in the test case is that when executing the query
for the first time, after a copy of the original
Item_func_group_concat object has been created by
JOIN::rollup_make_fields(), find_order_in_list() is called for
this new object. It then resolves ORDER BY by modifying the
ORDER objects so that they point to elements of the arguments
array which is local to the cloned object. When thd->mem_root
is freed upon completing the execution, pointers in the ORDER
objects become invalid. Those ORDER objects, however, are also
shared with the original Item_func_group_concat object which is
preserved between executions of a prepared statement. So the
first call to find_order_in_list() for the original object on
the second execution tries to dereference an invalid pointer.
The solution is to create copies of the ORDER objects when
copying Item_func_group_concat to not leave any stale pointers
in other instances with different lifecycles.
For queries with order by clauses that employed filesort usage of
virtual column references in select lists could trigger assertion
failures. It happened because a wrong vcol_set bitmap was used for
filesort. It turned out that filesort required its own vcol_set bitmap.
Made management of the vcol_set bitmaps similar to the management
of the read_set and write_set bitmaps.
If the expression for a virtual column of table contained datetime
comparison then the execution of the second query that used this
virtual column caused a crash. It happened because the execution
of the first query that used this virtual column inserted a cached
item into the expression tree. The cached tree was allocated in
the statement memory while the expression tree was allocated in
the table memory.
Now the cached items that are inserted into expressions for virtual
columns with datetime comparisons are always allocated in the same
mem_root as the expressions for virtual columns. So now the inserted
cached items are valid for any queries that use these virtual columns.
There were two problems that caused wrong results reported with this bug.
1. In some cases stored(persistent) virtual columns were not marked
in the write_set and in the vcol_set bitmaps.
2. If the list of fields in an insert command was empty then the values of
the stored virtual columns were set to default.
To fix the first problem the function st_table::mark_virtual_columns_for_write
was modified. Now the function has a parameter that says whether the virtual
columns are to be marked for insert or for update.
To fix the second problem a special handling of empty insert lists is
added in the function fill_record().
libmysqld/Makefile.am:
The new file added.
mysql-test/r/index_merge_myisam.result:
subquery_cache optimization option added.
mysql-test/r/myisam_mrr.result:
subquery_cache optimization option added.
mysql-test/r/subquery_cache.result:
The subquery cache tests added.
mysql-test/r/subselect3.result:
Subquery cache switched off to avoid changing read statistics.
mysql-test/r/subselect3_jcl6.result:
Subquery cache switched off to avoid changing read statistics.
mysql-test/r/subselect_no_mat.result:
subquery_cache optimization option added.
mysql-test/r/subselect_no_opts.result:
subquery_cache optimization option added.
mysql-test/r/subselect_no_semijoin.result:
subquery_cache optimization option added.
mysql-test/r/subselect_sj.result:
subquery_cache optimization option added.
mysql-test/r/subselect_sj_jcl6.result:
subquery_cache optimization option added.
mysql-test/t/subquery_cache.test:
The subquery cache tests added.
mysql-test/t/subselect3.test:
Subquery cache switched off to avoid changing read statistics.
sql/CMakeLists.txt:
The new file added.
sql/Makefile.am:
The new files added.
sql/item.cc:
Expression cache item (Item_cache_wrapper) added.
Item_ref and Item_field fixed for correct usage of result field and fast resolwing in SP.
sql/item.h:
Expression cache item (Item_cache_wrapper) added.
Item_ref and Item_field fixed for correct usage of result field and fast resolwing in SP.
sql/item_cmpfunc.cc:
Subquery cache added.
sql/item_cmpfunc.h:
Subquery cache added.
sql/item_subselect.cc:
Subquery cache added.
sql/item_subselect.h:
Subquery cache added.
sql/item_sum.cc:
Registration of subquery parameters added.
sql/mysql_priv.h:
subquery_cache optimization option added.
sql/mysqld.cc:
subquery_cache optimization option added.
sql/opt_range.cc:
Fix due to subquery cache.
sql/opt_subselect.cc:
Parameters of the function cahnged.
sql/procedure.h:
.h file guard added.
sql/sql_base.cc:
Registration of subquery parameters added.
sql/sql_class.cc:
Option to allow add indeces to temporary table.
sql/sql_class.h:
Item iterators added.
Option to allow add indeces to temporary table.
sql/sql_expression_cache.cc:
Expression cache for caching subqueries added.
sql/sql_expression_cache.h:
Expression cache for caching subqueries added.
sql/sql_lex.cc:
Registration of subquery parameters added.
sql/sql_lex.h:
Registration of subqueries and subquery parameters added.
sql/sql_select.cc:
Subquery cache added.
sql/sql_select.h:
Subquery cache added.
sql/sql_union.cc:
A new parameter to the function added.
sql/sql_update.cc:
A new parameter to the function added.
sql/table.cc:
Procedures to manage temporarty tables index added.
sql/table.h:
Procedures to manage temporarty tables index added.
storage/maria/ha_maria.cc:
Fix of handler to allow destoy a table in case of error during the table creation.
storage/maria/ha_maria.h:
.h file guard added.
storage/myisam/ha_myisam.cc:
Fix of handler to allow destoy a table in case of error during the table creation.
use limit efficiently
Bug #36569: UPDATE ... WHERE ... ORDER BY... always does a
filesort even if not required
Also two bugs reported after QA review (before the commit
of bugs above to public trees, no documentation needed):
Bug #53737: Performance regressions after applying patch
for bug 36569
Bug #53742: UPDATEs have no effect after applying patch
for bug 36569
Execution of single-table UPDATE and DELETE statements did not use the
same optimizer as was used in the compilation of SELECT statements.
Instead, it had an optimizer of its own that did not take into account
that you can omit sorting by retrieving rows using an index.
Extra optimization has been added: when applicable, single-table
UPDATE/DELETE statements use an existing index instead of filesort. A
corresponding SELECT query would do the former.
Also handling of the DESC ordering expression has been added when
reverse index scan is applicable.
From now on most single table UPDATE and DELETE statements show the
same disk access patterns as the corresponding SELECT query. We verify
this by comparing the result of SHOW STATUS LIKE 'Sort%
Currently the get_index_for_order function
a) checks quick select index (if any) for compatibility with the
ORDER expression list or
b) chooses the cheapest available compatible index, but only if
the index scan is cheaper than filesort.
Second way is implemented by the new test_if_cheaper_ordering
function (extracted part the test_if_skip_sort_order()).
mysql-test/r/log_state.result:
Updated result for optimized query, bug #36569.
mysql-test/r/single_delete_update.result:
Test case for bug #30584, bug #36569 and bug #53742.
mysql-test/r/update.result:
Updated result for optimized query, bug #30584.
Note:
"Handler_read_last 1" omitted, see bug 52312:
lost Handler_read_last status variable.
mysql-test/t/single_delete_update.test:
Test case for bug #30584, bug #36569 and bug #53742.
sql/opt_range.cc:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
* get_index_for_order() has been rewritten entirely and moved
to sql_select.cc
New QUICK_RANGE_SELECT::make_reverse method has been added.
sql/opt_range.h:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
* get_index_for_order() has been rewritten entirely and moved
to sql_select.cc
New functions:
* QUICK_SELECT_I::make_reverse()
* SQL_SELECT::set_quick()
sql/records.cc:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
* init_read_record_idx() has been modified to allow reverse index scan
New functions:
* rr_index_last()
* rr_index_desc()
sql/records.h:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
init_read_record_idx() has been modified to allow reverse index scan
sql/sql_delete.cc:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
mysql_delete: an optimization has been added to skip
unnecessary sorting with ORDER BY clause where select
result ordering is acceptable.
sql/sql_select.cc:
Bug #30584, bug #36569, bug #53737, bug #53742:
UPDATE/DELETE ... WHERE ... ORDER BY... always does a filesort
even if not required
The const_expression_in_where function has been modified
to accept both Item and Field pointers.
New functions:
* get_index_for_order()
* test_if_cheaper_ordering() has been extracted from
test_if_skip_sort_order() to share with get_index_for_order()
* simple_remove_const()
sql/sql_select.h:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
New functions:
* test_if_cheaper_ordering()
* simple_remove_const()
* get_index_for_order()
sql/sql_update.cc:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
mysql_update: an optimization has been added to skip
unnecessary sorting with ORDER BY clause where a select
result ordering is acceptable.
sql/table.cc:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
New functions:
* TABLE::update_const_key_parts()
* is_simple_order()
sql/table.h:
Bug #30584, bug #36569: UPDATE/DELETE ... WHERE ... ORDER BY...
always does a filesort even if not required
New functions:
* TABLE::update_const_key_parts()
* is_simple_order()
use limit efficiently
Bug #36569: UPDATE ... WHERE ... ORDER BY... always does a
filesort even if not required
Also two bugs reported after QA review (before the commit
of bugs above to public trees, no documentation needed):
Bug #53737: Performance regressions after applying patch
for bug 36569
Bug #53742: UPDATEs have no effect after applying patch
for bug 36569
Execution of single-table UPDATE and DELETE statements did not use the
same optimizer as was used in the compilation of SELECT statements.
Instead, it had an optimizer of its own that did not take into account
that you can omit sorting by retrieving rows using an index.
Extra optimization has been added: when applicable, single-table
UPDATE/DELETE statements use an existing index instead of filesort. A
corresponding SELECT query would do the former.
Also handling of the DESC ordering expression has been added when
reverse index scan is applicable.
From now on most single table UPDATE and DELETE statements show the
same disk access patterns as the corresponding SELECT query. We verify
this by comparing the result of SHOW STATUS LIKE 'Sort%
Currently the get_index_for_order function
a) checks quick select index (if any) for compatibility with the
ORDER expression list or
b) chooses the cheapest available compatible index, but only if
the index scan is cheaper than filesort.
Second way is implemented by the new test_if_cheaper_ordering
function (extracted part the test_if_skip_sort_order()).
an atomic counter"
Split the large LOCK_open section in open_table().
Do not call open_table_from_share() under LOCK_open.
Remove thd->version.
This fixes
Bug#50589 "Server hang on a query evaluated using a temporary
table"
Bug#51557 "LOCK_open and kernel_mutex are not happy together"
Bug#49463 "LOCK_table and innodb are not nice when handler
instances are created".
This patch has effect on storage engines that rely on
ha_open() PSEA method being called under LOCK_open.
In particular:
1) NDB is broken and left unfixed. NDB relies on LOCK_open
being kept as part of ha_open(), since it uses auto-discovery.
While previously the NDB open code was race-prone, now
it simply fails on asserts.
2) HEAP engine had a race in ha_heap::open() when
a share for the same table could be added twice
to the list of shares, or a dangling reference to a share
stored in HEAP handler. This patch aims to address this
problem by 'pinning' the newly created share in the
internal HEAP engine share list until at least one
handler instance is created using that share.
include/heap.h:
Add members to HP_CREATE_INFO.
Declare heap_release_share().
sql/lock.cc:
Remove thd->version, use thd->open_tables->s->version instead.
sql/repl_failsafe.cc:
Remove thd->version.
sql/sql_base.cc:
- close_thread_table(): move handler cleanup code outside the critical section protected by LOCK_open.
- remove thd->version
- split the large critical section in open_table() that
opens a new table from share and is protected by LOCK_open
into 2 critical sections, thus reducing the critical path.
- make check_if_table_exists() acquire LOCK_open internally.
- use thd->open_tables->s->version instead of thd->refresh_version to make sure that all tables in
thd->open_tables are in the same refresh series.
sql/sql_base.h:
Add declaration for check_if_table_exists().
sql/sql_class.cc:
Remove init_open_tables_state(), it's now equal to
reset_open_tables_state().
sql/sql_class.h:
Remove thd->version, THD::init_open_tables_state().
sql/sql_plugin.cc:
Use table->m_needs_reopen to mark the table as stale
rather than manipulate with thd->version, which is no more.
sql/sql_udf.cc:
Use table->m_needs_reopen to mark the table as stale
rather than manipulate with thd->version, which is no more.
sql/table.h:
Remove an unused variable.
sql/tztime.cc:
Use table->m_needs_reopen to mark the table as stale
rather than manipulate with thd->version, which is no more.
storage/heap/CMakeLists.txt:
Add heap tests to cmake build files.
storage/heap/ha_heap.cc:
Fix a race when ha_heap::ha_open() could insert two
HP_SHARE objects into the internal share list or store
a dangling reference to a share in ha_heap instance,
or wrongly set implicit_emptied.
storage/heap/hp_create.c:
Optionally pin a newly created share in the list of shares
by increasing its open_count. This is necessary to
make sure that a newly created share doesn't disappear while
a HP_INFO object is being created to reference it.
storage/heap/hp_open.c:
When adding a new HP_INFO object to the list of objects
in the heap share, make sure the open_count is not increased
twice.
storage/heap/hp_test1.c:
Adjust the test to new function signatures.
storage/heap/hp_test2.c:
Adjust the test to new function signatures.
an atomic counter"
Split the large LOCK_open section in open_table().
Do not call open_table_from_share() under LOCK_open.
Remove thd->version.
This fixes
Bug#50589 "Server hang on a query evaluated using a temporary
table"
Bug#51557 "LOCK_open and kernel_mutex are not happy together"
Bug#49463 "LOCK_table and innodb are not nice when handler
instances are created".
This patch has effect on storage engines that rely on
ha_open() PSEA method being called under LOCK_open.
In particular:
1) NDB is broken and left unfixed. NDB relies on LOCK_open
being kept as part of ha_open(), since it uses auto-discovery.
While previously the NDB open code was race-prone, now
it simply fails on asserts.
2) HEAP engine had a race in ha_heap::open() when
a share for the same table could be added twice
to the list of shares, or a dangling reference to a share
stored in HEAP handler. This patch aims to address this
problem by 'pinning' the newly created share in the
internal HEAP engine share list until at least one
handler instance is created using that share.
strict aliasing violations.
One somewhat major source of strict-aliasing violations and
related warnings is the SQL_LIST structure. For example,
consider its member function `link_in_list` which takes
a pointer to pointer of type T (any type) as a pointer to
pointer to unsigned char. Dereferencing this pointer, which
is done to reset the next field, violates strict-aliasing
rules and might cause problems for surrounding code that
uses the next field of the object being added to the list.
The solution is to use templates to parametrize the SQL_LIST
structure in order to deference the pointers with compatible
types. As a side bonus, it becomes possible to remove quite
a few casts related to acessing data members of SQL_LIST.
sql/handler.h:
Use the appropriate template type argument.
sql/item.cc:
Remove now-unnecessary cast.
sql/item_subselect.cc:
Remove now-unnecessary casts.
sql/item_sum.cc:
Use the appropriate template type argument.
Remove now-unnecessary cast.
sql/mysql_priv.h:
Move SQL_LIST structure to sql_list.h
Use the appropriate template type argument.
sql/sp.cc:
Remove now-unnecessary casts.
sql/sql_delete.cc:
Use the appropriate template type argument.
Remove now-unnecessary casts.
sql/sql_derived.cc:
Remove now-unnecessary casts.
sql/sql_lex.cc:
Remove now-unnecessary casts.
sql/sql_lex.h:
SQL_LIST now takes a template type argument which must
match the type of the elements of the list. Use forward
declaration when the type is not available, it is used
in pointers anyway.
sql/sql_list.h:
Rename SQL_LIST to SQL_I_List. The template parameter is
the type of object that is stored in the list.
sql/sql_olap.cc:
Remove now-unnecessary casts.
sql/sql_parse.cc:
Remove now-unnecessary casts.
sql/sql_prepare.cc:
Remove now-unnecessary casts.
sql/sql_select.cc:
Remove now-unnecessary casts.
sql/sql_show.cc:
Remove now-unnecessary casts.
sql/sql_table.cc:
Remove now-unnecessary casts.
sql/sql_trigger.cc:
Remove now-unnecessary casts.
sql/sql_union.cc:
Remove now-unnecessary casts.
sql/sql_update.cc:
Remove now-unnecessary casts.
sql/sql_view.cc:
Remove now-unnecessary casts.
sql/sql_yacc.yy:
Remove now-unnecessary casts.
storage/myisammrg/ha_myisammrg.cc:
Remove now-unnecessary casts.
strict aliasing violations.
One somewhat major source of strict-aliasing violations and
related warnings is the SQL_LIST structure. For example,
consider its member function `link_in_list` which takes
a pointer to pointer of type T (any type) as a pointer to
pointer to unsigned char. Dereferencing this pointer, which
is done to reset the next field, violates strict-aliasing
rules and might cause problems for surrounding code that
uses the next field of the object being added to the list.
The solution is to use templates to parametrize the SQL_LIST
structure in order to deference the pointers with compatible
types. As a side bonus, it becomes possible to remove quite
a few casts related to acessing data members of SQL_LIST.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
The problem was that TRUNCATE TABLE didn't take a exclusive
lock on a table if it resorted to truncating via delete of
all rows in the table. Specifically for InnoDB tables, this
could break proper isolation as InnoDB ends up aborting some
granted locks when truncating a table.
The solution is to take a exclusive metadata lock before
TRUNCATE TABLE can proceed. This guarantees that no other
transaction is using the table.
Incompatible change: Truncate via delete no longer fails
if sql_safe_updates is activated (this was a undocumented
side effect).
libmysqld/CMakeLists.txt:
Add new files to the build list.
libmysqld/Makefile.am:
Add new files to the build list.
mysql-test/extra/binlog_tests/binlog_truncate.test:
Add test case for Bug#42643
mysql-test/include/mix1.inc:
Update test case as TRUNCATE TABLE now grabs a exclusive lock.
Ensure that TRUNCATE waits for granted locks on the table.
mysql-test/suite/binlog/t/binlog_truncate_innodb.test:
As with other data modifying statements, TRUNCATE is still not
possible in a transaction with isolation level READ COMMITTED
or READ UNCOMMITED. It would be possible to implement so, but
it is not worth the effort.
mysql-test/suite/binlog/t/binlog_truncate_myisam.test:
Test under different binlog formats.
mysql-test/suite/binlog/t/disabled.def:
Re-enable test case.
mysql-test/t/innodb_bug38231.test:
Truncate no longer works with row-level locks.
mysql-test/t/mdl_sync.test:
Ensure that a acquired lock is not given up due to a conflict.
mysql-test/t/partition_innodb_semi_consistent.test:
End transaction as to release metadata locks.
mysql-test/t/truncate.test:
A metadata lock is now taken before the object is verified.
sql/CMakeLists.txt:
Add new files to the build list.
sql/Makefile.am:
Add new files to the build list.
sql/datadict.cc:
Introduce a new file specific for data dictionary operations.
sql/datadict.h:
Add header file.
sql/sql_base.cc:
Rename data dictionary function.
sql/sql_bitmap.h:
Include dependency.
sql/sql_delete.cc:
Move away from relying on mysql_delete() to delete all rows of
a table. Thus, move any bits related to truncate to sql_truncate.cc
sql/sql_delete.h:
Remove parameter.
sql/sql_parse.cc:
Add protection against the global read lock -- a intention
exclusive lock can be acquired in the truncate path.
sql/sql_show.cc:
Add sync point for testing scenarios where a pending flush
is ignored.
sql/sql_truncate.cc:
Acquire a shared metadata lock before accessing table metadata.
Upgrade the lock to a exclusive one if the table can be re-created.
Rework binlog rules to better reflect the requirements.
sql/sql_yacc.yy:
Set appropriate lock types for table to be truncated.
sql/table.h:
Move to data dictionary header.
The problem was that TRUNCATE TABLE didn't take a exclusive
lock on a table if it resorted to truncating via delete of
all rows in the table. Specifically for InnoDB tables, this
could break proper isolation as InnoDB ends up aborting some
granted locks when truncating a table.
The solution is to take a exclusive metadata lock before
TRUNCATE TABLE can proceed. This guarantees that no other
transaction is using the table.
Incompatible change: Truncate via delete no longer fails
if sql_safe_updates is activated (this was a undocumented
side effect).
bitmap_is_set(table->read_set, field_index))
UPDATE on an InnoDB table modifying the same index that is used
to satisfy the WHERE condition could trigger a debug assertion
under some circumstances.
Since for engines with the HA_PRIMARY_KEY_IN_READ_INDEX flag
set results of an index scan on a secondary index are appended
by the primary key value, if a query involves only columns from
the primary key and a secondary index, the latter is considered
to be covering.
That tricks mysql_update() to mark for reading only columns
from the secondary index when it does an index scan to retrieve
rows to update in case a part of that key is also being
updated. However, there may be other columns in WHERE that are
part of the primary key, but not the secondary one.
What we actually want to do in this case is to add index
columns to the existing WHERE columns bitmap rather than
replace it.
mysql-test/r/innodb_mysql.result:
Test case for bug #53830.
mysql-test/t/innodb_mysql.test:
Test case for bug #53830.
sql/sql_update.cc:
Add index columns to the read_set bitmap, don't replace it.
sql/table.cc:
Added a new add_read_columns_used_by_index() function to
st_table.
sql/table.h:
Added a new add_read_columns_used_by_index() function to
st_table.
bitmap_is_set(table->read_set, field_index))
UPDATE on an InnoDB table modifying the same index that is used
to satisfy the WHERE condition could trigger a debug assertion
under some circumstances.
Since for engines with the HA_PRIMARY_KEY_IN_READ_INDEX flag
set results of an index scan on a secondary index are appended
by the primary key value, if a query involves only columns from
the primary key and a secondary index, the latter is considered
to be covering.
That tricks mysql_update() to mark for reading only columns
from the secondary index when it does an index scan to retrieve
rows to update in case a part of that key is also being
updated. However, there may be other columns in WHERE that are
part of the primary key, but not the secondary one.
What we actually want to do in this case is to add index
columns to the existing WHERE columns bitmap rather than
replace it.
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
The goal of this patch is to decouple type of metadata
lock acquired for table by open_tables() from type of
table-level lock to be acquired on it.
To achieve this we change approach to how we determine what
type of metadata lock should be acquired on table to be open.
Now instead of inferring it at open_tables() time from flags
and type of table-level lock we rely on that type of metadata
lock is properly set at parsing time and is not changed
further.
sql/ha_ndbcluster.cc:
Now one needs to properly initialize table list element's
MDL_request object before calling mysql_rm_table_part2().
sql/lock.cc:
lock_table_names() no longer initializes table list elements'
MDL_request objects. Now proper initialization of these
requests is a responsibility of the caller.
sql/lock.h:
Removed MYSQL_OPEN_TAKE_UPGRADABLE_MDL flag which became
unnecessary. Thanks to the fact that we don't reset type of
requests for metadata locks between re-executions we now can
figure out that upgradable locks are requested by simply
looking at their type which were set in the parser. As result
this flag became redundant.
sql/mdl.h:
Added version of new operator which simplifies allocation of
MDL_request objects on a MEM_ROOT.
sql/sp_head.cc:
Added comment explaining why it is OK to infer type of
metadata lock to request from type of table-level lock
for prelocking.
Added enum_mdl_type argument to sp_add_to_query_tables()
to simplify its usage in trigger implementation.
sql/sp_head.h:
Added enum_mdl_type argument to sp_add_to_query_tables()
to simplify its usage in trigger implementation.
sql/sql_base.cc:
- open_table_get_mdl_lock():
Preserve type of MDL_request for table list element which
was set in the parser by creating MDL_request objects on
memory root if MYSQL_OPEN_FORCE_SHARED_MDL or
MYSQL_OPEN_FORCE_SHARED_HIGH_PRIO_MDL flag were specified.
Thanks to this and to the fact that we no longer reset
type of requests for metadata locks between re-executions
we no longer need to acquire exclusive metadata lock on
table to be created in a special way. This lock is acquired
by code handling acquiring of upgradable locks.
Also changed signature/calling convention for this function
to simplify its usage.
- Accordingly special lock strategy for table list elements
which was used for such locks became unnecessary and was
removed. Other strategies were renamed.
- Since we no longer have guarantee that MDL_request object
which were not satisfied due to lock conflict belongs to
table list element Open_table_context class and its methods
were extended to remember pointer to MDL_request which has
caused problem at request_backoff_action() time and use it
in recover_from_failed_open(). Similar approach is used
for cases when problem from which we need to recover is
not related to MDL but to the table itself. In this case
we store pointer to the element of table list.
- Changed open_tables()/open_tables_check_upgradable_mdl()/
open_tables_acquire_upgradable_mdl() not to rely on
MYSQL_OPEN_TAKE_UPGRADABLE_MDL flag to understand when
upgradable metadata locks should be acquired and not to
infer type of MDL lock from type of table-level lock.
Instead we assume that type of MDL to be acquired was set
in the parser (we can do this as type of MDL_request is
no longer reset between re-executions).
sql/sql_class.h:
Since we no longer have guarantee that MDL_request object
which were not satisfied due to lock conflict belongs to
table list element Open_table_context class and its methods
were extended to remember pointer to MDL_request which has
caused problem at request_backoff_action() time and use it
in recover_from_failed_open(). Similar approach is used
for cases when problem from which we need to recover is
not related to MDL but to the table itself. In this case
we store pointer to the element of table list.
sql/sql_db.cc:
Now one needs to properly initialize table list element's
MDL_request object before calling mysql_rm_table_part2()
or mysql_rename_tables().
sql/sql_lex.cc:
st_select_lex/st_select_lex_node::add_table_to_list() method
now has argument which allows specify type of metadata lock
to be requested for table list element being added.
sql/sql_lex.h:
- st_select_lex/st_select_lex_node::add_table_to_list()
method now has argument which specifies type of metadata
lock to be requested for table list element being added.
This allows to explicitly set type of MDL lock to be
acquired for a DDL statement in parser. It is also more
future-proof than inferring type of MDL request from type
of table-level lock.
- Added Yacc_state::m_mdl_type member which specifies which
type of metadata lock should be requested for tables to be
added to table list by a grammar rule in cases when the same
rule is used in several statements requiring different kinds
of metadata locks.
sql/sql_parse.cc:
- st_select_lex::add_table_to_list() method now has argument
which specifies type of metadata lock to be requested for
table list element being added. This allows to explicitly
set type of MDL lock to be acquired for a DDL statement in
parser. It is also more future-proof than inferring type of
MDL request from type of table-level lock.
- EXCLUSIVE_DOWNGRADABLE_MDL lock strategy has a new name -
OTLS_DOWNGRADE_IF_EXISTS.
- Adjusted LOCK TABLES implementation to the fact that we no
longer infer type of metadata lock to be acquired from table
level lock and that type of MDL request is set at parsing.
And thus MYSQL_OPEN_TAKE_UPGRADABLE_MDL flag became
unnecessary.
sql/sql_prepare.cc:
TABLE_LIST's lock strategy SHARED_MDL was renamed to OTLS_NONE
as now it means that metadata lock should not be changed during
call to open_table() (if it has been already acquired) and is
also used for exclusive metadata lock.
sql/sql_show.cc:
st_select_lex::add_table_to_list() method now has argument
which specifies type of metadata lock to be requested for
table list element being added.
sql/sql_table.cc:
- Adjusted mysql_admin_table()'s code to the fact that
open_tables() no longer determines what kind of metadata
lock should be obtained basing on type of table-level
lock and flags. Instead type of metadata lock for table
to be open should be set before calling open_tables().
- Changed mysql_alter_table() code to the facts:
a) that now it is responsibility of caller to properly
initalize MDL_request in table list elements before calling
lock_table_names()
b) and that MYSQL_OPEN_TAKE_UPGRADABLE_MDL is no longer
necessary since type of metadata lock to be obtained
at open_tables() time is set during parsing.
- Changed code of mysql_recreate_table() to properly set
type of metadata and table-level lock to be obtained
by mysql_alter_table() which it calls.
sql/sql_trigger.cc:
Instead of relying on MYSQL_OPEN_TAKE_UPGRADABLE_MDL flag to
force open_tables() to take an upgradable lock we now specify
exact type of lock to be taken when constructing table list
element for table to be open for CREATE/DROP TRIGGER.
sql/sql_view.cc:
We no longer use TABLE_LIST::EXCLUSIVE_MDL strategy to force
open_tables() to take an exclusive metadata lock on view to
be created. Instead we rely on parser setting proper type of
metadata lock to request and open_tables() acquiring it.
This became possible thanks to the fact that we no longer
reset type of MDL_request between statement re-executions.
sql/sql_yacc.yy:
Instead of inferring type of MDL_request for table to be
open from type of table-level lock and flags passed to
open_tables() we now explicitly specify them at parsing.
This became possible thanks to the fact that we no longer
reset type of MDL_request between statement re-executions.
In future this should allow to decouple type of metadata
lock from type of table-level lock.
The only exception to this approach is statements implemented
through mysql_admin_table() which re-uses same table list
element several times with different types of table-level
and metadata locks.
We now also properly initialize MDL_request objects for table
list elements which are later passed to lock_table_names()
function.
sql/table.cc:
Do not reset type of MDL_request between statement
re-executions. This became unnecessesary as we no longer
change type of MDL_request residing in table list element.
In its turn this change allows to set type of MDL_request
only once - at parsing time.
sql/table.h:
Got rid of TABLE_LIST::EXCLUSIVE_MDL lock strategy.
Now we can specify that we need to acquire exclusive lock
on table to be processed by open_tables() through setting
an appropriate type of MDL_request at parsing time (this
became possible thanks to the fact that we no longer reset
types of MDL_request's belonging to table list elements
between statement re-execution).
Strategy SHARED_MDL was renamed to OTLS_NONE as now it
means that metadata lock should not be changed during call
to open_table() (if it has been already acquired) and is
also used for exclusive metadata lock.
Strategy EXCLUSIVE_DOWNGRADABLE_MDL was renamed to
OTLS_DOWNGRADE_IF_EXISTS.
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
The goal of this patch is to decouple type of metadata
lock acquired for table by open_tables() from type of
table-level lock to be acquired on it.
To achieve this we change approach to how we determine what
type of metadata lock should be acquired on table to be open.
Now instead of inferring it at open_tables() time from flags
and type of table-level lock we rely on that type of metadata
lock is properly set at parsing time and is not changed
further.
- INSERT with RAND() doesn't require row based logging again
- Some bugs fixed in opt_range() where we table->key_read was wrongly used
.bzrignore:
Ignore new xtstat binary
mysql-test/r/index_merge_myisam.result:
Update results (old result was wrong)
mysql-test/suite/binlog/r/binlog_stm_binlog.result:
Added drop table first
mysql-test/suite/binlog/r/binlog_stm_unsafe_warning.result:
Added test for when RAND() requires row based logging
mysql-test/suite/binlog/t/binlog_stm_binlog.test:
Added drop table first
mysql-test/suite/binlog/t/binlog_stm_unsafe_warning.test:
Added test for when RAND() requires row based logging
scripts/make_binary_distribution.sh:
Removed type from last commit
sql/item_create.cc:
Don't require row based logging when using RAND() with INSERT
sql/opt_range.cc:
Revert wrong patch from Oracle:
- As QUICK_RANGE_SELECT uses it's own 'file' handler to the tables, one can't use 'table->key_read' as a flag to detect if index only read (keyread) is used or not
- Don't set keyread if keyread is already enabled
- Don't disable key read, if we didn't enable it ourselves
- Simplify code (and ensure that we do proper cleanup of index only read)
sql/opt_range.h:
Added flags to detect if the range optimizer enabled index only read (key read) or not
sql/opt_sum.cc:
Use our more optimized macros
sql/sql_lex.h:
Added 'readable' function to check if we are in a sub query function or not (not normal query or sub query in FROM clause)
sql/sql_select.cc:
Use our more optimized keyread macros
Added ASSERTS early
Simplify code on eliminate_item_equal()
Fixed that substitute_for_best_equal_field() doesn't core dump in case of out of memory conditions.
Removed not needed test for 'field->maybe_null()'
Replaced master_unit()->item with is_subquery_function() (More readable)
sql/sql_update.cc:
Use our more optimized keyread macros
sql/table.cc:
Use our more optimized keyread macros
sql/table.h:
Use separate functions to enable/disable Index only reads
- Safer, more readable, better logging and faster.
Conflicts:
Text conflict in mysql-test/r/grant.result
Text conflict in mysql-test/t/grant.test
Text conflict in mysys/mf_loadpath.c
Text conflict in sql/slave.cc
Text conflict in sql/sql_priv.h
Conflicts:
Text conflict in mysql-test/r/grant.result
Text conflict in mysql-test/t/grant.test
Text conflict in mysys/mf_loadpath.c
Text conflict in sql/slave.cc
Text conflict in sql/sql_priv.h
MYSQL_BIN_LOG m_table_map_version member and it's associated
functions were not used in the logic of binlogging and replication,
this patch removed all related code.
sql/log.cc:
removed unused m_table_map_version variable and functions
sql/log.h:
removed unused m_table_map_version variable and functions
sql/log_event.h:
Removed unused LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F flag
sql/sql_class.cc:
Removed unused LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F flag
sql/sql_load.cc:
Removed unused LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F flag
sql/table.cc:
removed unused table_map_version variable
sql/table.h:
removed unused table_map_version variable
MYSQL_BIN_LOG m_table_map_version member and it's associated
functions were not used in the logic of binlogging and replication,
this patch removed all related code.
greedy_search optimizer_search_depth=0
The algorithm inside restore_prev_nj_state failed to
properly update the counters within the NESTED_JOIN
tree. The counter was decremented each time a table in the
node was removed from the QEP, the correct thing to do being
only to decrement it when the last table in the child node
was removed from the plan. This lead to node counters
getting negative values and the plan thus appeared
impossible. An assertion caught this.
Fixed by not recursing up the tree unless the last table in
the join nest node is removed from the plan
greedy_search optimizer_search_depth=0
The algorithm inside restore_prev_nj_state failed to
properly update the counters within the NESTED_JOIN
tree. The counter was decremented each time a table in the
node was removed from the QEP, the correct thing to do being
only to decrement it when the last table in the child node
was removed from the plan. This lead to node counters
getting negative values and the plan thus appeared
impossible. An assertion caught this.
Fixed by not recursing up the tree unless the last table in
the join nest node is removed from the plan
Docs/sp-imp-spec.txt:
New sql_mode added.
include/my_base.h:
Flag in frm of create options.
libmysqld/CMakeLists.txt:
New files added.
libmysqld/Makefile.am:
New files added.
mysql-test/r/events_bugs.result:
New sql_mode added.
mysql-test/r/information_schema.result:
New sql_mode added.
mysql-test/r/sp.result:
New sql_mode added.
mysql-test/r/system_mysql_db.result:
New sql_mode added.
mysql-test/suite/funcs_1/r/is_columns_mysql.result:
New sql_mode added.
mysql-test/suite/funcs_1/r/is_columns_mysql_embedded.result:
New sql_mode added.
mysql-test/t/events_bugs.test:
New sql_mode added.
mysql-test/t/sp.test:
New sql_mode added.
scripts/mysql_system_tables.sql:
New sql_mode added.
scripts/mysql_system_tables_fix.sql:
New sql_mode added.
sql/CMakeLists.txt:
New files added.
sql/Makefile.am:
New files added.
sql/event_db_repository.cc:
New sql_mode added.
sql/field.cc:
Create options support added.
sql/field.h:
Create options support added.
sql/ha_partition.cc:
Create options support added.
sql/handler.cc:
Create options support added.
sql/handler.h:
Create options support added.
sql/log_event.h:
New sql_mode added.
sql/mysql_priv.h:
New sql_mode added.
sql/mysqld.cc:
New sql_mode added.
sql/share/errmsg.txt:
New error messages added.
sql/sp.cc:
New sql_mode added.
sql/sp_head.cc:
Create options support added.
sql/sql_class.cc:
Create options support added.
Debug added.
sql/sql_class.h:
Create options support added.
sql/sql_insert.cc:
my_safe_a* moved to mysqld_priv.h
sql/sql_lex.h:
Create options support added.
sql/sql_parse.cc:
Create options support added.
sql/sql_show.cc:
Create options support added.
sql/sql_table.cc:
Create options support added.
sql/sql_view.cc:
New sql_mode added.
sql/sql_yacc.yy:
Create options support added.
sql/structs.h:
Create options support added.
sql/table.cc:
Create options support added.
sql/table.h:
Create options support added.
sql/unireg.cc:
Create options support added.
storage/example/ha_example.cc:
Create options example.
storage/example/ha_example.h:
Create options example.
storage/pbxt/src/discover_xt.cc:
Create options support added.
Adding my_global.h first in all files using
NO_EMBEDDED_ACCESS_CHECKS.
Correcting a merge problem resulting from a changed definition
of check_some_access compared to the original patches.
Adding my_global.h first in all files using
NO_EMBEDDED_ACCESS_CHECKS.
Correcting a merge problem resulting from a changed definition
of check_some_access compared to the original patches.
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h
into partitioned MyISAM table
Problem was that the ha_data structure was introduced in 5.1
and only used for partitioning first, but with the intention
of be of use for others engines as well, and when used by other
engines it would clash if it also was partitioned.
Solution is to move the partitioning specific data to a separate
structure, with its own mutex (which is used for auto_increment).
Also did rename PARTITION_INFO to PARTITION_STATS since there
already exist a class named partition_info, also cleaned up
some related variables.
mysql-test/r/partition_binlog_stmt.result:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
New result file
mysql-test/t/partition_binlog_stmt.test:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
New result file
sql/ha_ndbcluster.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Rename of PARTITION_INFO to PARTITION_STATS to better
match the use (and there is also a class named
partition_info...)
sql/ha_ndbcluster.h:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Rename of PARTITION_INFO to PARTITION_STATS to better
match the use (and there is also a class named
partition_info...)
sql/ha_partition.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed the partitioning engines use of ha_data in
TABLE_SHARE and added ha_part_data instead, since
they collide if used in the same time.
Rename of PARTITION_INFO to PARTITION_STATS to better
match the use (and there is also a class named
partition_info...)
Removed some dead code.
sql/ha_partition.h:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed some dead code.
Rename of PARTITION_INFO to PARTITION_STATS to better
match the use (and there is also a class named
partition_info...)
Removed the partitioning engines use of ha_data in
TABLE_SHARE and added ha_part_data instead, since
they collide if used in the same time.
sql/handler.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Rename of PARTITION_INFO to PARTITION_STATS to better
match the use (and there is also a class named
partition_info...)
sql/handler.h:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Rename of PARTITION_INFO to PARTITION_STATS to better
match the use (and there is also a class named
partition_info...)
sql/mysql_priv.h:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed the partitioning engines use of ha_data in
TABLE_SHARE and added ha_part_data instead, since
they collide if used in the same time.
Added key_PARTITION_LOCK_auto_inc for instrumentation.
sql/mysqld.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed the partitioning engines use of ha_data in
TABLE_SHARE and added ha_part_data instead, since
they collide if used in the same time.
Added key_PARTITION_LOCK_auto_inc for instrumentation.
sql/partition_info.h:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed part_state* since it was not in use.
sql/sql_partition.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed part_state* since it was not in use.
sql/sql_partition.h:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Cleaned up old commented out code.
Removed part_state* since it was not in use.
sql/sql_show.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Rename of PARTITION_INFO to PARTITION_STATS to better
match the use (and there is also a class named
partition_info...)
Renamed partition_info to partition_info_str, since
partition_info is a name of a class.
sql/sql_table.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Renamed partition_info to partition_info_str, since
partition_info is a name of a class.
sql/table.cc:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed the partitioning engines use of ha_data in
TABLE_SHARE and added ha_part_data instead, since
they collide if used in the same time.
Renamed partition_info to partition_info_str, since
partition_info is a name of a class.
removed part_state* since it was not in use.
sql/table.h:
Bug#51851: Server with SBR locks mutex twice on LOAD DATA
into partitioned MyISAM table
Removed the partitioning engines use of ha_data in
TABLE_SHARE and added ha_part_data instead, since
they collide if used in the same time.
Renamed partition_info to partition_info_str, since
partition_info is a name of a class.
removed part_state* since it was not in use.
into partitioned MyISAM table
Problem was that the ha_data structure was introduced in 5.1
and only used for partitioning first, but with the intention
of be of use for others engines as well, and when used by other
engines it would clash if it also was partitioned.
Solution is to move the partitioning specific data to a separate
structure, with its own mutex (which is used for auto_increment).
Also did rename PARTITION_INFO to PARTITION_STATS since there
already exist a class named partition_info, also cleaned up
some related variables.
myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM.
(The disadvantage is that changed MyISAM tables will be checked at access time; Use --myisam-recover=OFF for old behavior)
Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed!
Added assert to detect if we accidently would use MyISAM versioning in MySQL
include/my_base.h:
Mark NOT_USED as USED, as we now use this as a flag to not call extra()
mysql-test/mysql-test-run.pl:
Don't write all options when there is something wrong with the arguments
mysql-test/r/sp-destruct.result:
Add missing flush of mysql.proc (as the test copied live tables)
mysql-test/r/variables.result:
myisam-recover options changed to 'default'
mysql-test/r/view.result:
Don't show create time in result
mysql-test/suite/maria/t/maria-recovery2-master.opt:
Don't run test with myisam-recover (as this produces extra warnings during simulated death)
mysql-test/t/sp-destruct.test:
Add missing flush of mysql.proc (as the test copied live tables)
mysql-test/t/view.test:
Don't show create time in result
sql/lock.cc:
Added marker if table was deleted to argument list
sql/mysql_priv.h:
Added marker if table was deleted to argument list
sql/mysqld.cc:
myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM
Allow one to specify OFF as argument to myisam-recover (was default before but one couldn't specify it)
sql/sql_base.cc:
Mark if table is going to be deleted
sql/sql_delete.cc:
Mark if table is going to be deleted
sql/sql_table.cc:
Mark if table is going to be deleted
Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed!
sql/table.cc:
Signal to handler if table is getting deleted as part of getting droped from table cache.
sql/table.h:
Added marker if table is going to be deleted.
storage/maria/ha_maria.cc:
Don't search for transaction handler if file is not transactional or outside of transaction
(Fixed possible core dump)
storage/maria/ma_blockrec.c:
Don't write changed information if table is going to be deleted.
storage/maria/ma_close.c:
Don't write changed information if table is going to be deleted.
storage/maria/ma_extra.c:
Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash.
storage/maria/ma_locking.c:
Cleanup
storage/maria/ma_recovery.c:
We need trnman to be inited during redo phase (to be able to open tables checked with maria_chk)
storage/maria/maria_def.h:
Added marker if table is going to be deleted.
storage/myisam/mi_close.c:
Don't write changed information if table is going to be deleted.
storage/myisam/mi_extra.c:
Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash.
storage/myisam/mi_open.c:
Added assert to detect if we accidently would use MyISAM versioning in MySQL
storage/myisam/myisamdef.h:
Added marker if table is going to be deleted.
Add a wait-for graph based deadlock detector to the
MDL subsystem.
Fixes bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock" and
bug #37346 "innodb does not detect deadlock between update and
alter table".
The first bug manifested itself as an unwarranted abort of a
transaction with ER_LOCK_DEADLOCK error by a concurrent ALTER
statement, when this transaction tried to repeat use of a
table, which it has already used in a similar fashion before
ALTER started.
The second bug showed up as a deadlock between table-level
locks and InnoDB row locks, which was "detected" only after
innodb_lock_wait_timeout timeout.
A transaction would start using the table and modify a few
rows.
Then ALTER TABLE would come in, and start copying rows
into a temporary table. Eventually it would stumble on
the modified records and get blocked on a row lock.
The first transaction would try to do more updates, and get
blocked on thr_lock.c lock.
This situation of circular wait would only get resolved
by a timeout.
Both these bugs stemmed from inadequate solutions to the
problem of deadlocks occurring between different
locking subsystems.
In the first case we tried to avoid deadlocks between metadata
locking and table-level locking subsystems, when upgrading shared
metadata lock to exclusive one.
Transactions holding the shared lock on the table and waiting for
some table-level lock used to be aborted too aggressively.
We also allowed ALTER TABLE to start in presence of transactions
that modify the subject table. ALTER TABLE acquires
TL_WRITE_ALLOW_READ lock at start, and that block all writes
against the table (naturally, we don't want any writes to be lost
when switching the old and the new table). TL_WRITE_ALLOW_READ
lock, in turn, would block the started transaction on thr_lock.c
lock, should they do more updates. This, again, lead to the need
to abort such transactions.
The second bug occurred simply because we didn't have any
mechanism to detect deadlocks between the table-level locks
in thr_lock.c and row-level locks in InnoDB, other than
innodb_lock_wait_timeout.
This patch solves both these problems by moving lock conflicts
which are causing these deadlocks into the metadata locking
subsystem, thus making it possible to avoid or detect such
deadlocks inside MDL.
To do this we introduce new type-of-operation-aware metadata
locks, which allow MDL subsystem to know not only the fact that
transaction has used or is going to use some object but also what
kind of operation it has carried out or going to carry out on the
object.
This, along with the addition of a special kind of upgradable
metadata lock, allows ALTER TABLE to wait until all
transactions which has updated the table to go away.
This solves the second issue.
Another special type of upgradable metadata lock is acquired
by LOCK TABLE WRITE. This second lock type allows to solve the
first issue, since abortion of table-level locks in event of
DDL under LOCK TABLES becomes also unnecessary.
Below follows the list of incompatible changes introduced by
this patch:
- From now on, ALTER TABLE and CREATE/DROP TRIGGER SQL (i.e. those
statements that acquire TL_WRITE_ALLOW_READ lock)
wait for all transactions which has *updated* the table to
complete.
- From now on, LOCK TABLES ... WRITE, REPAIR/OPTIMIZE TABLE
(i.e. all statements which acquire TL_WRITE table-level lock) wait
for all transaction which *updated or read* from the table
to complete.
As a consequence, innodb_table_locks=0 option no longer applies
to LOCK TABLES ... WRITE.
- DROP DATABASE, DROP TABLE, RENAME TABLE no longer abort
statements or transactions which use tables being dropped or
renamed, and instead wait for these transactions to complete.
- Since LOCK TABLES WRITE now takes a special metadata lock,
not compatible with with reads or writes against the subject table
and transaction-wide, thr_lock.c deadlock avoidance algorithm
that used to ensure absence of deadlocks between LOCK TABLES
WRITE and other statements is no longer sufficient, even for
MyISAM. The wait-for graph based deadlock detector of MDL
subsystem may sometimes be necessary and is involved. This may
lead to ER_LOCK_DEADLOCK error produced for multi-statement
transactions even if these only use MyISAM:
session 1: session 2:
begin;
update t1 ... lock table t2 write, t1 write;
-- gets a lock on t2, blocks on t1
update t2 ...
(ER_LOCK_DEADLOCK)
- Finally, support of LOW_PRIORITY option for LOCK TABLES ... WRITE
was abandoned.
LOCK TABLE ... LOW_PRIORITY WRITE from now on has the same
priority as the usual LOCK TABLE ... WRITE.
SELECT HIGH PRIORITY no longer trumps LOCK TABLE ... WRITE in
the wait queue.
- We do not take upgradable metadata locks on implicitly
locked tables. So if one has, say, a view v1 that uses
table t1, and issues:
LOCK TABLE v1 WRITE;
FLUSH TABLE t1; -- (or just 'FLUSH TABLES'),
an error is produced.
In order to be able to perform DDL on a table under LOCK TABLES,
the table must be locked explicitly in the LOCK TABLES list.
mysql-test/include/handler.inc:
Adjusted test case to trigger an execution path on which bug 41110
"crash with handler command when used concurrently with alter
table" and bug 41112 "crash in mysql_ha_close_table/get_lock_data
with alter table" were originally discovered. Left old test case
which no longer triggers this execution path for the sake of
coverage.
Added test coverage for HANDLER SQL statements and type-aware
metadata locks.
Added a test for the global shared lock and HANDLER SQL.
Updated tests to take into account that the old simple deadlock
detection heuristics was replaced with a graph-based deadlock
detector.
mysql-test/r/debug_sync.result:
Updated results (see debug_sync.test).
mysql-test/r/handler_innodb.result:
Updated results (see handler.inc test).
mysql-test/r/handler_myisam.result:
Updated results (see handler.inc test).
mysql-test/r/innodb-lock.result:
Updated results (see innodb-lock.test).
mysql-test/r/innodb_mysql_lock.result:
Updated results (see innodb_mysql_lock.test).
mysql-test/r/lock.result:
Updated results (see lock.test).
mysql-test/r/lock_multi.result:
Updated results (see lock_multi.test).
mysql-test/r/lock_sync.result:
Updated results (see lock_sync.test).
mysql-test/r/mdl_sync.result:
Updated results (see mdl_sync.test).
mysql-test/r/sp-threads.result:
SHOW PROCESSLIST output has changed due to the fact that waiting
for LOCK TABLES WRITE now happens within metadata locking
subsystem.
mysql-test/r/truncate_coverage.result:
Updated results (see truncate_coverage.test).
mysql-test/suite/funcs_1/datadict/processlist_val.inc:
SELECT FROM I_S.PROCESSLIST output has changed due to fact that
waiting for LOCK TABLES WRITE now happens within metadata locking
subsystem.
mysql-test/suite/funcs_1/r/processlist_val_no_prot.result:
SELECT FROM I_S.PROCESSLIST output has changed due to fact that
waiting for LOCK TABLES WRITE now happens within metadata locking
subsystem.
mysql-test/suite/rpl/t/rpl_sp.test:
Updated to a new SHOW PROCESSLIST state name.
mysql-test/t/debug_sync.test:
Use LOCK TABLES READ instead of LOCK TABLES WRITE as the latter
no longer allows to trigger execution path involving waiting on
thr_lock.c lock and therefore reaching debug sync-point covered
by this test.
mysql-test/t/innodb-lock.test:
Adjusted test case to the fact that innodb_table_locks=0 option is
no longer supported, since LOCK TABLES WRITE handles all its
conflicts within MDL subsystem.
mysql-test/t/innodb_mysql_lock.test:
Added test for bug #37346 "innodb does not detect deadlock between
update and alter table".
mysql-test/t/lock.test:
Added test coverage which checks the fact that we no longer support
DDL under LOCK TABLES on tables which were locked implicitly.
Adjusted existing test cases accordingly.
mysql-test/t/lock_multi.test:
Added test for bug #46272 "MySQL 5.4.4, new MDL: unnecessary
deadlock". Adjusted other test cases to take into account the
fact that waiting for LOCK TABLES ... WRITE now happens within MDL
subsystem.
mysql-test/t/lock_sync.test:
Since LOCK TABLES ... WRITE now takes SNRW metadata lock for
tables locked explicitly we have to implicitly lock InnoDB tables
(through view) to trigger the table-level lock conflict between
TL_WRITE and TL_WRITE_ALLOW_WRITE.
mysql-test/t/mdl_sync.test:
Added basic test coverage for type-of-operation-aware metadata
locks. Also covered with tests some use cases involving HANDLER
statements in which a deadlock could arise.
Adjusted existing tests to take type-of-operation-aware MDL into
account.
mysql-test/t/multi_update.test:
Update to a new SHOW PROCESSLIST state name.
mysql-test/t/truncate_coverage.test:
Adjusted test case after making LOCK TABLES WRITE to wait until
transactions that use the table to be locked are completed.
Updated to the changed name of DEBUG_SYNC point.
sql/handler.cc:
Global read lock functionality has been
moved into a class.
sql/lock.cc:
Global read lock functionality has been
moved into a class.
Updated code to use the new MDL API.
sql/mdl.cc:
Introduced new type-of-operation aware metadata locks.
To do this:
- Changed MDL_lock to use one list for waiting requests and one
list for granted requests. For each list, added a bitmap
that holds information what lock types a list contains.
Added a helper class MDL_lock::List to manipulate with granted
and waited lists while keeping the bitmaps in sync
with list contents.
- Changed lock-compatibility functions to use bitmaps that
define compatibility.
- Introduced a graph based deadlock detector inspired by
waiting_threads.c from Maria implementation.
- Now that we have a deadlock detector, and no longer have
a global lock to protect individual lock objects, but rather
use an rw lock per object, removed redundant code for upgrade,
and the global read lock. Changed the MDL API to
no longer require the caller to acquire the global
intention exclusive lock by means of a separate method.
Removed a few more methods that became redundant.
- Removed deadlock detection heuristic, it has been made
obsolete by the deadlock detector.
- With operation-type-aware metadata locks, MDL subsystem has
become aware of potential conflicts between DDL and open
transactions. This made it possible to remove calls to
mysql_abort_transactions_with_shared_lock() from acquisition
paths for exclusive lock and lock upgrade. Now we can simply
wait for these transactions to complete without fear of
deadlock. Function mysql_lock_abort() has also become
unnecessary for all conflicting cases except when a DDL
conflicts with a connection that has an open HANDLER.
sql/mdl.h:
Introduced new type-of-operation aware metadata locks.
Introduced a graph based deadlock detector and supporting
methods.
Added comments.
God rid of redundant API calls.
Renamed m_lt_or_ha_sentinel to m_trans_sentinel,
since now it guards the global read lock as well as
LOCK TABLES and HANDLER locks.
sql/mysql_priv.h:
Moved the global read lock functionality into a
class.
Added MYSQL_OPEN_FORCE_SHARED_MDL flag which forces
open_tables() to take MDL_SHARED on tables instead of
metadata locks specified in the parser. We use this to
allow PREPARE run concurrently in presence of
LOCK TABLES ... WRITE.
Added signature for find_table_for_mdl_ugprade().
sql/set_var.cc:
Global read lock functionality has been
moved into a class.
sql/sp_head.cc:
When creating TABLE_LIST elements for prelocking or
system tables set the type of request for metadata
lock according to the operation that will be performed
on the table.
sql/sql_base.cc:
- Updated code to use the new MDL API.
- In order to avoid locks starvation we take upgradable
locks all at once. As result implicitly locked tables no
longer get an upgradable lock. Consequently DDL and FLUSH
TABLES for such tables is prohibited.
find_write_locked_table() was replaced by
find_table_for_mdl_upgrade() function.
open_table() was adjusted to return TABLE instance with
upgradable ticket when necessary.
- We no longer wait for all locks on OT_WAIT back off
action -- only on the lock that caused the wait
conflict. Moreover, now we distinguish cases when we
have to wait due to conflict in MDL and old version
of table in TDC.
- Upate mysql_notify_threads_having_share_locks()
to only abort thr_lock.c waits of threads that
have open HANDLERs, since lock conflicts with only
these threads now can lead to deadlocks not detectable
by the MDL deadlock detector.
- Remove mysql_abort_transactions_with_shared_locks()
which is no longer needed.
sql/sql_class.cc:
Global read lock functionality has been moved into a class.
Re-arranged code in THD::cleanup() to simplify assert.
sql/sql_class.h:
Introduced class to incapsulate global read lock
functionality.
Now sentinel in MDL subsystem guards the global read lock
as well as LOCK TABLES and HANDLER locks. Adjusted code
accordingly.
sql/sql_db.cc:
Global read lock functionality has been moved into a class.
sql/sql_delete.cc:
We no longer acquire upgradable metadata locks on tables
which are locked by LOCK TABLES implicitly. As result
TRUNCATE TABLE is no longer allowed for such tables.
Updated code to use the new MDL API.
sql/sql_handler.cc:
Inform MDL_context about presence of open HANDLERs.
Since HANLDERs break MDL protocol by acquiring table-level
lock while holding only S metadata lock on a table MDL
subsystem should take special care about such contexts (Now
this is the only case when mysql_lock_abort() is used).
sql/sql_parse.cc:
Global read lock functionality has been moved into a class.
Do not take upgradable metadata locks when opening tables
for CREATE TABLE SELECT as it is not necessary and limits
concurrency.
When initializing TABLE_LIST objects before adding them
to the table list set the type of request for metadata lock
according to the operation that will be performed on the
table.
We no longer acquire upgradable metadata locks on tables
which are locked by LOCK TABLES implicitly. As result FLUSH
TABLES is no longer allowed for such tables.
sql/sql_prepare.cc:
Use MYSQL_OPEN_FORCE_SHARED_MDL flag when opening
tables during PREPARE. This allows PREPARE to run
concurrently in presence of LOCK TABLES ... WRITE.
sql/sql_rename.cc:
Global read lock functionality has been moved into a class.
sql/sql_show.cc:
Updated code to use the new MDL API.
sql/sql_table.cc:
Global read lock functionality has been moved into a class.
We no longer acquire upgradable metadata locks on tables
which are locked by LOCK TABLES implicitly. As result DROP
TABLE is no longer allowed for such tables.
Updated code to use the new MDL API.
sql/sql_trigger.cc:
Global read lock functionality has been moved into a class.
We no longer acquire upgradable metadata locks on tables
which are locked by LOCK TABLES implicitly. As result
CREATE/DROP TRIGGER is no longer allowed for such tables.
Updated code to use the new MDL API.
sql/sql_view.cc:
Global read lock functionality has been moved into a class.
Fixed results of wrong merge that led to misuse of GLR API.
CREATE VIEW statement is not a commit statement.
sql/table.cc:
When resetting TABLE_LIST objects for PS or SP re-execution
set the type of request for metadata lock according to the
operation that will be performed on the table. Do the same
in auxiliary function initializing metadata lock requests
in a table list.
sql/table.h:
When initializing TABLE_LIST objects set the type of request
for metadata lock according to the operation that will be
performed on the table.
sql/transaction.cc:
Global read lock functionality has been moved into a class.
Add a wait-for graph based deadlock detector to the
MDL subsystem.
Fixes bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock" and
bug #37346 "innodb does not detect deadlock between update and
alter table".
The first bug manifested itself as an unwarranted abort of a
transaction with ER_LOCK_DEADLOCK error by a concurrent ALTER
statement, when this transaction tried to repeat use of a
table, which it has already used in a similar fashion before
ALTER started.
The second bug showed up as a deadlock between table-level
locks and InnoDB row locks, which was "detected" only after
innodb_lock_wait_timeout timeout.
A transaction would start using the table and modify a few
rows.
Then ALTER TABLE would come in, and start copying rows
into a temporary table. Eventually it would stumble on
the modified records and get blocked on a row lock.
The first transaction would try to do more updates, and get
blocked on thr_lock.c lock.
This situation of circular wait would only get resolved
by a timeout.
Both these bugs stemmed from inadequate solutions to the
problem of deadlocks occurring between different
locking subsystems.
In the first case we tried to avoid deadlocks between metadata
locking and table-level locking subsystems, when upgrading shared
metadata lock to exclusive one.
Transactions holding the shared lock on the table and waiting for
some table-level lock used to be aborted too aggressively.
We also allowed ALTER TABLE to start in presence of transactions
that modify the subject table. ALTER TABLE acquires
TL_WRITE_ALLOW_READ lock at start, and that block all writes
against the table (naturally, we don't want any writes to be lost
when switching the old and the new table). TL_WRITE_ALLOW_READ
lock, in turn, would block the started transaction on thr_lock.c
lock, should they do more updates. This, again, lead to the need
to abort such transactions.
The second bug occurred simply because we didn't have any
mechanism to detect deadlocks between the table-level locks
in thr_lock.c and row-level locks in InnoDB, other than
innodb_lock_wait_timeout.
This patch solves both these problems by moving lock conflicts
which are causing these deadlocks into the metadata locking
subsystem, thus making it possible to avoid or detect such
deadlocks inside MDL.
To do this we introduce new type-of-operation-aware metadata
locks, which allow MDL subsystem to know not only the fact that
transaction has used or is going to use some object but also what
kind of operation it has carried out or going to carry out on the
object.
This, along with the addition of a special kind of upgradable
metadata lock, allows ALTER TABLE to wait until all
transactions which has updated the table to go away.
This solves the second issue.
Another special type of upgradable metadata lock is acquired
by LOCK TABLE WRITE. This second lock type allows to solve the
first issue, since abortion of table-level locks in event of
DDL under LOCK TABLES becomes also unnecessary.
Below follows the list of incompatible changes introduced by
this patch:
- From now on, ALTER TABLE and CREATE/DROP TRIGGER SQL (i.e. those
statements that acquire TL_WRITE_ALLOW_READ lock)
wait for all transactions which has *updated* the table to
complete.
- From now on, LOCK TABLES ... WRITE, REPAIR/OPTIMIZE TABLE
(i.e. all statements which acquire TL_WRITE table-level lock) wait
for all transaction which *updated or read* from the table
to complete.
As a consequence, innodb_table_locks=0 option no longer applies
to LOCK TABLES ... WRITE.
- DROP DATABASE, DROP TABLE, RENAME TABLE no longer abort
statements or transactions which use tables being dropped or
renamed, and instead wait for these transactions to complete.
- Since LOCK TABLES WRITE now takes a special metadata lock,
not compatible with with reads or writes against the subject table
and transaction-wide, thr_lock.c deadlock avoidance algorithm
that used to ensure absence of deadlocks between LOCK TABLES
WRITE and other statements is no longer sufficient, even for
MyISAM. The wait-for graph based deadlock detector of MDL
subsystem may sometimes be necessary and is involved. This may
lead to ER_LOCK_DEADLOCK error produced for multi-statement
transactions even if these only use MyISAM:
session 1: session 2:
begin;
update t1 ... lock table t2 write, t1 write;
-- gets a lock on t2, blocks on t1
update t2 ...
(ER_LOCK_DEADLOCK)
- Finally, support of LOW_PRIORITY option for LOCK TABLES ... WRITE
was abandoned.
LOCK TABLE ... LOW_PRIORITY WRITE from now on has the same
priority as the usual LOCK TABLE ... WRITE.
SELECT HIGH PRIORITY no longer trumps LOCK TABLE ... WRITE in
the wait queue.
- We do not take upgradable metadata locks on implicitly
locked tables. So if one has, say, a view v1 that uses
table t1, and issues:
LOCK TABLE v1 WRITE;
FLUSH TABLE t1; -- (or just 'FLUSH TABLES'),
an error is produced.
In order to be able to perform DDL on a table under LOCK TABLES,
the table must be locked explicitly in the LOCK TABLES list.
Queries optimized with GROUP_MIN_MAX didn't cleanup KEYREAD
optimization properly. As a result subsequent queries may
return incomplete rows (fields are initialized to default
values).
mysql-test/r/group_min_max.result:
A test case for BUG#49902.
mysql-test/t/group_min_max.test:
A test case for BUG#49902.
sql/opt_range.cc:
Refactor of KEYREAD optimization switch so that KEYREAD
handler state is in sync with st_table::key_read flag.
All SQL code is supposed to switch KEYREAD optimization
via st_table::set_keyread().
sql/opt_sum.cc:
Refactor of KEYREAD optimization switch so that KEYREAD
handler state is in sync with st_table::key_read flag.
All SQL code is supposed to switch KEYREAD optimization
via st_table::set_keyread().
sql/sql_select.cc:
Refactor of KEYREAD optimization switch so that KEYREAD
handler state is in sync with st_table::key_read flag.
All SQL code is supposed to switch KEYREAD optimization
via st_table::set_keyread().
sql/sql_update.cc:
Refactor of KEYREAD optimization switch so that KEYREAD
handler state is in sync with st_table::key_read flag.
All SQL code is supposed to switch KEYREAD optimization
via st_table::set_keyread().
sql/table.cc:
Refactor of KEYREAD optimization switch so that KEYREAD
handler state is in sync with st_table::key_read flag.
All SQL code is supposed to switch KEYREAD optimization
via st_table::set_keyread().
sql/table.h:
Refactor of KEYREAD optimization switch so that KEYREAD
handler state is in sync with st_table::key_read flag.
All SQL code is supposed to switch KEYREAD optimization
via st_table::set_keyread().
Queries optimized with GROUP_MIN_MAX didn't cleanup KEYREAD
optimization properly. As a result subsequent queries may
return incomplete rows (fields are initialized to default
values).
- Marked a couple of tests with --big
- Fixed xtradb/handler/ha_innodb.cc to call explain_filename()
storage/xtradb/handler/ha_innodb.cc:
Call explain_filename() to get proper names for partitioned tables
Bug #46654 False deadlock on concurrent DML/DDL with partitions,
inconsistent behavior
The problem was that if one connection is running a multi-statement
transaction which involves a single partitioned table, and another
connection attempts to alter the table, the first connection gets
ER_LOCK_DEADLOCK and cannot proceed anymore, even when the ALTER TABLE
statement in another connection has timed out or failed.
The reason for this was that the prepare phase for ALTER TABLE for
partitioned tables removed all instances of the table from the table
definition cache before it started waiting on the lock. The transaction
running in the first connection would notice this and report ER_LOCK_DEADLOCK.
This patch changes the prep_alter_part_table() ALTER TABLE code so that
tdc_remove_table() is no longer called. Instead, only the TABLE instance
changed by prep_alter_part_table() is marked as needing reopen.
The patch also removes an unnecessary call to tdc_remove_table() from
mysql_unpack_partition() as the changed TABLE object is destroyed by the
caller at a later point.
Test case added in partition_sync.test.
Bug #46654 False deadlock on concurrent DML/DDL with partitions,
inconsistent behavior
The problem was that if one connection is running a multi-statement
transaction which involves a single partitioned table, and another
connection attempts to alter the table, the first connection gets
ER_LOCK_DEADLOCK and cannot proceed anymore, even when the ALTER TABLE
statement in another connection has timed out or failed.
The reason for this was that the prepare phase for ALTER TABLE for
partitioned tables removed all instances of the table from the table
definition cache before it started waiting on the lock. The transaction
running in the first connection would notice this and report ER_LOCK_DEADLOCK.
This patch changes the prep_alter_part_table() ALTER TABLE code so that
tdc_remove_table() is no longer called. Instead, only the TABLE instance
changed by prep_alter_part_table() is marked as needing reopen.
The patch also removes an unnecessary call to tdc_remove_table() from
mysql_unpack_partition() as the changed TABLE object is destroyed by the
caller at a later point.
Test case added in partition_sync.test.
Bug#42546 Backup: RESTORE fails, thinking it finds an existing table
The problem occured when a MDL locking conflict happened for a non-existent
table between a CREATE and a INSERT statement. The code for CREATE
interpreted this lock conflict to mean that the table existed,
which meant that the statement failed when it should not have.
The problem could occur for CREATE TABLE, CREATE TABLE LIKE and
ALTER TABLE RENAME.
This patch fixes the problem for CREATE TABLE and CREATE TABLE LIKE.
It is based on code backported from the mysql-6.1-fk tree written
by Dmitry Lenev. CREATE now uses normal open_and_lock_tables() code
to acquire exclusive locks. This means that for the test case in the bug
description, CREATE will wait until INSERT completes so that it can
get the exclusive lock. This resolves the reported bug.
The patch also prohibits CREATE TABLE and CREATE TABLE LIKE under
LOCK TABLES. Note that this is an incompatible change and must
be reflected in the documentation. Affected test cases have been
updated.
mdl_sync.test contains tests for CREATE TABLE and CREATE TABLE LIKE.
Fixing the issue for ALTER TABLE RENAME is beyond the scope of this
patch. ALTER TABLE cannot be prohibited from working under LOCK TABLES
as this could seriously impact customers and a proper fix would require
a significant rewrite.
Bug#42546 Backup: RESTORE fails, thinking it finds an existing table
The problem occured when a MDL locking conflict happened for a non-existent
table between a CREATE and a INSERT statement. The code for CREATE
interpreted this lock conflict to mean that the table existed,
which meant that the statement failed when it should not have.
The problem could occur for CREATE TABLE, CREATE TABLE LIKE and
ALTER TABLE RENAME.
This patch fixes the problem for CREATE TABLE and CREATE TABLE LIKE.
It is based on code backported from the mysql-6.1-fk tree written
by Dmitry Lenev. CREATE now uses normal open_and_lock_tables() code
to acquire exclusive locks. This means that for the test case in the bug
description, CREATE will wait until INSERT completes so that it can
get the exclusive lock. This resolves the reported bug.
The patch also prohibits CREATE TABLE and CREATE TABLE LIKE under
LOCK TABLES. Note that this is an incompatible change and must
be reflected in the documentation. Affected test cases have been
updated.
mdl_sync.test contains tests for CREATE TABLE and CREATE TABLE LIKE.
Fixing the issue for ALTER TABLE RENAME is beyond the scope of this
patch. ALTER TABLE cannot be prohibited from working under LOCK TABLES
as this could seriously impact customers and a proper fix would require
a significant rewrite.
------------------------------------------------------------
revno: 2617.68.25
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 18:26:50 +0400
message:
Follow-up for one of pre-requisite patches for fixing bug #30977
"Concurrent statement using stored function and DROP FUNCTION
breaks SBR".
Made enum_mdl_namespace enum part of MDL_key class and removed MDL_
prefix from the names of enum members. In order to do the latter
changed name of PROCEDURE symbol to PROCEDURE_SYM (otherwise macro
which was automatically generated for this symbol conflicted with
MDL_key::PROCEDURE enum member).
------------------------------------------------------------
revno: 2617.68.25
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 18:26:50 +0400
message:
Follow-up for one of pre-requisite patches for fixing bug #30977
"Concurrent statement using stored function and DROP FUNCTION
breaks SBR".
Made enum_mdl_namespace enum part of MDL_key class and removed MDL_
prefix from the names of enum members. In order to do the latter
changed name of PROCEDURE symbol to PROCEDURE_SYM (otherwise macro
which was automatically generated for this symbol conflicted with
MDL_key::PROCEDURE enum member).
A pre-requisite patch for Bug#30977 "Concurrent statement using
stored function and DROP FUNCTION breaks SBR".
This patch changes the MDL API by introducing a namespace for
lock keys: MDL_TABLE for tables and views and MDL_PROCEDURE
for stored procedures and functions. The latter is needed for
the fix for Bug#30977.
A pre-requisite patch for Bug#30977 "Concurrent statement using
stored function and DROP FUNCTION breaks SBR".
This patch changes the MDL API by introducing a namespace for
lock keys: MDL_TABLE for tables and views and MDL_PROCEDURE
for stored procedures and functions. The latter is needed for
the fix for Bug#30977.
----------------------------------------------------------
revno: 2617.69.21
committer: Konstantin Osipov <kostja@sun.com>
branch nick: 5.4-4284-1-assert
timestamp: Thu 2009-08-13 20:13:55 +0400
message:
A fix and a test case for Bug#46610 "MySQL 5.4.4: MyISAM MRG engine crash
on auto-repair of child".
Also fixes Bug#42862 "Crash on failed attempt to open a children of a
merge table".
MERGE engine needs to extend the global table list
with TABLE_LIST elements for child tables,
so that they are opened and locked.
Previously these table list elements were allocated
in memory of ha_myisammrg object (MERGE engine handler).
That would lead to access to freed memory in
recover_from_failed_open_table_attempt(), which would
try to recover a MERGE table child (MyISAM table)
and use for that TABLE_LIST of that child.
But by the time recover_from_failed_open_table_attempt()
is invoked, ha_myisammrg object that owns this
TABLE_LIST may be destroyed, and thus TABLE_LIST
memory freed.
The fix is to ensure that TABLE_LIST elements
that are added to the global table list (lex->query_tables)
are always allocated in thd->mem_root, which is not
destroyed until end of execution.
If previously TABLE_LIST elements were allocated
at ha_myisammrg::open() (i.e. when the TABLE
object was created and added to the table cache),
now they are allocated in ha_myisammrg::add_chidlren_list()
(i.e. right after "open" of the merge parent in
open_tables()).
We still create a list of children names
at ha_myisammrg::open() to use as a basis
for creation of TABLE_LISTs, that allows
to avoid reading the merge handler data
file on every execution.
mysql-test/r/merge_recover.result:
Test results for Bug#46610.
mysql-test/t/merge_recover-master.opt:
Option file for Bug#46610 test (need a new test because
of that option, which is not tested anywhere else).
mysql-test/t/merge_recover.test:
Add a test case for Bug#46610.
sql/table.h:
MERGE child child_def_version is now moved from TABLE_LIST
to MERGE engine specific data structure.
storage/myisammrg/ha_myisammrg.cc:
Introduce an auxiliary structure to keep MERGE child name
and definition version. A list of Mrg_child_def is created
in ha_myisammrg::open() and reused in ha_myisammrg::add_children_list().
----------------------------------------------------------
revno: 2617.69.21
committer: Konstantin Osipov <kostja@sun.com>
branch nick: 5.4-4284-1-assert
timestamp: Thu 2009-08-13 20:13:55 +0400
message:
A fix and a test case for Bug#46610 "MySQL 5.4.4: MyISAM MRG engine crash
on auto-repair of child".
Also fixes Bug#42862 "Crash on failed attempt to open a children of a
merge table".
MERGE engine needs to extend the global table list
with TABLE_LIST elements for child tables,
so that they are opened and locked.
Previously these table list elements were allocated
in memory of ha_myisammrg object (MERGE engine handler).
That would lead to access to freed memory in
recover_from_failed_open_table_attempt(), which would
try to recover a MERGE table child (MyISAM table)
and use for that TABLE_LIST of that child.
But by the time recover_from_failed_open_table_attempt()
is invoked, ha_myisammrg object that owns this
TABLE_LIST may be destroyed, and thus TABLE_LIST
memory freed.
The fix is to ensure that TABLE_LIST elements
that are added to the global table list (lex->query_tables)
are always allocated in thd->mem_root, which is not
destroyed until end of execution.
If previously TABLE_LIST elements were allocated
at ha_myisammrg::open() (i.e. when the TABLE
object was created and added to the table cache),
now they are allocated in ha_myisammrg::add_chidlren_list()
(i.e. right after "open" of the merge parent in
open_tables()).
We still create a list of children names
at ha_myisammrg::open() to use as a basis
for creation of TABLE_LISTs, that allows
to avoid reading the merge handler data
file on every execution.
----------------------------------------------------------
revno: 2617.69.20
committer: Konstantin Osipov <kostja@sun.com>
branch nick: 5.4-4284-1-assert
timestamp: Thu 2009-08-13 18:29:55 +0400
message:
WL#4284 "Transactional DDL locking"
A review fix.
Since WL#4284 implementation separated MDL_request and MDL_ticket,
MDL_request becamse a utility object necessary only to get a ticket.
Store it by-value in TABLE_LIST with the intent to merge
MDL_request::key with table_list->table_name and table_list->db
in future.
Change the MDL subsystem to not require MDL_requests to
stay around till close_thread_tables().
Remove the list of requests from the MDL context.
Requests for shared metadata locks acquired in open_tables()
are only used as a list in recover_from_failed_open_table_attempt(),
which calls mdl_context.wait_for_locks() for this list.
To keep such list for recover_from_failed_open_table_attempt(),
introduce a context class (Open_table_context), that collects
all requests.
A lot of minor cleanups and simplications that became possible
with this change.
sql/event_db_repository.cc:
Remove alloc_mdl_requests(). Now MDL_request instance is a member
of TABLE_LIST, and init_one_table() initializes it.
sql/ha_ndbcluster_binlog.cc:
Remove now unnecessary declaration and initialization
of binlog_mdl_request.
sql/lock.cc:
No need to allocate MDL requests in lock_table_names() now.
sql/log.cc:
Use init_one_table() method, remove alloc_mdl_requests(),
which is now unnecessary.
sql/log_event.cc:
No need to allocate mdl_request separately now.
Use init_one_table() method.
sql/log_event_old.cc:
Update to the new signature of close_tables_for_reopen().
sql/mdl.cc:
Update try_acquire_exclusive_lock() to be more easy to use.
Function lock_table_name_if_not_cached() has been removed.
Make acquire_shared_lock() signature consistent with
try_acquire_exclusive_lock() signature.
Remove methods that are no longer used.
Update comments.
sql/mdl.h:
Implement an assignment operator that doesn't
copy MDL_key (MDL_key::operator= is private and
should remain private).
This is a hack to work-around assignment of TABLE_LIST
by value in several places. Such assignments violate
encapsulation, since only perform a shallow copy.
In most cases these assignments are a hack on their own.
sql/mysql_priv.h:
Update signatures of close_thread_tables() and close_tables_for_reopen().
sql/sp.cc:
Allocate TABLE_LIST in thd->mem_root.
Use init_one_table().
sql/sp_head.cc:
Use init_one_table().
Remove thd->locked_tables_root, it's no longer needed.
sql/sql_acl.cc:
Use init_mdl_requests() and init_one_table().
sql/sql_base.cc:
Update to new signatures of try_acquire_shared_lock() and
try_acquire_exclusive_lock().
Remove lock_table_name_if_not_cached().
Fix a bug in open_ltable() that would not return ER_LOCK_DEADLOCK
in case of a failed lock_tables() and a multi-statement
transaction.
Fix a bug in open_and_lock_tables_derived() that would
not return ER_LOCK_DEADLOCK in case of a multi-statement
transaction and a failure of lock_tables().
Move assignment of enum_open_table_action to a method of Open_table_context, a new class that maintains information
for backoff actions.
Minor rearrangements of the code.
Remove alloc_mdl_requests() in functions that work with system
tables: instead the patch ensures that callers always initialize
TABLE_LIST argument.
sql/sql_class.cc:
THD::locked_tables_root is no more.
sql/sql_class.h:
THD::locked_tables_root is no more.
Add a declaration for Open_table_context class.
sql/sql_delete.cc:
Update to use the simplified MDL API.
sql/sql_handler.cc:
TABLE_LIST::mdl_request is stored by-value now.
Ensure that mdl_request.ticket is NULL for every request
that is passed into MDL, to satisfy MDL asserts.
@ sql/sql_help.cc
Function open_system_tables_for_read() no longer initializes
mdl_requests.
Move TABLE_LIST::mdl_request initialization closer to
TABLE_LIST initialization.
sql/sql_help.cc:
Function open_system_tables_for_read() no longer initializes
mdl_requests.
Move TABLE_LIST::mdl_request initialization closer to
TABLE_LIST initialization.
sql/sql_insert.cc:
Remove assignment by-value of TABLE_LIST in
TABLEOP_HOOKS. We can't carry over a granted
MDL ticket from one table list to another.
sql/sql_parse.cc:
Change alloc_mdl_requests() -> init_mdl_requests().
@todo We can remove init_mdl_requests() altogether
in some places: all places that call add_table_to_list()
already have mdl requests initialized.
sql/sql_plugin.cc:
Use init_one_table().
THD::locked_tables_root is no more.
sql/sql_servers.cc:
Use init_one_table().
sql/sql_show.cc:
Update acquire_high_priority_shared_lock() to use
TABLE_LIST::mdl_request rather than allocate an own.
Fix get_trigger_table_impl() to use init_one_table(),
check for out of memory, follow the coding style.
sql/sql_table.cc:
Update to work with TABLE_LIST::mdl_request by-value.
Remove lock_table_name_if_not_cached().
The code that used to delegate to it is quite simple and
concise without it now.
sql/sql_udf.cc:
Use init_one_table().
sql/sql_update.cc:
Update to use the new signature of close_tables_for_reopen().
sql/table.cc:
Move re-setting of mdl_requests for prepared statements
and stored procedures from close_thread_tables() to
reinit_stmt_before_use().
Change alloc_mdl_requests() to init_mdl_requests().
init_mdl_requests() is a hack that can't be deleted
until we don't have a list-aware TABLE_LIST constructor.
Hopefully its use will be minimal
sql/table.h:
Change alloc_mdl_requests() to init_mdl_requests()
TABLE_LIST::mdl_request is stored by value.
sql/tztime.cc:
We no longer initialize mdl requests in open_system_tables_for*()
functions. Move this initialization closer to initialization
of the rest of TABLE_LIST members.
storage/myisammrg/ha_myisammrg.cc:
Simplify mdl_request initialization.
----------------------------------------------------------
revno: 2617.69.20
committer: Konstantin Osipov <kostja@sun.com>
branch nick: 5.4-4284-1-assert
timestamp: Thu 2009-08-13 18:29:55 +0400
message:
WL#4284 "Transactional DDL locking"
A review fix.
Since WL#4284 implementation separated MDL_request and MDL_ticket,
MDL_request becamse a utility object necessary only to get a ticket.
Store it by-value in TABLE_LIST with the intent to merge
MDL_request::key with table_list->table_name and table_list->db
in future.
Change the MDL subsystem to not require MDL_requests to
stay around till close_thread_tables().
Remove the list of requests from the MDL context.
Requests for shared metadata locks acquired in open_tables()
are only used as a list in recover_from_failed_open_table_attempt(),
which calls mdl_context.wait_for_locks() for this list.
To keep such list for recover_from_failed_open_table_attempt(),
introduce a context class (Open_table_context), that collects
all requests.
A lot of minor cleanups and simplications that became possible
with this change.
2617.31.12, 2617.31.15, 2617.31.15, 2617.31.16, 2617.43.1
- initial changeset that introduced the fix for
Bug#989 and follow up fixes for all test suite failures
introduced in the initial changeset.
------------------------------------------------------------
revno: 2617.31.1
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Fri 2009-03-06 19:17:00 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
Currently the MySQL server does not keep metadata locks on
schema objects for the duration of a transaction, thus failing
to guarantee the integrity of the schema objects being used
during the transaction and to protect then from concurrent
DDL operations. This also poses a problem for replication as
a DDL operation might be replicated even thought there are
active transactions using the object being modified.
The solution is to defer the release of metadata locks until
a active transaction is either committed or rolled back. This
prevents other statements from modifying the table for the
entire duration of the transaction. This provides commitment
ordering for guaranteeing serializability across multiple
transactions.
- Incompatible change:
If MySQL's metadata locking system encounters a lock conflict,
the usual schema is to use the try and back-off technique to
avoid deadlocks -- this schema consists in releasing all locks
and trying to acquire them all in one go.
But in a transactional context this algorithm can't be utilized
as its not possible to release locks acquired during the course
of the transaction without breaking the transaction commitments.
To avoid deadlocks in this case, the ER_LOCK_DEADLOCK will be
returned if a lock conflict is encountered during a transaction.
Let's consider an example:
A transaction has two statements that modify table t1, then table
t2, and then commits. The first statement of the transaction will
acquire a shared metadata lock on table t1, and it will be kept
utill COMMIT to ensure serializability.
At the moment when the second statement attempts to acquire a
shared metadata lock on t2, a concurrent ALTER or DROP statement
might have locked t2 exclusively. The prescription of the current
locking protocol is that the acquirer of the shared lock backs off
-- gives up all his current locks and retries. This implies that
the entire multi-statement transaction has to be rolled back.
- Incompatible change:
FLUSH commands such as FLUSH PRIVILEGES and FLUSH TABLES WITH READ
LOCK won't cause locked tables to be implicitly unlocked anymore.
mysql-test/extra/binlog_tests/drop_table.test:
Add test case for Bug#989.
mysql-test/extra/binlog_tests/mix_innodb_myisam_binlog.test:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction.
mysql-test/include/mix1.inc:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction.
mysql-test/include/mix2.inc:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction.
mysql-test/r/flush_block_commit.result:
Update test case result (WL#4284).
mysql-test/r/flush_block_commit_notembedded.result:
Update test case result (WL#4284).
mysql-test/r/innodb.result:
Update test case result (WL#4284).
mysql-test/r/innodb_mysql.result:
Update test case result (WL#4284).
mysql-test/r/lock.result:
Add test case result for an effect of WL#4284/Bug#989
(all locks should be released when a connection terminates).
mysql-test/r/mix2_myisam.result:
Update test case result (effects of WL#4284/Bug#989).
mysql-test/r/not_embedded_server.result:
Update test case result (effects of WL#4284/Bug#989).
Add a test case for interaction of WL#4284 and FLUSH PRIVILEGES.
mysql-test/r/partition_innodb_semi_consistent.result:
Update test case result (effects of WL#4284/Bug#989).
mysql-test/r/partition_sync.result:
Temporarily disable the test case for Bug#43867,
which will be fixed by a subsequent backport.
mysql-test/r/ps.result:
Add a test case for effect of PREPARE on transactional
locks: we take a savepoint at beginning of PREAPRE
and release it at the end. Thus PREPARE does not
accumulate metadata locks (Bug#989/WL#4284).
mysql-test/r/read_only_innodb.result:
Update test case result (effects of WL#4284/Bug#989).
mysql-test/suite/binlog/r/binlog_row_drop_tbl.result:
Add a test case result (WL#4284/Bug#989).
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result:
Update test case result (effects of WL#4284/Bug#989).
mysql-test/suite/binlog/r/binlog_stm_drop_tbl.result:
Add a test case result (WL#4284/Bug#989).
mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result:
Update test case result (effects of WL#4284/Bug#989).
mysql-test/suite/binlog/r/binlog_unsafe.result:
A side effect of Bug#989 -- slightly different table map ids.
mysql-test/suite/binlog/t/binlog_row_drop_tbl.test:
Add a test case for WL#4284/Bug#989.
mysql-test/suite/binlog/t/binlog_stm_drop_tbl.test:
Add a test case for WL#4284/Bug#989.
mysql-test/suite/binlog/t/binlog_stm_row.test:
Update to the new state name. This
is actually a follow up to another patch for WL#4284,
that changes Locked thread state to Table lock.
mysql-test/suite/ndb/r/ndb_index_ordered.result:
Remove result for disabled part of the test case.
mysql-test/suite/ndb/t/disabled.def:
Temporarily disable a test case (Bug#45621).
mysql-test/suite/ndb/t/ndb_index_ordered.test:
Disable a part of a test case (needs update to
reflect semantics of Bug#989).
mysql-test/suite/rpl/t/disabled.def:
Disable tests made meaningless by transactional metadata
locking.
mysql-test/suite/sys_vars/r/autocommit_func.result:
Add a commit (Bug#989).
mysql-test/suite/sys_vars/t/autocommit_func.test:
Add a commit (Bug#989).
mysql-test/t/flush_block_commit.test:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction.
mysql-test/t/flush_block_commit_notembedded.test:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction.
Add a test case for transaction-scope locks and the global
read lock (Bug#989/WL#4284).
mysql-test/t/innodb.test:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction
(effects of Bug#989/WL#4284).
mysql-test/t/lock.test:
Add a test case for Bug#989/WL#4284.
mysql-test/t/not_embedded_server.test:
Add a test case for Bug#989/WL#4284.
mysql-test/t/partition_innodb_semi_consistent.test:
Replace TRUNCATE with DELETE, to not issue
an implicit commit of a transaction, and not depend
on metadata locks.
mysql-test/t/partition_sync.test:
Temporarily disable the test case for Bug#43867,
which needs a fix to be backported from 6.0.
mysql-test/t/ps.test:
Add a test case for semantics of PREPARE and transaction-scope
locks: metadata locks on tables used in PREPARE are enclosed into a
temporary savepoint, taken at the beginning of PREPARE,
and released at the end. Thus PREPARE does not effect
what locks a transaction owns.
mysql-test/t/read_only_innodb.test:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction
(Bug#989/WL#4284).
Wait for the read_only statement to actually flush tables before
sending other concurrent statements that depend on its state.
mysql-test/t/xa.test:
Fix test case to reflect the fact that transactions now hold
metadata locks for the duration of a transaction
(Bug#989/WL#4284).
sql/ha_ndbcluster_binlog.cc:
Backport bits of changes of ha_ndbcluster_binlog.cc
from 6.0, to fix the failing binlog test suite with
WL#4284. WL#4284 implementation does not work
with 5.1 implementation of ndbcluster binlog index.
sql/log_event.cc:
Release metadata locks after issuing a commit.
sql/mdl.cc:
Style changes (WL#4284).
sql/mysql_priv.h:
Rename parameter to match the name used in the definition (WL#4284).
sql/rpl_injector.cc:
Release metadata locks on commit (WL#4284).
sql/rpl_rli.cc:
Remove assert made meaningless, metadata locks are released
at the end of the transaction.
sql/set_var.cc:
Close tables and release locks if autocommit mode is set.
sql/slave.cc:
Release metadata locks after a rollback.
sql/sql_acl.cc:
Don't implicitly unlock locked tables. Issue a implicit commit
at the end and unlock tables.
sql/sql_base.cc:
Defer the release of metadata locks when closing tables
if not required to.
Issue a deadlock error if the locking protocol requires
that a transaction re-acquire its locks.
Release metadata locks when closing tables for reopen.
sql/sql_class.cc:
Release metadata locks if the thread is killed.
sql/sql_parse.cc:
Release metadata locks after implicitly committing a active
transaction, or after explicit commits or rollbacks.
sql/sql_plugin.cc:
Allocate MDL request on the stack as the use of the table
is contained within the function. It will be removed from
the context once close_thread_tables is called at the end
of the function.
sql/sql_prepare.cc:
The problem is that the prepare phase of the CREATE TABLE
statement takes a exclusive metadata lock lock and this can
cause a self-deadlock the thread already holds a shared lock
on the table being that should be created.
The solution is to make the prepare phase take a shared
metadata lock when preparing a CREATE TABLE statement. The
execution of the statement will still acquire a exclusive
lock, but won't cause any problem as it issues a implicit
commit.
After some discussions with stakeholders it has been decided that
metadata locks acquired during a PREPARE statement must be released
once the statement is prepared even if it is prepared within a multi
statement transaction.
sql/sql_servers.cc:
Don't implicitly unlock locked tables. Issue a implicit commit
at the end and unlock tables.
sql/sql_table.cc:
Close table and release metadata locks after a admin operation.
sql/table.h:
The problem is that the prepare phase of the CREATE TABLE
statement takes a exclusive metadata lock lock and this can
cause a self-deadlock the thread already holds a shared lock
on the table being that should be created.
The solution is to make the prepare phase take a shared
metadata lock when preparing a CREATE TABLE statement. The
execution of the statement will still acquire a exclusive
lock, but won't cause any problem as it issues a implicit
commit.
sql/transaction.cc:
Release metadata locks after the implicitly committed due
to a new transaction being started. Also, release metadata
locks acquired after a savepoint if the transaction is rolled
back to the save point.
The problem is that in some cases transaction-long metadata locks
could be released before the transaction was committed. This could
happen when a active transaction was ended by a "START TRANSACTION"
or "BEGIN" statement, in which case the metadata locks would be
released before the actual commit of the active transaction.
The solution is to defer the release of metadata locks to after the
transaction has been implicitly committed. No test case is provided
as the effort to provide one is too disproportional to the size of
the fix.
2617.31.12, 2617.31.15, 2617.31.15, 2617.31.16, 2617.43.1
- initial changeset that introduced the fix for
Bug#989 and follow up fixes for all test suite failures
introduced in the initial changeset.
------------------------------------------------------------
revno: 2617.31.1
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Fri 2009-03-06 19:17:00 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
Currently the MySQL server does not keep metadata locks on
schema objects for the duration of a transaction, thus failing
to guarantee the integrity of the schema objects being used
during the transaction and to protect then from concurrent
DDL operations. This also poses a problem for replication as
a DDL operation might be replicated even thought there are
active transactions using the object being modified.
The solution is to defer the release of metadata locks until
a active transaction is either committed or rolled back. This
prevents other statements from modifying the table for the
entire duration of the transaction. This provides commitment
ordering for guaranteeing serializability across multiple
transactions.
- Incompatible change:
If MySQL's metadata locking system encounters a lock conflict,
the usual schema is to use the try and back-off technique to
avoid deadlocks -- this schema consists in releasing all locks
and trying to acquire them all in one go.
But in a transactional context this algorithm can't be utilized
as its not possible to release locks acquired during the course
of the transaction without breaking the transaction commitments.
To avoid deadlocks in this case, the ER_LOCK_DEADLOCK will be
returned if a lock conflict is encountered during a transaction.
Let's consider an example:
A transaction has two statements that modify table t1, then table
t2, and then commits. The first statement of the transaction will
acquire a shared metadata lock on table t1, and it will be kept
utill COMMIT to ensure serializability.
At the moment when the second statement attempts to acquire a
shared metadata lock on t2, a concurrent ALTER or DROP statement
might have locked t2 exclusively. The prescription of the current
locking protocol is that the acquirer of the shared lock backs off
-- gives up all his current locks and retries. This implies that
the entire multi-statement transaction has to be rolled back.
- Incompatible change:
FLUSH commands such as FLUSH PRIVILEGES and FLUSH TABLES WITH READ
LOCK won't cause locked tables to be implicitly unlocked anymore.
----------------------------------------------------------
revno: 2617.23.20
committer: Konstantin Osipov <kostja@sun.com>
branch nick: mysql-6.0-runtime
timestamp: Wed 2009-03-04 16:31:31 +0300
message:
WL#4284 "Transactional DDL locking"
Review comments: "Objectify" the MDL API.
MDL_request and MDL_context still need manual construction and
destruction, since they are used in environment that is averse
to constructors/destructors.
sql/mdl.cc:
Improve comments.
Add asserts to backup()/restore_from_backup()/merge() methods.
Fix an order bug in the error path of mdl_acquire_exclusive_locks():
we used to first free a ticket object, and only then exclude
it from the list of tickets.
----------------------------------------------------------
revno: 2617.23.20
committer: Konstantin Osipov <kostja@sun.com>
branch nick: mysql-6.0-runtime
timestamp: Wed 2009-03-04 16:31:31 +0300
message:
WL#4284 "Transactional DDL locking"
Review comments: "Objectify" the MDL API.
MDL_request and MDL_context still need manual construction and
destruction, since they are used in environment that is averse
to constructors/destructors.
------------------------------------------------------------
revno: 2617.23.18
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Mon 2009-03-02 18:18:26 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
This is a prerequisite patch:
These changes are intended to split lock requests from granted
locks and to allow the memory and lifetime of granted locks to
be managed within the MDL subsystem. Furthermore, tickets can
now be shared and therefore are used to satisfy multiple lock
requests, but only shared locks can be recursive.
The problem is that the MDL subsystem morphs lock requests into
granted locks locks but does not manage the memory and lifetime
of lock requests, and hence, does not manage the memory of
granted locks either. This can be problematic because it puts the
burden of tracking references on the users of the subsystem and
it can't be easily done in transactional contexts where the locks
have to be kept around for the duration of a transaction.
Another issue is that recursive locks (when the context trying to
acquire a lock already holds a lock on the same object) requires
that each time the lock is granted, a unique lock request/granted
lock structure structure must be kept around until the lock is
released. This can lead to memory leaks in transactional contexts
as locks taken during the transaction should only be released at
the end of the transaction. This also leads to unnecessary wake
ups (broadcasts) in the MDL subsystem if the context still holds
a equivalent of the lock being released.
These issues are exacerbated due to the fact that WL#4284 low-level
design says that the implementation should "2) Store metadata locks
in transaction memory root, rather than statement memory root" but
this is not possible because a memory root, as implemented in mysys,
requires all objects allocated from it to be freed all at once.
This patch combines review input and significant code contributions
from Konstantin Osipov (kostja) and Dmitri Lenev (dlenev).
mysql-test/r/mdl_sync.result:
Add test case result.
mysql-test/t/mdl_sync.test:
Add test case for shared lock upgrade case.
sql/event_db_repository.cc:
Rename mdl_alloc_lock to mdl_request_alloc.
sql/ha_ndbcluster_binlog.cc:
Use new function names to initialize MDL lock requests.
sql/lock.cc:
Rename MDL functions.
sql/log_event.cc:
The MDL request now holds the table and database name data (MDL_KEY).
sql/mdl.cc:
Move the MDL key to the MDL_LOCK structure in order to make the
object suitable for allocation from a fixed-size allocator. This
allows the simplification of the lists in the MDL_LOCK object,
which now are just two, one for granted tickets and other for
waiting (upgraders) tickets.
Recursive requests for a shared lock on the same object can now
be granted using the same lock ticket. This schema is only used
for shared locks because that the only case that matters. This
is used to avoid waste of resources in case a context (connection)
already holds a shared lock on a object.
sql/mdl.h:
Introduce a metadata lock object key which is used to uniquely
identify lock objects.
Separate the structure used to represent pending lock requests
from the structure used to represent granted metadata locks.
Rename functions used to manipulate locks requests in order to
have a more consistent function naming schema.
sql/sp_head.cc:
Rename mdl_alloc_lock to mdl_request_alloc.
sql/sql_acl.cc:
Rename alloc_mdl_locks to alloc_mdl_requests.
sql/sql_base.cc:
Various changes to accommodate that lock requests are separated
from lock tickets (granted locks).
sql/sql_class.h:
Last acquired lock before the savepoint was set.
sql/sql_delete.cc:
Various changes to accommodate that lock requests are separated
from lock tickets (granted locks).
sql/sql_handler.cc:
Various changes to accommodate that lock requests are separated
from lock tickets (granted locks).
sql/sql_insert.cc:
Rename alloc_mdl_locks to alloc_mdl_requests.
sql/sql_parse.cc:
Rename alloc_mdl_locks to alloc_mdl_requests.
sql/sql_plist.h:
Typedef for iterator type.
sql/sql_plugin.cc:
Rename alloc_mdl_locks to alloc_mdl_requests.
sql/sql_servers.cc:
Rename alloc_mdl_locks to alloc_mdl_requests.
sql/sql_show.cc:
Various changes to accommodate that lock requests are separated
from lock tickets (granted locks).
sql/sql_table.cc:
Various changes to accommodate that lock requests are separated
from lock tickets (granted locks).
sql/sql_trigger.cc:
Save reference to the lock ticket so it can be downgraded later.
sql/sql_udf.cc:
Rename alloc_mdl_locks to alloc_mdl_requests.
sql/table.cc:
Rename mdl_alloc_lock to mdl_request_alloc.
sql/table.h:
Separate MDL lock requests from lock tickets (granted locks).
storage/myisammrg/ha_myisammrg.cc:
Rename alloc_mdl_locks to alloc_mdl_requests.