Fall back to use ALTER TABLE for engines that doesn't support REPAIR when doing repair for upgrade.
Nicer output from mysql_upgrade and mysql_check
Updated all arrays that used NAME_LEN to use SAFE_NAME_LEN to ensure that we don't break things accidently as names can now have a #mysql50# prefix.
client/mysql_upgrade.c:
If we are using verbose, also run mysqlcheck in verbose mode.
client/mysqlcheck.c:
Add more information if running in verbose mode
Print 'Needs upgrade' instead of complex error if table needs to be upgraded
Don't write connect information if verbose is not 2 or above
mysql-test/r/drop.result:
Updated test and results as we now support full table names
mysql-test/r/grant.result:
Now you get a correct error message if using #mysql with paths
mysql-test/r/show_check.result:
Update results as table names can temporarly be bigger than NAME_LEN (during upgrade)
mysql-test/r/upgrade.result:
Test upgrade for long table names.
mysql-test/suite/funcs_1/r/is_tables_is.result:
Updated old test result (had note been updated in a while)
mysql-test/t/drop.test:
Updated test and results as we now support full table names
mysql-test/t/grant.test:
Now you get a correct error message if using #mysql with paths
mysql-test/t/upgrade.test:
Test upgrade for long table names.
sql/ha_partition.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/item.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/log_event.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/mysql_priv.h:
Added SAFE_NAME_LEN
sql/rpl_filter.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sp.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sp_head.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_acl.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_base.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_connect.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_parse.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_prepare.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_select.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_show.cc:
NAME_LEN -> SAFE_NAME_LEN
Enlarge table names for SHOW TABLES to also include optional #mysql50#
sql/sql_table.cc:
Fall back to use ALTER TABLE for engines that doesn't support REPAIR when doing repair for upgrade.
sql/sql_trigger.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_udf.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/sql_view.cc:
NAME_LEN -> SAFE_NAME_LEN
sql/table.cc:
Fixed check_table_name() to not count #mysql50# as part of name
If #mysql50# is part of the name, don't allow path characters in name.
- Changed to still use bcmp() in certain cases becasue
- Faster for short unaligneed strings than memcmp()
- Bettern when using valgrind
- Changed to use my_sprintf() instead of sprintf() to get higher portability for old systems
- Changed code to use MariaDB version of select->skip_record()
- Removed -%::SCCS/s.% from Makefile.am:s to remove automake warnings
As we check for the impossible partitions earlier, it's possible that we don't find any
suitable partitions at all. So this assertion just has to be corrected for this case.
per-file comments:
mysql-test/r/partition_innodb.result
Bug#55146 Assertion `m_part_spec.start_part == m_part_spec.end_part' in index_read_idx_map
test result updated.
mysql-test/t/partition_innodb.test
Bug#55146 Assertion `m_part_spec.start_part == m_part_spec.end_part' in index_read_idx_map
test case added.
sql/ha_partition.cc
Bug#55146 Assertion `m_part_spec.start_part == m_part_spec.end_part' in index_read_idx_map
Assertion changed to '>=' as the prune_partition_set() in the get_partition_set() can
do start_part= end_part+1 if no possible partitions were found.
As we check for the impossible partitions earlier, it's possible that we don't find any
suitable partitions at all. So this assertion just has to be corrected for this case.
per-file comments:
mysql-test/r/partition_innodb.result
Bug#55146 Assertion `m_part_spec.start_part == m_part_spec.end_part' in index_read_idx_map
test result updated.
mysql-test/t/partition_innodb.test
Bug#55146 Assertion `m_part_spec.start_part == m_part_spec.end_part' in index_read_idx_map
test case added.
sql/ha_partition.cc
Bug#55146 Assertion `m_part_spec.start_part == m_part_spec.end_part' in index_read_idx_map
Assertion changed to '>=' as the prune_partition_set() in the get_partition_set() can
do start_part= end_part+1 if no possible partitions were found.
Problem was that the handler call ::extra(HA_EXTRA_CACHE) was cached
but the ::extra(HA_EXTRA_PREPARE_FOR_UPDATE) was not.
Solution was to also cache the other call and forward it when moving
to a new partition to scan.
mysql-test/r/partition.result:
test result
mysql-test/t/partition.test:
New test from bug report.
sql/ha_partition.cc:
cache the HA_EXTRA_PREPARE_FOR_UPDATE just like HA_EXTRA_CACHE.
sql/ha_partition.h:
Added cache flag for HA_EXTRA_PREPARE_FOR_UPDATE
Problem was that the handler call ::extra(HA_EXTRA_CACHE) was cached
but the ::extra(HA_EXTRA_PREPARE_FOR_UPDATE) was not.
Solution was to also cache the other call and forward it when moving
to a new partition to scan.
The issue was that we didn't always check result of ha_rnd_init() which caused a problem for handlers that returned an error in this code.
- Changed prototype of ha_rnd_init() to ensure that we get a compile warning if result is not checked.
- Added ha_rnd_init_with_error() that prints error on failure.
- Checked all usage of ha_rnd_init() and ensure we generate an error message on failures.
- Changed init_read_record() to return 1 on failure.
sql/create_options.cc:
Fixed wrong printf
sql/event_db_repository.cc:
Check result from init_read_record()
sql/events.cc:
Check result from init_read_record()
sql/filesort.cc:
Check result from ha_rnd_init()
sql/ha_partition.cc:
Check result from ha_rnd_init()
sql/ha_partition.h:
Fixed compiler warning
sql/handler.cc:
Added ha_rnd_init_with_error()
Check result from ha_rnd_init()
sql/handler.h:
Added ha_rnd_init_with_error()
Changed prototype of ha_rnd_init() to ensure that we get a compile warning if result is not checked
sql/item_subselect.cc:
Check result from ha_rnd_init()
sql/log.cc:
Check result from ha_rnd_init()
sql/log_event.cc:
Check result from ha_rnd_init()
sql/log_event_old.cc:
Check result from ha_rnd_init()
sql/mysql_priv.h:
init_read_record() now returns error code on failure
sql/opt_range.cc:
Check result from ha_rnd_init()
sql/records.cc:
init_read_record() now returns error code on failure
Check result from ha_rnd_init()
sql/sql_acl.cc:
Check result from init_read_record()
sql/sql_cursor.cc:
Print error if ha_rnd_init() fails
sql/sql_delete.cc:
Check result from init_read_record()
sql/sql_help.cc:
Check result from init_read_record()
sql/sql_plugin.cc:
Check result from init_read_record()
sql/sql_select.cc:
Check result from ha_rnd_init()
Print error if ha_rnd_init() fails.
sql/sql_servers.cc:
Check result from init_read_record()
sql/sql_table.cc:
Check result from init_read_record()
sql/sql_udf.cc:
Check result from init_read_record()
sql/sql_update.cc:
Check result from init_read_record()
storage/example/ha_example.cc:
Don't return error on rnd_init()
storage/ibmdb2i/ha_ibmdb2i.cc:
Removed not relevant comment
Fixed compiler warnings.
queues.h and queues.c are now based on the UNIREG code and thus made BSD.
Fix code to use new queue() interface. This mostly affects how you access elements in the queue.
If USE_NET_CLEAR is not set, don't clear connection from unexpected characters. This should give a speed up when doing a lot of fast queries.
Fixed some code in ma_ft_boolean_search.c that had not made it from myisam/ft_boolean_search.c
include/queues.h:
Use UNIREG code base (BSD)
Changed init_queue() to take all initialization arguments.
New interface to access elements in queue
include/thr_alarm.h:
Changed to use time_t instead of ulong (portability)
Added index_in_queue, to be able to remove random element from queue in O(1)
mysys/queues.c:
Use UNIREG code base (BSD)
init_queue() and reinit_queue() now takes more initialization arguments. (No need for init_queue_ex() anymore)
Now one can tell queue_insert() to store in the element a pointer to where element is in queue. This allows one to remove elements from queue in O(1) instead of O(N)
mysys/thr_alarm.c:
Use new option in queue() to allow fast removal of elements.
Do less inside LOCK_alarm mutex.
This should give a major speed up of thr_alarm usage when there is many threads
sql/create_options.cc:
Fixed wrong printf
sql/event_queue.cc:
Use new queue interface()
sql/filesort.cc:
Use new queue interface()
sql/ha_partition.cc:
Use new queue interface()
sql/ha_partition.h:
Fixed compiler warning
sql/item_cmpfunc.cc:
Fixed compiler warning
sql/item_subselect.cc:
Use new queue interface()
Removed not used variable
sql/net_serv.cc:
If USE_NET_CLEAR is not set, don't clear connection from unexpected characters.
This should give a speed up when doing a lot of fast queries at the disadvantage that if there is a bug in the client protocol the connection will be dropped instead of being unnoticed.
sql/opt_range.cc:
Use new queue interface()
Fixed compiler warnings
sql/uniques.cc:
Use new queue interface()
storage/maria/ma_ft_boolean_search.c:
Copy code from myisam/ft_boolean_search.c
Use new queue interface()
storage/maria/ma_ft_nlq_search.c:
Use new queue interface()
storage/maria/ma_sort.c:
Use new queue interface()
storage/maria/maria_pack.c:
Use new queue interface()
Use queue_fix() instead of own loop to fix queue.
storage/myisam/ft_boolean_search.c:
Use new queue interface()
storage/myisam/ft_nlq_search.c:
Use new queue interface()
storage/myisam/mi_test_all.sh:
Remove temporary file from last run
storage/myisam/myisampack.c:
Use new queue interface()
Use queue_fix() instead of own loop to fix queue.
storage/myisam/sort.c:
Use new queue interface()
storage/myisammrg/myrg_queue.c:
Use new queue interface()
storage/myisammrg/myrg_rnext.c:
Use new queue interface()
storage/myisammrg/myrg_rnext_same.c:
Use new queue interface()
storage/myisammrg/myrg_rprev.c:
Use new queue interface()
The handler function for reading one row from a specific index
was not optimized in the partitioning handler since it
used the default implementation.
No test case since it is performance only, verified by hand.
sql/ha_partition.cc:
Implemented a optimized version of index_read_idx_map
for the case when find flag == HA_READ_KEY_EXACT,
which is the common case.
sql/ha_partition.h:
Declared ha_partition::index_read_idx_map
The handler function for reading one row from a specific index
was not optimized in the partitioning handler since it
used the default implementation.
No test case since it is performance only, verified by hand.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
client/mysqldump.c:
Pass my_free directly as its signature is compatible with the
callback type -- which wasn't the case for free_table_ent.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
and the original engine is disabled
Missing check that engine is available.
mysql-test/include/not_blackhole.inc:
new include file
mysql-test/r/partition_not_blackhole.result:
new result file
mysql-test/std_data/parts/t1_blackhole.frm:
blackhole partitioned table .frm file:
create table `t1` (`id` int primary key) engine=blackhole
partition by key () partitions 1;
mysql-test/std_data/parts/t1_blackhole.par:
.par file matching blackhole partitioned .frm
mysql-test/t/partition_not_blackhole-master.opt:
new master-opt to disable blackhole if compiled in.
mysql-test/t/partition_not_blackhole.test:
New test
sql/ha_partition.cc:
Added check that engine is available.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
* remove handler::index_read_last()
* create handler::keyread_read_time() (was get_index_only_read_time() in opt_range.cc)
* ha_show_status() allows engine's show_status() to fail
* remove HTON_FLUSH_AFTER_RENAME
* fix key_cmp_if_same() to work for floats and doubles
* set table->status in the server, don't force engines to do it
* increment status vars in the server, don't force engines to do it
mysql-test/r/status_user.result:
correct test results - innodb was wrongly counting internal
index searches as handler_read_* calls.
sql/ha_partition.cc:
compensate for handler incrementing status counters -
we want to count only calls to underlying engines
sql/handler.h:
inline methods moved to sql_class.h
sql/key.cc:
simplify the check
sql/opt_range.cc:
move get_index_only_read_time to the handler class
sql/sp.cc:
don't use a key that's stored in the record buffer -
the engine can overwrite the buffer with anything, destroying the key
sql/sql_class.h:
inline handler methods that need to see THD and TABLE definitions
sql/sql_select.cc:
no ha_index_read_last_map anymore
sql/sql_table.cc:
remove HTON_FLUSH_AFTER_RENAME
sql/table.cc:
set HA_CAN_MEMCMP as appropriate
sql/tztime.cc:
don't use a key that's stored in the record buffer -
the engine can overwrite the buffer with anything, destroying the key
storage/myisam/ha_myisam.cc:
engines don't need to update table->status or use ha_statistic_increment anymore
storage/myisam/ha_myisam.h:
index_read_last_map is no more
Add -Wall to gcc/g++
Fix most warnings reported in dbg and opt mode.
cmd-line-utils/libedit/filecomplete.c:
Remove unused auto variables.
configure.cmake:
Add -Wall to gcc.
extra/comp_err.c:
Cast to correct type.
extra/perror.c:
Fix segfault (but warnings about deprecated features remain)
extra/yassl/taocrypt/include/runtime.hpp:
Comparing two literals was reported as undefined behaviour.
include/my_global.h:
Add a template for aligning character buffers.
mysys/lf_alloc-pin.c:
Initialize pointer.
sql/mysqld.cc:
Use UNINIT_VAR rather than LINT_INIT.
sql/partition_info.cc:
Use UNINIT_VAR rather than LINT_INIT.
sql/rpl_handler.cc:
Use char[] rather than unsigned long[] array for placement buffer.
sql/spatial.cc:
Use char[] rather than unsigned void*[] array for placement buffer.
sql/spatial.h:
Use char[] rather than unsigned void*[] array for placement buffer.
sql/sql_partition.cc:
Initialize auto variable.
sql/sql_table.cc:
Initialize auto variables.
Add parens around assignment within if()
sql/sys_vars.cc:
Use UNINIT_VAR.
storage/innobase/os/os0file.c:
Init first slot in auto variable.
storage/myisam/mi_create.c:
Use UNINIT_VAR rather than LINT_INIT.
storage/myisam/mi_open.c:
Remove (wrong) casting.
storage/myisam/mi_page.c:
Remove (wrong) casting.
storage/myisam/mi_search.c:
Cast to uchar* rather than char*.
strings/ctype-ucs2.c:
Use UNINIT_VAR rather than LINT_INIT.
Add (uchar*) casting.
SELECT and ALTER TABLE ... REBUILD PARTITION".
ALTER TABLE on InnoDB table (including partitioned tables)
acquired exclusive locks on rows of table being altered.
In cases when there was concurrent transaction which did
locking reads from this table this sometimes led to a
deadlock which was not detected by MDL subsystem nor by
InnoDB engine (and was reported only after exceeding
innodb_lock_wait_timeout).
This problem stemmed from the fact that ALTER TABLE acquired
TL_WRITE_ALLOW_READ lock on table being altered. This lock
was interpreted as a write lock and thus for table being
altered handler::external_lock() method was called with
F_WRLCK as an argument. As result InnoDB engine treated
ALTER TABLE as an operation which is going to change data
and acquired LOCK_X locks on rows being read from old
version of table.
In case when there was a transaction which already acquired
SR metadata lock on table and some LOCK_S locks on its rows
(e.g. by using it in subquery of DML statement) concurrent
ALTER TABLE was blocked at the moment when it tried to
acquire LOCK_X lock before reading one of these rows.
The transaction's attempt to acquire SW metadata lock on
table being altered led to deadlock, since it had to wait
for ALTER TABLE to release SNW lock. This deadlock was not
detected and got resolved only after timeout expiring
because waiting were happening in two different subsystems.
Similar deadlocks could have occured in other situations.
This patch tries to solve the problem by changing ALTER TABLE
implementation to use TL_READ_NO_INSERT lock instead of
TL_WRITE_ALLOW_READ. After this step handler::external_lock()
is called with F_RDLCK as an argument and InnoDB engine
correctly interprets ALTER TABLE as operation which only
reads data from original version of table. Thanks to this
ALTER TABLE acquires only LOCK_S locks on rows it reads.
This, in its turn, causes inter-subsystem deadlocks to go
away, as all potential lock conflicts and thus deadlocks will
be limited to metadata locking subsystem:
- When ALTER TABLE reads rows from table being altered it
can't encounter any locks which conflict with LOCK_S row
locks. There should be no concurrent transactions holding
LOCK_X row locks. Such a transaction should have been
acquired SW metadata lock on table first which would have
conflicted with ALTER's SNW lock.
- Vice versa, when DML which runs concurrently with ALTER
TABLE tries to lock row it should be requesting only LOCK_S
lock which is compatible with locks acquired by ALTER,
as otherwise such DML must own an SW metadata lock on table
which would be incompatible with ALTER's SNW lock.
mysql-test/r/innodb_mysql_lock2.result:
Added test for bug #51263 "Deadlock between transactional
SELECT and ALTER TABLE ... REBUILD PARTITION".
mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result:
Since CREATE TRIGGER no longer acquires write lock on table
it is no longer interpreted as an operation which modifies
table data and therefore no longer fails if invoked for
SBR-only engine in ROW mode.
mysql-test/suite/rpl_ndb/t/rpl_ndb_binlog_format_errors.test:
Since CREATE TRIGGER no longer acquires write lock on table
it is no longer interpreted as an operation which modifies
table data and therefore no longer fails if invoked for
SBR-only engine in ROW mode.
mysql-test/t/innodb_mysql_lock2.test:
Added test for bug #51263 "Deadlock between transactional
SELECT and ALTER TABLE ... REBUILD PARTITION".
sql/ha_partition.cc:
When ALTER TABLE creates a new partition to be filled from
other partition lock it in F_WRLCK mode instead of using
mode which was used for locking the whole table (it is
F_RDLCK now).
sql/lock.cc:
Replaced conditions which used TL_WRITE_ALLOW_READ
lock type with equivalent conditions using
TL_WRITE_ALLOW_WRITE. This should allow to get rid
of TL_WRITE_ALLOW_READ lock type eventually.
sql/mdl.cc:
Updated outdated comment to reflect current situation.
sql/sql_base.cc:
Replaced conditions which used TL_WRITE_ALLOW_READ
lock type with equivalent conditions using
TL_WRITE_ALLOW_WRITE. This should allow to get rid
of TL_WRITE_ALLOW_READ lock type eventually.
sql/sql_table.cc:
mysql_admin_table():
Use TL_WRITE_ALLOW_WRITE lock type instead of
TL_WRITE_ALLOW_READ to determine that we need to acquire
upgradable metadata lock. This should allow to completely
get rid of TL_WRITE_ALLOW_READ in long term.
mysql_recreate_table():
ALTER TABLE now requires TL_READ_NO_INSERT thr_lock.c lock
instead of TL_WRITE_ALLOW_READ.
sql/sql_trigger.cc:
Changed CREATE/DROP TRIGGER implementation to use
TL_READ_NO_INSERT lock instead of TL_WRITE_ALLOW_READ lock.
The latter is no longer necessary since:
a) We now can rely on metadata locks to achieve proper
isolation between two DDL statements or DDL and DML
statements.
b) This statement does not change any data in table so there
is no need to inform storage engine about it.
sql/sql_yacc.yy:
Changed implementation of ALTER TABLE (and CREATE/DROP INDEX
as a consequence) to use TL_READ_NO_INSERT lock instead of
TL_WRITE_ALLOW_READ lock. This is possible since:
a) We now can rely on metadata locks to achieve proper
isolation between two DDL statements or DDL and DML
statements.
b) This statement only reads data in table being open.
We write data only to the new version of table and
then replace with it old version of table under
X metadata lock.
Thanks to this change InnoDB will no longer acquire LOCK_X
locks on rows being read by ALTER TABLE (instead LOCK_S
locks will be acquired) and thus cause of bug #51263
"Deadlock between transactional SELECT and ALTER TABLE ...
REBUILD PARTITION" is removed.
Did the similar change for CREATE TRIGGER (see comments
for sql_trigger.cc for details).
SELECT and ALTER TABLE ... REBUILD PARTITION".
ALTER TABLE on InnoDB table (including partitioned tables)
acquired exclusive locks on rows of table being altered.
In cases when there was concurrent transaction which did
locking reads from this table this sometimes led to a
deadlock which was not detected by MDL subsystem nor by
InnoDB engine (and was reported only after exceeding
innodb_lock_wait_timeout).
This problem stemmed from the fact that ALTER TABLE acquired
TL_WRITE_ALLOW_READ lock on table being altered. This lock
was interpreted as a write lock and thus for table being
altered handler::external_lock() method was called with
F_WRLCK as an argument. As result InnoDB engine treated
ALTER TABLE as an operation which is going to change data
and acquired LOCK_X locks on rows being read from old
version of table.
In case when there was a transaction which already acquired
SR metadata lock on table and some LOCK_S locks on its rows
(e.g. by using it in subquery of DML statement) concurrent
ALTER TABLE was blocked at the moment when it tried to
acquire LOCK_X lock before reading one of these rows.
The transaction's attempt to acquire SW metadata lock on
table being altered led to deadlock, since it had to wait
for ALTER TABLE to release SNW lock. This deadlock was not
detected and got resolved only after timeout expiring
because waiting were happening in two different subsystems.
Similar deadlocks could have occured in other situations.
This patch tries to solve the problem by changing ALTER TABLE
implementation to use TL_READ_NO_INSERT lock instead of
TL_WRITE_ALLOW_READ. After this step handler::external_lock()
is called with F_RDLCK as an argument and InnoDB engine
correctly interprets ALTER TABLE as operation which only
reads data from original version of table. Thanks to this
ALTER TABLE acquires only LOCK_S locks on rows it reads.
This, in its turn, causes inter-subsystem deadlocks to go
away, as all potential lock conflicts and thus deadlocks will
be limited to metadata locking subsystem:
- When ALTER TABLE reads rows from table being altered it
can't encounter any locks which conflict with LOCK_S row
locks. There should be no concurrent transactions holding
LOCK_X row locks. Such a transaction should have been
acquired SW metadata lock on table first which would have
conflicted with ALTER's SNW lock.
- Vice versa, when DML which runs concurrently with ALTER
TABLE tries to lock row it should be requesting only LOCK_S
lock which is compatible with locks acquired by ALTER,
as otherwise such DML must own an SW metadata lock on table
which would be incompatible with ALTER's SNW lock.
Problem was reporting wrong error
Fixed by adding a new error which better explain the problem.
mysql-test/r/partition_error.result:
Bug#49161: Out of memory; restart server and try again (needed 2 bytes)
Updated test result
mysql-test/t/partition_error.test:
Bug#49161: Out of memory; restart server and try again (needed 2 bytes)
Added test case
sql/ha_partition.cc:
Bug#49161: Out of memory; restart server and try again (needed 2 bytes)
Better error message. (used ER_UNKNOWN_ERROR to avoid merge
problems in mysql-trunk+)