When the transaction isolation level is SERIALIZABLE, or when
a locking read is performed in the REPEATABLE READ isolation level,
InnoDB must lock delete-marked records in order to prevent another
transaction from inserting something.
However, at READ UNCOMMITTED or READ COMMITTED isolation level or
when the parameter innodb_locks_unsafe_for_binlog is set, the
repeatability of the reads does not matter, and there is no need
to lock any records.
row_search_mvcc(): Skip locks on delete-marked committed records upfront,
instead of invoking row_unlock_for_mysql() afterwards. The unlocking
never worked for secondary index records.
LOCK_thd_data was used to protect both THD data and
ensure that the THD is not deleted while it was in use
This patch moves the THD delete protection to LOCK_thd_kill,
which already protects the THD for kill.
The benefits are:
- More well defined what LOCK_thd_data protects
- LOCK_thd_data usage is now much simpler and easier to verify
- Less chance of deadlocks in SHOW PROCESS LIST as there is less
chance of interactions between mutexes
- Remove not needed LOCK_thread_count from
thd_get_error_context_description()
- Fewer mutex taken for thd->awake()
Other things:
- Don't take mysys->var mutex in show processlist to check if thread
is kill marked
- thd->awake() now automatically takes the LOCK_thd_kill mutex
(Simplifies code)
- Apc uses LOCK_thd_kill instead of LOCK_thd_data
Allow DROP TABLE `#mysql50##sql-...._.` to drop tables that were
being rebuilt by ALGORITHM=INPLACE
NOTE: If the server is killed after the table-rebuilding ALGORITHM=INPLACE
commits inside InnoDB but before the .frm file has been replaced, then
the recovery will involve something else than DROP TABLE.
NOTE: If the server is killed in a true inplace ALTER TABLE commits
inside InnoDB but before the .frm file has been replaced, then we
are really out of luck. To properly handle that situation, we would
need a transactional mysql.ddl_fixup table that directs recovery to
rename or remove files.
prepare_inplace_alter_table_dict(): Use the altered_table->s->table_name
for generating the new_table_name.
table_name_t::part_suffix: The start of the partition name suffix.
table_name_t::dbend(): Return the end of the schema name.
table_name_t::dblen(): Return the length of the schema name, in bytes.
table_name_t::basename(): Return the name without the schema name.
table_name_t::part(): Return the partition name, or NULL if none.
row_drop_table_for_mysql(): Assert for #sql, not #sql-ib.
This was missing bug fix from MySQL wsrep i.e. Galera.
Problem was that if stored procedure declares a handler that
catches deadlock error, then the error may have been
cleared in method sp_rcontext::handle_sql_condition().
Use wsrep_conflict_state correctly to determine is the
error already sent to client.
Add test case for both this bug and MDEV-12837: WSREP: BF
lock wait long. Test requires both fixes to pass.
Introduce the debug flag trx_t::persistent_stats to suppress the
assertion for the updates of persistent statistics during fast
shutdown.
dict_stats_exec_sql(): Do execute the statement even though shutdown
has been initiated.
dict_stats_exec_sql(): Expect the caller to always provide a transaction.
Remove some redundant assertions. The caller must hold dict_sys->mutex,
but holding dict_operation_lock is only necessary for accessing
data dictionary tables, which we are not accessing.
dict_stats_save_index_stat(): Acquire dict_sys->mutex
for invoking dict_stats_exec_sql().
dict_stats_save(), dict_stats_update_for_index(), dict_stats_update(),
dict_stats_drop_index(), dict_stats_delete_from_table_stats(),
dict_stats_delete_from_index_stats(), dict_stats_drop_table(),
dict_stats_rename_in_table_stats(), dict_stats_rename_in_index_stats(),
dict_stats_rename_table(): Use a single caller-provided
transaction that is started and committed or rolled back by the caller.
dict_stats_process_entry_from_recalc_pool(): Let the caller provide
a transaction object.
ha_innobase::open(): Pass a transaction to dict_stats_init().
ha_innobase::create(), ha_innobase::discard_or_import_tablespace():
Pass a transaction to dict_stats_update().
ha_innobase::rename_table(): Pass a transaction to
dict_stats_rename_table(). We do not use the same transaction
as the one that updated the data dictionary tables, because
we already released the dict_operation_lock. (FIXME: there is
a race condition; a lock wait on SYS_* tables could occur
in another DDL transaction until the data dictionary transaction
is committed.)
ha_innobase::info_low(): Pass a transaction to dict_stats_update()
when calculating persistent statistics.
alter_stats_norebuild(), alter_stats_rebuild(): Update the
persistent statistics as well. In this way, a single transaction
will be used for updating the statistics of a whole table, even
for partitioned tables.
ha_innobase::commit_inplace_alter_table(): Drop statistics for
all partitions when adding or dropping virtual columns, so that
the statistics will be recalculated on the next handler::open().
This is a refactored version of Oracle Bug#22469660 fix.
RecLock::add_to_waitq(), lock_table_enqueue_waiting():
Do not allow a lock wait to occur for updating statistics
in a data dictionary transaction, such as DROP TABLE. Instead,
return the previously unused error code DB_QUE_THR_SUSPENDED.
row_merge_lock_table(), row_mysql_lock_table(): Remove dead code
for handling DB_QUE_THR_SUSPENDED.
row_drop_table_for_mysql(), row_truncate_table_for_mysql():
Drop the statistics as part of the data dictionary transaction.
After TRUNCATE TABLE, the statistics will be recalculated on
subsequent ha_innobase::open(), similar to how the logic after
the above-mentioned Oracle Bug#22469660 fix in
ha_innobase::commit_inplace_alter_table() works.
btr_defragment_thread(): Use a single transaction object for
updating defragmentation statistics.
dict_stats_save_defrag_stats(), dict_stats_save_defrag_stats(),
dict_stats_process_entry_from_defrag_pool(),
dict_defrag_process_entries_from_defrag_pool(),
dict_stats_save_defrag_summary(), dict_stats_save_defrag_stats():
Add a parameter for the transaction.
dict_stats_empty_table(): Make public. This will be called by
row_truncate_table_for_mysql() after dropping persistent statistics,
to clear the memory-based statistics as well.
Problem was that Binlog_checkpoint can happen at random times.
Fixed by not write binlog_checkpoint for the rpl_log test.
Other things:
- Removed not used variable "$keep_gtid_events"
- Added option for show_binlog_events to skip binlog_checkpoint
1. Removing data type specific constants from enum_item_param_state,
adding SHORT_DATA_VALUE instead.
2. Replacing tests for Item_param::state for the removed constants to
tests for Type_handler::cmp_type() against {INT|REAL|TIME|DECIAML}_RESULT.
Deriving Item_param::PValue from Type_handler_hybrid_field_type,
to store the data type handler of the current value of the parameter.
3. Moving Item_param::decimal_value and Item_param::str_value_ptr
to Item_param::PValue. Adding Item_param::PValue::m_string
and changing Item_param to use it to store string values,
instead of Item::str_value. The intent is to replace Item_param::value
to a st_value based implementation in the future, to avoid duplicate code.
Adding a sub-class Item::PValue_simple, to implement
Item_param::PValue::swap() easier.
Remaming Item_basic_value::fix_charset_and_length_from_str_value()
to fix_charset_and_length() and adding the "CHARSET_INFO" pointer
parameter, instead of getting it directly from item->str_value.charset().
Changing Item_param to pass value.m_string.charset() instead
of str_value.charset().
Adding a String argument to the overloaded
fix_charset_and_length_from_str_value() and changing Item_param
to pass value.m_string instead of str_value.
4. Replacing the case in Item_param::save_in_field() to a call
for Type_handler::Item_save_in_field().
5. Adding new methods into Item_param::PValue:
val_real(), val_int(), val_decimal(), val_str().
Changing the corresponding Item_param methods
to use these new Item_param::PValue methods
internally. Adding a helper method
Item_param::can_return_value() and removing
duplicate code in Item_param::val_xxx().
6. Removing value.set_handler() from Item_param::set_conversion()
and Type_handler_xxx::Item_param_set_from_value().
It's now done inside Item_param::set_param_func(),
Item_param::set_value() and Item_param::set_limit_clause_param().
7. Changing Type_handler_int_result::Item_param_set_from_value()
to set max_length using attr->max_length instead of
MY_INT64_NUM_DECIMAL_DIGITS, to preserve the data type
of the assigned expression more precisely.
8. Adding Type_handler_hybrid_field_type::swap(),
using it in Item_param::PValue::swap().
9. Moving the data-type specific code from
Item_param::query_val_str(), Item_param::eq(),
Item_param::clone_item() to
Item_param::value_query_type_str(),
Item_param::value_eq(), Item_param::value_clone_item(),
to split the "state" dependent code and
the data type dependent code.
Later we'll split the data type related code further
and add new methods in Type_handler. This will be done
after we replace Item_param::PValue to st_value.
10. Adding asserts into set_int(), set_double(), set_decimal(),
set_time(), set_str(), set_longdata() to make sure that
the value set to Item_param corresponds to the previously
set data type handler.
11. Adding tests into t/ps.test and suite/binlog/t/binlog_stm_ps.test,
to cover Item_param::print() and Item_param::append_for_log()
for LIMIT clause parameters.
Note, the patch does not change the behavior covered by the new
tests. Adding for better code coverage.
12. Adding tests for more precise integer data type in queries like this:
EXECUTE IMMEDIATE
'CREATE OR REPLACE TABLE t1 AS SELECT 999999999 AS a,? AS b'
USING 999999999;
The explicit integer literal and the same integer literal
passed as a PS parameter now produce columns of the same data type.
Re-recording old results in ps.result, gis.result, func_hybrid_type.result
accordingly.
for multi-file innodb_data_file_path.
Use fil_extend_space_to_desired_size() to correctly extend system
tablespace. Make sure to get tablespace size from the first tablespace
part.
Fixing a test failure in "mtr --ps compat/oracle.ps" caused by "SELECT ?"
returning different errors:
- CR_PARAMS_NOT_BOUND in prepared execution
- ER_PARSE_ERROR in direct execution
Disabling PS protocol for this test chunk.
dict_stats_exec_sql(): Refuse the operation if shutdown has been
initiated.
The real fix would be to update the persistent statistics as part
of the data dictionary transactions. To do this, we should move the
storage of InnoDB persistent statistics to the InnoDB data files,
and maybe also remove the InnoDB data dictionary.
Imported missing test case from MySQL 5.7 for
commit 25781c154396dbbc21023786aa3be070057d6999
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Mon Feb 24 14:00:03 2014 +0530
Bug #17604730 ASSERTION: *CURSOR->INDEX->NAME == TEMP_INDEX_PREFIX
This is caused by following change:
commit 95d29c99f01882ffcc2259f62b3163f9b0e80c75
Author: Marko Mäkelä <marko.makela@oracle.com>
Date: Tue Nov 27 11:12:13 2012 +0200
Bug#15920445 INNODB REPORTS ER_DUP_KEY BEFORE CREATE UNIQUE INDEX COMPLETED
There is a phase during online secondary index creation where the index has
been internally completed inside InnoDB, but does not 'officially' exist yet.
We used to report ER_DUP_KEY in these situations, like this:
ERROR 23000: Can't write; duplicate key in table 't1'
What we should do is to let the 'offending' operation complete, but report an
error to the
ALTER TABLE t1 ADD UNIQUE KEY (c2):
ERROR HY000: Index c2 is corrupted
(This misleading error message should be fixed separately:
Bug#15920713 CREATE UNIQUE INDEX REPORTS ER_INDEX_CORRUPT INSTEAD OF DUPLICATE)
row_ins_sec_index_entry_low(): flag the index corrupted instead of
reporting a duplicate, in case the index has not been published yet.
rb:1614 approved by Jimmy Yang
Problem is that after we have found duplicate key on primary key
we continue to get necessary gap locks in secondary indexes to
block concurrent transactions from inserting the searched records.
However, search from unique index used in foreign key constraint
could return DB_NO_REFERENCED_ROW if INSERT .. ON DUPLICATE KEY UPDATE
does not contain value for foreign key column. In this case
we should return the original DB_DUPLICATE_KEY error instead
of DB_NO_REFERENCED_ROW.
Consider as a example following:
create table child(a int not null primary key,
b int not null,
c int,
unique key (b),
foreign key (b) references
parent (id)) engine=innodb;
insert into child values (1,1,2);
insert into child(a) values (1) on duplicate key update c = 3;
Now primary key value 1 naturally causes duplicate key error that will be
stored on node->duplicate. If there was no duplicate key error, we should
return the actual no referenced row error. As value for column b used in
both unique key and foreign key is not provided, server uses 0 as a
search value. This is naturally, not found leading to DB_NO_REFERENCED_ROW.
But, we should update the row with primay key value 1 anyway as
requested by on duplicate key update clause.
With combination of --log-bin and Galera the server may crash
reporting two characteristic stacks:
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0xc7)[0x7f182a8e2cb7]
/usr/sbin/mysqld(binlog_background_thread+0x2b5)[0x7f182a8e3275]
or
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7ff395b2dafd]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7ff395b2db91]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG16rotate_and_purgeEb+0xc2)[0x7ff395b300b2]
The reason of the failure appears to be non-matching decrements for
`xid_count_per_binlog::xid_count`
which can occur when a transaction is executed having its connection issued
`SET @@sql_log_bin=0`. In such case the xid count is not incremented but
its decrements still runs to turn `binlog_xid_count_list` into improper state
which the following FLUSH BINARY LOGS exposes through the crash.
*Note_1*: the regression test reuses an existing galera.sql_log_bin
which does not run stably (even in its base form) by mtr with --log-bin.
*Note_2*: 10.0-galera branch is free of this issue having missed MDEV-7205
fixes.
As reported in MDEV-11969 "there's no way to ditch knowledge" about some
domain that is no longer updated on a server. Besides being of annoyance to
clutter output in DBA console stale domains can prevent the slave
to connect the master as MDEV-12012 witnesses.
What domain is obsolete must be evaluated by the user (DBA) according
to whether the domain info is still relevant and will the domain ever
receive any update.
This patch introduces a method to discard obsolete gtid domains from
the server binlog state. The removal requires no event group from such
domain present in existing binlog files though. If there are any the
containing logs must be first PURGEd in order for
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
succeed. Otherwise the command returns an error.
The list of obsolete domains can be computed through
intersecting two sets - the earliest (first) binlog's Gtid_list
and the current value of @@global.gtid_binlog_state - and extracting
the domain id components from the intersection list items.
The new DELETE_DOMAIN_ID featured FLUSH continues to rotate binlog
omitting the deleted domains from the active binlog file's Gtid_list.
Notice though when the command is ineffective - that none of requested to delete
domain exists in the binlog state - rotation does not occur.
Obsolete domain deletion is not harmful for connected slaves as long
as master side binlog files *purge* is synchronized with FLUSH-DELETE_DOMAIN_ID.
The slaves must have the last event from purged files processed as usual,
in order not to bump later into requesting a gtid from a file which
was already gone.
While the command is not replicated (as ordinary FLUSH BINLOG LOGS is)
slaves, even though having extra domains, won't suffer from reconnection errors
thanks to master-slave gtid connection protocol allowing the master
to be ignorant about a gtid domain.
Should at failover such slave to be promoted into master role it may run
the ex-master's
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
to clean its own binlog state.
NOTES.
suite/perfschema/r/start_server_low_digest.result
is re-recorded as consequence of internal parser codes changes.