The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.
This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.
The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.
To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
orig_*, all_set bitmaps are never substituted already.
This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
MY_BITMAP* instead of my_bitmap_map*
These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
This reverts commit 6f1f911497.
because it doesn't do anything now (the server doesn't check
my_disable_leak_check) and it never did anything before
(because without `extern` it simply created a local instance of
my_disable_leak_check, did not affect server's my_disable_leak_check).
- Include the valgrind suppressions from the FB upstream
- Use HAVE_Valgrind, not HAVE_Purify (like the rest of MariaDB code does)
The call to DisownData() is now actually disabled under Valgrind
... produces "bytes lost" warnings
When rocksdb_validate_update_cf_options() returns an error,
the update won't happen.
Free the copy of the string in this case.
Use RocksDB debug sync points to introduce a sync delay. This
commits to get grouped even when the datadir is on ramdisk.
For some unclear reason the effect is visible on write_prepared
but not write_committed, so run the test only with write_prepared.
MariaDB doesn't support Read-Free replication, so showing them only causes
confusion.
Removed variables:
- @@rocksdb_read_free_rpl
- @@rocksdb_read_free_rpl_tables
Copy of
commit dcd9379eb5707bc7514a2ff4d9127790356505cb
Author: Manuel Ung <mung@fb.com>
Date: Fri Jun 14 10:38:17 2019 -0700
Skip valgrind for rocksdb.force_shutdown
Summary:
This test does unclean shutdown, and leaks memory.
Squash with: D15749084
Reviewed By: hermanlee
Differential Revision: D15828957
fbshipit-source-id: 30541455d74
Use thd_get_ha_data()/thd_set_ha_data() which protect against plugin
removal until it has THD ha_data.
Do not reset THD ha_data in rocksdb_close_connection(), cleaner approach
is to let ha_close_connection() do it.
Removed transaction objects cleanup from rocksdb_done_func(). As we lock
plugin properly, there must be no transaction objects during RocksDB
shutdown.
The cause for this was fix MDEV-15372, which was trying to speed up
the parallel slave.
Part#1: Do not attempt the "optimization" for transactions that are not
replication slave workers.
Removed redundant plugin_thdvar_cleanup() from end_connection(): called by
THD::free_connection(), which always follows end_connection().
Saves at least one lock(LOCK_plugin) and one
rdlock(LOCK_system_variables_hash).
Benchmarked on a 2socket/20core/40threads Broadwell system using sysbench
connect brencmark @40 threads (with select 1 disabled).
10.2 shows moderate improvement: 136219.93 -> 137766.31 CPS.
10.3 is improvement is somewhat better: 93018.29 -> 101379.77 CPS.
Also backported MyRocks memory leak fix from 10.4, which turned out to
be unrelated.
- Use the correct range bounds when doing a reverse-ordered range scan
(this was already done for HA_READ_PREFIX_LAST_OR_PREV but not for
HA_READ_BEFORE_KEY).
An error in "group commit with MariaDB's binlog" code: we would flush
the WAL even when the transaction did not do any writes (and so the logic
in myrocks::Rdb_transaction::commit caused it to rollback).
Move up-to this revision in the upstream:
commit de1e8c7bfe7c875ea284b55040e8f3cd3a56fcc2
Author: Abhinav Sharma <abhinavsharma@fb.com>
Date: Thu Aug 23 14:34:39 2018 -0700
Log updates to semi-sync whitelist in the error log
Summary:
Plugin variable changes are not logged in the error log even when
log_global_var_changes is enabled. Logging updates to whitelist will help in
debugging.
Reviewed By: guokeno0
Differential Revision: D9483807
fbshipit-source-id: e111cda773d
commit de1e8c7bfe7c875ea284b55040e8f3cd3a56fcc2
Author: Abhinav Sharma <abhinavsharma@fb.com>
Date: Thu Aug 23 14:34:39 2018 -0700
Log updates to semi-sync whitelist in the error log
Summary:
Plugin variable changes are not logged in the error log even when
log_global_var_changes is enabled. Logging updates to whitelist will help in
debugging.
Reviewed By: guokeno0
Differential Revision: D9483807
fbshipit-source-id: e111cda773d
Currently RocksDB engine doesn't update AUTO_INCREMENT in the UPDATE statement.
For example,
CREATE TABLE t1 (pk INT AUTO_INCREMENT, a INT, PRIMARY KEY(pk)) ENGINE=RocksDB;
INSERT INTO t1 (a) VALUES (1);
UPDATE t1 SET pk = 3; ==> AUTO_INCREMENT should be updated to 4.
Without this fix, it hits the Assertion `dd_val >= last_val' failed in
myrocks::ha_rocksdb::load_auto_incr_value_from_index.
(cherry picked from commit f7154242b8)
Add variable rocksdb_remove_mariabackup_checkpoint.
If set, it will remove $rocksdb_datadir/mariabackup-checkpoint directory.
The variable is to be used by exclusively by mariabackup,
to remove temporary checkpoints.
Fix two issues:
1. Rdb_ddl_manager::rename() loses the value of m_hidden_pk_val. new
object used to get 0, which means "not loaded from the db yet".
2. ha_rocksdb::load_hidden_pk_value() uses current transaction (and its
snapshot) when loading hidden PK value from disk. This may cause it to
load an out-of-date value.
RocksDB now supports "iterator bounds" which are min and max keys
that an iterator is interested in.
Iterator initialization function doesn't copy the keys, though, it keeps
pointers to them.
So if the buffer space for the keys is used for another iterator (the one
for checking for UNIUQE constraint violation in ha_rocksdb::ha_update_row)
then one can get incorrect query result.
Fixed by using a separate buffer for iterator bounds in the unique constraint
violation check.
Backport the fix from the upstream and add our testcase.
Backported cset:
commit 997a979bf5e2f75ab88781d9d3fd22dddc1fc21f
Author: Manuel Ung <mung@fb.com>
Date: Thu Feb 15 08:38:12 2018 -0800
Fix crashes in autoincrement code paths
Summary:
There are two issues related to autoincrement that can lead to crashes:
1. The max value for double/float type for autoincrement was not implemented in MyRocks, and can lead to assertions. The fix is to add them in.
2. If we try to set auto_increment via alter table on a table without an auto_increment column defined, we segfault because there is no index from which to read the last value. The fix is to perform a check to see if autoincrement exists before reading from index (similar to code ha_rocksdb::open).
Fixes https://github.com/facebook/mysql-5.6/issues/792
Closes https://github.com/facebook/mysql-5.6/pull/794
Differential Revision: D6995096
Pulled By: lth
fbshipit-source-id: 1130ce1
In reverse-ordered column families, if one wants to start reading at the
logical end of the index, they should Seek() to a key value that is not
covered by the index. This may (and typically does) prevent use of a bloom
filter.
The calls to setup_scan_iterator() that are made for index and table scan
didn't take this into account and passed eq_cond_len=INDEX_NUMBER_SIZE.
Fixed them to compute and pass correct eq_cond_len.
Also, removed an incorrect assert in ha_rocksdb::setup_iterator_bounds.
"The Store binlog position inside RocksDB" feature is only needed for
obtaining binlog position after having restored a MyRocks backup.
This is not yet supported in MariaDB, so properly disable it in both
places where it is done.
MyRocks internally will print non-critical messages to
sql_print_verbose_info() which will do what InnoDB does in similar cases:
check if (global_system_variables.log_warnings > 2).
Cherry-pick this fix from the upstream:
commit 6ddedd8f1e0ddcbc24e8f9a005636c5463799ab7
Author: Sergei Petrunia <psergey@askmonty.org>
Date: Tue Apr 10 11:43:01 2018 -0700
[mysql-5.6][PR] Issue #802: MyRocks: Statement rollback doesnt work correctly for nesâ¦
Summary:
â¦ted statements
Variant #1: When the statement fails, we should roll back to the latest
savepoint taken at the top level.
Closes https://github.com/facebook/mysql-5.6/pull/804
Differential Revision: D7509380
Pulled By: hermanlee
fbshipit-source-id: 9a6f414
MariaDB differs from the upstream for "DDL-like" command. For these,
it sets binlog_format=STATEMENT for the duration of the statement.
This doesn't play well with MyRocks, which tries to prevent DML
commands with binlog_format!=ROW.
Also, if Locked_tables_list::reopen_tables() returned an error, then
close_cached_tables should propagate the error condition and not silently
consume it (it's difficult to have test coverage for this because this
error condition is rare)
- Disallow loading of MyRocks (or any auxilary) plugins after it has been
unloaded.
- Do it carefully - Plugin's system variables may be accesssed (e.g. default
value is set) after the first rocksdb_done_func() call but before
the secon rocksdb_init_func() call.
When the plugin is unloaded, walk the s_trx_list and delete the left over
Rdb_transaction objects.
It is responsibility of the SQL layer to make sure that the storage engine
has no open tables when the plugin is being unloaded.
commit 445e518bc7
Author: Sergei Petrunia <psergey@askmonty.org>
Date: Sat Jan 27 10:18:20 2018 +0000
Copy of
commit f8f364b47f2784f16b401f27658f1c16eaf348ec
Author: Jay Edgar <jkedgar@fb.com>
Date: Tue Oct 17 15:19:31 2017 -0700
Add a hashed, hierarchical, wheel timer implementation
Summary:
In order to implement idle timeouts on detached sessions we need something inside MySQL that is lightweight and can handle calling events in the future wi
By default the timers are grouped into 10ms buckets (the 'hashed' part), though the size of the buckets is configurable at the creation of the timer. Eac
Reviewed By: djwatson
Differential Revision: D6199806
fbshipit-source-id: 5e1590f
commit f8f364b47f2784f16b401f27658f1c16eaf348ec
Author: Jay Edgar <jkedgar@fb.com>
Date: Tue Oct 17 15:19:31 2017 -0700
Add a hashed, hierarchical, wheel timer implementation
Summary:
In order to implement idle timeouts on detached sessions we need something inside MySQL that is lightweight and can handle calling events in the future with very little cost for cancelling or resetting the event. A hashed, hi
By default the timers are grouped into 10ms buckets (the 'hashed' part), though the size of the buckets is configurable at the creation of the timer. Each wheel (the 'wheel' part) maintains 256 buckets and cascades to the whe
Reviewed By: djwatson
Differential Revision: D6199806
fbshipit-source-id: 5e1590f