Commit graph

149 commits

Author SHA1 Message Date
Nikita Malyavin
e25623e78a MDEV-17556 Assertion `bitmap_is_set_all(&table->s->all_set)' failed
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.

This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
 bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
 sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
 directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.

The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.

To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
 orig_*, all_set bitmaps are never substituted already.

This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
 to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
 MY_BITMAP* instead of my_bitmap_map*

These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
2021-01-08 16:04:29 +10:00
Sergei Golubchik
cceb965a79 Revert "MDEV-12445 : Rocksdb does not shutdown worker threads and aborts in memleak check on server shutdown"
This reverts commit 6f1f911497.

because it doesn't do anything now (the server doesn't check
my_disable_leak_check) and it never did anything before
(because without `extern` it simply created a local instance of
my_disable_leak_check, did not affect server's my_disable_leak_check).
2020-05-27 15:56:40 +02:00
Sergei Petrunia
a18d1cc777 MDEV-20315: MyRocks tests produce valgrind failures (Backport to 10.2)
- Include the valgrind suppressions from the FB upstream

- Use HAVE_Valgrind, not HAVE_Purify (like the rest of MariaDB code does)
  The call to DisownData() is now actually disabled under Valgrind
2019-08-13 16:26:17 +03:00
Sergei Petrunia
3b234104ae MDEV-16955: rocksdb_sys_vars.rocksdb_update_cf_options_basic
... produces "bytes lost" warnings

When rocksdb_validate_update_cf_options() returns an error,
the update won't happen.
Free the copy of the string in this case.
2019-08-10 01:46:50 +03:00
Sergei Petrunia
b3bd51c992 Fix rocksdb.autoinc_vars_thread test 2019-07-15 00:50:46 +03:00
Sergei Petrunia
fbbc2354c8 MDEV-14455: rocksdb.2pc_group_commit failed in buildbot
Use RocksDB debug sync points to introduce a sync delay. This
commits to get grouped even when the datadir is on ramdisk.

For some unclear reason the effect is visible on write_prepared
but not write_committed, so run the test only with write_prepared.
2019-07-12 21:41:01 +03:00
Marko Mäkelä
24403da91a Remove unused const TABLE_HASH_SIZE 2019-07-03 14:42:04 +03:00
Robert Bindar
666730ee52 Fix gcc-8 warning in rocksdb 2019-07-03 14:32:24 +03:00
Sergei Petrunia
a2e9e3fbd4 MyRocks: dont show read-Free replication variables
MariaDB doesn't support Read-Free replication, so showing them only causes
confusion.
Removed variables:
- @@rocksdb_read_free_rpl
- @@rocksdb_read_free_rpl_tables
2019-06-20 15:29:17 +03:00
Sergei Petrunia
27e05d92be Post-merge fixes cont'd 2019-06-16 20:41:53 +03:00
Sergei Petrunia
ba2f20cc33 Post-merge fix: fix compilation on Windows 2019-06-16 12:53:13 +03:00
Sergei Petrunia
9c75b3d283 Post-merge fixes 2019-06-16 12:44:04 +03:00
Sergei Petrunia
9ab0d7b4e9 Merge from MyRocks upstream:
Copy of
    commit dcd9379eb5707bc7514a2ff4d9127790356505cb
    Author: Manuel Ung <mung@fb.com>
    Date:   Fri Jun 14 10:38:17 2019 -0700

        Skip valgrind for rocksdb.force_shutdown

        Summary:
        This test does unclean shutdown, and leaks memory.

        Squash with: D15749084

        Reviewed By: hermanlee

        Differential Revision: D15828957

        fbshipit-source-id: 30541455d74
2019-06-16 00:28:33 +03:00
Sergei Petrunia
5173e396ff Copy of
commit dcd9379eb5707bc7514a2ff4d9127790356505cb
Author: Manuel Ung <mung@fb.com>
Date:   Fri Jun 14 10:38:17 2019 -0700

    Skip valgrind for rocksdb.force_shutdown

    Summary:
    This test does unclean shutdown, and leaks memory.

    Squash with: D15749084

    Reviewed By: hermanlee

    Differential Revision: D15828957

    fbshipit-source-id: 30541455d74
2019-06-15 21:29:46 +03:00
Sergei Petrunia
91be2212c6 MDEV-17045: MyRocks tables cannot be updated when binlog_format=MIXED. 2019-06-15 19:55:57 +03:00
Sergey Vojtovich
a24dffdba3 Fixed RocksDB to follow THD ha_data protocol
Use thd_get_ha_data()/thd_set_ha_data() which protect against plugin
removal until it has THD ha_data.

Do not reset THD ha_data in rocksdb_close_connection(), cleaner approach
is to let ha_close_connection() do it.

Removed transaction objects cleanup from rocksdb_done_func(). As we lock
plugin properly, there must be no transaction objects during RocksDB
shutdown.
2019-05-16 16:28:16 +04:00
Sergei Petrunia
8fcd9478cc MDEV-18080, part#1: MyRocks is slow with log-bin=off
The cause for this was fix MDEV-15372, which was trying to speed up
the parallel slave.

Part#1: Do not attempt the "optimization" for transactions that are not
replication slave workers.
2019-03-28 19:51:40 +03:00
Sergei Golubchik
f1134d5676 post-merge: gcc 8 warnings
note: Inherit String from Sql_alloc,
to get operators new and new[] in sync

in rocksdb gcc was complaining that non-lvalue was cast to const.
2019-03-15 21:00:50 +01:00
Sergey Vojtovich
20928e2e96 MDEV-14984 - regression in connect performance
Removed redundant plugin_thdvar_cleanup() from end_connection(): called by
THD::free_connection(), which always follows end_connection().

Saves at least one lock(LOCK_plugin) and one
rdlock(LOCK_system_variables_hash).

Benchmarked on a 2socket/20core/40threads Broadwell system using sysbench
connect brencmark @40 threads (with select 1 disabled).

10.2 shows moderate improvement: 136219.93 -> 137766.31 CPS.
10.3 is improvement is somewhat better: 93018.29 -> 101379.77 CPS.

Also backported MyRocks memory leak fix from 10.4, which turned out to
be unrelated.
2019-03-13 10:13:14 +04:00
Sergei Petrunia
734029fa79 Fix a trivial (and harmless) merge error 2018-12-26 00:00:49 +03:00
Daniel Black
3859273d04 MDEV-14267: correct FSF address 2018-10-30 19:45:09 +08:00
Sergei Petrunia
f8604ed9ff MDEV-17414: MyROCKS order desc limit 1 fails : Backport to 10.2
- Use the correct range bounds when doing a reverse-ordered range scan
  (this was already done for HA_READ_PREFIX_LAST_OR_PREV but not for
   HA_READ_BEFORE_KEY).
2018-10-29 13:49:44 +03:00
Sergei Petrunia
2b45eb77f7 MDEV-17261: sysbench oltp read only too slow for MyRocks
An error in "group commit with MariaDB's binlog" code: we would flush
the WAL even when the transaction did not do any writes (and so the logic
in myrocks::Rdb_transaction::commit caused it to rollback).
2018-09-23 13:41:08 +03:00
Sergei Petrunia
a55309d926 MyRocks: post-merge fixes: Make it compile on Windows. 2018-08-31 13:21:46 +03:00
Sergei Petrunia
c930afd47e Merge branch 'merge-myrocks' of github.com:MariaDB/mergetrees into bb-10.2-mariarocks-merge
Move up-to this revision in the upstream:

  commit de1e8c7bfe7c875ea284b55040e8f3cd3a56fcc2
  Author: Abhinav Sharma <abhinavsharma@fb.com>
  Date:   Thu Aug 23 14:34:39 2018 -0700

      Log updates to semi-sync whitelist in the error log

      Summary:
      Plugin variable changes are not logged in the error log even when
      log_global_var_changes is enabled. Logging updates to whitelist will help in
      debugging.

      Reviewed By: guokeno0

      Differential Revision: D9483807

      fbshipit-source-id: e111cda773d
2018-08-28 14:09:04 +03:00
Sergei Petrunia
faa4d8f8c6 Copy of
commit de1e8c7bfe7c875ea284b55040e8f3cd3a56fcc2
Author: Abhinav Sharma <abhinavsharma@fb.com>
Date:   Thu Aug 23 14:34:39 2018 -0700

    Log updates to semi-sync whitelist in the error log

    Summary:
    Plugin variable changes are not logged in the error log even when
    log_global_var_changes is enabled. Logging updates to whitelist will help in
    debugging.

    Reviewed By: guokeno0

    Differential Revision: D9483807

    fbshipit-source-id: e111cda773d
2018-08-28 08:23:44 +00:00
Ming Lin
2b76f6f61d MDEV-16703: Update AUTO_INCREMENT in the UPDATE statement
Currently RocksDB engine doesn't update AUTO_INCREMENT in the UPDATE statement.
For example,

CREATE TABLE t1 (pk INT AUTO_INCREMENT, a INT, PRIMARY KEY(pk)) ENGINE=RocksDB;
INSERT INTO t1 (a) VALUES (1);
UPDATE t1 SET pk = 3; ==> AUTO_INCREMENT should be updated to 4.

Without this fix, it hits the Assertion `dd_val >= last_val' failed in
myrocks::ha_rocksdb::load_auto_incr_value_from_index.

(cherry picked from commit f7154242b8)
2018-08-26 15:10:32 +03:00
Sergei Petrunia
8662015c90 MDEV-15304: Server crash in print_keydup_error / key_unpack or unexpected ER_DUP_KEY
Adjust the patch to match the variant accepted into the upstream:
undo the changes in ha_rocksdb::load_hidden_pk_value().
2018-06-13 15:26:50 +03:00
Vladislav Vaintroub
ea70586282 MDEV-16300 : remove rocksdb checkpoint created by mariabackup.
Add variable rocksdb_remove_mariabackup_checkpoint.
If set, it will remove $rocksdb_datadir/mariabackup-checkpoint directory.
The variable is to be used by exclusively by mariabackup,
to remove temporary checkpoints.
2018-06-07 15:12:26 +01:00
Sergei Petrunia
727d0d4f9b MDEV-15304: Server crash in print_keydup_error / key_unpack or unexpected ER_DUP_KEY
Fix two issues:
1. Rdb_ddl_manager::rename() loses the value of m_hidden_pk_val. new
object used to get 0, which means "not loaded from the db yet".

2. ha_rocksdb::load_hidden_pk_value() uses current transaction (and its
snapshot) when loading hidden PK value from disk. This may cause it to
load an out-of-date value.
2018-05-18 17:41:56 +03:00
Sergei Petrunia
21bcfeb996 MDEV-16155: UPDATE on RocksDB table with unique constraint does not work
RocksDB now supports "iterator bounds" which are min and max keys
that an iterator is interested in.

Iterator initialization function doesn't copy the keys, though, it keeps
pointers to them.
So if the buffer space for the keys is used for another iterator (the one
for checking for UNIUQE constraint violation in ha_rocksdb::ha_update_row)
then one can get incorrect query result.

Fixed by using a separate buffer for iterator bounds in the unique constraint
violation check.
2018-05-15 12:34:10 +03:00
Sergei Petrunia
7e7592ade5 MDEV-16154: Server crashes in in myrocks::ha_rocksdb::load_auto_incr_value_from_index
Backport the fix from the upstream and add our testcase.
Backported cset:
commit 997a979bf5e2f75ab88781d9d3fd22dddc1fc21f
Author: Manuel Ung <mung@fb.com>
Date:   Thu Feb 15 08:38:12 2018 -0800

    Fix crashes in autoincrement code paths

    Summary:
    There are two issues related to autoincrement that can lead to crashes:

    1. The max value for double/float type for autoincrement was not implemented in MyRocks, and can lead to assertions. The fix is to add them in.
    2. If we try to set auto_increment via alter table on a table without an auto_increment column defined, we segfault because there is no index from which to read the last value. The fix is to perform a check to see if autoincrement exists before reading from index (similar to code ha_rocksdb::open).

    Fixes https://github.com/facebook/mysql-5.6/issues/792
    Closes https://github.com/facebook/mysql-5.6/pull/794

    Differential Revision: D6995096

    Pulled By: lth

    fbshipit-source-id: 1130ce1
2018-05-14 15:59:51 +03:00
Sergei Petrunia
b0269816a5 MyRocks: post-merge fixes for Windows: take into account FN_LIBCHAR2
Table name may be passed either as "./db/table" or as ".\\db\\table".
2018-05-10 19:05:13 +03:00
Sergei Petrunia
4d51009a77 MyRocks: fix rocksdb.rocksdb_range test attempt 3 2018-05-08 13:00:26 +03:00
Sergei Petrunia
dbe73588cd Merge branch 'bb-10.2-mariarocks-merge' of github.com:MariaDB/server into 10.2
Manually resolved the conflicts
2018-05-07 21:38:18 +03:00
Sergei Petrunia
e3661b9f7c Cherry-picked from MyRocks upstream: Issue #809: Wrong query result with bloom filters
In reverse-ordered column families, if one wants to start reading at the
  logical end of the index, they should Seek() to a key value that is not
  covered by the index. This may (and typically does) prevent use of a bloom
  filter.
  The calls to setup_scan_iterator() that are made for index and table scan
  didn't take this into account and passed eq_cond_len=INDEX_NUMBER_SIZE.
  Fixed them to compute and pass correct eq_cond_len.

  Also, removed an incorrect assert in ha_rocksdb::setup_iterator_bounds.
2018-05-07 20:21:35 +03:00
Sergei Petrunia
955233256e MyRocks: fix rocksdb.information_schema testcase.
"The Store binlog position inside RocksDB" feature is only needed for
obtaining binlog position after having restored a MyRocks backup.

This is not yet supported in MariaDB, so properly disable it in both
places where it is done.
2018-04-19 15:41:13 +03:00
Sergei Petrunia
0c02c91bc1 MyRocks: MDEV-15911: Reduce debug logging on default levels in error log
MyRocks internally will print non-critical messages to
sql_print_verbose_info() which will do what InnoDB does in similar cases:
check if (global_system_variables.log_warnings > 2).
2018-04-19 14:13:28 +03:00
Sergei Petrunia
d13e3547e4 MDEV-14460: Memory leak with only SELECT statements
Cherry-pick this fix from the upstream:
commit 6ddedd8f1e0ddcbc24e8f9a005636c5463799ab7
Author: Sergei Petrunia <psergey@askmonty.org>
Date:   Tue Apr 10 11:43:01 2018 -0700

    [mysql-5.6][PR] Issue #802: MyRocks: Statement rollback doesnt work correctly for nes…

    Summary:
    …ted statements

    Variant #1: When the statement fails, we should roll back to the latest
    savepoint taken at the top level.
    Closes https://github.com/facebook/mysql-5.6/pull/804

    Differential Revision: D7509380

    Pulled By: hermanlee

    fbshipit-source-id: 9a6f414
2018-04-13 01:56:01 +03:00
Sergei Petrunia
7e700bd2a8 Fix ha_rocksdb::calc_eq_cond_len() to handle HA_READ_PREFIX_LAST_OR_PREV correctly
This is Variant#2.
- Undo Variant#1
- Instead, swap the range bounds if we are doing a reverse-ordered scan.
2018-04-09 19:12:23 +03:00
Sergei Petrunia
8628c589f6 Fix ha_rocksdb::calc_eq_cond_len() to handle HA_READ_PREFIX_LAST_OR_PREV correctly 2018-04-09 15:27:35 +03:00
Sergei Petrunia
b922741074 MDEV-15472: Assertion `!is_set() || (m_status == DA_OK_BULK && is_bulk_op())' failure
MariaDB differs from the upstream for "DDL-like" command. For these,
it sets binlog_format=STATEMENT for the duration of the statement.
This doesn't play well with MyRocks, which tries to prevent DML
commands with binlog_format!=ROW.

Also, if Locked_tables_list::reopen_tables() returned an error, then
close_cached_tables should propagate the error condition and not silently
consume it (it's difficult to have test coverage for this because this
error condition is rare)
2018-03-29 14:43:08 +03:00
Sergei Petrunia
011586c04d MDEV-15686: Loading MyRocks plugin back after it has been unloaded causes a crash
- Disallow loading of MyRocks (or any auxilary) plugins after it has been
  unloaded.
- Do it carefully - Plugin's system variables may be accesssed (e.g. default
  value is set) after the first rocksdb_done_func() call but before
  the secon rocksdb_init_func() call.
2018-03-28 14:30:37 +03:00
Sergei Petrunia
60438451c3 MDEV-14843: Assertion `s_tx_list.size() == 0' failed in myrocks::Rdb_transaction::term_mutex
When the plugin is unloaded, walk the s_trx_list and delete the left over
Rdb_transaction objects.
It is responsibility of the SQL layer to make sure that the storage engine
has no open tables when the plugin is being unloaded.
2018-03-26 21:25:40 +03:00
Sergei Golubchik
e119799a92 fix compilation wih -DPLUGIN_PARTITION=NO
rocksdb and spider
2018-02-22 08:40:54 +01:00
Sergei Petrunia
00a556c0c2 MDEV-15372: Parallel slave speedup very limited when log_slave_updates=OFF
Part #2: some transactions have m_rocksdb_tx==NULL (and most functions of
Rdb_transction_impl handle this case. Do like they do)
2018-02-21 17:00:03 +03:00
Sergei Petrunia
01e89d6a86 MDEV-15372: Parallel slave speedup very limited when log_slave_updates=OFF
Make MyRocks' non-XA commit path to first do the commit without syncing
and then sync.
2018-02-21 15:42:34 +03:00
Varun Gupta
34ff2967c5 Post merge fix
rocksdb_set_update_cf_options functions was freeing a pointer which it should not.
2018-02-08 19:25:19 +05:30
Sergei Petrunia
e3a03da2bc Merge from merge-myrocks:
commit 445e518bc7
Author: Sergei Petrunia <psergey@askmonty.org>
Date:   Sat Jan 27 10:18:20 2018 +0000

    Copy of
    commit f8f364b47f2784f16b401f27658f1c16eaf348ec
    Author: Jay Edgar <jkedgar@fb.com>
    Date:   Tue Oct 17 15:19:31 2017 -0700

        Add a hashed, hierarchical, wheel timer implementation

        Summary:
        In order to implement idle timeouts on detached sessions we need something inside MySQL that is lightweight and can handle calling events in the future wi

        By default the timers are grouped into 10ms buckets (the 'hashed' part), though the size of the buckets is configurable at the creation of the timer.  Eac

        Reviewed By: djwatson

        Differential Revision: D6199806

        fbshipit-source-id: 5e1590f
2018-01-27 11:52:34 +00:00
Sergei Petrunia
445e518bc7 Copy of
commit f8f364b47f2784f16b401f27658f1c16eaf348ec
Author: Jay Edgar <jkedgar@fb.com>
Date:   Tue Oct 17 15:19:31 2017 -0700

    Add a hashed, hierarchical, wheel timer implementation

    Summary:
    In order to implement idle timeouts on detached sessions we need something inside MySQL that is lightweight and can handle calling events in the future with very little cost for cancelling or resetting the event.  A hashed, hi

    By default the timers are grouped into 10ms buckets (the 'hashed' part), though the size of the buckets is configurable at the creation of the timer.  Each wheel (the 'wheel' part) maintains 256 buckets and cascades to the whe

    Reviewed By: djwatson

    Differential Revision: D6199806

    fbshipit-source-id: 5e1590f
2018-01-27 10:18:20 +00:00