mariadb/storage/rocksdb
Marko Mäkelä 1bd681c8b3 MDEV-25506 (3 of 3): Do not delete .ibd files before commit
This is a complete rewrite of DROP TABLE, also as part of other DDL,
such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE.

The background DROP TABLE queue hack is removed.
If a transaction needs to drop and create a table by the same name
(like TRUNCATE TABLE does), it must first rename the table to an
internal #sql-ib name. No committed version of the data dictionary
will include any #sql-ib tables, because whenever a transaction
renames a table to a #sql-ib name, it will also drop that table.
Either the rename will be rolled back, or the drop will be committed.

Data files will be unlinked after the transaction has been committed
and a FILE_RENAME record has been durably written. The file will
actually be deleted when the detached file handle returned by
fil_delete_tablespace() will be closed, after the latches have been
released. It is possible that a purge of the delete of the SYS_INDEXES
record for the clustered index will execute fil_delete_tablespace()
concurrently with the DDL transaction. In that case, the thread that
arrives later will wait for the other thread to finish.

HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag.
ha_innobase::truncate() now requires that all other references to
the table be released in advance. This was implemented by Monty.

ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected,
we will "hijack" the current transaction, drop the table in
the current transaction and commit the current transaction.
This essentially fixes MDEV-21602. There is a FIXME comment about
making the check less failure-prone.

ha_innobase::truncate(), ha_innobase::delete_table():
Implement a fast path for temporary tables. We will no longer allow
temporary tables to use the adaptive hash index.

dict_table_t::mdl_name: The original table name for the purpose of
acquiring MDL in purge, to prevent a race condition between a
DDL transaction that is dropping a table, and purge processing
undo log records of DML that had executed before the DDL operation.
For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the
dict_table_t::mdl_name will differ from dict_table_t::name.

dict_table_t::parse_name(): Use mdl_name instead of name.

dict_table_rename_in_cache(): Update mdl_name.

For the internal FTS_ tables of FULLTEXT INDEX, purge would
acquire MDL on the FTS_ table name, but not on the main table,
and therefore it would be able to run concurrently with a
DDL transaction that is dropping the table. Previously, the
DROP TABLE queue hack prevented a race between purge and DDL.
For now, we introduce purge_sys.stop_FTS() to prevent purge from
opening any table, while a DDL transaction that may drop FTS_
tables is in progress. The function fts_lock_table(), which will
be invoked before the dictionary is locked, will wait for
purge to release any table handles.

trx_t::drop_table_statistics(): Drop statistics for the table.
This replaces dict_stats_drop_index(). We will drop or rename
persistent statistics atomically as part of DDL transactions.
On lock conflict for dropping statistics, we will fail instantly
with DB_LOCK_WAIT_TIMEOUT, because we will be holding the
exclusive data dictionary latch.

trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory().
Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT
in addition to DB_DUPLICATE_KEY. The call to fts_commit() is
entirely misplaced here and may obviously break the consistency
of transactions that affect FULLTEXT INDEX. It needs to be fixed
separately.

dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175).
The counter was a work-around for missing meta-data locking (MDL)
on the SQL layer, and not really needed in MariaDB.

ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28.

HA_ERR_TABLE_IN_FK_CHECK: Remove.

row_ins_check_foreign_constraints(): Do not acquire
dict_sys.latch either. The SQL-layer MDL will protect us.

This was reviewed by Thirunarayanan Balathandayuthapani
and tested by Matthias Leich.
2021-06-09 17:06:07 +03:00
..
mysql-test MDEV-25506 (3 of 3): Do not delete .ibd files before commit 2021-06-09 17:06:07 +03:00
rocksdb@bba5e7bc21 MDEV-21930 RocksDB does not compile anymore, with Visual Studio 2020-03-23 11:25:01 +01:00
tools
unittest MDEV-14267: correct FSF address 2018-10-30 19:45:09 +08:00
.clang-format Copy of 2019-06-15 21:29:46 +03:00
.gitignore
atomic_stat.h
build_rocksdb.cmake Merge 10.5 into 10.6 2021-01-25 12:56:30 +02:00
CMakeLists.txt MDEV-25870 Windows - fix ARM64 cross-compilation 2021-06-07 23:15:36 +02:00
event_listener.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
event_listener.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
get_rocksdb_files.sh
ha_rocksdb.cc MDEV-22189: Change error messages inside code to have mariadb instead of 2021-05-24 11:38:13 +05:30
ha_rocksdb.h MDEV-25180 Atomic ALTER TABLE 2021-05-19 22:54:13 +02:00
ha_rocksdb_proto.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
logger.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
myrocks_hotbackup.py Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
nosql_access.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
nosql_access.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
properties_collector.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
properties_collector.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_buff.h Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_cf_manager.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_cf_manager.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_cf_options.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_cf_options.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_compact_filter.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_comparator.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_converter.cc MDEV-22641: Provide SIMD optimized wrapper for zlib crc32() (#1558) 2020-06-01 11:34:06 +03:00
rdb_converter.h Copy of 2019-06-15 21:29:46 +03:00
rdb_datadic.cc MDEV-18465 Logging of DDL statements during backup 2021-05-19 22:54:13 +02:00
rdb_datadic.h MDEV-25180 Atomic ALTER TABLE 2021-05-19 22:54:13 +02:00
rdb_global.h MDEV-17171: RocksDB Tables do not have "Creation Date" 2019-10-31 19:45:25 +03:00
rdb_i_s.cc Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
rdb_i_s.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_index_merge.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_index_merge.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_io_watchdog.cc Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_io_watchdog.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_mariadb_port.h
rdb_mariadb_server_port.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_mariadb_server_port.h
rdb_mutex_wrapper.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_mutex_wrapper.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_perf_context.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_perf_context.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_psi.cc Copy of 2019-06-15 21:29:46 +03:00
rdb_psi.h Copy of 2019-06-15 21:29:46 +03:00
rdb_source_revision.h.in
rdb_sst_info.cc Merge 10.2 into 10.3 2020-03-30 11:12:56 +03:00
rdb_sst_info.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_threads.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_threads.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
rdb_utils.cc Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
rdb_utils.h Merge from MyRocks upstream: 2019-06-16 00:28:33 +03:00
README
rocksdb-range-access.txt
ut0counter.h MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00

== Summary ==
This directory contains RocksDB-based Storage Engine (RDBSE) for MySQL,
also known as "MyRocks".

== Resources ==
https://github.com/facebook/mysql-5.6/wiki/Getting-Started-with-MyRocks
https://www.facebook.com/groups/MyRocks/

== Coding Conventions ==
The baseline for MyRocks coding conventions for the code in storage/rocksdb/
is based on the default clang format with a few minor changes.  The file
storage/rocksdb/.clang-format describes conventions and can be integrated
with Vim or Emacs as described here:
http://releases.llvm.org/3.6.0/tools/clang/docs/ClangFormat.html#vim-integration

All code outside of storage/rocksdb/ should conform to the MySQL coding
conventions:
http://dev.mysql.com/doc/internals/en/coding-guidelines.html.

Several refinements:
  0. There is an umbrella C++ namespace named "myrocks" for all MyRocks code.
  1. We introduced "RDB" as the super-short abbreviation for "RocksDB". We will
     use it as a name prefix, with different capitalization (see below), to ease
     up code navigation with ctags and grep.
     N.B. For ease of matching, we'll keep the variables and functions dealing
          with sysvars as close as possible to the outside visible names of
          sysvars, which start with "rocksdb_" prefix, the outward storage
          engine name.
  2. The names for classes, interfaces, and C++ structures (which act as
     classes), start with prefix "Rdb_".
     NB: For historical reasons, we'll keep the "ha_<storage_engine_name>" class
         name for ha_rocksdb class, which is an exception to the rule.
  3. The names for global objects and functions start with prefix "rdb_".
  4. The names for macros and constants start with prefix "RDB_".
  5. Regular class member names start with "m_".
  6. Static class member names start with "s_".
  7. Given the 80 character per line limit, we'll not always use full English
     words in names, when a well known or easily recognizable abbreviation
     exists (like "tx" for "transaction" or "param" for "parameter" etc).
  8. When needing to disambiguate, we use different suffixes for that, like
     "_arg" for a function argument/parameter, "_arr" for a C style array, and
     "_vect" for a std::vector etc.

== Running Tests ==
To run tests from rocksdb, rocksdb_rpl or other rocksdb_* packages, use the
following parameters:
  --default-storage-engine=rocksdb
  --skip-innodb
  --default-tmp-storage-engine=MyISAM
  --rocksdb