mirror of
https://github.com/MariaDB/server.git
synced 2025-02-01 03:21:53 +01:00
1bd681c8b3
This is a complete rewrite of DROP TABLE, also as part of other DDL, such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE. The background DROP TABLE queue hack is removed. If a transaction needs to drop and create a table by the same name (like TRUNCATE TABLE does), it must first rename the table to an internal #sql-ib name. No committed version of the data dictionary will include any #sql-ib tables, because whenever a transaction renames a table to a #sql-ib name, it will also drop that table. Either the rename will be rolled back, or the drop will be committed. Data files will be unlinked after the transaction has been committed and a FILE_RENAME record has been durably written. The file will actually be deleted when the detached file handle returned by fil_delete_tablespace() will be closed, after the latches have been released. It is possible that a purge of the delete of the SYS_INDEXES record for the clustered index will execute fil_delete_tablespace() concurrently with the DDL transaction. In that case, the thread that arrives later will wait for the other thread to finish. HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag. ha_innobase::truncate() now requires that all other references to the table be released in advance. This was implemented by Monty. ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected, we will "hijack" the current transaction, drop the table in the current transaction and commit the current transaction. This essentially fixes MDEV-21602. There is a FIXME comment about making the check less failure-prone. ha_innobase::truncate(), ha_innobase::delete_table(): Implement a fast path for temporary tables. We will no longer allow temporary tables to use the adaptive hash index. dict_table_t::mdl_name: The original table name for the purpose of acquiring MDL in purge, to prevent a race condition between a DDL transaction that is dropping a table, and purge processing undo log records of DML that had executed before the DDL operation. For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the dict_table_t::mdl_name will differ from dict_table_t::name. dict_table_t::parse_name(): Use mdl_name instead of name. dict_table_rename_in_cache(): Update mdl_name. For the internal FTS_ tables of FULLTEXT INDEX, purge would acquire MDL on the FTS_ table name, but not on the main table, and therefore it would be able to run concurrently with a DDL transaction that is dropping the table. Previously, the DROP TABLE queue hack prevented a race between purge and DDL. For now, we introduce purge_sys.stop_FTS() to prevent purge from opening any table, while a DDL transaction that may drop FTS_ tables is in progress. The function fts_lock_table(), which will be invoked before the dictionary is locked, will wait for purge to release any table handles. trx_t::drop_table_statistics(): Drop statistics for the table. This replaces dict_stats_drop_index(). We will drop or rename persistent statistics atomically as part of DDL transactions. On lock conflict for dropping statistics, we will fail instantly with DB_LOCK_WAIT_TIMEOUT, because we will be holding the exclusive data dictionary latch. trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory(). Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT in addition to DB_DUPLICATE_KEY. The call to fts_commit() is entirely misplaced here and may obviously break the consistency of transactions that affect FULLTEXT INDEX. It needs to be fixed separately. dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175). The counter was a work-around for missing meta-data locking (MDL) on the SQL layer, and not really needed in MariaDB. ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28. HA_ERR_TABLE_IN_FK_CHECK: Remove. row_ins_check_foreign_constraints(): Do not acquire dict_sys.latch either. The SQL-layer MDL will protect us. This was reviewed by Thirunarayanan Balathandayuthapani and tested by Matthias Leich. |
||
---|---|---|
.. | ||
mysql-test | ||
rocksdb@bba5e7bc21 | ||
tools | ||
unittest | ||
.clang-format | ||
.gitignore | ||
atomic_stat.h | ||
build_rocksdb.cmake | ||
CMakeLists.txt | ||
event_listener.cc | ||
event_listener.h | ||
get_rocksdb_files.sh | ||
ha_rocksdb.cc | ||
ha_rocksdb.h | ||
ha_rocksdb_proto.h | ||
logger.h | ||
myrocks_hotbackup.py | ||
nosql_access.cc | ||
nosql_access.h | ||
properties_collector.cc | ||
properties_collector.h | ||
rdb_buff.h | ||
rdb_cf_manager.cc | ||
rdb_cf_manager.h | ||
rdb_cf_options.cc | ||
rdb_cf_options.h | ||
rdb_compact_filter.h | ||
rdb_comparator.h | ||
rdb_converter.cc | ||
rdb_converter.h | ||
rdb_datadic.cc | ||
rdb_datadic.h | ||
rdb_global.h | ||
rdb_i_s.cc | ||
rdb_i_s.h | ||
rdb_index_merge.cc | ||
rdb_index_merge.h | ||
rdb_io_watchdog.cc | ||
rdb_io_watchdog.h | ||
rdb_mariadb_port.h | ||
rdb_mariadb_server_port.cc | ||
rdb_mariadb_server_port.h | ||
rdb_mutex_wrapper.cc | ||
rdb_mutex_wrapper.h | ||
rdb_perf_context.cc | ||
rdb_perf_context.h | ||
rdb_psi.cc | ||
rdb_psi.h | ||
rdb_source_revision.h.in | ||
rdb_sst_info.cc | ||
rdb_sst_info.h | ||
rdb_threads.cc | ||
rdb_threads.h | ||
rdb_utils.cc | ||
rdb_utils.h | ||
README | ||
rocksdb-range-access.txt | ||
ut0counter.h |
== Summary == This directory contains RocksDB-based Storage Engine (RDBSE) for MySQL, also known as "MyRocks". == Resources == https://github.com/facebook/mysql-5.6/wiki/Getting-Started-with-MyRocks https://www.facebook.com/groups/MyRocks/ == Coding Conventions == The baseline for MyRocks coding conventions for the code in storage/rocksdb/ is based on the default clang format with a few minor changes. The file storage/rocksdb/.clang-format describes conventions and can be integrated with Vim or Emacs as described here: http://releases.llvm.org/3.6.0/tools/clang/docs/ClangFormat.html#vim-integration All code outside of storage/rocksdb/ should conform to the MySQL coding conventions: http://dev.mysql.com/doc/internals/en/coding-guidelines.html. Several refinements: 0. There is an umbrella C++ namespace named "myrocks" for all MyRocks code. 1. We introduced "RDB" as the super-short abbreviation for "RocksDB". We will use it as a name prefix, with different capitalization (see below), to ease up code navigation with ctags and grep. N.B. For ease of matching, we'll keep the variables and functions dealing with sysvars as close as possible to the outside visible names of sysvars, which start with "rocksdb_" prefix, the outward storage engine name. 2. The names for classes, interfaces, and C++ structures (which act as classes), start with prefix "Rdb_". NB: For historical reasons, we'll keep the "ha_<storage_engine_name>" class name for ha_rocksdb class, which is an exception to the rule. 3. The names for global objects and functions start with prefix "rdb_". 4. The names for macros and constants start with prefix "RDB_". 5. Regular class member names start with "m_". 6. Static class member names start with "s_". 7. Given the 80 character per line limit, we'll not always use full English words in names, when a well known or easily recognizable abbreviation exists (like "tx" for "transaction" or "param" for "parameter" etc). 8. When needing to disambiguate, we use different suffixes for that, like "_arg" for a function argument/parameter, "_arr" for a C style array, and "_vect" for a std::vector etc. == Running Tests == To run tests from rocksdb, rocksdb_rpl or other rocksdb_* packages, use the following parameters: --default-storage-engine=rocksdb --skip-innodb --default-tmp-storage-engine=MyISAM --rocksdb