mariadb

mirror of https://github.com/MariaDB/server.git synced 2026-05-16 20:07:13 +02:00

Author	SHA1	Message	Date
Marko Mäkelä	101da87228	Merge 10.5 into 10.6	2021-06-23 19:36:45 +03:00
Marko Mäkelä	22b62edaed	MDEV-25113: Make page flushing faster buf_page_write_complete(): Reduce the buf_pool.mutex hold time, and do not acquire buf_pool.flush_list_mutex at all. Instead, mark blocks clean by setting oldest_modification to 1. Dirty pages of temporary tables will be identified by the special value 2 instead of the previous special value 1. (By design of the ib_logfile0 format, actual LSN values smaller than 2048 are not possible.) buf_LRU_free_page(), buf_pool_t::get_oldest_modification() and many other functions will remove the garbage (clean blocks) from buf_pool.flush_list while holding buf_pool.flush_list_mutex. buf_pool_t::n_flush_LRU, buf_pool_t::n_flush_list: Replaced with non-atomic variables, protected by buf_pool.mutex, to avoid unnecessary synchronization when modifying the counts. export_vars: Remove unnecessary indirection for innodb_pages_created, innodb_pages_read, innodb_pages_written.	2021-06-23 19:06:52 +03:00
Marko Mäkelä	6ba938af62	MDEV-25905: Assertion table2==NULL in dict_sys_t::add() In commit `49e2c8f0a6` (MDEV-25743) we made dict_sys_t::find() incompatible with the rest of the table name hash table operations in case the table name contains non-ASCII octets (using a compatibility mode that facilitates the upgrade into the MySQL 5.0 filename-safe encoding) and the target platform implements signed char. ut_fold_string(): Remove; replace with my_crc32c(). This also makes table name hash value calculations independent on whether char is unsigned or signed.	2021-06-14 12:38:56 +03:00
Marko Mäkelä	1bd681c8b3	MDEV-25506 (3 of 3): Do not delete .ibd files before commit This is a complete rewrite of DROP TABLE, also as part of other DDL, such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE. The background DROP TABLE queue hack is removed. If a transaction needs to drop and create a table by the same name (like TRUNCATE TABLE does), it must first rename the table to an internal #sql-ib name. No committed version of the data dictionary will include any #sql-ib tables, because whenever a transaction renames a table to a #sql-ib name, it will also drop that table. Either the rename will be rolled back, or the drop will be committed. Data files will be unlinked after the transaction has been committed and a FILE_RENAME record has been durably written. The file will actually be deleted when the detached file handle returned by fil_delete_tablespace() will be closed, after the latches have been released. It is possible that a purge of the delete of the SYS_INDEXES record for the clustered index will execute fil_delete_tablespace() concurrently with the DDL transaction. In that case, the thread that arrives later will wait for the other thread to finish. HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag. ha_innobase::truncate() now requires that all other references to the table be released in advance. This was implemented by Monty. ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected, we will "hijack" the current transaction, drop the table in the current transaction and commit the current transaction. This essentially fixes MDEV-21602. There is a FIXME comment about making the check less failure-prone. ha_innobase::truncate(), ha_innobase::delete_table(): Implement a fast path for temporary tables. We will no longer allow temporary tables to use the adaptive hash index. dict_table_t::mdl_name: The original table name for the purpose of acquiring MDL in purge, to prevent a race condition between a DDL transaction that is dropping a table, and purge processing undo log records of DML that had executed before the DDL operation. For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the dict_table_t::mdl_name will differ from dict_table_t::name. dict_table_t::parse_name(): Use mdl_name instead of name. dict_table_rename_in_cache(): Update mdl_name. For the internal FTS_ tables of FULLTEXT INDEX, purge would acquire MDL on the FTS_ table name, but not on the main table, and therefore it would be able to run concurrently with a DDL transaction that is dropping the table. Previously, the DROP TABLE queue hack prevented a race between purge and DDL. For now, we introduce purge_sys.stop_FTS() to prevent purge from opening any table, while a DDL transaction that may drop FTS_ tables is in progress. The function fts_lock_table(), which will be invoked before the dictionary is locked, will wait for purge to release any table handles. trx_t::drop_table_statistics(): Drop statistics for the table. This replaces dict_stats_drop_index(). We will drop or rename persistent statistics atomically as part of DDL transactions. On lock conflict for dropping statistics, we will fail instantly with DB_LOCK_WAIT_TIMEOUT, because we will be holding the exclusive data dictionary latch. trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory(). Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT in addition to DB_DUPLICATE_KEY. The call to fts_commit() is entirely misplaced here and may obviously break the consistency of transactions that affect FULLTEXT INDEX. It needs to be fixed separately. dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175). The counter was a work-around for missing meta-data locking (MDL) on the SQL layer, and not really needed in MariaDB. ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28. HA_ERR_TABLE_IN_FK_CHECK: Remove. row_ins_check_foreign_constraints(): Do not acquire dict_sys.latch either. The SQL-layer MDL will protect us. This was reviewed by Thirunarayanan Balathandayuthapani and tested by Matthias Leich.	2021-06-09 17:06:07 +03:00
Marko Mäkelä	65f1a42788	Merge 10.5 into 10.6	2021-06-09 16:50:58 +03:00
Marko Mäkelä	b25d2a4578	Merge 10.4 into 10.5	2021-06-09 14:29:58 +03:00
Marko Mäkelä	c7ee039d36	Merge 10.3 to 10.4	2021-06-09 14:28:57 +03:00
Marko Mäkelä	75a65d3201	MDEV-25886 CHECK TABLE crash with DB_MISSING_HISTORY if innodb_read_only Occasionally, the test innodb.alter_copy would fail in MariaDB 10.6.1, reporting DB_MISSING_HISTORY during CHECK TABLE. It started to occur during the development of MDEV-25180, which introduced purge_sys.stop_SYS(). If we delay purge more during DDL operations, then the test would almost always fail. The reason is that during startup we will restore a purge view, and CHECK TABLE would still use REPEATABLE READ even though innodb_read_only is set and other isolation levels than READ UNCOMMITTED are not guaranteed to work. ha_innobase::check(): Use READ UNCOMMITTED isolation level if innodb_read_only is set or innodb_force_recovery exceeds 3. dict_set_corrupted(): Do not update the persistent data dictionary if innodb_force_recovery exceeds 3.	2021-06-09 14:23:20 +03:00
Marko Mäkelä	f4425d3a3d	Merge 10.4 into 10.5	2021-06-08 16:03:53 +03:00
Marko Mäkelä	b8d38c5e39	dict_index_build_internal_clust(): Catch MDEV-20131 on CREATE TABLE	2021-06-08 15:03:50 +03:00
Marko Mäkelä	49e2c8f0a6	MDEV-25743: Unnecessary copying of table names in InnoDB dictionary Many InnoDB data dictionary cache operations require that the table name be copied so that it will be NUL terminated. (For example, SYS_TABLES.NAME is not guaranteed to be NUL-terminated.) dict_table_t::is_garbage_name(): Check if a name belongs to the background drop table queue. dict_check_if_system_table_exists(): Remove. dict_sys_t::load_sys_tables(): Load the non-hard-coded system tables SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL on startup. dict_sys_t::create_or_check_sys_tables(): Replaces dict_create_or_check_foreign_constraint_tables() and dict_create_or_check_sys_virtual(). dict_sys_t::load_table(): Replaces dict_table_get_low() and dict_load_table(). dict_sys_t::find_table(): Renamed from get_table(). dict_sys_t::sys_tables_exist(): Check whether all the non-hard-coded tables SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL exist. trx_t::has_stats_table_lock(): Moved to dict0stats.cc. Some error messages will now report table names in the internal databasename/tablename format, instead of `databasename`.`tablename`.	2021-05-21 18:03:40 +03:00
Marko Mäkelä	9eb4ad57de	Cleanup: Access lower_case_table_names, tdc_size directly dict_sys_t::evict_table_LRU(): Replaces dict_make_room_in_cache() and srv_master_evict_from_table_cache(). innobase_get_table_cache_size(): Replaced with direct read of tdc_size, in dict_sys_t::evict_table_LRU(). innobase_get_lower_case_table_names(): Replaced with direct reads of lower_case_table_names.	2021-05-21 18:03:40 +03:00
Marko Mäkelä	383f77cd84	Cleanup: Simplify dict_table_schema_check()	2021-05-21 18:03:39 +03:00
Monty	7762ee5dbe	MDEV-25180 Atomic ALTER TABLE MDEV-25604 Atomic DDL: Binlog event written upon recovery does not have default database The purpose of this task is to ensure that ALTER TABLE is atomic even if the MariaDB server would be killed at any point of the alter table. This means that either the ALTER TABLE succeeds (including that triggers, the status tables and the binary log are updated) or things should be reverted to their original state. If the server crashes before the new version is fully up to date and commited, it will revert to the original table and remove all temporary files and tables. If the new version is commited, crash recovery will use the new version, and update triggers, the status tables and the binary log. The one execption is ALTER TABLE .. RENAME .. where no changes are done to table definition. This one will work as RENAME and roll back unless the whole statement completed, including updating the binary log (if enabled). Other changes: - Added handlerton->check_version() function to allow the ddl recovery code to check, in case of inplace alter table, if the table in the storage engine is of the new or old version. - Added handler->table_version() so that an engine can report the current version of the table. This should be changed each time the table definition changes. - Added ha_signal_ddl_recovery_done() and handlerton::signal_ddl_recovery_done() to inform all handlers when ddl recovery has been done. (Needed by InnoDB). - Added handlerton call inplace_alter_table_committed, to signal engine that ddl_log has been closed for the alter table query. - Added new handerton flag HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we should call hton->notify_tabledef_changed() during mysql_inplace_alter_table. This was required as MyRocks and InnoDB needed the call at different times. - Added function server_uuid_value() to be able to generate a temporary xid when ddl recovery writes the query to the binary log. This is needed to be able to handle crashes during ddl log recovery. - Moved freeing of the frm definition to end of mysql_alter_table() to remove duplicate code and have a common exit strategy. ------- InnoDB part of atomic ALTER TABLE (Implemented by Marko Mäkelä) innodb_check_version(): Compare the saved dict_table_t::def_trx_id to determine whether an ALTER TABLE operation was committed. We must correctly recover dict_table_t::def_trx_id for this to work. Before purge removes any trace of DB_TRX_ID from system tables, it will make an effort to load the user table into the cache, so that the dict_table_t::def_trx_id can be recovered. ha_innobase::table_version(): return garbage, or the trx_id that would be used for committing an ALTER TABLE operation. In InnoDB, table names starting with #sql-ib will remain special: they will be dropped on startup. This may be revisited later in MDEV-18518 when we implement proper undo logging and rollback for creating or dropping multiple tables in a transaction. Table names starting with #sql will retain some special meaning: dict_table_t::parse_name() will not consider such names for MDL acquisition, and dict_table_rename_in_cache() will treat such names specially when handling FOREIGN KEY constraints. Simplify InnoDB DROP INDEX. Prevent purge wakeup To ensure that dict_table_t::def_trx_id will be recovered correctly in case the server is killed before ddl_log_complete(), we will block the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS between ha_innobase::commit_inplace_alter_table(commit=true) (purge_sys.stop_SYS()) and purge_sys.resume_SYS(). The completion callback purge_sys.resume_SYS() must be between ddl_log_complete() and MDL release. -------- MyRocks support for atomic ALTER TABLE (Implemented by Sergui Petrunia) Implement these SE API functions: - ha_rocksdb::table_version() - hton->check_version = rocksdb_check_versionMyRocks data dictionary now stores table version for each table. (Absence of table version record is interpreted as table_version=0, that is, which means no upgrade changes are needed) - For inplace alter table of a partitioned table, call the underlying handlerton when checking if the table is ok. This assumes that the partition engine commits all changes at once.	2021-05-19 22:54:13 +02:00
Marko Mäkelä	1b1882c627	MDEV-25689: Remove MONITOR_TABLE_REFERENCE, MONITOR_TABLE_CLOSE The counter metadata_table_reference_count was not updated consistently ever since mysql-server@65c0af9a1dedae43b63797134aff6b32304ced52 or commit `2e814d4702` introduced dict_table_t::release(). The counter metadata_table_handles_closed was being incremented unconditionally in ha_innobase::close(), while the corresponding counter metadata_table_handles_opened would be incremented in ha_innobase::open() if the function returned early due to an error.	2021-05-17 09:56:04 +03:00
Marko Mäkelä	916b237b3f	Merge 10.5 into 10.6	2021-05-07 15:00:27 +03:00
Nikita Malyavin	3f55c56951	Merge branch bb-10.4-release into bb-10.5-release	2021-05-05 23:57:11 +03:00
Nikita Malyavin	509e4990af	Merge branch bb-10.3-release into bb-10.4-release	2021-05-05 23:03:01 +03:00
Nikita Malyavin	a8a925dd22	Merge branch bb-10.2-release into bb-10.3-release	2021-05-04 14:49:31 +03:00
Marko Mäkelä	52aac131e3	MDEV-18518 Multi-table CREATE and DROP transactions for InnoDB InnoDB used to support at most one CREATE TABLE or DROP TABLE per transaction. This caused complications for DDL operations on partitioned tables (where each partition is treated as a separate table by InnoDB) and FULLTEXT INDEX (where each index is maintained in a number of internal InnoDB tables). dict_drop_index_tree(): Extend the MDEV-24589 logic and treat the purge or rollback of SYS_INDEXES records of clustered indexes specially: by dropping the tablespace if it exists. This is the only form of recovery that we will need. trx_undo_ddl_type: Document the DDL undo log record types better. trx_t::dict_operation: Change the type to bool. trx_t::ddl: Remove. trx_t::table_id, trx_undo_t::table_id: Remove. dict_build_table_def_step(): Remove trx_t::table_id logging. dict_table_close_and_drop(), row_merge_drop_table(): Remove. row_merge_lock_table(): Merged to the only callers, which can call lock_table_for_trx() directly. fts_aux_table_t, fts_aux_id, fts_space_set_t: Remove. fts_drop_orphaned_tables(): Remove. row_merge_rename_index_to_drop(): Remove. Thanks to MDEV-24589, we can simply delete the to-be-dropped indexes from SYS_INDEXES, while still being able to roll back the operation. ha_innobase_inplace_ctx: Make a few data members const. Preallocate trx. prepare_inplace_alter_table_dict(): Simplify the logic. Let the normal rollback take care of some cleanup. row_undo_ins_remove_clust_rec(): Simplify the parsing of SYS_COLUMNS. trx_rollback_active(): Remove the special DROP TABLE logic. trx_undo_mem_create_at_db_start(), trx_undo_reuse_cached(): Always write TRX_UNDO_TABLE_ID as 0.	2021-05-04 13:48:55 +03:00
Marko Mäkelä	54e2e70194	MDEV-25524 heap-use-after-free in fil_space_t::rename() In commit `91599701d0` (MDEV-25312) some recovery code for TRUNCATE TABLE was broken causing a regression in a case where undo log for a RENAME TABLE operation had been durably written but the tablespace had not been renamed yet. row_rename_table_for_mysql(): Add a DEBUG_SYNC point for the test case, and simplify the logic and trim the error messages. fil_space_t::rename(): Simplify the operation. Merge the necessary part of fil_rename_tablespace_check(). If there is no change to the file name, do nothing. dict_table_t::rename_tablespace(): Refactored from dict_table_rename_in_cache(). row_undo_ins_parse_undo_rec(): On rolling back TRX_UNDO_RENAME_TABLE, invoke dict_table_t::rename_tablespace() even if the table name matches. os_file_rename_func(): Temporarily relax an assertion that would fail during the recovery in the test innodb.truncate_crash.	2021-04-29 18:31:59 +03:00
Thirunarayanan Balathandayuthapani	b862377c3e	MDEV-25503 InnoDB hangs on startup during recovery InnoDB startup hangs if a DDL transaction needs to be rolled back and a recovered transaction on statistics tables exists. In that case, InnoDB should rollback the transaction which holds locks on innodb_table_stats or innodb_index_stats during trx_rollback_or_clean_recovered().	2021-04-27 17:07:37 +05:30
Marko Mäkelä	d2e2d32933	Merge 10.5 into 10.6	2021-04-14 12:32:27 +03:00
Marko Mäkelä	6c3e860cbf	Merge 10.4 into 10.5	2021-04-14 11:35:39 +03:00
Marko Mäkelä	5008171b05	Merge 10.3 into 10.4	2021-04-14 10:33:59 +03:00
Marko Mäkelä	b8c8692fd9	MDEV-24620 ASAN heap-buffer-overflow in btr_pcur_restore_position() Between btr_pcur_store_position() and btr_pcur_restore_position() it is possible that purge empties a table and enlarges index->n_core_fields and index->n_core_null_bytes. Therefore, we must cache index->n_core_fields in btr_pcur_t::old_n_core_fields so that btr_pcur_t::old_rec can be parsed correctly. Unfortunately, this is a huge change, because we will replace "bool leaf" parameters with "ulint n_core" (passing index->n_core_fields, or 0 for non-leaf pages). For special cases where we know that index->is_instant() cannot hold, we may also pass index->n_fields.	2021-04-13 10:28:13 +03:00
Marko Mäkelä	6e6318b29b	Merge 10.2 into 10.3	2021-04-13 10:26:01 +03:00
Thirunarayanan Balathandayuthapani	cf2c6b7f8d	MDEV-24971 InnoDB access freed virtual column after rollback of secondary index Problem: ======== InnoDB fails to clean the index stub if it fails to add the virtual index which contains new virtual column. But it clears the newly virtual column from index in clear_added_indexes() during inplace_alter_table. On commit, InnoDB evicts and reload the table. In case of rollback, it doesn't happen. InnoDB clears the ABORTED index while opening the table or doing the DDL. In the mean time, InnoDB can access the dropped virtual index columns while creating prebuilt or rollback of concurrent DML. Solution: ========== (1) InnoDB should maintain newly added virtual column while rollbacking the newly added virtual index. (2) InnoDB must not defer the index removal if the alter table is executed with LOCK=EXCLUSIVE. (3) For LOCK=SHARED, InnoDB should check whether the table has any other transaction lock other than alter transaction before deferring the index stub. Replaced has_new_v_col with dict_add_vcol_info in dict_index_t to indicate whether the index has any new virtual column. dict_index_t::has_new_v_col(): Returns whether the index has newly added virtual column, it doesn't say which columns are newly added virtual column ha_innobase_inplace_ctx::is_new_vcol(): Return whether the given column is added as a part of the current alter. ha_innobase_inplace_ctx::clean_new_vcol_index(): Copy the newly added virtual column to new_vcol_info in dict_index_t. Replace the column in the index fields with virtual column stored in new_vcol_info. dict_index_t::assign_new_v_col(): Store the number of virtual column added in index as a part of alter table. dict_index_t::get_n_new_vcol(): Get the number of newly added virtual column dict_index_t::assign_drop_v_col(): Allocate the memory for adding new virtual column in new_vcol_info. dict_index_t::add_drop_v_col(): Add the newly added virtual column in new_vcol_info. dict_table_t::has_lock_for_other_trx(): Whether the table has any other transaction lock than given transaction. row_merge_drop_indexes(): Add parameter alter_trx and check whether the table has any other lock than alter transaction.	2021-04-12 16:06:06 +05:30
Marko Mäkelä	cf552f5886	MDEV-25312 Replace fil_space_t::name with fil_space_t::name() A consistency check for fil_space_t::name is causing recovery failures in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field altogether. fil_space_t::name was more or less a copy of dict_table_t::name (except for some special cases), and it was not being used for anything useful. There used to be a name_hash, but it had been removed already in commit `a75dbfd718` (MDEV-12266). We will also remove os_normalize_path(), OS_PATH_SEPARATOR, OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and / roughly in the same way. The intention is that for per-table tablespaces, the filenames will always follow the pattern prefix/databasename/tablename.ibd. (Any \ in the prefix must not be converted.) ut_basename_noext(): Remove (unused function). read_link_file(): Replaces RemoteDatafile::read_link_file(). We will ensure that the last two path component separators are forward slashes (converting up to 2 trailing backslashes on Microsoft Windows), so that everywhere else we can assume that data file names end in "/databasename/tablename.ibd". Note: On Microsoft Windows, path names that start with \\?\ must not contain / as path component separators. Previously, such paths did work in the DATA DIRECTORY argument of InnoDB tables. Reviewed by: Vladislav Vaintroub	2021-04-07 18:01:13 +03:00
Krunal Bauskar	33cf577ad8	MDEV-25029: Reduce dict_sys mutex contention for read-only workload In theory, read-only workload shouldn't have anything to do with dict_sys mutex. dict_sys mutex is meant to protect database metadata. But then why does dict_sys mutex shows up as part of a read-only workload? This workload needs to fetch stats for query processing and while reading these stats dict_sys mutex is taken. Is this really needed? No. For the traditional reasons, it was the default global mutex used. Based on 10.6 changes, flow can now use table->lock_mutex to protect update/access of these stats. table mutexes being table specific global contention arising out of dict_sys is reduced. Thanks to Marko Makela for his early suggestion around proposed alternative and review of the draft patch.	2021-03-05 14:26:14 +02:00
Marko Mäkelä	b08448de64	MDEV-20612: Partition lock_sys.latch We replace the old lock_sys.mutex (which was renamed to lock_sys.latch) with a combination of a global lock_sys.latch and table or page hash lock mutexes. The global lock_sys.latch can be acquired in exclusive mode, or it can be acquired in shared mode and another mutex will be acquired to protect the locks for a particular page or a table. This is inspired by mysql/mysql-server@1d259b87a6 but the optimization of lock_release() will be done in the next commit. Also, we will interleave mutexes with the hash table elements, similar to how buf_pool.page_hash was optimized in commit `5155a300fa` (MDEV-22871). dict_table_t::autoinc_trx: Use Atomic_relaxed. dict_table_t::autoinc_mutex: Use srw_mutex in order to reduce the memory footprint. On 64-bit Linux or OpenBSD, both this and the new dict_table_t::lock_mutex should be 32 bits and be stored in the same 64-bit word. On Microsoft Windows, the underlying SRWLOCK is 32 or 64 bits, and on other systems, sizeof(pthread_mutex_t) can be much larger. ib_lock_t::trx_locks, trx_lock_t::trx_locks: Document the new rules. Writers must assert lock_sys.is_writer() \|\| trx->mutex_is_owner(). LockGuard: A RAII wrapper for acquiring a page hash table lock. LockGGuard: Like LockGuard, but when Galera Write-Set Replication is enabled, we must acquire all shards, for updating arbitrary trx_locks. LockMultiGuard: A RAII wrapper for acquiring two page hash table locks. lock_rec_create_wsrep(), lock_table_create_wsrep(): Special Galera conflict resolution in non-inlined functions in order to keep the common code paths shorter. lock_sys_t::prdt_page_free_from_discard(): Refactored from lock_prdt_page_free_from_discard() and lock_rec_free_all_from_discard_page(). trx_t::commit_tables(): Replaces trx_update_mod_tables_timestamp(). lock_release(): Let trx_t::commit_tables() invalidate the query cache for those tables that were actually modified by the transaction. Merge lock_check_dict_lock() to lock_release(). We must never release lock_sys.latch while holding any lock_sys_t::hash_latch. Failure to do that could lead to memory corruption if the buffer pool is resized between the time lock_sys.latch is released and the hash_latch is released.	2021-02-12 17:44:32 +02:00
Marko Mäkelä	ff5d306e29	MDEV-21452: Replace ib_mutex_t with mysql_mutex_t SHOW ENGINE INNODB MUTEX functionality is completely removed, as are the InnoDB latching order checks. We will enforce innodb_fatal_semaphore_wait_threshold only for dict_sys.mutex and lock_sys.mutex. dict_sys_t::mutex_lock(): A single entry point for dict_sys.mutex. lock_sys_t::mutex_lock(): A single entry point for lock_sys.mutex. FIXME: srv_sys should be removed altogether; it is duplicating tpool functionality. fil_crypt_threads_init(): To prevent SAFE_MUTEX warnings, we must not hold fil_system.mutex. fil_close_all_files(): To prevent SAFE_MUTEX warnings for fil_space_destroy_crypt_data(), we must not hold fil_system.mutex while invoking fil_space_free_low() on a detached tablespace.	2020-12-15 17:56:18 +02:00
Marko Mäkelä	9702be2c73	MDEV-24142: Remove __FILE__,__LINE__ related to buf_block_t::lock	2020-12-03 15:28:53 +02:00
Marko Mäkelä	03ca6495df	MDEV-24142: Replace InnoDB rw_lock_t with sux_lock InnoDB buffer pool block and index tree latches depend on a special kind of read-update-write lock that allows reentrant (recursive) acquisition of the 'update' and 'write' locks as well as an upgrade from 'update' lock to 'write' lock. The 'update' lock allows any number of reader locks from other threads, but no concurrent 'update' or 'write' lock. If there were no requirement to support an upgrade from 'update' to 'write', we could compose the lock out of two srw_lock (implemented as any type of native rw-lock, such as SRWLOCK on Microsoft Windows). Removing this requirement is very difficult, so in commit f7e7f487d4b06695f91f6fbeb0396b9d87fc7bbf we implemented an 'update' mode to our srw_lock. Re-entrant or recursive locking is mostly needed when writing or freeing BLOB pages, but also in crash recovery or when merging buffered changes to an index page. The re-entrancy allows us to attach a previously acquired page to a sub-mini-transaction that will be committed before whatever else is holding the page latch. The SUX lock supports Shared ('read'), Update, and eXclusive ('write') locking modes. The S latches are not re-entrant, but a single S latch may be acquired even if the thread already holds an U latch. The idea of the U latch is to allow a write of something that concurrent readers do not care about (such as the contents of BTR_SEG_LEAF, BTR_SEG_TOP and other page allocation metadata structures, or the MDEV-6076 PAGE_ROOT_AUTO_INC). (The PAGE_ROOT_AUTO_INC field is only updated when a dict_table_t for the table exists, and only read when a dict_table_t for the table is being added to dict_sys.) block_lock::u_lock_try(bool for_io=true) is used in buf_flush_page() to allow concurrent readers but no concurrent modifications while the page is being written to the data file. That latch will be released by buf_page_write_complete() in a different thread. Hence, we use the special lock owner value FOR_IO. The index_lock::u_lock() improves concurrency on operations that involve non-leaf index pages. The interface has been cleaned up a little. We will use x_lock_recursive() instead of x_lock() when we know that a lock is already held by the current thread. Similarly, a lock upgrade from U to X is only allowed via u_x_upgrade() or x_lock_upgraded() but not via x_lock(). We will disable the LatchDebug and sync_array interfaces to InnoDB rw-locks. The SEMAPHORES section of SHOW ENGINE INNODB STATUS output will no longer include any information about InnoDB rw-locks, only TTASEventMutex (cmake -DMUTEXTYPE=event) waits. This will make a part of the 'innotop' script dead code. The block_lock buf_block_t::lock will not be covered by any PERFORMANCE_SCHEMA instrumentation. SHOW ENGINE INNODB MUTEX and INFORMATION_SCHEMA.INNODB_MUTEXES will no longer output source code file names or line numbers. The dict_index_t::lock will be identified by index and table names, which should be much more useful. PERFORMANCE_SCHEMA is lumping information about all dict_index_t::lock together as event_name='wait/synch/sxlock/innodb/index_tree_rw_lock'. buf_page_free(): Remove the file,line parameters. The sux_lock will not store such diagnostic information. buf_block_dbg_add_level(): Define as empty macro, to be removed in a subsequent commit. Unless the build was configured with cmake -DPLUGIN_PERFSCHEMA=NO the index_lock dict_index_t::lock will be instrumented via PERFORMANCE_SCHEMA. Similar to commit `1669c8890c` we will distinguish lock waits by registering shared_lock,exclusive_lock events instead of try_shared_lock,try_exclusive_lock. Actual 'try' operations will not be instrumented at all. rw_lock_list: Remove. After MDEV-24167, this only covered buf_block_t::lock and dict_index_t::lock. We will output their information by traversing buf_pool or dict_sys.	2020-12-03 15:19:49 +02:00
Marko Mäkelä	a13fac9eee	Merge 10.5 into 10.6	2020-12-03 08:12:47 +02:00
Marko Mäkelä	6a1e655cb0	Merge 10.4 into 10.5	2020-12-02 18:29:49 +02:00
Marko Mäkelä	589cf8dbf3	Merge 10.3 into 10.4	2020-12-01 19:51:14 +02:00
Marko Mäkelä	81ab9ea63f	Merge 10.2 into 10.3	2020-12-01 14:55:46 +02:00
Marko Mäkelä	1c9833c511	Cleanup: row_log_free() The nonnull attribute is not applicable to parameters that are passed by reference, at least not in the Intel compiler. Let us remove the reference indirection, which was only there so that the pointer could be assigned to NULL, and let the callers perform that task. row_log_allocate(): Fix a bug in out-of-memory error handling that would leave a pointer to freed memory.	2020-11-25 10:54:38 +02:00
Marko Mäkelä	581aebe29f	MDEV-24167: Fix -DPLUGIN_PERFSCHEMA=NO and Windows debug builds	2020-11-24 21:01:22 +02:00
Marko Mäkelä	bdd88cfa34	MDEV-24167: Replace fts_cache_rw_lock, fts_cache_init_rw_lock with mutex fts_cache_t::init_lock: Replace with mutex. This was only acquired in exclusive mode. fts_cache_t:🔒 Replace with mutex. The only read-lock user was i_s_fts_index_cache_fill() for producing content for the view INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE.	2020-11-24 15:43:12 +02:00
Marko Mäkelä	1a1b7a6f16	MDEV-24167: Replace dict_operation_lock (dict_sys.latch)	2020-11-24 15:43:11 +02:00
Marko Mäkelä	d7a5824899	Merge 10.4 into 10.5	2020-11-13 21:54:21 +02:00
Marko Mäkelä	972dc6ee98	Merge 10.3 into 10.4	2020-11-12 11:18:04 +02:00
Marko Mäkelä	150f447af1	Merge 10.2 into 10.3	2020-11-12 10:37:21 +02:00
Marko Mäkelä	d6ee28582a	Cleanup: Remove dict_space_is_empty(), dict_space_get_id() As noted in commit `0b66d3f70d`, MariaDB does not support CREATE TABLESPACE for InnoDB. Hence, some code that was added in commit `fec844aca8` and originally in mysql/mysql-server@c71dd213bd is unused in MariaDB and should be removed.	2020-11-11 15:48:43 +02:00
Marko Mäkelä	5407117a59	MDEV-22343 Remove SYS_TABLESPACES and SYS_DATAFILES The InnoDB internal tables SYS_TABLESPACES and SYS_DATAFILES as well as the INFORMATION_SCHEMA views INNODB_SYS_TABLESPACES and INNODB_SYS_DATAFILES were introduced in MySQL 5.6 for no good reason in mysql/mysql-server/commit/e9255a22ef16d612a8076bc0b34002bc5a784627 when the InnoDB support for the DATA DIRECTORY attribute was introduced. The file system should be the authoritative source of information on files. Storing information about file system paths in the file system (symlinks, or even the .isl files that were unfortunately chosen as the solution) is sufficient. If information is additionally stored in some hidden tables inside the InnoDB system tablespace, everything unnecessarily becomes more complicated, because more copies of data mean more opportunity for the copies to be out of sync, and because modifying the data in the system tablespace in the desired way might not be possible at all without modifying the InnoDB source code. So, the copy in the system tablespace basically is a redundant, non-authoritative source of information. We will stop creating or accessing the system tables SYS_TABLESPACES and SYS_DATAFILES. We will also remove the view INFORMATION_SCHEMA.INNODB_SYS_DATAFILES along with SYS_DATAFILES. The view INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES will be repurposed to directly reflect fil_system.space_list. The column PAGE_SIZE, which would always contain the value of the GLOBAL read-only variable innodb_page_size, is removed. The column ZIP_PAGE_SIZE, which would actually contain the physical page size of a page, is renamed to PAGE_SIZE. Finally, a new column FILENAME is added, as a replacement of SYS_DATAFILES.PATH. This will also address MDEV-21801 (files that were created before upgrading to MySQL 5.6 or MariaDB 10.0 or later were never registered in SYS_TABLESPACES or SYS_DATAFILES) and MDEV-21801 (information about the system tablespace is not stored in SYS_TABLESPACES or SYS_DATAFILES).	2020-11-11 11:15:11 +02:00
Marko Mäkelä	898521e2dd	Merge 10.4 into 10.5	2020-10-30 11:15:30 +02:00
Marko Mäkelä	7b2bb67113	Merge 10.3 into 10.4	2020-10-29 13:38:38 +02:00
Marko Mäkelä	2b6f804490	Merge 10.2 into 10.3	2020-10-28 10:44:40 +02:00

1 2 3 4 5 ...

400 commits