mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 02:51:44 +01:00

Author	SHA1	Message	Date
Sergei Golubchik	ef781162ff	Merge branch '10.4' into 10.5	2022-05-09 22:04:06 +02:00
Marko Mäkelä	0806592ac8	MDEV-28422 Page split breaks a gap lock btr_insert_into_right_sibling(): Inherit any gap lock from the left sibling to the right sibling before inserting the record to the right sibling and updating the node pointer(s). lock_update_node_pointer(): Update locks in case a node pointer will move. Based on mysql/mysql-server@c7d93c274f	2022-04-27 13:38:08 +03:00
Vlad Lesin	0b92c7b0e0	Merge 10.4 into 10.5	2022-03-07 17:16:11 +03:00
Vlad Lesin	86c1bf118a	MDEV-27992 DELETE fails to delete record after blocking is released MDEV-27025 allows to insert records before the record on which DELETE is locked, as a result the DELETE misses those records, what causes serious ACID violation. Revert MDEV-27025, MDEV-27550. The test which shows the scenario of ACID violation is added.	2022-03-07 16:42:05 +03:00
Vlad Lesin	5f001bd7b8	MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock The code was backported from 10.5 `be8113861c` commit. See that commit message for details.	2022-02-21 12:49:54 +03:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Oleksandr Byelkin	41a163ac5c	Merge branch '10.2' into 10.3	2022-01-29 15:41:05 +01:00
Vlad Lesin	be8113861c	MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock The code was backported from 10.6 `bd03c0e516` commit. See that commit message for details. Apart from the above commit trx_lock_t::wait_trx was also backported from MDEV-24738. trx_lock_t::wait_trx is protected with lock_sys.wait_mutex in 10.6, but that mutex was implemented only in MDEV-24789. As there is no need to backport MDEV-24789 for MDEV-27025, trx_lock_t::wait_trx is protected with the same mutexes as trx_lock_t::wait_lock. This fix should not break innodb-lock-schedule-algorithm=VATS. This algorithm uses an Eldest-Transaction-First (ETF) heuristic, which prefers older transactions over new ones. In this fix we just insert granted lock just before the last granted lock of the same transaction, what does not change transactions execution order. The changes in lock_rec_create_low() should not break Galera Cluster, there is a big "if" branch for WSREP. This branch is necessary to provide the correct transactions execution order, and should not be changed for the current bug fix.	2022-01-18 18:15:10 +03:00
Vladislav Vaintroub	47e18af906	MDEV-27494 Rename .ic files to .inl	2022-01-17 16:41:51 +01:00
Marko Mäkelä	18eab4a832	MDEV-26682 Replication timeouts with XA PREPARE The purpose of non-exclusive locks in a transaction is to guarantee that the records covered by those locks must remain in that way until the transaction is committed. (The purpose of gap locks is to ensure that a record that was nonexistent will remain that way.) Once a transaction has reached the XA PREPARE state, the only allowed further actions are XA ROLLBACK or XA COMMIT. Therefore, it can be argued that only the exclusive locks that the XA PREPARE transaction is holding are essential. Furthermore, InnoDB never preserved explicit locks across server restart. For XA PREPARE transations, we will only recover implicit exclusive locks for records that had been modified. Because of the fact that XA PREPARE followed by a server restart will cause some locks to be lost, we might as well always release all non-exclusive locks during the execution of an XA PREPARE statement. lock_release_on_prepare(): Release non-exclusive locks on XA PREPARE. trx_prepare(): Invoke lock_release_on_prepare() unless the isolation level is SERIALIZABLE or this is an internal distributed transaction with the binlog (not actual XA PREPARE statement). This has been discussed with Sergei Golubchik and Andrei Elkin. Reviewed by: Sergei Golubchik	2021-10-18 12:49:10 +03:00
Marko Mäkelä	a9550c47e4	MDEV-16264 fixup: Remove unused code and data LATCH_ID_OS_AIO_READ_MUTEX, LATCH_ID_OS_AIO_WRITE_MUTEX, LATCH_ID_OS_AIO_LOG_MUTEX, LATCH_ID_OS_AIO_IBUF_MUTEX, LATCH_ID_OS_AIO_SYNC_MUTEX: Remove. The tpool is not instrumented. lock_set_timeout_event(): Remove. srv_sys_mutex_key, srv_sys_t::mutex, SYNC_THREADS: Remove. srv_slot_t::suspended: Remove. We only ever assigned this data member true, so it is redundant. ib_wqueue_wait(), ib_wqueue_timedwait(): Remove. os_thread_join(): Remove. os_thread_create(), os_thread_exit(): Remove redundant parameters. These were missed in commit `5e62b6a5e0`.	2020-09-30 14:28:11 +03:00
Marko Mäkelä	00cd53d39a	MDEV-23719: Make lock_sys use page_id_t Since commit `8ccb3caafb` it should be more efficient to use page_id_t rather than two separate variables for tablespace identifier and page number. lock_rec_fold(): Replaced with page_id_t::fold(). lock_rec_hash(): Replaced with lock_sys.hash(page_id). lock_rec_expl_exist_on_page(), lock_rec_get_first_on_page_addr(), lock_rec_get_first_on_page(): Replaced with lock_sys.get_first().	2020-09-17 14:08:41 +03:00
Marko Mäkelä	cfd3d70ccb	MDEV-22871: Remove pointer indirection for InnoDB hash_table_t hash_get_n_cells(): Remove. Access n_cells directly. hash_get_nth_cell(): Remove. Access array directly. hash_table_clear(): Replaced with hash_table_t::clear(). hash_table_create(), hash_table_free(): Remove. hash0hash.cc: Remove.	2020-06-18 14:16:01 +03:00
Marko Mäkelä	6877ef9a7c	Merge 10.4 into 10.5	2020-06-05 20:36:43 +03:00
Marko Mäkelä	680463a8d9	Merge 10.2 into 10.3	2020-06-05 16:51:26 +03:00
Marko Mäkelä	eba2d10ac5	MDEV-22721 Remove bloat caused by InnoDB logger class Introduce a new ATTRIBUTE_NOINLINE to ib::logger member functions, and add UNIV_UNLIKELY hints to callers. Also, remove some crash reporting output. If needed, the information will be available using debugging tools. Furthermore, remove some fts_enable_diag_print output that included indexed words in raw form. The code seemed to assume that words are NUL-terminated byte strings. It is not clear whether a NUL terminator is always guaranteed to be present. Also, UCS2 or UTF-16 strings would typically contain many NUL bytes.	2020-06-04 10:24:10 +03:00
Marko Mäkelä	7bcaa541aa	Merge 10.4 into 10.5	2020-05-05 21:16:22 +03:00
Oleksandr Byelkin	7fb73ed143	Merge branch '10.2' into 10.3	2020-05-04 16:47:11 +02:00
Marko Mäkelä	496d0372ef	Merge 10.4 into 10.5	2020-04-29 15:40:51 +03:00
Daniel Black	ba2061da52	MDEV-21595: innodb offset_t rename to rec_offs thanks to: perl -i -pe 's/\boffset_t\b/rec_offs/g' $(git grep -lw offset_t storage/innobase)	2020-04-29 12:02:47 +03:00
Marko Mäkelä	2e12d471ea	Merge 10.2 into 10.3	2020-04-27 14:24:41 +03:00
Marko Mäkelä	c06845d6f0	Merge 10.1 into 10.2	2020-04-27 13:28:13 +03:00
Marko Mäkelä	edbdfc2f99	MDEV-7962 wsrep_on() takes 0.14% in OLTP RO The function wsrep_on() was being called rather frequently in InnoDB and XtraDB. Let us cache it in trx_t and invoke trx_t::is_wsrep() instead. innobase_trx_init(): Cache trx->wsrep = wsrep_on(thd). ha_innobase::write_row(): Replace many repeated calls to current_thd, and test the cheapest condition first.	2020-04-27 11:18:11 +03:00
Marko Mäkelä	fbe2712705	Merge 10.4 into 10.5 The functional changes of commit `5836191c8f` (MDEV-21168) are omitted due to MDEV-742 having addressed the issue.	2020-04-25 21:57:52 +03:00
Marko Mäkelä	2c39f69d34	MDEV-22203: WSREP_ON is unnecessarily expensive WITH_WSREP=OFF If the server is compiled WITH_WSREP=OFF, we should avoid evaluating conditions on a global variable that is constant. WSREP_ON_: Renamed from WSREP_ON. Defined only WITH_WSREP=ON. WSREP_ON: Defined as unlikely(WSREP_ON_). wsrep_on(): Defined as WSREP_ON && wsrep_service->wsrep_on_func(). The reason why we have wsrep_on() at all is that the macro WSREP(thd) depends on the definition of THD, and that is intentionally an opaque data type for InnoDB. So, we cannot avoid invoking wsrep_on(), but we can evaluate the less expensive condition WSREP_ON before calling the function.	2020-04-24 15:25:39 +03:00
Marko Mäkelä	574d8b2940	MDEV-21907: Fix most clang -Wconversion in InnoDB Declare innodb_purge_threads as 4-byte integer (UINT) instead of 4-or-8-byte (ULONG) and adjust the documentation string.	2020-03-11 08:29:48 +02:00
Marko Mäkelä	28c89b7151	Merge 10.4 into 10.5	2019-12-16 07:47:17 +02:00
Marko Mäkelä	3466b47b0d	Merge 10.2 into 10.3	2019-12-13 10:08:57 +02:00
Eugene Kosov	f0aa073f2b	MDEV-20950 Reduce size of record offsets offset_t: this is a type which represents one record offset. It's unsigned short int. a lot of functions: replace ulint with offset_t btr_pcur_restore_position_func(), page_validate(), row_ins_scan_sec_index_for_duplicate(), row_upd_clust_rec_by_insert_inherit_func(), row_vers_impl_x_locked_low(), trx_undo_prev_version_build(): allocate record offsets on the stack instead of waiting for rec_get_offsets() to allocate it from mem_heap_t. So, reducing memory allocations. RECORD_OFFSET, INDEX_OFFSET: now it's less convenient to store pointers in offset_t* array. One pointer occupies now several offset_t. And those constant are start indexes into array to places where to store pointer values REC_OFFS_HEADER_SIZE: adjusted for the new reality REC_OFFS_NORMAL_SIZE: increase size from 100 to 300 which means less heap allocations. And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which is smaller than previous 800 bytes. REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality rem0rec.h, rem0rec.ic, rem0rec.cc: various arguments, return values and local variables types were changed to fix numerous integer conversions issues. enum field_type_t: offset types concept was introduces which replaces old offset flags stuff. Like in earlier version, 2 upper bits are used to store offset type. And this enum represents those types. REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed get_type(), set_type(), get_value(), combine(): these are convenience functions to work with offsets and it's types rec_offs_base()[0]: still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL rec_offs_base()[i]: these have type offset_t now. Two upper bits contains type.	2019-12-13 00:26:50 +07:00
Vladislav Vaintroub	a808c18b73	Make .clang-format work with clang-8 Remove keywords that are too new.	2019-11-15 18:09:30 +01:00
Vladislav Vaintroub	5e62b6a5e0	MDEV-16264 Use threadpool for Innodb background work. Almost all threads have gone - the "ticking" threads, that sleep a while then do some work) (srv_monitor_thread, srv_error_monitor_thread, srv_master_thread) were replaced with timers. Some timers are periodic, e.g the "master" timer. - The btr_defragment_thread is also replaced by a timer , which reschedules it self when current defragment "item" needs throttling - the buf_resize_thread and buf_dump_threads are substitutes with tasks Ditto with page cleaner workers. - purge workers threads are not tasks as well, and purge cleaner coordinator is a combination of a task and timer. - All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io() and provides the callback. - The srv_slot_t was removed, and innodb_debug_sync used in purge is currently not working, and needs reimplementation.	2019-11-15 18:09:30 +01:00
Marko Mäkelä	4081b7b27a	Merge 10.4 into 10.5	2019-09-06 17:16:40 +03:00
Marko Mäkelä	2c9e75ccfe	MDEV-15326 after-merge fixes trx_t::is_recovered: Revert most of the changes that were made by the merge of MDEV-15326 from 10.2. The trx_sys.rw_trx_hash and the recovery of transactions at startup is quite different in 10.3. trx_free_at_shutdown(): Avoid excessive mutex protection. Reading fields that can only be modified by the current thread (owning the transaction) can be done outside mutex. trx_t::commit_state(): Restore a tighter assertion. trx_rollback_recovered(): Clarify why there is no potential race condition with other transactions. lock_trx_release_locks(): Merge with trx_t::release_locks(), and avoid holding lock_sys.mutex unnecessarily long. rw_trx_hash_t::find(): Remove redundant code, and avoid starving the committer by checking trx_t::state before trx_t::reference().	2019-09-05 15:58:31 +03:00
Marko Mäkelä	537f8594a6	Merge 10.2 into 10.3	2019-09-04 17:52:04 +03:00
Marko Mäkelä	b07beff894	MDEV-15326: InnoDB: Failing assertion: !other_lock MySQL 5.7.9 (and MariaDB 10.2.2) introduced a race condition between InnoDB transaction commit and the conversion of implicit locks into explicit ones. The assertion failure can be triggered with a test that runs 3 concurrent single-statement transactions in a loop on a simple table: CREATE TABLE t (a INT PRIMARY KEY) ENGINE=InnoDB; thread1: INSERT INTO t SET a=1; thread2: DELETE FROM t; thread3: SELECT * FROM t FOR UPDATE; -- or DELETE FROM t; The failure scenarios are like the following: (1) The INSERT statement is being committed, waiting for lock_sys->mutex. (2) At the time of the failure, both the DELETE and SELECT transactions are active but have not logged any changes yet. (3) The transaction where the !other_lock assertion fails started lock_rec_convert_impl_to_expl(). (4) After this point, the commit of the INSERT removed the transaction from trx_sys->rw_trx_set, in trx_erase_lists(). (5) The other transaction consulted trx_sys->rw_trx_set and determined that there is no implicit lock. Hence, it grabbed the lock. (6) The !other_lock assertion fails in lock_rec_add_to_queue() for the lock_rec_convert_impl_to_expl(), because the lock was 'stolen'. This assertion failure looks genuine, because the INSERT transaction is still active (trx->state=TRX_STATE_ACTIVE). The problematic step (4) was introduced in mysql/mysql-server@e27e0e0bb7 which fixed something related to MVCC (covered by the test innodb.innodb-read-view). Basically, it reintroduced an error that had been mentioned in an earlier commit mysql/mysql-server@a17be6963f: "The active transaction was removed from trx_sys->rw_trx_set prematurely." Our fix goes along the following lines: (a) Implicit locks will released by assigning trx->state=TRX_STATE_COMMITTED_IN_MEMORY as the first step. This transition will no longer be protected by lock_sys_t::mutex, only by trx->mutex. This idea is by Sergey Vojtovich. (b) We detach the transaction from trx_sys before starting to release explicit locks. (c) All callers of trx_rw_is_active() and trx_rw_is_active_low() must recheck trx->state after acquiring trx->mutex. (d) Before releasing any explicit locks, we will ensure that any activity by other threads to convert implicit locks into explicit will have ceased, by checking !trx_is_referenced(trx). There was a glitch in this check when it was part of lock_trx_release_locks(); at the end we would release trx->mutex and acquire lock_sys->mutex and trx->mutex, and fail to recheck (trx_is_referenced() is protected by trx_t::mutex). (e) Explicit locks can be released in batches (LOCK_RELEASE_INTERVAL=1000) just like we did before. trx_t::state: Document that the transition to COMMITTED is only protected by trx_t::mutex, no longer by lock_sys_t::mutex. trx_rw_is_active_low(), trx_rw_is_active(): Document that the transaction state should be rechecked after acquiring trx_t::mutex. trx_t::commit_state(): New function to change a transaction to committed state, to release implicit locks. trx_t::release_locks(): New function to release the explicit locks after commit_state(). lock_trx_release_locks(): Move much of the logic to the caller (which must invoke trx_t::commit_state() and trx_t::release_locks() as needed), and assert that the transaction will have locks. trx_get_trx_by_xid(): Make the parameter a pointer to const. lock_rec_other_trx_holds_expl(): Recheck trx->state after acquiring trx->mutex, and avoid a redundant lookup of the transaction. lock_rec_queue_validate(): Recheck impl_trx->state while holding impl_trx->mutex. row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): Document that the transaction state must be rechecked after trx_mutex_enter(). trx_free_prepared(): Adjust for the changes to lock_trx_release_locks().	2019-09-04 09:42:38 +03:00
Marko Mäkelä	624dd71b94	Merge 10.4 into 10.5	2019-08-13 18:57:00 +03:00
Marko Mäkelä	fdef9f9b89	Merge 10.2 into 10.3	2019-07-25 15:31:11 +03:00
Marko Mäkelä	b6ac67389d	Merge 10.1 into 10.2	2019-07-25 12:14:27 +03:00
Marko Mäkelä	9e5df96751	Reduce the amount of time(NULL) calls for lock processing lock_t::requested_time: Document what the field is used for. lock_t::wait_time: Document that the field is only used for diagnostics and may be garbage if the system time is being adjusted. srv_slot_t::suspend_time: Document that this is duplicating trx_lock_t::wait_started. lock_table_print(), lock_rec_print(): Declare in static scope. Add a parameter for the current time. lock_deadlock_check_and_resolve(), lock_deadlock_lock_print(), lock_deadlock_joining_trx_print(): Add a parameter for the current time.	2019-07-24 21:59:26 +03:00
Marko Mäkelä	d09aec7a15	MDEV-19940 Clean up INFORMATION_SCHEMA.INNODB_ tables Shorten some VARCHAR attributes to a more reasonable length. INNODB_METRICS: Rename the column STATUS to ENABLED, and make it Boolean. Replace with INT(1) many Boolean attributes that were declared as VARCHAR containing 'NO','YES','disabled','enabled','Uninitialized','Initialized'. Replace some VARCHAR attributes with ENUM. Replace some BIGINT with INT when 32 bits are sufficient. Remove INNODB_SYS_TABLESPACES.SPACE_TYPE. The type of a tablespace can be derived from the tablespace ID. A fixed number is used for the system tablespace and the temporary tablespace. All other tablespaces are single-table or single-partition tablespaces. i_s_locks_row_t::lock_type, lock_get_type_str(): Remove. This is a redundant field. Table and record locks can be distinguished by whether i_s_locks_row_t::lock_index is NULL. fill_trx_row(): Do not unnecessarily copy the constant strings that trx->op_info is pointing to. i_s_locks_row_t::lock_mode: Replace string with integer. lock_get_mode_str(), lock_get_trx_id(), lock_get_trx(): Remove. field_store_ulint(): Remove.	2019-07-04 00:09:16 +03:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	c0ac0b8860	Update FSF address	2019-05-11 19:25:02 +03:00
Marko Mäkelä	0abd2766b1	Merge 10.2 into 10.3 Also, related to MDEV-15522, MDEV-17304, MDEV-17835, remove the Galera xtrabackup tests, because xtrabackup never worked with MariaDB Server 10.3 due to InnoDB redo log format changes.	2018-11-30 09:38:56 +02:00
Marko Mäkelä	447e493179	Remove some unnecessary InnoDB #include	2018-11-29 12:53:44 +02:00
Marko Mäkelä	055a3334ad	MDEV-13564 Mariabackup does not work with TRUNCATE Implement undo tablespace truncation via normal redo logging. Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name, CREATE, and DROP. Note: Orphan #sql-ib.ibd may be left behind if MariaDB Server 10.2 is killed before the DROP operation is committed. If MariaDB Server 10.2 is killed during TRUNCATE, it is also possible that the old table was renamed to #sql-ib.ibd but the data dictionary will refer to the table using the original name. In MariaDB Server 10.3, RENAME inside InnoDB is transactional, and #sql-* tables will be dropped on startup. So, this new TRUNCATE will be fully crash-safe in 10.3. ha_mroonga::wrapper_truncate(): Pass table options to the underlying storage engine, now that ha_innobase::truncate() will need them. rpl_slave_state::truncate_state_table(): Before truncating mysql.gtid_slave_pos, evict any cached table handles from the table definition cache, so that there will be no stale references to the old table after truncating. == TRUNCATE TABLE == WL#6501 in MySQL 5.7 introduced separate log files for implementing atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB undo and redo log. Some convoluted logic was added to the InnoDB crash recovery, and some extra synchronization (including a redo log checkpoint) was introduced to make this work. This synchronization has caused performance problems and race conditions, and the extra log files cannot be copied or applied by external backup programs. In order to support crash-upgrade from MariaDB 10.2, we will keep the logic for parsing and applying the extra log files, but we will no longer generate those files in TRUNCATE TABLE. A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE (with full redo and undo logging and proper rollback). This will be implemented in MDEV-14717. ha_innobase::truncate(): Invoke RENAME, create(), delete_table(). Because RENAME cannot be fully rolled back before MariaDB 10.3 due to missing undo logging, add some explicit rename-back in case the operation fails. ha_innobase::delete(): Introduce a variant that takes sqlcom as a parameter. In TRUNCATE TABLE, we do not want to touch any FOREIGN KEY constraints. ha_innobase::create(): Add the parameters file_per_table, trx. In TRUNCATE, the new table must be created in the same transaction that renames the old table. create_table_info_t::create_table_info_t(): Add the parameters file_per_table, trx. row_drop_table_for_mysql(): Replace a bool parameter with sqlcom. row_drop_table_after_create_fail(): New function, wrapping row_drop_table_for_mysql(). dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(), fil_prepare_for_truncate(), fil_reinit_space_header_for_table(), row_truncate_table_for_mysql(), TruncateLogger, row_truncate_prepare(), row_truncate_rollback(), row_truncate_complete(), row_truncate_fts(), row_truncate_update_system_tables(), row_truncate_foreign_key_checks(), row_truncate_sanity_checks(): Remove. row_upd_check_references_constraints(): Remove a check for TRUNCATE, now that the table is no longer truncated in place. The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some race-condition like scenarios. The test innodb-innodb.truncate does not use any synchronization. We add a redo log subformat to indicate backup-friendly format. MariaDB 10.4 will remove support for the old TRUNCATE logging, so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve limitations. == Undo tablespace truncation == MySQL 5.7 implements undo tablespace truncation. It is only possible when innodb_undo_tablespaces is set to at least 2. The logging is implemented similar to the WL#6501 TRUNCATE, that is, using separate log files and a redo log checkpoint. We can simply implement undo tablespace truncation within a single mini-transaction that reinitializes the undo log tablespace file. Unfortunately, due to the redo log format of some operations, currently, the total redo log written by undo tablespace truncation will be more than the combined size of the truncated undo tablespace. It should be acceptable to have a little more than 1 megabyte of log in a single mini-transaction. This will be fixed in MDEV-17138 in MariaDB Server 10.4. recv_sys_t: Add truncated_undo_spaces[] to remember for which undo tablespaces a MLOG_FILE_CREATE2 record was seen. namespace undo: Remove some unnecessary declarations. fil_space_t::is_being_truncated: Document that this flag now only applies to undo tablespaces. Remove some references. fil_space_t::is_stopping(): Do not refer to is_being_truncated. This check is for tablespaces of tables. Potentially used tablespaces are never truncated any more. buf_dblwr_process(): Suppress the out-of-bounds warning for undo tablespaces. fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero page number (new size of the tablespace in pages) to inform crash recovery that the undo tablespace size has been reduced. fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2 can be written for undo tablespaces (without .ibd file suffix) for a nonzero page number. os_file_truncate(): Add the parameter allow_shrink=false so that undo tablespaces can actually be shrunk using this function. fil_name_parse(): For undo tablespace truncation, buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[]. recv_read_in_area(): Avoid reading pages for which no redo log records remain buffered, after recv_addr_trim() removed them. trx_rseg_header_create(): Add a FIXME comment that we could write much less redo log. trx_undo_truncate_tablespace(): Reinitialize the undo tablespace in a single mini-transaction, which will be flushed to the redo log before the file size is trimmed. recv_addr_trim(): Discard any redo logs for pages that were logged after the new end of a file, before the truncation LSN. If the rec_list becomes empty, reduce n_addrs. After removing any affected records, actually truncate the file. recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before applying any log records. The undo tablespace files must be open at this point. buf_flush_or_remove_pages(), buf_flush_dirty_pages(), buf_LRU_flush_or_remove_pages(): Add a parameter for specifying the number of the first page to flush or remove (default 0). trx_purge_initiate_truncate(): Remove the log checkpoints, the extra logging, and some unnecessary crash points. Merge the code from trx_undo_truncate_tablespace(). First, flush all to-be-discarded pages (beyond the new end of the file), then trim the space->size to make the page allocation deterministic. At the only remaining crash injection point, flush the redo log, so that the recovery can be tested.	2018-09-07 22:10:02 +03:00
Marko Mäkelä	1eb2d8f6e8	Merge 10.2 into 10.3	2018-08-16 08:54:58 +03:00
Marko Mäkelä	3ce8a0fc49	MDEV-16136: Simplify trx_lock_t memory management Allocate trx->lock.rec_pool and trx->lock.table_pool directly from trx_t. Remove unnecessary use of std::vector. In order to do this, move some definitions from lock0priv.h to lock0types.h, so that ib_lock_t will not be an opaque type.	2018-08-16 05:53:38 +03:00
Marko Mäkelä	1748a31ae8	MDEV-16675 Unnecessary explicit lock acquisition during UPDATE or DELETE In InnoDB, an INSERT will not create an explicit lock object. Instead, the inserted record is initially implicitly locked by the transaction that wrote its trx_t::id to the hidden system column DB_TRX_ID. (Other transactions would check if DB_TRX_ID is referring to a transaction that has not been committed.) If a record was inserted in the current transaction, it would be implicitly locked by that transaction. Only if some other transaction is requesting access to the record, the implicit lock should be converted to an explicit one, so that the waits-for graph can be constructed for detecting deadlocks and lock wait timeouts. Before this fix, InnoDB would convert implicit locks to explicit ones, even if no conflict exists. lock_rec_convert_impl_to_expl(): Return whether caller_trx already holds an explicit lock that covers the record. row_vers_impl_x_locked_low(): Avoid a lookup if the record matches caller_trx->id. lock_trx_has_expl_x_lock(): Renamed from lock_trx_has_rec_x_lock(). row_upd_clust_step(): In a debug assertion, check for implicit lock before invoking lock_trx_has_expl_x_lock(). rw_trx_hash_t::find(): Make do_ref_count a mandatory parameter. Assert that trx_id is not 0 (the caller should check it). trx_sys_t::is_registered(): Only invoke find() if id != 0. trx_sys_t::find(): Add the optional parameter do_ref_count. lock_rec_queue_validate(): Avoid lookup for trx_id == 0.	2018-07-03 15:10:06 +03:00
Sergei Golubchik	b1818dccf7	Merge branch '10.2' into 10.3	2018-03-28 17:31:57 +02:00

1 2 3

129 commits