mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 20:12:31 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	cd04673a17	MDEV-32050 fixup: innodb.instant_alter_crash (take 2) We must disable persistent statistics, because a transaction commit from dict_stats_save() would occasionally interfere with this test.	2023-11-20 16:57:57 +02:00
Marko Mäkelä	3ba041f9f5	MDEV-31953 fixup: Clean up the test Let us tolerate multiple "Memory pressure event freed" in case there a real memory pressure event occurred in addition to the one that this test simulates. Also, clean up some SET variables.	2023-11-20 13:44:47 +02:00
Marko Mäkelä	90d968dab9	Merge 10.6 into 10.11	2023-11-20 10:08:19 +02:00
Marko Mäkelä	2323483528	MDEV-31953 madvise(..., MADV_FREE) is causing a performance regression buf_page_t::set_os_unused(): Remove the system call that had been added in commit `16c9718758` and revised in commit `c1fd082e9c` for Microsoft Windows. buf_pool_t::garbage_collect(): A new function to collect any garbage from the InnoDB buffer pool that can be removed without writing any log or data files. This will also invoke madvise() for all of buf_pool.free. To trigger this the following MDEV is implemented: MDEV-24670 avoid OOM by linux kernel co-operative memory management To avoid frequent triggers that caused the MDEV-31953 regression, while still preserving the 10.11 functionality of non-greedy kernel memory usage, memory triggers are used. On the triggering of memory pressure, if supported in the Linux kernel, trigger the garbage collection of the innodb buffer pool. The hard coded triggers occur where there is: * some memory pressure in 5 of the last 10 seconds * a full stall on memory pressure for 10ms in the last 2 seconds The kernel will trigger only one in each of these time windows. To avoid mariadb being in a constant state of memory garbage collection, this has been limited to once per minute. For a small set of kernels in 2023 (6.5, 6.6), there was a limit requiring CAP_SYS_RESOURCE that was lifted[1] to support the use case of user memory pressure. It not currently possible to set CAP_SYS_RESOURCES in a systemd service as its setting a capability inside a usernamespace. Running under systemd v254+ requires the default MemoryPressureWatch=auto (or alternately "on"). Functionality was tested in a 6.4 kernel Fedora successfully under a systemd service. Running in a container requires that (unmask=)/sys/fs/cgroup be writable by the mariadbd process. To aid testing, the buf_pool_resize was a convient trigger point on which to trigger garbage collection. ref [1]: https://lore.kernel.org/all/CAMw=ZnQ56cm4Txgy5EhGYvR+Jt4s-KVgoA9_65HKWVMOXp7a9A@mail.gmail.com/T/#m3bd2a73c5ee49965cb73a830b1ccaa37ccf4e427 Co-Author: Daniel Black (on memory pressure trigger) Reviewed by: Marko Mäkelä, Vladislav Vaintroub, Vladislav Lesin, Thirunarayanan Balathandayuthapani Tested by: Matthias Leich	2023-11-18 20:12:33 +11:00
Marko Mäkelä	eb1f8b2919	MDEV-32027 Opening all .ibd files on InnoDB startup can be slow dict_find_max_space_id(): Return SELECT MAX(SPACE) FROM SYS_TABLES. dict_check_tablespaces_and_store_max_id(): In the normal case (no encryption plugin has been loaded and the change buffer is empty), invoke dict_find_max_space_id() and do not open any .ibd files. If a std::set<uint32_t> has been specified, open the files whose tablespace ID is mentioned. Else, open all data files that are identified by SYS_TABLES records. fil_ibd_open(): Remove a call to os_file_get_last_error() that can report a misleading error, such as EINVAL inside my_realpath() that is not an actual error. This could be invoked when a data file is found but the FSP_SPACE_FLAGS are incorrect, such as is the case for table test.td in ./mtr --mysqld=--innodb-buffer-pool-dump-at-shutdown=0 innodb.table_flags buf_load(): If any tablespaces could not be found, invoke dict_check_tablespaces_and_store_max_id() on the missing tablespaces. dict_load_tablespace(): Try to load the tablespace unless it was found to be futile. This fixes failures related to FTS_*.ibd files for FULLTEXT INDEX. btr_cur_t::search_leaf(): Prevent a crash when the tablespace does not exist. This was caught by the test innodb_fts.fts_concurrent_insert when the change to dict_load_tablespaces() was not present. We modify a few tests to ensure that tables will not be loaded at startup. For some fault injection tests this means that the corrupted tables will not be loaded, because dict_load_tablespace() would perform stricter checks than dict_check_tablespaces_and_store_max_id(). Tested by: Matthias Leich Reviewed by: Thirunarayanan Balathandayuthapani	2023-11-17 15:07:51 +02:00
Marko Mäkelä	5a1f821b93	MDEV-31861 Empty INSERT crashes with innodb_force_recovery=6 or innodb_read_only=ON ha_innobase::extra(): Do not invoke log_buffer_flush_to_disk() if high_level_read_only holds. log_buffer_flush_to_disk(): Remove an assertion that duplicates one at the start of log_write_up_to().	2023-11-16 16:57:42 +02:00
Marko Mäkelä	55a96c055a	MDEV-32050 fixup: innodb.instant_alter_crash This test occasionally fails with a failure to purge history. Let us try to purge everything before starting the interesting part, to make that occasional failure go away.	2023-11-16 16:39:02 +02:00
Marko Mäkelä	52ca2e65af	Merge 10.5 into 10.6	2023-11-15 14:10:21 +02:00
Oleksandr Byelkin	0427c4739e	MariaDB 11.1.3 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmVSLiUACgkQ8WVvJMdM 0dhj4A/7B2GIx75Mv4IcExE2s4bfR7sOKZzvjWbMHysovMHhsHAV5fHN7dRQojyV HxmSY8lxykm/LMoJ8RASmrojRZsgvkJ84z+fLK7is327Vms7fW7ZWc3eqotIgs7I m9Dz3+wiexvl6NKeHnafTZtkJOe8MEqZEGPV1e8V4I3SAJQWyQLnRr4si/VmjAMi miKuieTuKoZUYSkdNLwicEFHXysgg6/U8367sgMsJe9V3HYSD3pVQJ/nboTL5uZL vTbmEPS1pICKPvPu75DdedSdxSASMyXis9/IWtk13NqPPzX16uHtjkhffAuBT3+k CUgRggTYAuoF3MjvyspIS3pdC/73PBb1O+w/9vlHPiwSXVn3d48Ay55uvFgM/pVB UKLorw+As0oH2N1HWUp/d4Rbvrnjdq5OgzhmMTrWDAtYyrNU9Jw5S1CAp+G/s2dD 5j+FUPBBnHo5UfxI+EVTqUggm56R+vJTx4H3q82n05bdJTJYNJ+nixvsYuf7hS3f oEqJAUizgGI3h6FGPD9bN0HSYGblEeNgAYv1YogfVX/Eq10RriWic9PtxxOxgOmE n+UhdH4YTTyaTv0jssWTJVmVNzjjXMvI4aB8A1FkXeIz2iohSziSkJzaBuzNq2QY kKHr8XqiyNnckcoRxfoxNPtrWcmiykpHOBFnuyMRWoXPKcr7idc= =ShdC -----END PGP SIGNATURE----- Merge tag '11.1' into 11.2 MariaDB 11.1.3 release	2023-11-14 18:28:37 +01:00
Oleksandr Byelkin	9f83a8822f	Merge branch '10.5' into mariadb-10.5.23	2023-11-14 08:41:23 +01:00
Alexander Barkov	1710b6454b	MDEV-26743 InnoDB: CHAR+nopad does not work well The patch for "MDEV-25440: Indexed CHAR ... broken with NO_PAD collations" fixed these scenarios from MDEV-26743: - Basic latin letter vs equal accented letter - Two letters vs equal (but space padded) expansion However, this scenario was still broken: - Basic latin letter (but followed by an ignorable character) vs equal accented letter Fix: When processing for a NOPAD collation a string with trailing ignorable characters, like: '<non-ignorable><ignorable><ignorable>' the string gets virtually converted to: '<non-ignorable><ignorable><ignorable><space><space><space>...' After the fix the code works differently in these two cases: 1. <space> fits into the "nchars" limit 2. <space> does not fit into the "nchars" limit Details: 1. If "nchars" is large enough (4+ in this example), return weights as follows: '[weight-for-non-ignorable, 1 char] [weight-for-space-character, 3 chars]' i.e. the weight for the virtual trailing space character now indicates that it corresponds to total 3 characters: - two ignorable characters - one virtual trailing space character 2. If "nchars" is small (3), then the virtual trailing space character does not fit into the "nchar" limit, so return 0x00 as weight, e.g.: '[weight-for-non-ignorable, 1 char] [0x00, 2 chars]' Adding corresponding MTR tests and unit tests.	2023-11-10 06:17:23 +04:00
Marko Mäkelä	e0c65784aa	MDEV-32737 innodb.log_file_name fails on Assertion `after_apply \|\| !(blocks).end in recv_sys_t::clear recv_group_scan_log_recs(): Set the debug flag recv_sys.after_apply after actually completing the log scan. In the test, suppress some errors that may be reported when the crash recovery of RENAME TABLE t1 TO t2 is preceded by copying t2.ibd to t1.ibd.	2023-11-09 11:06:17 +02:00
Oleksandr Byelkin	0f5613a25f	Merge branch '11.0' into 11.1	2023-11-08 18:03:08 +01:00
Oleksandr Byelkin	48af85db21	Merge branch '10.11' into 11.0	2023-11-08 17:09:44 +01:00
Oleksandr Byelkin	fecd78b837	Merge branch '10.10' into 10.11	2023-11-08 16:46:47 +01:00
Oleksandr Byelkin	04d9a46c41	Merge branch '10.6' into 10.10	2023-11-08 16:23:30 +01:00
Oleksandr Byelkin	b83c379420	Merge branch '10.5' into 10.6	2023-11-08 15:57:05 +01:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Marko Mäkelä	228b7e4db5	MDEV-13626 Merge InnoDB test cases from MySQL 5.7 This imports and adapts a number of MySQL 5.7 test cases that are applicable to MariaDB. Some tests for old bug fixes are not that relevant because the code has been refactored since then (especially starting with MariaDB Server 10.6), and the tests would not reproduce the original bug if the fix was reverted. In the test innodb_fts.opt, there are many duplicate MATCH ranks, which would make the results nondeterministic. The test was stabilized by changing some LIMIT clauses or by adding sorted_result in those cases where the purpose of a test was to show that no sorting took place in the server. In the test innodb_fts.phrase, MySQL 5.7 would generate FTS_DOC_ID that are 1 larger than in MariaDB. In innodb_fts.index_table the difference is 2. This is because in MariaDB, fts_get_next_doc_id() post-increments cache->next_doc_id, while MySQL 5.7 pre-increments it. Reviewed by: Thirunarayanan Balathandayuthapani	2023-11-08 12:17:14 +02:00
Kristian Nielsen	9fa718b1a1	Fix mariabackup InnoDB recovered binlog position on server upgrade Before MariaDB 10.3.5, the binlog position was stored in the TRX_SYS page, while after it is stored in rollback segments. There is code to read the legacy position from TRX_SYS to handle upgrades. The problem was if the legacy position happens to compare larger than the position found in rollback segments; in this case, the old TRX_SYS position would incorrectly be preferred over the newer position from rollback segments. Fixed by always preferring a position from rollback segments over a legacy position. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-11-03 09:13:51 +01:00
Thirunarayanan Balathandayuthapani	b4de67da45	MDEV-32638 MariaDB crashes with foreign_key_checks=0 when changing a column and adding a foreign key at the same time Problem: ======= - InnoDB fails to find the foreign key index for the newly added foreign key relation. This is caused by commit `5f09b53bdb` (MDEV-31086). FIX: === In check_col_is_in_fk_indexes(), while iterating through the newly added foreign key relationship, InnoDB should consider that foreign key relation may not have foreign index when foreign key check is disabled.	2023-11-02 14:33:05 +05:30
Yuchen Pei	d0f8dfbcf0	Merge branch '11.1' into 11.2	2023-10-27 18:11:56 +11:00
Marko Mäkelä	88733282fb	MDEV-32050: Look up tables in the purge coordinator The InnoDB table lookup in purge worker threads is a bottleneck that can degrade a slow shutdown to utilize less than 2 threads. Let us fix that bottleneck by constructing a local lookup table that does not require any synchronization while the undo log records of the current batch are being processed. TRX_PURGE_TABLE_BUCKETS: The initial number of std::unordered_map hash buckets used during a purge batch. This could avoid some resizing and rehashing in trx_purge_attach_undo_recs(). purge_node_t::tables: A lookup table from table ID to an already looked up and locked table. Replaces many fields. trx_purge_attach_undo_recs(): Look up each table in the purge batch only once. trx_purge(): Close all tables and release MDL at the end of the batch. trx_purge_table_open(), trx_purge_table_acquire(): Open a table in purge and acquire a metadata lock on it. This replaces dict_table_open_on_id<true>() and dict_acquire_mdl_shared(). purge_sys_t::close_and_reopen(): In case of an MDL conflict, close and reopen all tables that are covered by the current purge batch. It may be that some of the tables have been dropped meanwhile and can be ignored. This replaces wait_SYS() and wait_FTS(). row_purge_parse_undo_rec(): Make purge_coordinator_task issue a MDL warrant to any purge_worker_task which might need it when innodb_purge_threads>1. purge_node_t::end(): Clear the MDL warrant. Reviewed by: Vladislav Lesin and Vladislav Vaintroub	2023-10-25 10:08:20 +03:00
Marko Mäkelä	14685b10df	MDEV-32050: Deprecate&ignore innodb_purge_rseg_truncate_frequency The motivation of introducing the parameter innodb_purge_rseg_truncate_frequency in mysql/mysql-server@28bbd66ea5 and mysql/mysql-server@8fc2120fed seems to have been to avoid stalls due to freeing undo log pages or truncating undo log tablespaces. In MariaDB Server, innodb_undo_log_truncate=ON should be a much lighter operation than in MySQL, because it will not involve any log checkpoint. Another source of performance stalls should be trx_purge_truncate_rseg_history(), which is shrinking the history list by freeing the undo log pages whose undo records have been purged. To alleviate that, we will introduce a purge_truncation_task that will offload this from the purge_coordinator_task. In that way, the next innodb_purge_batch_size pages may be parsed and purged while the pages from the previous batch are being freed and the history list being shrunk. The processing of innodb_undo_log_truncate=ON will still remain the responsibility of the purge_coordinator_task. purge_coordinator_state::count: Remove. We will ignore innodb_purge_rseg_truncate_frequency, and act as if it had been set to 1 (the maximum shrinking frequency). purge_coordinator_state::do_purge(): Invoke an asynchronous task purge_truncation_callback() to free the undo log pages. purge_sys_t::iterator::free_history(): Free those undo log pages that have been processed. This used to be a part of trx_purge_truncate_history(). purge_sys_t::clone_end_view(): Take a new value of purge_sys.head as a parameter, so that it will be updated while holding exclusive purge_sys.latch. This is needed for race-free access to the field in purge_truncation_callback(). Reviewed by: Vladislav Lesin	2023-10-25 09:11:58 +03:00
Alexander Barkov	df72c57d6f	MDEV-30048 Prefix keys for CHAR work differently for MyISAM vs InnoDB Also fixes: MDEV-30050 Inconsistent results of DISTINCT with NOPAD Problem: Key segments for CHAR columns where compared using strnncollsp() for engines MyISAM and Aria. This did not work correct in case if the engine applyied trailing space compression. Fix: Replacing ha_compare_text() calls to new functions: - ha_compare_char_varying() - ha_compare_char_fixed() - ha_compare_word() - ha_compare_word_prefix() - ha_compare_word_or_prefix() The code branch corresponding to comparison of CHAR column keys (HA_KEYTYPE_TEXT segment type) now uses ha_compare_char_fixed() which calls strnncollsp_nchars(). This patch does not change the behavior for the rest of the code: - comparison of VARCHAR/TEXT column keys (HA_KEYTYPE_VARTEXT1, HA_KEYTYPE_VARTEXT2 segments types) - comparison in the fulltext code	2023-10-24 03:35:48 +04:00
Thirunarayanan Balathandayuthapani	7d89dcf1ae	MDEV-32527 Server aborts during alter operation when table doesn't have foreign index Problem: ======== InnoDB fails to find the foreign key index for the foreign key relation in the table while iterating the foreign key constraints during alter operation. This is caused by commit `5f09b53bdb` (MDEV-31086). Fix: ==== In check_col_is_in_fk_indexes(), while iterating through the foreign key relationship, InnoDB should consider that foreign key relation may not have foreign index when foreign key check is disabled.	2023-10-20 15:23:22 +05:30
Daniel Black	1182451af1	MDEV-32018 Allow the setting of Auto_increment on FK referenced columns In MDEV-31086, SET FOREIGN_KEY_CHECKS=0 cannot bypass checks that make column types of foreign keys incompatible. An unfortunate consequence is that adding an AUTO_INCREMENT is considered incompatible in Field_{num,decimal}::is_equal and for the purpose of FK checks this isn't relevant. innodb.foreign_key - pragmaticly left wait_until_count_sessions.inc at end of test to match the second line of test. Reporter: horrockss@github - https://github.com/MariaDB/mariadb-docker/issues/528 Co-Author: Marko Mäkelä <marko.makela@mariadb.com> Reviewer: Nikita Malyavin For the future reader this was attempted: Removing AUTO_INCREMENT checks from Field_{num,decimal}::is_equals failed in the following locations (noted for future fixing): * MyISAM and Aria (not InnoDB) don't adjust AUTO_INCREMENT next number correctly, hence added a test to main.auto_increment to catch the next person that attempts this fix. * InnoDB must perform an ALGORITHM=COPY to populate NULL values of an original table (MDEV-19190 mtr test period.copy), this requires ALTER_STORED_COLUMN_TYPE to be set in fill_alter_inplace_info which doesn't get hit because field->is_equal is true. * InnoDB must not perform the change inplace (below patch) * innodb.innodb-alter-timestamp main.partition_innodb test would also need futher investigation. InnoDB ha_innobase::check_if_supported_inplace_alter to support the removal of Field_{num,decimal}::is_equal AUTO_INCREMENT checks would need the following change diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc index a5ccb1957f3..9d778e2d39a 100644 --- a/storage/innobase/handler/handler0alter.cc +++ b/storage/innobase/handler/handler0alter.cc @@ -2455,10 +2455,15 @@ ha_innobase::check_if_supported_inplace_alter( /* An AUTO_INCREMENT attribute can only be added to an existing column by ALGORITHM=COPY, but we can remove the attribute. / - ut_ad((MTYP_TYPENR((af)->unireg_check) - != Field::NEXT_NUMBER) - \|\| (MTYP_TYPENR(f->unireg_check) - == Field::NEXT_NUMBER)); + if ((MTYP_TYPENR((*af)->unireg_check) + == Field::NEXT_NUMBER) + && (MTYP_TYPENR(f->unireg_check) + != Field::NEXT_NUMBER)) + { + ha_alter_info->unsupported_reason = my_get_err_msg( + ER_ALTER_OPERATION_NOT_SUPPORTED_REASON_AUTOINC); + DBUG_RETURN(HA_ALTER_INPLACE_NOT_SUPPORTED); + } With this change the main.auto_increment test for bug #14573, under innodb, will pass without the 2 --error ER_DUP_ENTRY entries. The function header comment was updated to reflect the MDEV-31086 changes.	2023-10-20 17:32:46 +11:00
Marko Mäkelä	65700edb26	Merge 10.10 into 10.11	2023-10-19 14:50:42 +03:00
Marko Mäkelä	c92d06748a	Merge 10.6 into 10.10	2023-10-19 14:35:31 +03:00
Marko Mäkelä	6991b1c47c	Merge 10.5 into 10.6	2023-10-19 13:50:00 +03:00
Marko Mäkelä	9b2a65e41a	Merge 11.0 into 11.1	2023-10-19 08:26:16 +03:00
Marko Mäkelä	be24e75229	Merge 10.11 into 11.0	2023-10-19 08:12:16 +03:00
Daniel Black	e467e8d8c2	MDEV-30825 innodb_compression_algorithm=0 (none) increments Innodb_num_pages_page_compression_error fil_page_compress_low returns 0 for both innodb_compression_algorithm=0 and where there is compression errors. On the two callers to this function, don't increment the compression errors if the algorithm was none. Reviewed by: Marko Mäkelä	2023-10-18 19:18:50 +11:00
Thirunarayanan Balathandayuthapani	3da5d047b8	MDEV-31851 After crash recovery, undo tablespace fails to open Problem: ======== - InnoDB fails to open undo tablespace when page0 is corrupted and fails to throw error. Solution: ========= - InnoDB throws DB_CORRUPTION error when InnoDB encounters page0 corruption of undo tablespace. - InnoDB restores the page0 of undo tablespace from doublewrite buffer if it encounters page corruption - Moved Datafile::restore_from_doublewrite() to recv_dblwr_t::restore_first_page(). So that undo tablespace and system tablespace can use this function instead of duplicating the code srv_undo_tablespace_open(): Returns 0 if file doesn't exist or ULINT_UNDEFINED if page0 is corrupted.	2023-10-17 18:41:21 +05:30
Marko Mäkelä	2ecc0443ec	Merge 10.10 into 10.11	2023-10-17 16:04:21 +03:00
Marko Mäkelä	0563106b1a	Merge 10.6 into 10.10	2023-10-17 13:02:57 +03:00
Thirunarayanan Balathandayuthapani	ee5cadd5c8	MDEV-28122 Optimize table crash while applying online log - InnoDB fails to check the overflow buffer while applying the operation to the table that was rebuilt. This is caused by commit `3cef4f8f0f` (MDEV-515).	2023-10-16 20:17:09 +05:30
Marko Mäkelä	d5e15424d8	Merge 10.6 into 10.10 The MDEV-29693 conflict resolution is from Monty, as well as is a bug fix where ANALYZE TABLE wrongly built histograms for single-column PRIMARY KEY. Also includes a fix for safe_malloc error reporting. Other things: - Copied main.log_slow from 10.4 to avoid mtr issue Disabled test: - spider/bugfix.mdev_27239 because we started to get +Error 1429 Unable to connect to foreign data source: localhost -Error 1158 Got an error reading communication packets - main.delayed - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED This part is disabled for now as it fails randomly with different warnings/errors (no corruption).	2023-10-14 13:36:11 +03:00
Vlad Lesin	18fa00a54c	MDEV-32272 lock_release_on_prepare_try() does not release lock if supremum bit is set along with other bits set in lock's bitmap The error is caused by MDEV-30165 fix with the following commit: `d13a57ae81` There is logical error in lock_release_on_prepare_try(): if (supremum_bit) lock_rec_unlock_supremum(*cell, lock); else lock_rec_dequeue_from_page(lock, false); Because there can be other bits set in the lock's bitmap, and the lock type can be suitable for releasing criteria, but the above logic releases only supremum bit of the lock. The fix is to release lock if it suits for releasing criteria and unlock supremum if supremum is locked otherwise. Tere is also the test for the case, which was reported by QA team. I placed it in a separate files, because it requires debug build. Reviewed by: Marko Mäkelä	2023-10-13 16:29:04 +03:00
Thirunarayanan Balathandayuthapani	4045ead9db	MDEV-32337 Assertion `pos < table->n_def' failed in dict_table_get_nth_col While checking for altered column in foreign key constraints, InnoDB fails to ignore virtual columns. This issue caused by commit 5f09b53bdb4e973e7c7ec2c53a24c98321223f98(MDEV-31086).	2023-10-12 14:49:27 +05:30
Marko Mäkelä	625a150a86	Merge 10.5 into 10.6	2023-10-06 14:34:01 +03:00
Marko Mäkelä	6e9b421f77	MDEV-32364 Server crashes when starting server with high innodb_log_buffer_size log_t::create(): Return whether the initialisation succeeded. It may fail if too large an innodb_log_buffer_size is specified.	2023-10-06 14:16:01 +03:00
Vlad Lesin	96ae37abc5	MDEV-30658 lock_row_lock_current_waits counter in information_schema.innodb_metrics may become negative MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT monitor should has MONITOR_DISPLAY_CURRENT flag set in its definition, as it shows the current state and does not accumulate anything. Reviewed by: Marko Mäkelä	2023-10-05 18:27:54 +03:00
Sergei Golubchik	37e854f34a	Merge branch '11.1' into 11.2	2023-09-29 16:01:59 +02:00
Sergei Golubchik	05d850d4b3	Merge branch '11.0' into 11.1	2023-09-29 13:58:47 +02:00
Sergei Golubchik	3f6bccb888	Merge branch '10.11' into 11.0	2023-09-29 12:24:54 +02:00
Sergei Golubchik	034848c6c2	Merge branch '10.10' into 10.11	2023-09-24 19:41:43 +02:00
Vlad Lesin	d13a57ae81	Merge 10.5 into 10.6.	2023-09-22 15:21:15 +03:00
Vlad Lesin	95730372bd	MDEV-30165 X-lock on supremum for prepared transaction for RR trx_t::set_skip_lock_inheritance() must be invoked at the very beginning of lock_release_on_prepare(). Currently trx_t::set_skip_lock_inheritance() is invoked at the end of lock_release_on_prepare() when lock_sys and trx are released, and there can be a case when locks on prepare are released, but "not inherit gap locks" bit has not yet been set, and page split inherits lock to supremum. Also reset supremum bit and rebuild waiting queue when XA is prepared. Reviewed by: Marko Mäkelä	2023-09-21 20:07:53 +03:00
Marko Mäkelä	6a470db552	Merge 10.5 into 10.6	2023-09-14 15:25:53 +03:00
Marko Mäkelä	81e60f1a0a	MDEV-32163 Crash recovery fails after DROP TABLE in system tablespace fseg_free_extent(): After fsp_free_extent() succeeded, properly mark the affected pages as freed. We failed to write FREE_PAGE records. This bug was revealed or caused by commit `e938d7c18f` (MDEV-32028).	2023-09-14 15:17:27 +03:00
Thirunarayanan Balathandayuthapani	eece7f135f	- Rename the DBUG_EXECUTE_IF from sys_shrink_buffer_pool_full to sys_shrink_buffer_pool to make it as generic name.	2023-09-11 18:11:45 +05:30
Marko Mäkelä	4a8291fc5f	MDEV-30531 Corrupt index(es) on busy table when using FOREIGN KEY lock_wait(): Never return the transient error code DB_LOCK_WAIT. In commit `78a04a4c22` (MDEV-29869) some assignments assign trx->error_state = DB_SUCCESS were removed, and it was possible that the field was left at its initial value DB_LOCK_WAIT. The test case for this is nondeterministic; without this fix, it would only occasionally fail. Reviewed by: Vladislav Lesin	2023-09-11 14:52:05 +03:00
Marko Mäkelä	0dd25f28f7	Merge 10.5 into 10.6	2023-09-11 14:46:39 +03:00
Marko Mäkelä	f8f7d9de2c	Merge 10.4 into 10.5	2023-09-11 11:29:31 +03:00
Marko Mäkelä	65c99207e0	MDEV-23841: Memory leak in innodb_monitor_validate() innodb_monitor_validate(): Let item_val_str() allocate the memory in THD, so that it will be available to innodb_monitor_update(). In this way, there is no need to allocate another buffer, and no problem if the call to innodb_monitor_update() is skipped due to an invalid value that is passed to another configuration parameter. There are some other callers to st_mysql_sys_var::val_str() that validate configuration parameters that are related to FULLTEXT INDEX, but they will allocate memory by invoking thd_strmake().	2023-09-11 10:27:21 +03:00
Sergei Golubchik	fba4abf3b9	MDEV-32128 wrong table name in innodb's "row too big" errors	2023-09-08 19:15:33 +02:00
Monty	b08474435f	Fix compression tests for s390x The problem is that s390x is not using the default bzip library we use on other platforms, which causes compressed string lengths to be differnt than what mtr tests expects. Fixed by: - Added have_normal_bzip.inc, which checks if compress() returns the expected length. - Adjust the results to match the expected one - main.func_compress.test & archive.archive - Don't print lengths that depends on compression library - mysqlbinlog compress tests & connect.zip - Don't print DATA_LENGTH for SET column_compression_zlib_level=1 - main.column_compression	2023-09-05 12:34:39 +03:00
Marko Mäkelä	b0a43818b4	Merge 10.5 into 10.6	2023-09-04 10:15:02 +03:00
Marko Mäkelä	59952b2625	Merge 10.4 into 10.5	2023-09-04 09:40:26 +03:00
Thirunarayanan Balathandayuthapani	d1fca0baab	MDEV-32060 Server aborts when table doesn't have referenced index - Server aborts when table doesn't have referenced index. This is caused by `5f09b53bdb` (MDEV-31086). While iterating the foreign key constraints, we fail to consider that InnoDB doesn't have referenced index for it when foreign key check is disabled.	2023-09-01 17:54:07 +05:30
Marko Mäkelä	2325f8f339	Merge 10.5 into 10.6	2023-08-31 13:01:42 +03:00
Thirunarayanan Balathandayuthapani	cb384d0d04	MDEV-32008 auto_increment value on table increments by one after restart - This issue caused by commit 4700f2ac70f8c79f2ac1968b6b59d18716f492bf(MDEV-30796) During bulk insert operation, InnoDB wrongly stores the next autoincrement value as current autoincrement value. So update the current autoincrement value rather than next auto increment value.	2023-08-29 10:37:08 +05:30
Thirunarayanan Balathandayuthapani	e938d7c18f	MDEV-32028 InnoDB scrubbing doesn't write zero while freeing the extent Problem: ======== InnoDB fails to mark the page status as FREED during freeing of an extent of a segment. This behaviour affects scrubbing and doesn't write all zeroes in file even though pages are freed. Solution: ======== InnoDB should mark the page status as FREED before reinitialize the extent descriptor entry.	2023-08-28 20:27:19 +05:30
Thirunarayanan Balathandayuthapani	bf3b787e02	MDEV-31835 Remove unnecessary extra HA_EXTRA_IGNORE_INSERT call - This commit is different from 10.6 commit `c438284863`. Due to Commit `045757af4c` (MDEV-24621), InnoDB does buffer and pre-sort the records for each index, and build the indexes one page at a time. Multiple large insert ignore statment aborts the server during bulk insert operation. Problem is that InnoDB merge record exceeds the page size. To avoid this scenario, InnoDB should catch too big record while buffering the insert operation itself. row_merge_buf_encode(): returns length of the encoded index record row_merge_buf_write(): Catches the DB_TOO_BIG_RECORD earlier and returns error	2023-08-25 23:13:05 +05:30
Marko Mäkelä	f7780a8eb8	MDEV-30100: Assertion purge_sys.tail.trx_no <= purge_sys.rseg->last_trx_no() trx_t::commit_empty(): A special case of transaction "commit" when the transaction was actually rolled back or the persistent undo log is empty. In this case, we need to change the undo log header state to TRX_UNDO_CACHED and move the undo log from rseg->undo_list to rseg->undo_cached for fast reuse. Furthermore, unless this is the only undo log record in the page, we will remove the record and rewind TRX_UNDO_PAGE_START, TRX_UNDO_PAGE_FREE, TRX_UNDO_LAST_LOG. We must also ensure that the system-wide transaction identifier will be persisted up to this->id, so that there will not be warnings or errors due to a PAGE_MAX_TRX_ID being too large. We might have modified secondary index pages before being rolled back, and any changes of PAGE_MAX_TRX_ID are never rolled back. Even though it is not going to be written persistently anywhere, we will invoke trx_sys.assign_new_trx_no(this), so that in the test innodb.instant_alter everything will be purged as expected. trx_t::write_serialisation_history(): Renamed from trx_write_serialisation_history(). If there is no undo log, invoke commit_empty(). trx_purge_add_undo_to_history(): Simplify an assertion and remove a comment. This function will not be invoked on an empty undo log anymore. trx_undo_header_create(): Add a debug assertion. trx_undo_mem_create_at_db_start(): Remove a duplicated assignment. Reviewed by: Vladislav Lesin Tested by: Matthias Leich	2023-08-25 13:41:54 +03:00
Marko Mäkelä	eda75cadea	Merge 10.5 into 10.6	2023-08-24 10:16:24 +03:00
Marko Mäkelä	aeb8eae5c8	Merge 10.4 into 10.5	2023-08-24 10:12:13 +03:00
Marko Mäkelä	02878f128e	MDEV-31813 SET GLOBAL innodb_max_purge_lag_wait hangs if innodb_read_only innodb_max_purge_lag_wait_update(): Return immediately if we are in high_level_read_only mode. srv_wake_purge_thread_if_not_active(): Relax a debug assertion. If srv_read_only_mode holds, purge_sys.enabled() will not hold and this function will do nothing. trx_t::commit_in_memory(): Remove a redundant condition before invoking srv_wake_purge_thread_if_not_active().	2023-08-24 10:08:51 +03:00
Marko Mäkelä	07494006dd	Merge 10.5 into 10.6	2023-08-22 09:36:35 +03:00
Marko Mäkelä	f9cc29824b	Merge 10.4 into 10.5	2023-08-22 09:01:34 +03:00
Marko Mäkelä	ff682eada8	MDEV-20194 test adjustment for s390x The test innodb.row_size_error_log_warnings_3 that was added in commit `372b0e6355` (MDEV-20194) failed to take into account the earlier adjustment in commit `cf574cf53b` (MDEV-27634) that is specific to many GNU/Linux distributions for the s390x.	2023-08-22 09:00:51 +03:00
Marko Mäkelä	448c2077fb	Merge 10.5 into 10.6	2023-08-21 15:50:31 +03:00
Sergei Golubchik	18ddde4826	Merge branch '11.1' into 11.2	2023-08-18 00:59:16 +02:00
Marko Mäkelä	5895a3622b	Merge 10.4 into 10.5	2023-08-17 10:33:36 +03:00
Marko Mäkelä	5a8a8fc953	MDEV-31928 Assertion xid ... < 128 failed in trx_undo_write_xid() trx_undo_write_xid(): Correct an off-by-one error in a debug assertion.	2023-08-17 10:31:55 +03:00
Marko Mäkelä	b4ace139a1	Remove the often-hanging test innodb.alter_rename_files The test innodb.alter_rename_files rather frequently hangs in checkpoint_set_now. The test was removed in MariaDB Server 10.5 commit `37e7bde12a` when the code that it aimed to cover was simplified. Starting with MariaDB Server 10.5 the page flushing and log checkpointing is much simpler, handled by the single buf_flush_page_cleaner() thread. Let us remove the test to avoid occasional failures. We are not going to fix the cause of the failure in MariaDB Server 10.4.	2023-08-15 12:14:31 +03:00
Nikita Malyavin	43cb98b420	fix main.mysql57_virtual, main.alter_table, innodb.alter_algorithm The correct (best) algorithm is now chosen for ALGORITHM=DEFAULT and alter_algorithm=DEFAULT See also MDEV-30906	2023-08-15 10:16:13 +02:00
Sergei Golubchik	a8a22b7af2	support 'alter online table t1 page_checksum=0'	2023-08-15 10:16:11 +02:00
Oleksandr Byelkin	f5fae75652	Merge branch '11.0' into 11.1	2023-08-09 08:25:14 +02:00
Oleksandr Byelkin	51f9d62005	Merge branch '10.11' into 11.0	2023-08-09 07:53:48 +02:00
Oleksandr Byelkin	036df5f970	Merge branch '10.10' into 10.11	2023-08-08 14:57:31 +02:00
Oleksandr Byelkin	d2fdba94cf	Merge branch '10.9' into 10.10	2023-08-08 14:47:16 +02:00
Oleksandr Byelkin	27dc4cd1fc	Merge branch '10.6' into 10.9	2023-08-08 13:28:26 +02:00
Oleksandr Byelkin	d28d636f57	Merge branch '10.5' into 10.6	2023-08-08 13:20:58 +02:00
Oleksandr Byelkin	8852afe317	Merge branch '10.4' into 10.5	2023-08-08 11:24:42 +02:00
Thirunarayanan Balathandayuthapani	0ede90dd31	MDEV-31869 Server aborts when table does drop column - InnoDB aborts when table is dropping the column. This is caused by `5f09b53bdb` (MDEV-31086). While iterating the altered table fields, we fail to consider the dropped columns.	2023-08-08 13:24:23 +05:30
Oleksandr Byelkin	ced243a099	Merge branch '10.9' into 10.10	2023-08-05 20:34:09 +02:00
Oleksandr Byelkin	34a8e78581	Merge branch '10.6' into 10.9	2023-08-04 08:01:06 +02:00
Oleksandr Byelkin	5ea5291d97	Merge branch '10.5' into 10.6	2023-08-04 07:52:54 +02:00
Sergei Golubchik	da09ae05a9	MDEV-18114 Foreign Key Constraint actions don't affect Virtual Column * invoke check_expression() for all vcol_info's in mysql_prepare_create_table() to check for FK CASCADE * also check for SET NULL and SET DEFAULT * to check against existing FKs when a vcol is added in ALTER TABLE, old FKs must be added to the new_key_list just like other indexes are * check columns recursively, if vcol1 references vcol2, flags of vcol2 must be taken into account * remove check_table_name_processor(), put that logic under check_vcol_func_processor() to avoid walking the tree twice	2023-08-02 14:45:31 +02:00
Sergei Golubchik	75f5cc478f	MDEV-30905 Remove old_alter_table variable	2023-08-02 13:29:48 +02:00
Thirunarayanan Balathandayuthapani	f9003c73a1	MDEV-14795 InnoDB system tablespace cannot be shrunk - Introduce the option :autoshrink attribute to be added to innodb_data_file_path variable to allow the shrinking of system tablespace during startup process. Steps for shrinking the system tablespace: 1) Find the last used extent in system tablespace by iterating through the BITMAP in extent descriptor pages 2) If the last used extent is lesser than user specified size then set desired target size to user specified size. 3) Store the page contents of "to be modified" extent descriptor pages, latches the "to be modified" extent descriptor pages and check for buffer pool memory availability 4) Make checkpoint to flush all pages in buffer pool, so that pages in flush list doesn't have to use doublewrite buffer and disable doublewrite buffer during shrinking process 5) Update the FSP_SIZE and FSP_FREE_LIMIT in header page 6) Remove the "to be truncated" pages from FSP_FREE and FSP_FREE_FRAG list 7) Reset the bitmap in the last descriptor pages for the "to be truncated" pages. 8) In case of multiple files, calculate the truncated last file size and do the truncation in last file 9) Check whether mini-transaction log size doesn't exceed the minimum value of innodb_log_buffer_size which is 2MB. In that case, replace the modified buffer pool pages with the page old content. 11) Commit the mini-transaction for shrinking the tablespace and enable/disable the doublewrite buffer depends on user specified value. recv_sys_t::apply(): Handle the truncation of system tablespace only if the recovered tablespace size is lesser than actual existing size.	2023-08-01 19:43:04 +05:30
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	f291c3df2c	Merge branch '10.4' into 10.5	2023-07-27 15:43:21 +02:00
Oleksandr Byelkin	7564be1352	Merge branch '10.4' into 10.5	2023-07-26 16:02:57 +02:00
Marko Mäkelä	e81fa34502	Merge 11.1 into 11.2	2023-07-26 15:49:24 +03:00
Marko Mäkelä	c6ac1e39b6	Merge 11.0 into 11.1	2023-07-26 15:13:43 +03:00
Marko Mäkelä	f2b4972bd4	Merge 10.11 into 11.0	2023-07-26 15:13:06 +03:00
Marko Mäkelä	bce3ee704f	Merge 10.10 into 10.11	2023-07-26 14:44:43 +03:00
Marko Mäkelä	b1b47264d2	Merge 10.9 into 10.10	2023-07-26 14:17:36 +03:00
Thirunarayanan Balathandayuthapani	4700f2ac70	MDEV-30796 Auto_increment values not updated after bulk insert operation - InnoDB fails to update the autoinc persistently after bulk insert operation. row_merge_bulk_t::write_to_index(): Update the autoinc value persistently	2023-07-26 16:24:20 +05:30
Marko Mäkelä	864bbd4d09	Merge 10.6 into 10.9	2023-07-26 13:42:23 +03:00
Lena Startseva	9854fb6fa7	MDEV-31003: Second execution for ps-protocol This patch adds for "--ps-protocol" second execution of queries "SELECT". Also in this patch it is added ability to disable/enable (--disable_ps2_protocol/--enable_ps2_protocol) second execution for "--ps-prototocol" in testcases.	2023-07-26 17:15:00 +07:00
Oleksandr Byelkin	f52954ef42	Merge commit '10.4' into 10.5	2023-07-20 11:54:52 +02:00
Vlad Lesin	090a84366a	MDEV-29311 Server Status Innodb_row_lock_time% is reported in seconds Before MDEV-24671, the wait time was derived from my_interval_timer() / 1000 (nanoseconds converted to microseconds, and not microseconds to milliseconds like I must have assumed). The lock_sys.wait_time and lock_sys.wait_time_max are already in milliseconds; we should not divide them by 1000. In MDEV-24738 the millisecond counts lock_sys.wait_time and lock_sys.wait_time_max were changed to a 32-bit type. That would overflow in 49.7 days. Keep using a 64-bit type for those millisecond counters. Reviewed by: Marko Mäkelä	2023-07-10 12:42:46 +03:00
Marko Mäkelä	c358e216d9	MDEV-31642: Upgrade may crash if innodb_log_file_buffering=OFF recv_log_recover_10_5(): Make reads aligned by 4096 bytes, to avoid any trouble in case the file was opened in O_DIRECT mode and the physical block size is larger than 512 bytes. Because innodb_log_file_size used to be defined in whole megabytes, reading multiples of 4096 bytes instead of 512 should not be an issue.	2023-07-10 11:14:54 +03:00
Vlad Lesin	1bfd3cc457	MDEV-10962 Deadlock with 3 concurrent DELETEs by unique key PROBLEM: A deadlock was possible when a transaction tried to "upgrade" an already held Record Lock to Next Key Lock. SOLUTION: This patch is based on observations that: (1) a Next Key Lock is equivalent to Record Lock combined with Gap Lock (2) a GAP Lock never has to wait for any other lock In case we request a Next Key Lock, we check if we already own a Record Lock of equal or stronger mode, and if so, then we change the requested lock type to GAP Lock, which we either already have, or can be granted immediately, as GAP locks don't conflict with any other lock types. (We don't consider Insert Intention Locks a Gap Lock in above statements). The reason of why we don't upgrage Record Lock to Next Key Lock is the following. Imagine a transaction which does something like this: for each row { request lock in LOCK_X\|LOCK_REC_NOT_GAP mode request lock in LOCK_S mode } If we upgraded lock from Record Lock to Next Key lock, there would be created only two lock_t structs for each page, one for LOCK_X\|LOCK_REC_NOT_GAP mode and one for LOCK_S mode, and then used their bitmaps to mark all records from the same page. The situation would look like this: request lock in LOCK_X\|LOCK_REC_NOT_GAP mode on row 1: // -> creates new lock_t for LOCK_X\|LOCK_REC_NOT_GAP mode and sets bit for // 1 request lock in LOCK_S mode on row 1: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 1, // so it upgrades it to X request lock in LOCK_X\|LOCK_REC_NOT_GAP mode on row 2: // -> creates a new lock_t for LOCK_X\|LOCK_REC_NOT_GAP mode (because we // don't have any after we've upgraded!) and sets bit for 2 request lock in LOCK_S mode on row 2: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 2, // so it upgrades it to X ...etc...etc.. Each iteration of the loop creates a new lock_t struct, and in the end we have a lot (one for each record!) of LOCK_X locks, each with single bit set in the bitmap. Soon we run out of space for lock_t structs. If we create LOCK_GAP instead of lock upgrading, the above scenario works like the following: // -> creates new lock_t for LOCK_X\|LOCK_REC_NOT_GAP mode and sets bit for // 1 request lock in LOCK_S mode on row 1: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 1, // so it creates LOCK_S\|LOCK_GAP only and sets bit for 1 request lock in LOCK_X\|LOCK_REC_NOT_GAP mode on row 2: // -> reuses the lock_t for LOCK_X\|LOCK_REC_NOT_GAP by setting bit for 2 request lock in LOCK_S mode on row 2: // -> notices that we already have LOCK_X\|LOCK_REC_NOT_GAP on the row 2, // so it reuses LOCK_S\|LOCK_GAP setting bit for 2 In the end we have just two locks per page, one for each mode: LOCK_X\|LOCK_REC_NOT_GAP and LOCK_S\|LOCK_GAP. Another benefit of this solution is that it avoids not-entirely const-correct, (and otherwise looking risky) "upgrading". The fix was ported from mysql/mysql-server@bfba840dfa mysql/mysql-server@75cefdb1f7 Reviewed by: Marko Mäkelä	2023-07-06 15:06:10 +03:00
Yuchen Pei	9b431d714f	MDEV-26137 Improve import tablespace workflow. Allow ALTER TABLE ... IMPORT TABLESPACE without creating the table followed by discarding the tablespace. That is, assuming we want to import table t1 to t2, instead of CREATE TABLE t2 LIKE t1; ALTER TABLE t2 DISCARD TABLESPACE; FLUSH TABLES t1 FOR EXPORT; --copy_file $MYSQLD_DATADIR/test/t1.cfg $MYSQLD_DATADIR/test/t2.cfg --copy_file $MYSQLD_DATADIR/test/t1.ibd $MYSQLD_DATADIR/test/t2.ibd UNLOCK TABLES; ALTER TABLE t2 IMPORT TABLESPACE; We can simply do FLUSH TABLES t1 FOR EXPORT; --copy_file $MYSQLD_DATADIR/test/t1.cfg $MYSQLD_DATADIR/test/t2.cfg --copy_file $MYSQLD_DATADIR/test/t1.frm $MYSQLD_DATADIR/test/t2.frm --copy_file $MYSQLD_DATADIR/test/t1.ibd $MYSQLD_DATADIR/test/t2.ibd UNLOCK TABLES; ALTER TABLE t2 IMPORT TABLESPACE; We achieve this by creating a "stub" table in the second scenario while opening the table, where t2 does not exist but needs to import from t1. The "stub" table is similar to a table that is created but then instructed to discard its tablespace. We include tests with various row formats, encryption, with indexes and auto-increment.	2023-07-04 17:56:27 +10:00
Marko Mäkelä	cee9b3b850	Merge 11.0 into 11.1	2023-07-04 08:20:55 +03:00
Marko Mäkelä	a906046f1f	Merge 10.11 into 11.0	2023-07-04 08:20:20 +03:00
Marko Mäkelä	3430767e00	Merge 10.10 into 10.11	2023-07-04 08:19:48 +03:00
Marko Mäkelä	c2d5523545	Merge 10.9 into 10.10	2023-07-04 08:18:30 +03:00
Marko Mäkelä	26fc07b162	Merge 10.6 into 10.9	2023-07-03 16:49:55 +03:00
Marko Mäkelä	b8088487e4	MDEV-19216 Assertion ...SYS_FOREIGN failed in btr_node_ptr_max_size btr_node_ptr_max_size(): Handle BINARY(0) and VARBINARY(0) as special cases, similar to CHAR(0) and VARCHAR(0).	2023-07-03 16:09:18 +03:00
Marko Mäkelä	0105220e3b	Remove tests that duplicate innodb.max_record_size	2023-07-03 16:06:10 +03:00
Marko Mäkelä	dc1bd1802a	MDEV-31386 InnoDB: Failing assertion: page_type == i_s_page_type[page_type].type_value i_s_innodb_buffer_page_get_info(): Correct a condition. After crash recovery, there may be some buffer pool pages in FREED state, containing garbage (invalid data page contents). Let us ignore such pages in the INFORMATION_SCHEMA output. The test innodb.innodb_defragment_fill_factor will be removed, because the queries that it is invoking on information_schema.innodb_buffer_page would start to fail. The defragmentation feature was removed in commit `7ca89af6f8` in MariaDB Server 11.1. Tested by: Matthias Leich	2023-07-03 14:39:29 +03:00
Marko Mäkelä	d04de1aa13	Merge 10.6 into 10.9	2023-06-30 13:42:52 +03:00
Vlad Lesin	3e89b4fcc6	MDEV-31570 gap_lock_split.test hangs sporadically The fix is in replacing the waiting for the whole purge finishing with the the waiting for only delete-marked records purging finishing. Reviewed by: Marko Mäkelä	2023-06-28 14:22:40 +03:00
Thirunarayanan Balathandayuthapani	73f78fb3b0	MDEV-31537 Bulk insert operation aborts the server for redundant table - InnoDB bulk insert operation aborts the server for redundant table. InnoDB miscalculates the record size in temporary file for the redundant table. CHAR in redundant row format table always fixed length, but in temporary file, it is variable-length for variable-length character sets.	2023-06-28 15:26:22 +05:30
Marko Mäkelä	8290a46d50	Merge 11.0 into 11.1	2023-06-28 09:38:59 +03:00
Marko Mäkelä	1fe4bcbe05	Merge 10.11 into 11.0	2023-06-28 09:19:19 +03:00
Marko Mäkelä	71a1a28a49	Merge 10.10 into 10.11	2023-06-27 17:45:06 +03:00
Marko Mäkelä	135e976696	Merge 10.9 into 10.10	2023-06-27 17:43:31 +03:00
Thirunarayanan Balathandayuthapani	5f09b53bdb	MDEV-31086 MODIFY COLUMN can break FK constraints, and lead to unrestorable dumps - When foreign_key_check is disabled, allowing to modify the column which is part of foreign key constraint can lead to refusal of TRUNCATE TABLE, OPTIMIZE TABLE later. So it make sense to block the column modify operation when foreign key is involved irrespective of foreign_key_check variable. Correct way to modify the charset of the column when fk is involved: SET foreign_key_checks=OFF; ALTER TABLE child DROP FOREIGN KEY fk, MODIFY m VARCHAR(200) CHARSET utf8mb4; ALTER TABLE parent MODIFY m VARCHAR(200) CHARSET utf8mb4; ALTER TABLE child ADD CONSTRAINT FOREIGN KEY (m) REFERENCES PARENT(m); SET foreign_key_checks=ON; fk_check_column_changes(): Remove the FOREIGN_KEY_CHECKS while checking the column change for foreign key constraint. This is the partial revert of commit `5f1f2fc0e4` and it changes the behaviour of copy alter algorithm ha_innobase::prepare_inplace_alter_table(): Find the modified column and check whether it is part of existing and newly added foreign key constraint.	2023-06-27 16:58:22 +05:30
Marko Mäkelä	eb6b521f1b	Merge 10.6 into 10.9	2023-06-27 13:48:46 +03:00
Monty	582d0cf5b0	Added not_as_root.inc to some test scripts that fails if run as root	2023-06-10 11:14:15 +03:00
Marko Mäkelä	3883eb63dc	Merge 11.0 into 11.1	2023-06-08 14:09:21 +03:00
Thirunarayanan Balathandayuthapani	bf0a54df34	MDEV-31416 ASAN errors in dict_v_col_t::detach upon adding key to virtual column - InnoDB throws ASAN error while adding the index on virtual column of system versioned table. InnoDB wrongly assumes that virtual column collation type changes, creates new column with different character set. This leads to failure while detaching the column from indexes.	2023-06-08 16:34:45 +05:30
Marko Mäkelä	5fb2c031f7	Merge 10.11 into 11.0	2023-06-08 13:49:48 +03:00
Marko Mäkelä	5d7b957eb0	Merge 10.10 into 10.11	2023-06-08 11:23:08 +03:00
Marko Mäkelä	e704a13b32	Merge 10.9 into 10.10	2023-06-08 11:22:12 +03:00
Marko Mäkelä	223c2c5b9d	Merge 10.6 into 10.9	2023-06-08 10:46:19 +03:00
Marko Mäkelä	80585c9d6f	Merge 10.5 into 10.6	2023-06-08 10:42:56 +03:00
Marko Mäkelä	21031b24fc	Suppress an occasional buffer pool warning	2023-06-08 09:38:03 +03:00
Marko Mäkelä	3e40f9a7f3	MDEV-31355 innodb_undo_log_truncate=ON fails to wait for purge of enough transaction history purge_sys_t::sees(): Wrapper for view.sees(). trx_purge_truncate_history(): Invoke purge_sys.sees() instead of comparing to head.trx_no, to determine if undo pages can be safely freed. The test innodb.cursor-restore-locking was adjusted by Vladislav Lesin, as was the the debug instrumentation in row_purge_del_mark(). Reviewed by: Vladislav Lesin	2023-06-08 09:17:52 +03:00
Marko Mäkelä	c04284e747	Merge 10.10 into 10.11	2023-06-07 15:01:43 +03:00
Marko Mäkelä	82230aa423	Merge 10.9 into 10.10	2023-06-07 14:48:37 +03:00
Marko Mäkelä	878a86f276	Merge 10.6 into 10.9	2023-06-07 14:32:46 +03:00
Sergei Golubchik	cbabb95915	Merge branch '11.0' into 11.1	2023-06-05 20:15:15 +02:00
Sergei Golubchik	0005f2f06c	Merge branch 'bb-10.11-release' into bb-11.0-release	2023-06-05 19:27:00 +02:00
Sergei Golubchik	4e2b93dffe	Merge branch 'bb-10.10-release' into bb-10.11-release	2023-06-05 19:04:58 +02:00
Sergei Golubchik	30bba8e275	Merge branch 'github/bb-10.9-release' into bb-10.10-release	2023-06-05 18:59:43 +02:00
Sergei Golubchik	33fd519ca7	Merge branch 'github/bb-10.6-release' into bb-10.9-release	2023-06-05 18:55:26 +02:00
Marko Mäkelä	89eb6fa8a7	MDEV-31308 InnoDB monitor trx_rseg_history_len was accidentally disabled by default innodb_counter_info[]: Revert a change that was accidentally made in commit `204e7225dc`	2023-06-03 11:12:21 +02:00
Alexander Barkov	03a9366c73	Extra tests for MDEV-30483 After upgrade to 10.6 from Mysql 5.7 seeing "InnoDB: Column last_update in table mysql.innodb_table_stats is BINARY(4) NOT NULL but should be INT UNSIGNED NOT NULL" Adding tests demonstrating that columns: - mysql.innodb_table_stats.last_update - mysql.innodb_index_stats.last_update contain sane values close to NOW() rathar than a garbage. Tests cover these three underlying TIMESTAMP data formats: - MariaDB Field_timestamp0 - UINT4 based Like in a MariaDB native installation running with mysql56_temporal_format=0 - MariaDB Field_timestampf - BINARY(4) based, with UNSIGNED_FLAG Like in a MariaDB native installation running with mysql56_temporal_format=1 - MySQL-alike Field_timestampf - BINARY(4) based, without UNSIGNED_FLAG Like with a MariaDB server running over a MySQL-5.6 directory (e.g. during a migragion).	2023-05-26 16:47:16 +04:00
Marko Mäkelä	0796b7ad5e	Merge 10.6 into 10.9	2023-05-22 09:13:51 +03:00
Vlad Lesin	b54e7b0cea	MDEV-31185 rw_trx_hash_t::find() unpins pins too early rw_trx_hash_t::find() acquires element->mutex, then unpins pins, used for lf_hash element search. After that the "element" can be deallocated and reused by some other thread. If we take a look rw_trx_hash_t::insert()->lf_hash_insert()->lf_alloc_new() calls, we will not find any element->mutex acquisition, as it was not initialized yet before it's allocation. rw_trx_hash_t::insert() can reuse the chunk, unpinned in rw_trx_hash_t::find(). The scenario is the following: 1. Thread 1 have just executed lf_hash_search() in rw_trx_hash_t::find(), but have not acquired element->mutex yet. 2. Thread 2 have removed the element from hash table with rw_trx_hash_t::erase() call. 3. Thread 1 acquired element->mutex and unpinned pin 2 pin with lf_hash_search_unpin(pins) call. 4. Some thread purged memory of the element. 5. Thread 3 reused the memory for the element, filled element->id, element->trx. 6. Thread 1 crashes with failed "DBUG_ASSERT(trx_id == trx->id)" assertion. Note that trx_t objects are also reused, see the code around trx_pools for details. The fix is to invoke "lf_hash_search_unpin(pins);" after element->trx is stored in local variable in rw_trx_hash_t::find(). Reviewed by: Nikita Malyavin, Marko Mäkelä.	2023-05-19 15:50:20 +03:00
Marko Mäkelä	df524dc06f	MDEV-31308 InnoDB monitor trx_rseg_history_len was accidentally disabled by default innodb_counter_info[]: Revert a change that was accidentally made in commit `204e7225dc`	2023-05-19 15:29:26 +03:00
Marko Mäkelä	54819192fe	Merge 10.11 into 11.0	2023-04-26 18:50:15 +03:00
Marko Mäkelä	52f6f364d9	Merge 10.10 into 10.11	2023-04-26 18:31:50 +03:00
Marko Mäkelä	ce6616aa28	Merge 10.9 into 10.10	2023-04-26 18:31:03 +03:00
Marko Mäkelä	e3f6e1c92e	Merge 10.8 into 10.9	2023-04-26 17:48:13 +03:00
Marko Mäkelä	c15c8ef3e3	Merge 10.6 into 10.8	2023-04-26 13:58:40 +03:00
Marko Mäkelä	818d5e4814	Merge 10.5 into 10.6	2023-04-25 13:10:33 +03:00
Marko Mäkelä	3c25077899	Merge 10.6 into 10.8	2023-04-24 15:59:23 +03:00
Oleksandr Byelkin	1d74927c58	Merge branch '10.4' into 10.5	2023-04-24 12:43:47 +02:00
Marko Mäkelä	204e7225dc	Cleanup: MONITOR_EXISTING trx_undo_slots_used, trx_undo_slots_cached Let us remove explicit updates of MONITOR_NUM_UNDO_SLOT_USED and MONITOR_NUM_UNDO_SLOT_CACHED, and let us compute the rough values from trx_sys.rseg_array[] on demand.	2023-04-21 17:58:18 +03:00
Thirunarayanan Balathandayuthapani	2bfd04e314	MDEV-31025 Redundant table alter fails when fixed column stored externally row_merge_buf_add(): Has strict assert that fixed length mismatch shouldn't happen while rebuilding the redundant row format table btr_index_rec_validate(): Fixed size column can be stored externally. So sum of inline stored length and external stored length of the column should be equal to total column length	2023-04-19 17:11:14 +05:30
Sergei Petrunia	c7fe8e51de	Merge 10.11 into 11.0	2023-04-17 16:50:01 +03:00
Marko Mäkelä	656c2e18b1	Merge 10.10 into 10.11	2023-04-14 13:08:28 +03:00
Marko Mäkelä	a009280e60	Merge 10.9 into 10.10	2023-04-14 12:24:14 +03:00
Marko Mäkelä	44281b88f3	Merge 10.8 into 10.9	2023-04-14 11:32:36 +03:00
Marko Mäkelä	1d1e0ab2cc	Merge 10.6 into 10.8	2023-04-12 15:50:08 +03:00
Junqi Xie	d20a96f9c1	MDEV-21921 Make transaction_isolation and transaction_read_only into system variables In MariaDB, we have a confusing problem where: * The transaction_isolation option can be set in a configuration file, but it cannot be set dynamically. * The tx_isolation system variable can be set dynamically, but it cannot be set in a configuration file. Therefore, we have two different names for the same thing in different contexts. This is needlessly confusing, and it complicates the documentation. The same thing applys for transaction_read_only. MySQL 5.7 solved this problem by making them into system variables. https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-20.html This commit takes a similar approach by adding new system variables and marking the original ones as deprecated. This commit also resolves some legacy problems related to SET STATEMENT and transaction_isolation.	2023-04-12 11:04:29 +10:00
Marko Mäkelä	5bada1246d	Merge 10.5 into 10.6	2023-04-11 16:15:19 +03:00
Alexander Barkov	ed2adc8c6f	MDEV-28190 sql_mode makes MDEV-371 virtual column expressions nondeterministic This problem was fixed earlier by MDEV-27653. Adding MTR tests only.	2023-04-06 16:17:50 +04:00
Alexander Barkov	62e137d4d7	Merge remote-tracking branch 'origin/10.4' into 10.5	2023-04-05 16:16:19 +04:00
Alexander Barkov	8020b1bd73	MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations - Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars() and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES. The flag defines if strnncollsp_nchars() should emulate trailing spaces which were possibly trimmed earlier (e.g. in InnoDB CHAR compression). This is important for NOPAD collations. For example, with this input: - str1= 'a ' (Latin letter a followed by one space) - str2= 'a ' (Latin letter a followed by two spaces) - nchars= 3 if the flag is given, strnncollsp_nchars() will virtually restore one trailing space to str1 up to nchars (3) characters and compare two strings as equal: - str1= 'a ' (one extra trailing space emulated) - str2= 'a ' (as is) If the flag is not given, strnncollsp_nchars() does not add trailing virtual spaces, so in case of a NOPAD collation, str1 will be compared as less than str2 because it is shorter. - Field_string::cmp_prefix() now passes the new flag. Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do not pass the new flag. - The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc (which handles the CHAR data type) now also passed the new flag. - Fixing UCA collations to respect the new flag. Other collations are possibly also affected, however I had no success in making an SQL script demonstrating the problem. Other collations will be extended to respect this flags in a separate patch later. - Changing the meaning of the last parameter of Field::cmp_prefix() from "number of bytes" (internal length) to "number of characters" (user visible length). The code calling cmp_prefix() from handler.cc was wrong. After this change, the call in handler.cc became correct. The code calling cmp_prefix() from key_rec_cmp() in key.cc was adjusted according to this change. - Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c now pass the new flag. A few new tests also were added, without the flag.	2023-04-04 12:30:50 +04:00
Lorna Luo	0cc1694e9c	Make 'move_file' command more reliable in 3 innodb tests The tests innodb.import_tablespace_race, innodn.restart, and innodb.innodb-wl5522 move the tablespace file between the data directory and the tmp directory specified by global environment variables. However this is risky because it's not unusual that the set tmp directory (often under /tmp) is mounted on another disk partition or device, and 'move_file' command may fail with "Errcode: 18 'Invalid cross-device link.'" For innodb.import_tablespace_race and innodb.innodb-wl5522, moving files across directories is not necessary. Modify the tests so they rename files under the same directory. For innodb.restart, instead of moving between datadir and MYSQL_TMPDIR, move the files under MYSQLTEST_VARDIR. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2023-04-03 14:36:11 +02:00
Oleksandr Byelkin	ac5a534a4c	Merge remote-tracking branch '10.4' into 10.5	2023-03-31 21:32:41 +02:00
Marko Mäkelä	2b61ff8f22	Merge 11.0 into 11.1	2023-03-29 17:23:21 +03:00
Marko Mäkelä	5e01255732	Merge 10.11 into 11.0	2023-03-29 17:20:42 +03:00
Marko Mäkelä	d84a282629	Merge 10.10 into 10.11	2023-03-29 16:53:37 +03:00
Marko Mäkelä	a6780df49b	MDEV-30453 Setting innodb_buffer_pool_filename to an empty string attempts to delete the data directory on shutdown Let us make innodb_buffer_pool_filename a read-only variable so that a malicious user cannot cause an important file to be deleted on InnoDB shutdown. An attempt to delete a directory will fail because it is not a regular file, but what if the variable pointed to (say) ibdata1, ib_logfile0 or some *.ibd file? It does not seem to make much sense for this parameter to be configurable in the first place, but we will not change that in order to avoid breaking compatibility.	2023-03-29 16:49:10 +03:00
Marko Mäkelä	191821f7df	Merge 10.9 into 10.10	2023-03-29 15:29:02 +03:00
Marko Mäkelä	55e78ebf41	Merge 10.8 into 10.9	2023-03-29 15:28:13 +03:00
Marko Mäkelä	dd2fe81122	Merge 10.6 into 10.8	2023-03-29 15:16:42 +03:00
Thirunarayanan Balathandayuthapani	e06c6046d2	MDEV-29545 InnoDB: Can't find record during replace stmt Problem: ======== - InnoDB replace statement returns can't find record as result during bulk insert operation. InnoDB returns DB_END_OF_INDEX blindly when bulk transaction is visible to current transaction even though the search tuple is inserted as a part of current replace statement. Solution: ========= row_search_mvcc(): InnoDB should allow the transaction to read all the rows when innodb intends to do any locking on the record even though bulk insert transaction changes are visible to the current transaction	2023-03-24 15:20:21 +05:30
Yuchen Pei	7c91082e39	MDEV-27912 Fixing inconsistency w.r.t. expect files in tests. mtr uses group suffix, but some existing inc and test files use server_id for expect files. This patch aims to fix that. For spider: With this change we will not have to maintain a separate version of restart_mysqld.inc for spider, that duplicates code, just because spider tests use different names for expect files, and shutdown_mysqld requires magical names for them. With this change spider tests will also be able to use other features provided by restart_mysqld.inc without code duplication, like the parameter $restart_parameters (see e.g. the testcase mdev_29904.test in commit ef1161e5d4f). Tests run after this change: default, spider, rocksdb, galera, using the following command mtr --parallel=auto --force --max-test-fail=0 --skip-core-file mtr --suite spider,spider/,spider//* \ --skip-test="spider/oracle.\|./t\..*" --parallel=auto --big-test \ --force --max-test-fail=0 --skip-core-file mtr --suite galera --parallel=auto mtr --suite rocksdb --parallel=auto	2023-03-22 11:55:57 +11:00
Marko Mäkelä	6e58d5ab6a	Merge 11.0 into 11.1	2023-03-17 15:04:38 +02:00
Marko Mäkelä	4c355d4e81	Merge 10.11 into 11.0	2023-03-17 15:03:17 +02:00
Marko Mäkelä	7343a2ceb6	Merge 10.10 into 10.11	2023-03-17 14:23:03 +02:00
Marko Mäkelä	df08731b58	Merge 10.9 into 10.10	2023-03-17 14:22:35 +02:00
Marko Mäkelä	1147e32688	Merge 10.8 into 10.9	2023-03-17 14:22:10 +02:00
Marko Mäkelä	fa56adff75	Merge 10.6 into 10.8	2023-03-17 14:19:17 +02:00
Thirunarayanan Balathandayuthapani	e8e0559ed2	MDEV-30870 Undo tablespace name displays wrongly for I_S queries - INNODB_SYS_TABLESPACES in information schema should display innodb_undo001, innodb_undo002 etc as tablespace name for undo tablespaces	2023-03-17 17:17:35 +05:30
Thirunarayanan Balathandayuthapani	18e4978edc	MDEV-29975 InnoDB fails to release savepoint during bulk insert - InnoDB does rollback the whole transaction and discards the savepoint when there is a failure happens during bulk insert operation. When server request to release the savepoint, InnoDB should return DB_SUCCESS when it deals with bulk insert operation	2023-03-17 16:41:27 +05:30
Marko Mäkelä	c50f849d64	Merge 10.10 into 10.11	2023-03-17 07:00:03 +02:00
Marko Mäkelä	3dd33789c1	Merge 10.9 into 10.10	2023-03-17 06:59:46 +02:00
Marko Mäkelä	fffa4b28a1	Merge 10.8 into 10.9	2023-03-17 06:58:33 +02:00
Marko Mäkelä	acf46b7b36	Merge 10.6 into 10.8	2023-03-16 18:11:37 +02:00
Marko Mäkelä	a55b951e60	MDEV-26827 Make page flushing even faster For more convenient monitoring of something that could greatly affect the volume of page writes, we add the status variable Innodb_buffer_pool_pages_split that was previously only available via information_schema.innodb_metrics as "innodb_page_splits". This was suggested by Axel Schwenke. buf_flush_page_count: Replaced with buf_pool.stat.n_pages_written. We protect buf_pool.stat (except n_page_gets) with buf_pool.mutex and remove unnecessary export_vars indirection. buf_pool.flush_list_bytes: Moved from buf_pool.stat.flush_list_bytes. Protected by buf_pool.flush_list_mutex. buf_pool_t::page_cleaner_status: Replaces buf_pool_t::n_flush_LRU_, buf_pool_t::n_flush_list_, and buf_pool_t::page_cleaner_is_idle. Protected by buf_pool.flush_list_mutex. We will exclusively broadcast buf_pool.done_flush_list by the buf_flush_page_cleaner thread, and only wait for it when communicating with buf_flush_page_cleaner. There is no need to keep a count of pending writes by the buf_pool.flush_list processing. A single flag suffices for that. Waits for page write completion can be performed by simply waiting on block->page.lock, or by invoking buf_dblwr.wait_for_page_writes(). buf_LRU_block_free_non_file_page(): Broadcast buf_pool.done_free and set buf_pool.try_LRU_scan when freeing a page. This would be executed also as part of buf_page_write_complete(). buf_page_write_complete(): Do not broadcast buf_pool.done_flush_list, and do not acquire buf_pool.mutex unless buf_pool.LRU eviction is needed. Let buf_dblwr count all writes to persistent pages and broadcast a condition variable when no outstanding writes remain. buf_flush_page_cleaner(): Prioritize LRU flushing and eviction right after "furious flushing" (lsn_limit). Simplify the conditions and reduce the hold time of buf_pool.flush_list_mutex. Refuse to shut down or sleep if buf_pool.ran_out(), that is, LRU eviction is needed. buf_pool_t::page_cleaner_wakeup(): Add the optional parameter for_LRU. buf_LRU_get_free_block(): Protect buf_lru_free_blocks_error_printed with buf_pool.mutex. Invoke buf_pool.page_cleaner_wakeup(true) to to ensure that buf_flush_page_cleaner() will process the LRU flush request. buf_do_LRU_batch(), buf_flush_list(), buf_flush_list_space(): Update buf_pool.stat.n_pages_written when submitting writes (while holding buf_pool.mutex), not when completing them. buf_page_t::flush(), buf_flush_discard_page(): Require that the page U-latch be acquired upfront, and remove buf_page_t::ready_for_flush(). buf_pool_t::delete_from_flush_list(): Remove the parameter "bool clear". buf_flush_page(): Count pending page writes via buf_dblwr. buf_flush_try_neighbors(): Take the block of page_id as a parameter. If the tablespace is dropped before our page has been written out, release the page U-latch. buf_pool_invalidate(): Let the caller ensure that there are no outstanding writes. buf_flush_wait_batch_end(false), buf_flush_wait_batch_end_acquiring_mutex(false): Replaced with buf_dblwr.wait_for_page_writes(). buf_flush_wait_LRU_batch_end(): Replaces buf_flush_wait_batch_end(true). buf_flush_list(): Remove some broadcast of buf_pool.done_flush_list. buf_flush_buffer_pool(): Invoke also buf_dblwr.wait_for_page_writes(). buf_pool_t::io_pending(), buf_pool_t::n_flush_list(): Remove. Outstanding writes are reflected by buf_dblwr.pending_writes(). buf_dblwr_t::init(): New function, to initialize the mutex and the condition variables, but not the backing store. buf_dblwr_t::is_created(): Replaces buf_dblwr_t::is_initialised(). buf_dblwr_t::pending_writes(), buf_dblwr_t::writes_pending: Keeps track of writes of persistent data pages. buf_flush_LRU(): Allow calls while LRU flushing may be in progress in another thread. Tested by Matthias Leich (correctness) and Axel Schwenke (performance)	2023-03-16 17:19:58 +02:00
Thirunarayanan Balathandayuthapani	dfdcd7ffab	MDEV-26198 Assertion `0' failed in row_log_table_apply_op during redundant table rebuild - InnoDB alter fails to apply the online log during redundant table rebuild. Problem is that InnoDB wrongly reads the length flags of the record while applying the temporary log record. rec_init_offsets_comp_ordinary(): For finding the n_core_null_bytes, InnoDB should use the same logic as rec_convert_dtuple_to_rec_comp().	2023-03-14 13:34:23 +05:30
Marko Mäkelä	7ca89af6f8	MDEV-30545 Remove innodb_defragment and related parameters The deprecated parameters will be removed: innodb_defragment innodb_defragment_n_pages innodb_defragment_stats_accuracy innodb_defragment_fill_factor_n_recs innodb_defragment_fill_factor innodb_defragment_frequency The mysql.innodb_index_stats.stat_name values 'n_page_split' and 'n_pages_freed' will lose their special meaning. The related changes to OPTIMIZE TABLE in InnoDB will be removed as well. The parameter innodb_optimize_fulltext_only will retain its special meaning in OPTIMIZE TABLE. Tested by: Matthias Leich	2023-03-11 10:45:35 +02:00
Thirunarayanan Balathandayuthapani	062ba0bd4a	MDEV-30183 Assertion `!memcmp(rec_trx_id, old_pk_trx_id->data, 6 + 7)' failed in row_log_table_apply_update - This failure caused by commit `358921ce32` row_ins_duplicate_online() should consider if the record is an exact match of the tuple when number of matching fields equals with number of unique fields + DB_TRX_ID + DB_ROLL_PTR	2023-03-06 23:40:13 +05:30
Marko Mäkelä	7a834d6248	Merge 10.11 into 11.0	2023-02-28 13:14:08 +02:00
Marko Mäkelä	95d51369c9	Merge 10.10 into 10.11	2023-02-28 10:52:42 +02:00
Marko Mäkelä	f14d9fa09a	Merge 10.9 into 10.10	2023-02-28 10:43:29 +02:00
Marko Mäkelä	c3246e4bf0	Merge 10.8 into 10.9	2023-02-28 10:37:11 +02:00
Marko Mäkelä	6ac44ac3ab	Merge 10.6 into 10.8	2023-02-28 10:36:17 +02:00
Marko Mäkelä	3e2ad0e918	Merge 10.5 into 10.6	2023-02-27 13:17:35 +02:00
Marko Mäkelä	0de3be8cfd	MDEV-30671 InnoDB undo log truncation fails to wait for purge of history It is not safe to invoke trx_purge_free_segment() or execute innodb_undo_log_truncate=ON before all undo log records in the rollback segment has been processed. A prominent failure that would occur due to premature freeing of undo log pages is that trx_undo_get_undo_rec() would crash when trying to copy an undo log record to fetch the previous version of a record. If trx_undo_get_undo_rec() was not invoked in the unlucky time frame, then the symptom would be that some committed transaction history is never removed. This would be detected by CHECK TABLE...EXTENDED that was impleented in commit `ab0190101b`. Such a garbage collection leak should be possible even when using innodb_undo_log_truncate=OFF, just involving trx_purge_free_segment(). trx_rseg_t::needs_purge: Change the type from Boolean to a transaction identifier, noting the most recent non-purged transaction, or 0 if everything has been purged. On transaction start, we initialize this to 1 more than the transaction start ID. On recovery, the field may be adjusted to the transaction end ID (TRX_UNDO_TRX_NO) if it is larger. The field TRX_UNDO_NEEDS_PURGE becomes write-only; only some debug assertions that would validate the value. The field reflects the old inaccurate Boolean field trx_rseg_t::needs_purge. trx_undo_mem_create_at_db_start(), trx_undo_lists_init(), trx_rseg_mem_restore(): Remove the parameter max_trx_id. Instead, store the maximum in trx_rseg_t::needs_purge, where trx_rseg_array_init() will find it. trx_purge_free_segment(): Contiguously hold a lock on trx_rseg_t to prevent any concurrent allocation of undo log. trx_purge_truncate_rseg_history(): Only invoke trx_purge_free_segment() if the rollback segment is empty and there are no pending transactions associated with it. trx_purge_truncate_history(): Only proceed with innodb_undo_log_truncate=ON if trx_rseg_t::needs_purge indicates that all history has been purged. Tested by: Matthias Leich	2023-02-24 14:24:44 +02:00
Marko Mäkelä	d5d7c8ba96	MDEV-30544 Deprecate innodb_defragment and related parameters There is a little used option innodb_defragment that would make OPTIMIZE TABLE not rebuild the table as usual for InnoDB, but instead cause the index B-trees to be optimized in place. This option uses excessive locking (exclusively locking index trees). It never covered SPATIAL INDEX or FULLTEXT INDEX. Storage space was never reclaimed. Because this option is not particularly useful and causes a maintenance burden (most recently in commit `de4030e4d4`), it is best to deprecate it, to prepare for its removal.	2023-02-21 13:33:47 +02:00
Vlad Lesin	a474e3278c	MDEV-27701 Race on trx->lock.wait_lock between lock_rec_move() and lock_sys_t::cancel() The initial issue was in assertion failure, which checked the equality of lock to cancel with trx->lock.wait_lock in lock_sys_t::cancel(). If we analyze lock_sys_t::cancel() code from the perspective of trx->lock.wait_lock racing, we won't find the error there, except the cases when we need to reload it after the corresponding latches acquiring. So the fix is just to remove the assertion and reload trx->lock.wait_lock after acquiring necessary latches. Reviewed by: Marko Mäkelä <marko.makela@mariadb.com>	2023-02-20 20:31:24 +03:00
Marko Mäkelä	2e431ff7e6	Merge 10.11 into 11.0	2023-02-16 13:34:45 +02:00
Thirunarayanan Balathandayuthapani	702d1af32c	MDEV-30615 Can't read from I_S.INNODB_SYS_INDEXES when having a discarded tablesace - MY_I_S_MAYBE_NULL field attributes is added PAGE_NO and SPACE in innodb_sys_index table. By doing this, InnoDB can set null for these fields when it encounters discarded tablespace	2023-02-16 16:04:46 +05:30
Marko Mäkelä	1fd0099839	Merge 10.10 into 10.11	2023-02-16 11:41:18 +02:00
Marko Mäkelä	345356b868	Merge 10.9 into 10.10	2023-02-16 11:36:38 +02:00
Marko Mäkelä	0d55914d96	Merge 10.8 into 10.9	2023-02-16 10:25:34 +02:00
Marko Mäkelä	b12cd88ce1	Merge 10.6 into 10.8	2023-02-16 10:24:23 +02:00
Marko Mäkelä	67a6ad0a4a	Merge 10.5 into 10.6	2023-02-16 10:17:58 +02:00
Marko Mäkelä	d3f35aa47b	MDEV-30552 fixup: Fix the test for non-debug	2023-02-16 10:16:38 +02:00
Marko Mäkelä	5abbe092e6	Merge 10.6 into 10.8	2023-02-16 09:17:06 +02:00
Marko Mäkelä	96a3b11d13	Merge 10.5 into 10.6	2023-02-14 15:23:23 +02:00
Thirunarayanan Balathandayuthapani	951d81d92e	MDEV-30426 Assertion !rec_offs_nth_extern(offsets2, n) during bulk insert - cmp_rec_rec_simple() fails to detect duplicate key error for bulk insert operation	2023-02-14 15:43:33 +05:30
Thirunarayanan Balathandayuthapani	1a5c7552ea	MDEV-30552 InnoDB recovery crashes when error handling scenario - InnoDB fails to reset the after_apply variable before applying the redo log in last batch during multi-batch recovery.	2023-02-14 14:36:17 +05:30
Thirunarayanan Balathandayuthapani	3eea2e8e10	MDEV-30551 InnoDB recovery hangs when buffer pool ran out of memory - During non-last batch of multi-batch recovery, InnoDB holds log_sys.mutex and preallocates the block which may intiate page flush, which may initiate log flush, which requires log_sys.mutex to acquire again. This leads to assert failure. So InnoDB recovery should release log_sys.mutex before preallocating the block.	2023-02-14 14:35:35 +05:30
Marko Mäkelä	dbab3e8d90	Merge 10.6 into 10.8	2023-02-10 13:43:53 +02:00
Monty	00704aff98	Fixed bug in extended key handling when there is no primary key Extended keys works by first checking if the engine supports extended keys. If yes, it extends secondary key with primary key components and mark the secondary keys as HA_EXT_NOSAME (unique). If we later notice that there where no primary key, the extended key information for secondary keys in share->key_info is reset. However the key_info->flag HA_EXT_NOSAME was not reset! This causes some strange things to happen: - Tables that have no primary key or secondary index that contained the primary key would be wrongly optimized as the secondary key could be thought to be unique when it was not and not unique when it was. - The problem was not shown in EXPLAIN because of a bug in create_ref_for_key() that caused EQ_REF to be displayed by EXPLAIN as REF when extended keys where used and the secondary key contained the primary key. This is fixed with: - Removed wrong test in make_join_select() which did not detect that key where unique when a secondary key contains the primary. - Moved initialization of extended keys from create_key_infos() to init_from_binary_frm_image() after we know if there is a usable primary key or not. One disadvantage with this approach is that key_info->key_parts may have not used slots (for keys we thought could be extended but could not). Fixed by adding a check for unused key_parts to copy_keys_from_share(). Other things: - Simplified copying of first key part in create_key_infos(). - Added a lot of code comments in code that I had to check as part of finding the issue. - Fixed some indentation. - Replaced a couple of looks using references to pointers in C context where the reference does not give any benefit. - Updated Aria and Maria to not assume the all key_info->rec_per_key are in one memory block (this could happen when using dervived tables with many keys). - Fixed a bug where key_info->rec_per_key where not allocated - Optimized TABLE::add_tmp_key() to only call alloc() once. (No logic changes) Test case changes: - innodb_mysql.test changed index as an index the optimizer thought was unique, was not. (Table had no primary key) TODO: - Move code that checks for partial or too long keys to the primary loop earlier that initally decides if we should add extended key fields. This is needed to ensure that HA_EXT_NOSAME is not set for partial or too long keys. It will also shorten the current code notable.	2023-02-10 13:35:31 +02:00
Monty	01c82173dd	Removed /2 of InnoDB ref_per_key[] estimates The original code was there to favor index search over table scan. This is not needed anymore as the cost calculations for table scans and index lookups are now more exact.	2023-02-10 12:59:36 +02:00
Monty	3fa99f0c0e	Change cost for REF to take into account cost for 1 extra key read_next The main difference in code path between EQ_REF and REF is that for REF we have to do an extra read_next on the index to check that there is no more matching rows. Before this patch we added a preference of EQ_REF by ensuring that REF would always estimate to find at least 2 rows. This patch adds the cost of the extra key read_next to REF access and removes the code that limited REF to at least 2 rows. For some queries this can have a big effect as the total estimated rows will be halved for each REF table with 1 rows. multi_range cost calculations are also changed to take into account the difference between EQ_REF and REF. The effect of the patch to the test suite: - About 80 test case changed - Almost all changes where for EXPLAIN where estimated rows for REF where changed from 2 to 1. - A few test cases using explain extended had a change of 'filtered'. This is because of the estimated rows are now closer to the calculated selectivity. - A very few test had a change of table order. This is because the change of estimated rows from 2 to 1 or the small cost change for REF (main.subselect_sj_jcl6, main.group_by, main.dervied_cond_pushdown, main.distinct, main.join_nested, main.order_by, main.join_cache) - No key statistics and the estimated rows are now smaller which cased estimated filtering to be lower. (main.subselect_sj_mat) - The number of total rows are halved. (main.derived_cond_pushdown) - Plans with 1 row changed to use RANGE instead of REF. (main.group_min_max) - ALL changed to REF (main.key_diff) - Key changed from ref + index_only to PRIMARY key for InnoDB, as OPTIMIZER_ROW_LOOKUP_COST + OPTIMIZER_ROW_NEXT_FIND_COST is smaller than OPTIMIZER_KEY_LOOKUP_COST + OPTIMIZER_KEY_NEXT_FIND_COST. (main.join_outer_innodb) - Cost changes printouts (main.opt_trace*) - Result order change (innodb_gis.rtree)	2023-02-10 12:58:50 +02:00
Oleksandr Byelkin	70a515df43	Merge branch '10.6.12' into 10.6	2023-02-06 20:18:44 +01:00
Vicențiu Ciorbaru	8885225de6	Implement multiple-signal debug_sync The patch is inspired from MySQL. Instead of using a single String to hold the current active debug_sync signal, use a Hash_set to store LEX_STRINGS. This patch ensures that a signal can not be lost, by being overwritten by another thread via set DEBUG_SYNC = '... SIGNAL ...'; All signals are kepts "alive" until they are consumed by a wait event. This requires updating test cases that assume the GLOBAL signal is never consumed. Follow-up work needed: Port the additional syntax that allows one to set multiple signals and also conditionally deactivate signals when waiting.	2023-02-03 16:27:16 +02:00
Monty	1f4a9f086a	Removed "<select expression> INTO <destination>" deprication. This was done after discussions with Igor, Sanja and Bar. The main reason for removing the deprication was to ensure that MariaDB is always backward compatible whenever possible. Other things: - Added statistics counters, mainly for the feedback plugin. - INTO OUTFILE - INTO variable - If INTO is using the old syntax (end of query)	2023-02-03 11:57:50 +03:00
Monty	b74d2623eb	Removed diff dates from rdiff files	2023-02-03 11:57:45 +03:00
Monty	0dd9ec97d0	Changed a rule to be cost based in test_if_cheaper_ordering - Simplified test by setting read_time=DBL_MAX at start of loop if FORCE INDEX is used - No need to test for 'group by' as the cost compare should handle it. - Only one test change where index scan was replaced with table scan (correct)	2023-02-03 10:57:02 +03:00
Monty	727491b72a	Added test cases for preceding test This includes all test changes from "Changing all cost calculation to be given in milliseconds" and forwards. Some of the things that caused changes in the result files: - As part of fixing tests, I added 'echo' to some comments to be able to easier find out where things where wrong. - MATERIALIZED has now a higher cost compared to X than before. Because of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY. - Some test cases that required MATERIALIZED to repeat a bug was changed by adding more rows to force MATERIALIZED to happen. - 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to something smaller. This is because now filtered also takes into account the smallest possible ref access and filters, even if they where not used. Another reason for 'Filtered' being smaller is that we now also take into account implicit filtering done for subqueries using FIRSTMATCH. (main.subselect_no_exists_to_in) This is caluculated in best_access_path() and stored in records_out. - Table orders has changed because more accurate costs. - 'index' and 'ALL' for small tables has changed to use 'range' or 'ref' because of optimizer_scan_setup_cost. - index can be changed to 'range' as 'range' optimizer assumes we don't have to read the blocks from disk that range optimizer has already read. This can be confusing in the case where there is no obvious where clause but instead there is a hidden 'key_column > NULL' added by the optimizer. (main.subselect_no_exists_to_in) - Scan on primary clustered key does not report 'Using Index' anymore (It's a table scan, not an index scan). - For derived tables, the number of rows is now 100 instead of 2, which can be seen in EXPLAIN. - More tests have "Using index for group by" as the cost of this optimization is now more correct (lower). - A primary key could be preferred for a normal key, even if it would access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a clustered primary key than one lookup trough a secondary. (main.stat_tables_innodb) Notes: - There was a 4.7% more calls to best_extension_by_limited_search() in the main.greedy_optimizer test. However examining the test results it looked that the plans where slightly better (eq_ref where more chained together) so I assume this is ok. - I have verified a few test cases where there was notable/unexpected changes in the plan and in all cases the new optimizer plans where faster. (main.greedy_optimizer and some others)	2023-02-03 00:00:35 +03:00
Monty	013ba37ae2	Fix cost calculation in test_if_cheaper_ordering() to be cost based The original code was mostly rule based and preferred clustered or covering indexed independent of cost. There where a few test changes: - Some test changed from using filesort to index or table scan. This happened when most of the rows had to be sorted and the ORDER BY could use covering or a clustered index (innodb_mysql, create_spatial_index). - Some test changed range to filesort. This where mainly because the range was scanning most of the rows or using index scan + row lookup and filesort with table scan is cheaper. (order_by). - Change in join_cache was because sorting 2 rows is faster than retrieving 10 rows. - In selectivity_innodb.test one test changed to use a cheaper index.	2023-02-02 23:08:23 +03:00
Monty	b6215b9b20	Update row and key fetch cost models to take into account data copy costs Before this patch, when calculating the cost of fetching and using a row/key from the engine, we took into account the cost of finding a row or key from the engine, but did not consistently take into account index only accessed, clustered key or covered keys for all access paths. The cost of the WHERE clause (TIME_FOR_COMPARE) was not consistently considered in best_access_path(). TIME_FOR_COMPARE was used in calculation in other places, like greedy_search(), but was in some cases (like scans) done an a different number of rows than was accessed. The cost calculation of row and index scans didn't take into account the number of rows that where accessed, only the number of accepted rows. When using a filter, the cost of index_only_reads and cost of accessing and disregarding 'filtered rows' where not taken into account, which made filters cost less than there actually where. To remedy the above, the following key & row fetch related costs has been added: - The cost of fetching and using a row is now split into different costs: - key + Row fetch cost (as before) but multiplied with the variable 'optimizer_cache_cost' (default to 0.5). This allows the user to tell the optimizer the likehood of finding the key and row in the engine cache. - ROW_COPY_COST, The cost copying a row from the engine to the sql layer or creating a row from the join_cache to the record buffer. Mostly affects table scan costs. - ROW_LOOKUP_COST, the cost of fetching a row by rowid. - KEY_COPY_COST the cost of finding the next key and copying it from the engine to the SQL layer. This is used when we calculate the cost index only reads. It makes index scans more expensive than before if they cover a lot of rows. (main.index_merge_myisam) - KEY_LOOKUP_COST, the cost of finding the first key in a range. This replaces the old define IDX_LOOKUP_COST, but with a higher cost. - KEY_NEXT_FIND_COST, the cost of finding the next key (and rowid). when doing a index scan and comparing the rowid to the filter. Before this cost was assumed to be 0. All of the above constants/variables are now tuned to be somewhat in proportion of executing complexity to each other. There is tuning need for these in the future, but that can wait until the above are made user variables as that will make tuning much easier. To make the usage of the above easy, there are new (not virtual) cost calclation functions in handler: - ha_read_time(), like read_time(), but take optimizer_cache_cost into account. - ha_read_and_copy_time(), like ha_read_time() but take into account ROW_COPY_TIME - ha_read_and_compare_time(), like ha_read_and_copy_time() but take TIME_FOR_COMPARE into account. - ha_rnd_pos_time(). Read row with row id, taking ROW_COPY_COST into account. This is used with filesort where we don't need to execute the WHERE clause again. - ha_keyread_time(), like keyread_time() but take optimizer_cache_cost into account. - ha_keyread_and_copy_time(), like ha_keyread_time(), but add KEY_COPY_COST. - ha_key_scan_time(), like key_scan_time() but take optimizer_cache_cost nto account. - ha_key_scan_and_compare_time(), like ha_key_scan_time(), but add KEY_COPY_COST & TIME_FOR_COMPARE. I also added some setup costs for doing different types of scans and creating temporary tables (on disk and in memory). This encourages the optimizer to not use these for simple 'a few row' lookups if there are adequate key lookup strategies. - TABLE_SCAN_SETUP_COST, cost of starting a table scan. - INDEX_SCAN_SETUP_COST, cost of starting an index scan. - HEAP_TEMPTABLE_CREATE_COST, cost of creating in memory temporary table. - DISK_TEMPTABLE_CREATE_COST, cost of creating an on disk temporary table. When calculating cost of fetching ranges, we had a cost of IDX_LOOKUP_COST (0.125) for doing a key div for a new range. This is now replaced with 'io_cost * KEY_LOOKUP_COST (1.0) * optimizer_cache_cost', which matches the cost we use for 'ref' and other key lookups. The effect is that the cost is now a bit higher when we have many ranges for a key. Allmost all calculation with TIME_FOR_COMPARE is now done in best_access_path(). 'JOIN::read_time' now includes the full cost for finding the rows in the table. In the result files, many of the changes are now again close to what they where before the "Update cost for hash and cached joins" commit, as that commit didn't fix the filter cost (too complex to do everything in one commit). The above changes showed a lot of a lot of inconsistencies in optimizer cost calculation. The main objective with the other changes was to do calculation as similar (and accurate) as possible and to make different plans more comparable. Detailed list of changes: - Calculate index_only_cost consistently and correctly for all scan and ref accesses. The row fetch_cost and index_only_cost now takes into account clustered keys, covered keys and index only accesses. - cost_for_index_read now returns both full cost and index_only_cost - Fixed cost calculation of get_sweep_read_cost() to match other similar costs. This is bases on the assumption that data is more often stored on SSD than a hard disk. - Replaced constant 2.0 with new define TABLE_SCAN_SETUP_COST. - Some scan cost estimates did not take into account TIME_FOR_COMPARE. Now all scan costs takes this into account. (main.show_explain) - Added session variable optimizer_cache_hit_ratio (default 50%). By adjusting this on can reduce or increase the cost of index or direct record lookups. The effect of the default is that key lookups is now a bit cheaper than before. See usage of 'optimizer_cache_cost' in handler.h. - JOIN_TAB::scan_time() did not take into account index only scans, which produced a wrong cost when index scan was used. Changed JOIN_TAB:::scan_time() to take into consideration clustered and covered keys. The values are now cached and we only have to call this function once. Other calls are changed to use the cached values. Function renamed to JOIN_TAB::estimate_scan_time(). - Fixed that most index cost calculations are done the same way and more close to 'range' calculations. The cost is now lower than before for small data sets and higher for large data sets as we take into account how many keys are read (main.opt_trace_selectivity, main.limit_rows_examined). - Ensured that index_scan_cost() == range(scan_of_all_rows_in_table_using_one_range) + MULTI_RANGE_READ_INFO_CONST. One effect of this is that if there is choice of doing a full index scan and a range-index scan over almost the whole table then index scan will be preferred (no range-read setup cost). (innodb.innodb, main.show_explain, main.range) - Fixed the EQ_REF and REF takes into account clustered and covered keys. This changes some plans to use covered or clustered indexes as these are much cheaper. (main.subselect_mat_cost, main.state_tables_innodb, main.limit_rows_examined) - Rowid filter setup cost and filter compare cost now takes into account fetching and checking the rowid (KEY_NEXT_FIND_COST). (main.partition_pruning heap.heap_btree main.log_state) - Added KEY_NEXT_FIND_COST to Range_rowid_filter_cost_info::lookup_cost to account of the time to find and check the next key value against the container - Introduced ha_keyread_time(rows) that takes into account finding the next row and copying the key value to 'record' (KEY_COPY_COST). - Introduced ha_key_scan_time() for calculating an index scan over all rows. - Added IDX_LOOKUP_COST to keyread_time() as a startup cost. - Added index_only_fetch_cost() as a convenience function to OPT_RANGE. - keyread_time() cost is slightly reduced to prefer shorter keys. (main.index_merge_myisam) - All of the above caused some index_merge combinations to be rejected because of cost (main.index_intersect). In some cases 'ref' where replaced with index_merge because of the low cost calculation of get_sweep_read_cost(). - Some index usage moved from PRIMARY to a covering index. (main.subselect_innodb) - Changed cost calculation of filter to take KEY_LOOKUP_COST and TIME_FOR_COMPARE into account. See sql_select.cc::apply_filter(). filter parameters and costs are now written to optimizer_trace. - Don't use matchings_records_in_range() to try to estimate the number of filtered rows for ranges. The reason is that we want to ensure that 'range' is calculated similar to 'ref'. There is also more work needed to calculate the selectivity when using ranges and ranges and filtering. This causes filtering column in EXPLAIN EXTENDED to be 100.00 for some cases where range cannot use filtering. (main.rowid_filter) - Introduced ha_scan_time() that takes into account the CPU cost of finding the next row and copying the row from the engine to 'record'. This causes costs of table scan to slightly increase and some test to changed their plan from ALL to RANGE or ALL to ref. (innodb.innodb_mysql, main.select_pkeycache) In a few cases where scan time of very small tables have lower cost than a ref or range, things changed from ref/range to ALL. (main.myisam, main.func_group, main.limit_rows_examined, main.subselect2) - Introduced ha_scan_and_compare_time() which is like ha_scan_time() but also adds the cost of the where clause (TIME_FOR_COMPARE). - Added small cost for creating temporary table for materialization. This causes some very small tables to use scan instead of materialization. - Added checking of the WHERE clause (TIME_FOR_COMPARE) of the accepted rows to ROR costs in get_best_ror_intersect() - Removed '- 0.001' from 'join->best_read' and optimize_straight_join() to ensure that the 'Last_query_cost' status variable contains the same value as the one that was calculated by the optimizer. - Take avg_io_cost() into account in handler::keyread_time() and handler::read_time(). This should have no effect as it's 1.0 by default, except for heap that overrides these functions. - Some 'ref_or_null' accesses changed to 'range' because of cost adjustments (main.order_by) - Added scan type "scan_with_join_cache" for optimizer_trace. This is just to show in the trace what kind of scan was used. - When using 'scan_with_join_cache' take into account number of preceding tables (as have to restore all fields for all previous table combination when checking the where clause) The new cost added is: (row_combinations * ROW_COPY_COST * number_of_cached_tables). This increases the cost of join buffering in proportion of the number of tables in the join buffer. One effect is that full scans are now done earlier as the cost is then smaller. (main.join_outer_innodb, main.greedy_optimizer) - Removed the usage of 'worst_seeks' in cost_for_index_read as it caused wrong plans to be created; It prefered JT_EQ_REF even if it would be much more expensive than a full table scan. A related issue was that worst_seeks only applied to full lookup, not to clustered or index only lookups, which is not consistent. This caused some plans to use index scan instead of eq_ref (main.union) - Changed federated block size from 4096 to 1500, which is the typical size of an IO packet. - Added costs for reading rows to Federated. Needed as there is no caching of rows in the federated engine. - Added ha_innobase::rnd_pos_time() cost function. - A lot of extra things added to optimizer trace - More costs, especially for materialization and index_merge. - Make lables more uniform - Fixed a lot of minor bugs - Added 'trace_started()' around a lot of trace blocks. - When calculating ORDER BY with LIMIT cost for using an index the cost did not take into account the number of row retrivals that has to be done or the cost of comparing the rows with the WHERE clause. The cost calculated would be just a fraction of the real cost. Now we calculate the cost as we do for ranges and 'ref'. - 'Using index for group-by' is used a bit more than before as now take into account the WHERE clause cost when comparing with 'ref' and prefer the method with fewer row combinations. (main.group_min_max). Bugs fixed: - Fixed that we don't calculate TIME_FOR_COMPARE twice for some plans, like in optimize_straight_join() and greedy_search() - Fixed bug in save_explain_data where we could test for the wrong index when displaying 'Using index'. This caused some old plans to show 'Using index'. (main.subselect_innodb, main.subselect2) - Fixed bug in get_best_ror_intersect() where 'min_cost' was not updated, and the cost we compared with was not the one that was used. - Fixed very wrong cost calculation for priority queues in check_if_pq_applicable(). (main.order_by now correctly uses priority queue) - When calculating cost of EQ_REF or REF, we added the cost of comparing the WHERE clause with the found rows, not all row combinations. This made ref and eq_ref to be regarded way to cheap compared to other access methods. - FORCE INDEX cost calculation didn't take into account clustered or covered indexes. - JT_EQ_REF cost was estimated as avg_io_cost(), which is half the cost of a JT_REF key. This may be true for InnoDB primary key, but not for other unique keys or other engines. Now we use handler function to calculate the cost, which allows us to handle consistently clustered, covered keys and not covered keys. - ha_start_keyread() didn't call extra_opt() if keyread was already enabled but still changed the 'keyread' variable (which is wrong). Fixed by not doing anything if keyread is already enabled. - multi_range_read_info_cost() didn't take into account io_cost when calculating the cost of ranges. - fix_semijoin_strategies_for_picked_join_order() used the wrong record_count when calling best_access_path() for SJ_OPT_FIRST_MATCH and SJ_OPT_LOOSE_SCAN. - Hash joins didn't provide correct best_cost to the upper level, which means that the cost for hash_joins more expensive than calculated in best_access_path (a difference of 10x * TIME_OF_COMPARE). This is fixed in the new code thanks to that we now include TIME_OF_COMPARE cost in 'read_time'. Other things: - Added some 'if (thd->trace_started())' to speed up code - Removed not used function Cost_estimate::is_zero() - Simplified testing of HA_POS_ERROR in get_best_ror_intersect(). (No cost changes) - Moved ha_start_keyread() from join_read_const_table() to join_read_const() to enable keyread for all types of JT_CONST tables. - Made a few very short functions inline in handler.h Notes: - In main.rowid_filter the join order of order and lineitem is swapped. This is because the cost of doing a range fetch of lineitem(98 rows) is almost as big as the whole join of order,lineitem. The filtering will also ensure that we only have to do very small key fetches of the rows in lineitem. - main.index_merge_myisam had a few changes where we are now using less keys for index_merge. This is because index scans are now more expensive than before. - handler->optimizer_cache_cost is updated in ha_external_lock(). This ensures that it is up to date per statements. Not an optimal solution (for locked tables), but should be ok for now. - 'DELETE FROM t1 WHERE t1.a > 0 ORDER BY t1.a' does not take cost of filesort into consideration when table scan is chosen. (main.myisam_explain_non_select_all) - perfschema.table_aggregate_global_* has changed because an update on a table with 1 row will now use table scan instead of key lookup. TODO in upcomming commits: - Fix selectivity calculation for ranges with and without filtering and when there is a ref access but scan is chosen. For this we have to store the lowest known value for 'accepted_records' in the OPT_RANGE structure. - Change that records_read does not include filtered rows. - test_if_cheaper_ordering() needs to be updated to properly calculate costs. This will fix tests like main.order_by_innodb, main.single_delete_update - Extend get_range_limit_read_cost() to take into considering cost_for_index_read() if there where no quick keys. This will reduce the computed cost for ORDER BY with LIMIT in some cases. (main.innodb_ext_key) - Fix that we take into account selectivity when counting the number of rows we have to read when considering using a index table scan to resolve ORDER BY. - Add new calculation for rnd_pos_time() where we take into account the benefit of reading multiple rows from the same page.	2023-02-02 21:43:30 +03:00
Monty	956980971f	Update cost for hash and cached joins The old code did not't correctly add TIME_FOR_COMPARE to rows that are part of the scan that will be compared with the attached where clause. Now the cost calculation for hash join and full join cache join are identical except for HASH_FANOUT (10%) The cost for a join with keys is now also uniform. The total cost for a using a key for lookup is calculated in one place as: (cost_of_finding_rows_through_key(records) + records/TIME_FOR_COMPARE)* record_count_of_previous_row_combinations + startup_cost startup_cost is the cost of a creating a temporary table (if needed) Best_cost now includes the cost of comparing all WHERE clauses and also cost of joining with previous row combinations. Other things: - Optimizer trace is now printing the total costs, including testing the WHERE clause (TIME_FOR_COMPARE) and comparing with all previous rows. - In optimizer trace, include also total cost of query together with the final join order. This makes it easier to find out where the cost was calculated. - Old code used filter even if the cost for it was higher than not using a filter. This is not corrected. - When rebasing on 10.11, I noticed some changes to access_cost_factor calculation. These changes was not picked as the coming changes to filtering will make that code obsolete.	2023-02-02 20:49:35 +03:00
Oleksandr Byelkin	cafba8761a	Merge branch '10.10' into 10.11	2023-02-01 18:28:03 +01:00
Oleksandr Byelkin	cc8b9bcee3	Merge branch '10.9' into 10.10	2023-02-01 17:53:45 +01:00
Julius Goryavsky	e3e72644cf	MDEV-30452: ssl error: unexpected EOF while reading This commit contains fixes for error codes, which are needed because OpenSSL 3.x and recent versions of GnuTLS have changed the indication of error codes when the peer does not send close_notify before closing the connection.	2023-02-01 17:50:29 +01:00
Oleksandr Byelkin	260f1fe7c3	Merge branch '10.8' into 10.9	2023-02-01 17:21:42 +01:00
Oleksandr Byelkin	d4310eb96a	Merge branch '10.7' into 10.8	2023-02-01 17:19:48 +01:00
Oleksandr Byelkin	bc656c4fa5	Merge branch '10.6' into 10.7	2023-02-01 16:29:16 +01:00
Marko Mäkelä	1c926b6263	MDEV-30527 Assertion !m_freed_pages in mtr_t::start() on DROP TEMPORARY TABLE mtr_t::commit(): Add special handling of innodb_immediate_scrub_data_uncompressed for TEMPORARY TABLE. This fixes a regression that was caused by commit `de4030e4d4` (MDEV-30400).	2023-02-01 10:55:49 +02:00
Monty	c443dbff0e	Ensure that test_quick_select doesn't return more rows than in the table Other changes: - In test_quick_select(), assume that if table->used_stats_records is 0 then the table has 0 rows. - Fixed prepare_simple_select() to populate table->used_stat_records - Enusre that set_statistics_for_tables() doesn't cause used_stats_records to be 0 when using stat_tables. - To get blackhole to work with replication, set stats.records to 2 so that test_quick_select() doesn't assume the table is empty.	2023-01-30 15:22:20 +02:00
Marko Mäkelä	75c78316d6	Merge 10.11 into 11.0	2023-01-25 10:17:54 +02:00
Marko Mäkelä	10635c2833	Merge 10.10 into 10.11	2023-01-24 15:17:39 +02:00
Marko Mäkelä	51fc6b91d2	Merge 10.9 into 10.10	2023-01-24 15:17:10 +02:00
Marko Mäkelä	4d9fe4032b	Merge 10.8 into 10.9	2023-01-24 14:59:42 +02:00
Marko Mäkelä	fa543a0f62	Merge 10.7 into 10.8	2023-01-24 14:52:25 +02:00
Marko Mäkelä	cea50896d2	Merge 10.6 into 10.7	2023-01-24 14:35:36 +02:00
Marko Mäkelä	de4030e4d4	MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT This also fixes part of MDEV-29835 Partial server freeze which is caused by violations of the latching order that was defined in https://dev.mysql.com/worklog/task/?id=6326 (WL#6326: InnoDB: fix index->lock contention). Unless the current thread is holding an exclusive dict_index_t::lock, it must acquire page latches in a strict parent-to-child, left-to-right order. Not all cases of MDEV-29835 are fixed yet. Failure to follow the correct latching order will cause deadlocks of threads due to lock order inversion. As part of these changes, the BTR_MODIFY_TREE mode is modified so that an Update latch (U a.k.a. SX) will be acquired on the root page, and eXclusive latches (X) will be acquired on all pages leading to the leaf page, as well as any left and right siblings of the pages along the path. The DEBUG_SYNC test innodb.innodb_wl6326 will be removed, because at the time the DEBUG_SYNC point is hit, the thread is actually holding several page latches that will be blocking a concurrent SELECT statement. We also remove double bookkeeping that was caused due to excessive information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo store information of latched pages, and ensure that mtr_memo_slot_t::object is never a null pointer. The tree_blocks[] and tree_savepoints[] were redundant. buf_page_get_low(): If innodb_change_buffering_debug=1, to avoid a hang, do not try to evict blocks if we are holding a latch on a modified page. The test innodb.innodb-change-buffer-recovery will be removed, because change buffering may no longer be forced by debug injection when the change buffer comprises multiple pages. Remove a debug assertion that could fail when innodb_change_buffering_debug=1 fails to evict a page. For other cases, the assertion is redundant, because we already checked that right after the got_block: label. The test innodb.innodb-change-buffering-recovery will be removed, because due to this change, we will be unable to evict the desired page. mtr_t::lock_register(): Register a change of a page latch on an unmodified buffer-fixed block. mtr_t::x_latch_at_savepoint(), mtr_t::sx_latch_at_savepoint(): Replaced by the use of mtr_t::upgrade_buffer_fix(), which now also handles RW_S_LATCH. mtr_t::set_modified(): For temporary tables, invoke buf_page_t::set_modified() here and not in mtr_t::commit(). We will never set the MTR_MEMO_MODIFY flag on other than persistent data pages, nor set mtr_t::m_modifications when temporary data pages are modified. mtr_t::commit(): Only invoke the buf_flush_note_modification() loop if persistent data pages were modified. mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo. This avoids many redundant entries in mtr_t::m_memo, as well as redundant calls to buf_page_get_gen() for blocks that had already been looked up in a mini-transaction. btr_get_latched_root(): Return a pointer to an already latched root page. This replaces btr_root_block_get() in cases where the mini-transaction has already latched the root page. btr_page_get_parent(): Fetch a parent page that was already latched in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched(). If needed, upgrade the root page U latch to X. This avoids bloating mtr_t::m_memo as well as performing redundant buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for B-tree defragmentation, we will invoke btr_cur_search_to_nth_level(). btr_cur_search_to_nth_level(): This will only be used for non-leaf (level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be removed altogether, or retained for the case of CHECK TABLE without QUICK. btr_cur_t::left_block: Remove. btr_pcur_move_backward_from_page() can retrieve the left sibling from the end of mtr_t::m_memo. btr_cur_t::open_leaf(): Some clean-up. btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level() for searches to level=0 (the leaf level). We will never release parent page latches before acquiring leaf page latches. If we need to temporarily release the level=1 page latch in the BTR_SEARCH_PREV or BTR_MODIFY_PREV latch_mode, we will reposition the cursor on the child node pointer so that we will land on the correct leaf page. btr_cur_t::pessimistic_search_leaf(): Implement new BTR_MODIFY_TREE latching logic in the case that page splits or merges will be needed. The parent pages (and their siblings) should already be latched on the first dive to the leaf and be present in mtr_t::m_memo; there should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost suffices; it must be revised in MDEV-29835 and work-arounds removed for cases where mtr_t::get_already_latched() fails to find a block. rtr_search_to_nth_level(): A SPATIAL INDEX version of btr_search_to_nth_level() that can search to any level (including the leaf level). rtr_search_leaf(), rtr_insert_leaf(): Wrappers for rtr_search_to_nth_level(). rtr_search(): Replaces rtr_pcur_open(). rtr_latch_leaves(): Replaces btr_cur_latch_leaves(). Note that unlike in the B-tree code, there is no error handling in case the sibling pages are corrupted. rtr_cur_restore_position(): Remove an unused constant parameter. btr_pcur_open_on_user_rec(): Remove the constant parameter mode=PAGE_CUR_GE. row_ins_clust_index_entry_low(): Use a new mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC. BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove. BTR_CONT_MODIFY_TREE: Note that this is only used by rtr_search_to_nth_level(). btr_pcur_optimistic_latch_leaves(): Replaces btr_cur_optimistic_latch_leaves(). ibuf_delete_rec(): Acquire exclusive ibuf.index->lock in order to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV). btr_blob_log_check_t(): Acquire a U latch on the root page, so that btr_page_alloc() in btr_store_big_rec_extern_fields() will avoid a deadlock. btr_store_big_rec_extern_fields(): Assert that the root page latch is being held. Tested by: Matthias Leich Reviewed by: Vladislav Lesin	2023-01-24 14:09:21 +02:00
Marko Mäkelä	e41fb3697c	Revert "MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT" This reverts commit `f9cac8d2cb` which was accidentally pushed prematurely.	2023-01-23 14:52:49 +02:00
Marko Mäkelä	851c56771e	Merge 10.5 into 10.6	2023-01-23 13:15:41 +02:00
Marko Mäkelä	1bbf37e0db	MDEV-515: Improve test coverage Cover dict_index_t::clear() for TEMPORARY TABLE	2023-01-23 13:05:52 +02:00
Thirunarayanan Balathandayuthapani	647a7232ff	MDEV-30438 innodb.undo_truncate,4k fails when innodb-immediate-scrub-data-uncompressed is enabled - InnoDB fails to clear the freed ranges during truncation of innodb undo log tablespace. During shutdown, InnoDB flushes the freed page ranges and throws the out of bound error. mtr_t::commit_shrink(): clear the freed ranges while doing undo tablespace truncation	2023-01-23 09:55:49 +05:30
Marko Mäkelä	f9cac8d2cb	MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT This also fixes part of MDEV-29835 Partial server freeze which is caused by violations of the latching order that was defined in https://dev.mysql.com/worklog/task/?id=6326 (WL#6326: InnoDB: fix index->lock contention). Unless the current thread is holding an exclusive dict_index_t::lock, it must acquire page latches in a strict parent-to-child, left-to-right order. Not all cases are fixed yet. Failure to follow the correct latching order will cause deadlocks of threads due to lock order inversion. As part of these changes, the BTR_MODIFY_TREE mode is modified so that an Update latch (U a.k.a. SX) will be acquired on the root page, and eXclusive latches (X) will be acquired on all pages leading to the leaf page, as well as any left and right siblings of the pages along the path. The test innodb.innodb_wl6326 will be removed, because at the time the DEBUG_SYNC point is hit, the thread is actually holding several page latches that will be blocking a concurrent SELECT statement. We also remove double bookkeeping that was caused due to excessive information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo store information of latched pages, and ensure that mtr_memo_slot_t::object is never a null pointer. The tree_blocks[] and tree_savepoints[] were redundant. mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo. This avoids many redundant entries in mtr_t::m_memo, as well as redundant calls to buf_page_get_gen() for blocks that had already been looked up in a mini-transaction. btr_get_latched_root(): Return a pointer to an already latched root page. This replaces btr_root_block_get() in cases where the mini-transaction has already latched the root page. btr_page_get_parent(): Fetch a parent page that was already latched in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched(). If needed, upgrade the root page U latch to X. This avoids bloating mtr_t::m_memo as well as redundant buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for B-tree defragmentation, we will invoke btr_cur_search_to_nth_level(). btr_cur_search_to_nth_level(): This will only be used for non-leaf (level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be removed altogether, or retained for the case of CHECK TABLE without QUICK. btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level() for searches to level=0 (the leaf level). btr_cur_t::pessimistic_search_leaf(): Implement the new BTR_MODIFY_TREE latching logic in the case that page splits or merges will be needed. The parent pages (and their siblings) should already be latched on the first dive to the leaf and be present in mtr_t::m_memo; there should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost suffices; MDEV-29835 will have to revise it and remove work-arounds where mtr_t::get_already_latched() fails to find a block. rtr_search_to_nth_level(): A SPATIAL INDEX version of btr_search_to_nth_level() that can search to any level (including the leaf level). rtr_search_leaf(), rtr_insert_leaf(): Wrappers for rtr_search_to_nth_level(). rtr_search(): Replaces rtr_pcur_open(). rtr_cur_restore_position(): Remove an unused constant parameter. btr_pcur_open_on_user_rec(): Remove the constant parameter mode=PAGE_CUR_GE. btr_cur_latch_leaves(): Update a pre-existing mtr_t::m_memo entry for the current leaf page. row_ins_clust_index_entry_low(): Use a new mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC. btr_cur_t::open_leaf(): Some clean-up. mtr_t::lock_register(): Register a page latch on a buffer-fixed block. BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove. BTR_CONT_MODIFY_TREE: Note that this is only used by rtr_search_to_nth_level(). btr_pcur_optimistic_latch_leaves(): Replaces btr_cur_optimistic_latch_leaves(). ibuf_delete_rec(): Acquire ibuf.index->lock.u_lock() in order to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV). Tested by: Matthias Leich	2023-01-19 17:19:18 +02:00
Oleksandr Byelkin	66bd8cd6c3	Merge branch '10.10' into 10.11	2023-01-18 16:58:28 +01:00
Oleksandr Byelkin	45087dd0b3	Merge branch '10.9' into 10.10	2023-01-18 16:45:59 +01:00
Oleksandr Byelkin	08d4968404	Merge branch '10.8' into 10.9	2023-01-18 16:39:11 +01:00
Oleksandr Byelkin	26d8485244	Merge branch '10.7' into 10.8	2023-01-18 16:37:40 +01:00
Oleksandr Byelkin	795ff0daf0	Merge branch '10.6' into 10.7	2023-01-18 16:36:13 +01:00
Marko Mäkelä	a8c5635cf1	Merge 10.5 into 10.6	2023-01-17 20:02:29 +02:00
Sergei Golubchik	a5eff044cb	MDEV-22602 Disable UPDATE CASCADE for SQL constraints fix it for named constraints too	2023-01-17 15:28:56 +01:00
Marko Mäkelä	44dce3b207	MDEV-29986 Set innodb_undo_tablespaces=3 by default Starting with commit `baf276e6d4` (MDEV-19229) the parameter innodb_undo_tablespaces can be increased from its previous default value 0 while allowing an upgrade from old databases. We will change the default setting to innodb_undo_tablespaces=3 so that the space occupied by possible bursts of undo log records can be reclaimed after SET GLOBAL innodb_undo_log_truncate=ON. We will not enable innodb_undo_log_truncate by default, because it causes some observable performance degradation. Special thanks to Thirunarayanan Balathandayuthapani for diagnosing and fixing a number of bugs related to this new default setting. Tested by: Matthias Leich, Axel Schwenke, Vladislav Vaintroub (with both values of innodb_undo_log_truncate)	2023-01-13 12:46:30 +02:00
Marko Mäkelä	f27e9c8947	MDEV-29694 Remove the InnoDB change buffer The purpose of the change buffer was to reduce random disk access, which could be useful on rotational storage, but maybe less so on solid-state storage. When we wished to (1) insert a record into a non-unique secondary index, (2) delete-mark a secondary index record, (3) delete a secondary index record as part of purge (but not ROLLBACK), and the B-tree leaf page where the record belongs to is not in the buffer pool, we inserted a record into the change buffer B-tree, indexed by the page identifier. When the page was eventually read into the buffer pool, we looked up the change buffer B-tree for any modifications to the page, applied these upon the completion of the read operation. This was called the insert buffer merge. We remove the change buffer, because it has been the source of various hard-to-reproduce corruption bugs, including those fixed in commit `5b9ee8d819` and commit `165564d3c3` but not limited to them. A downgrade will fail with a clear message starting with commit `db14eb16f9` (MDEV-30106). buf_page_t::state: Merge IBUF_EXIST to UNFIXED and WRITE_FIX_IBUF to WRITE_FIX. buf_pool_t::watch[]: Remove. trx_t: Move isolation_level, check_foreigns, check_unique_secondary, bulk_insert into the same bit-field. The only purpose of trx_t::check_unique_secondary is to enable bulk insert into an empty table. It no longer enables insert buffering for UNIQUE INDEX. btr_cur_t::thr: Remove. This field was originally needed for change buffering. Later, its use was extended to cover SPATIAL INDEX. Much of the time, rtr_info::thr holds this field. When it does not, we will add parameters to SPATIAL INDEX specific functions. ibuf_upgrade_needed(): Check if the change buffer needs to be updated. ibuf_upgrade(): Merge and upgrade the change buffer after all redo log has been applied. Free any pages consumed by the change buffer, and zero out the change buffer root page to mark the upgrade completed, and to prevent a downgrade to an earlier version. dict_load_tablespaces(): Renamed from dict_check_tablespaces_and_store_max_id(). This needs to be invoked before ibuf_upgrade(). btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics. The change buffer merge does not need this function anymore. btr_page_alloc(): Renamed from btr_page_alloc_low(). We no longer allocate any change buffer pages. btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics. The change buffer merge does not need this function anymore. row_search_index_entry(), btr_lift_page_up(): Add a parameter thr for the SPATIAL INDEX case. rtr_page_split_and_insert(): Specialized from btr_page_split_and_insert(). rtr_root_raise_and_insert(): Specialized from btr_root_raise_and_insert(). Note: The support for upgrading from the MySQL 3.23 or MySQL 4.0 change buffer format that predates the MySQL 4.1 introduction of the option innodb_file_per_table was removed in MySQL 5.6.5 as part of mysql/mysql-server@69b6241a79 and MariaDB 10.0.11 as part of `1d0f70c2f8`. In the tests innodb.log_upgrade and innodb.log_corruption, we create valid (upgraded) change buffer pages. Tested by: Matthias Leich	2023-01-11 17:59:36 +02:00
Marko Mäkelä	e581396b7a	MDEV-29983 Deprecate innodb_file_per_table Before commit `6112853cda` in MySQL 4.1.1 introduced the parameter innodb_file_per_table, all InnoDB data was written to the InnoDB system tablespace (often named ibdata1). A serious design problem is that once the system tablespace has grown to some size, it cannot shrink even if the data inside it has been deleted. There are also other design problems, such as the server hang MDEV-29930 that should only be possible when using innodb_file_per_table=0 and innodb_undo_tablespaces=0 (storing both tables and undo logs in the InnoDB system tablespace). The parameter innodb_change_buffering was deprecated in commit `b5852ffbee`. Starting with commit `baf276e6d4` (MDEV-19229) the number of innodb_undo_tablespaces can be increased, so that the undo logs can be moved out of the system tablespace of an existing installation. If all these things (tables, undo logs, and the change buffer) are removed from the InnoDB system tablespace, the only variable-size data structure inside it is the InnoDB data dictionary. DDL operations on .ibd files was optimized in commit `86dc7b4d4c` (MDEV-24626). That should have removed any thinkable performance advantage of using innodb_file_per_table=0. Since there should be no benefit of setting innodb_file_per_table=0, the parameter should be deprecated. Starting with MySQL 5.6 and MariaDB Server 10.0, the default value is innodb_file_per_table=1.	2023-01-11 17:55:56 +02:00
Marko Mäkelä	3a237f7666	Merge 10.10 into 10.11	2023-01-11 11:13:56 +02:00
Marko Mäkelä	cae5a0328b	Merge 10.9 into 10.10	2023-01-10 15:06:25 +02:00
Marko Mäkelä	820ebcec86	Merge 10.8 into 10.9	2023-01-10 14:50:58 +02:00
Marko Mäkelä	92c8d6f168	Merge 10.7 into 10.8 The MDEV-25004 test innodb_fts.versioning is omitted because ever since commit `685d958e38` InnoDB would not allow writes to a database where the redo log file ib_logfile0 is missing.	2023-01-10 14:42:50 +02:00
Marko Mäkelä	8356fb68c3	Merge 10.6 into 10.7	2023-01-04 14:52:25 +02:00
Michael Roosz	b5a54e8a93	MDEV-30321: blob data corrupted by row_merge_write_blob_to_tmp_file()	2023-01-04 16:21:07 +05:30
Marko Mäkelä	fe38d7cad4	Remove redundant statements from a test	2023-01-04 10:04:58 +02:00
Marko Mäkelä	e441c32a0b	Merge 10.5 into 10.6	2023-01-03 18:13:11 +02:00
Marko Mäkelä	8b9b4ab3f5	Merge 10.4 into 10.5	2023-01-03 17:08:42 +02:00
Marko Mäkelä	fb0808c450	Merge 10.3 into 10.4	2023-01-03 16:10:02 +02:00
Vlad Lesin	3ddc00dc3b	MDEV-30225 RR isolation violation with locking unique search Before the fix next-key lock was requested only if a record was delete-marked for locking unique search in RR isolation level. There can be several delete-marked records for the same unique key, that's why InnoDB scans the records until eighter non-delete-marked record is reached or all delete-marked records with the same unique key are scanned. For range scan next-key locks are used for RR to protect scanned range from inserting new records by other transactions. And this is the reason of why next-key locks are used for delete-marked records for unique searches. If a record is not delete-marked, the requested lock type was "not-gap". When a record is not delete-marked during lock request by trx 1, and some other transaction holds conflicting lock, trx 1 creates waiting not-gap lock on the record and suspends. During trx 1 suspending the record can be delete-marked. And when the lock is granted on conflicting transaction commit or rollback, its type is still "not-gap". So we have "not-gap" lock on delete-marked record for RR. And this let some other transaction to insert some record with the same unique key when trx 1 is not committed, what can cause isolation level violation. The fix is to set next-key locks for both delete-marked and non-delete-marked records for unique search in RR.	2022-12-20 11:31:49 +03:00
Marko Mäkelä	c562ccf796	MDEV-30233 DROP DATABASE test fails: Directory not empty Some tests drop the default mtr database "test". This may fail due to the directory not being empty. InnoDB may not delete all tables immediately, due to the "background drop table queue" or its replacement in commit `1bd681c8b3` (the purge of history would clean up after a DDL operation during which the server was killed). Let us try to avoid "drop database test" whenever it is easily possible. Where it is not, SET GLOBAL innodb_max_purge_lag_wait=0 will ensure that the replacement of the "background drop table queue" will have completed its job.	2022-12-15 11:14:23 +02:00
Marko Mäkelä	0aca3012a1	Merge 10.10 into 10.11	2022-12-14 09:18:30 +02:00
Marko Mäkelä	fa389b9098	Merge 10.9 into 10.10	2022-12-14 08:57:39 +02:00
Marko Mäkelä	b7914f562d	Merge 10.8 into 10.9	2022-12-13 18:24:51 +02:00
Marko Mäkelä	d7a4ce3c80	Merge 10.7 into 10.8	2022-12-13 18:11:24 +02:00
Marko Mäkelä	25b91c3f13	Merge 10.6 into 10.7	2022-12-13 18:01:49 +02:00
Marko Mäkelä	a8a5c8a1b8	Merge 10.5 into 10.6	2022-12-13 16:58:58 +02:00
Marko Mäkelä	1dc2f35598	Merge 10.4 into 10.5	2022-12-13 14:39:18 +02:00
Marko Mäkelä	fdf43b5c78	Merge 10.3 into 10.4	2022-12-13 11:37:33 +02:00
Marko Mäkelä	782b2a7500	MDEV-29144 ER_TABLE_SCHEMA_MISMATCH or crash on DISCARD/IMPORT mysql_discard_or_import_tablespace(): On successful ALTER TABLE...DISCARD TABLESPACE, evict the table handle from the table definition cache, so that ha_innobase::close() will be invoked, like InnoDB expects to be the case. This will avoid an assertion failure ut_a(table->get_ref_count() == 0) during IMPORT TABLESPACE. ha_innobase::open(): Do not issue any ER_TABLESPACE_DISCARDED warning. Member functions for DML will do that. ha_innobase::truncate(), ha_innobase::check_if_supported_inplace_alter(): Issue ER_TABLESPACE_DISCARDED warnings, to compensate for the removal of the warning in ha_innobase::open(). row_quiesce_write_indexes(): Only write information about committed indexes. The ALTER TABLE t NOWAIT ADD INDEX(c) in the nondeterministic test case will most of the time fail due to a metadata lock (MDL) timeout and leave behind an uncommitted index. Reviewed by: Sergei Golubchik	2022-12-09 10:42:19 +02:00
Marko Mäkelä	64071d30bd	Merge 10.10 into 10.11	2022-12-07 10:00:52 +02:00
Marko Mäkelä	3ff4eb07ed	Merge 10.9 into 10.10	2022-12-07 09:49:38 +02:00
Marko Mäkelä	23f705f3a2	Merge 10.8 into 10.9	2022-12-07 09:43:38 +02:00
Marko Mäkelä	b3c254339b	Merge 10.7 into 10.8	2022-12-07 09:43:13 +02:00
Marko Mäkelä	9e27e53dfa	Merge 10.6 into 10.7	2022-12-07 09:39:46 +02:00
Marko Mäkelä	e55397a46d	Merge 10.5 into 10.6	2022-12-05 18:04:23 +02:00
Jan Lindström	4eb8e51c26	Merge 10.4 into 10.5	2022-11-30 13:10:52 +02:00
Marko Mäkelä	b81b194393	Merge 10.10 into 10.11	2022-11-30 12:59:57 +02:00
Marko Mäkelä	a27bfb2a87	Merge 10.9 into 10.10	2022-11-30 12:34:45 +02:00
Marko Mäkelä	3ba8828396	Merge 10.8 into 10.9	2022-11-30 12:21:10 +02:00
Marko Mäkelä	0751bfbcaf	Merge 10.7 into 10.8	2022-11-30 12:12:07 +02:00
Marko Mäkelä	b7ae4d442a	Merge 10.6 into 10.7	2022-11-30 12:09:01 +02:00
Marko Mäkelä	d32b2e7e8e	Merge 10.5 into 10.6	2022-11-30 08:32:57 +02:00
Marko Mäkelä	1181564131	MDEV-24412: Disable the test on ./mtr --embedded	2022-11-30 08:32:05 +02:00
Marko Mäkelä	c59985fcf5	Merge 10.5 into 10.6	2022-11-30 07:06:41 +02:00
Marko Mäkelä	846112ce36	MDEV-24412: Create a separate test Some builders in our CI, most notably FreeBSD and IBM AIX, do not support sparse files. Also, Microsoft Windows requires special means for creating sparse files. Since these platforms do not run ./mtr --big-test, we will for now simply move the test to a separate file that requires that option.	2022-11-30 06:57:32 +02:00
Marko Mäkelä	499ef7bf23	Add a global suppression for O_DIRECT failures Fixes up commit `b8ad6fbd95`	2022-11-29 11:06:29 +02:00
Monty	b8ad6fbd95	Fixed warning from innodb.create_isl_with_direct if have_symlink is disabled	2022-11-29 03:34:35 +02:00

... 4 5 6 7 8 ...

4126 commits