mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 03:52:35 +01:00

Author	SHA1	Message	Date
Oleksandr Byelkin	564d374704	Merge branch '10.8' into 10.9	2022-08-08 17:17:45 +02:00
Oleksandr Byelkin	50b270525a	Merge branch '10.7' into 10.8	2022-08-08 17:15:13 +02:00
Oleksandr Byelkin	1d48041982	Merge branch '10.6' into 10.7	2022-08-08 17:12:32 +02:00
Oleksandr Byelkin	d2f1c3ed6c	Merge branch '10.5' into bb-10.6-release	2022-08-03 12:19:59 +02:00
Oleksandr Byelkin	af143474d8	Merge branch '10.4' into 10.5	2022-08-03 07:12:27 +02:00
Aleksey Midenkov	1ea5e402a8	MDEV-28727 ALTER TABLE ALGORITHM=NOCOPY does not work after upgrade MDEV-20704 changed the rules of how (HA_BINARY_PACK_KEY \| HA_VAR_LENGTH_KEY) flags are added. Older FRMs before that fix had these flags for DOUBLE index. After that fix when ALTER sees such old FRM it thinks it cannot do instant alter because of failed compare_keys_but_name(): it compares flags against tmp table created by ALTER. MDEV-20704 fix was actually not about DOUBLE type but about FIELDFLAG_BLOB which affected DOUBLE. So there is no direct knowledge that any other types were not affected. The proposed fix under CHECK TABLE checks if FRM has (HA_BINARY_PACK_KEY \| HA_VAR_LENGTH_KEY) flags and was created prior MDEV-20704 and if so issues "needs upgrade". When mysqlcheck and mysql_upgrade see such status they issue ALTER TABLE FORCE and upgrade the table to the version of server.	2022-08-01 19:14:46 +03:00
Aleksey Midenkov	231feabd2b	MDEV-21540 Initialization of already inited long unique index on reorganize partition Handler for existing partition was already index-inited at the beginning of copy_partitions(). In the case of REORGANIZE PARTITION we fill new partition by calling its ha_write_row() (handler is storage engine of new partition). From that we go through the below conditions: if (this->inited == RND) table->clone_handler_for_update(); handler *h= table->update_handler ? table->update_handler : table->file; First, the above misses the meaning of this->inited check. Now it is new partition and this handler is not inited. So, we assign table->file which is ha_partition and is really not known to be inited or not. It is supposed (this == table->file), otherwise we are out of the logic for using update_handler. This patch adds DBUG_ASSERT for that. Second, we call check_duplicate_long_entries() for table->file and that calls ha_partition::index_init() which calls index_init() for each partition's handler. But the existing parititions' handlers was already inited in copy_partitions() and we fail on assertion. The fix implies that we don't need check_duplicate_long_entries() per-partition as we've already done check_duplicate_long_entries() for ha_partition. For REORGANIZE PARTITION that means existing row was already checked at previous INSERT/UPDATE commands, so no need to check it again (see NOTE in handler::ha_write_row()). The fix also optimizes ha_update_row() so check_duplicate_long_entries_update() is not called per-partition considering it was already called for ha_partition. Besides, per-partition duplicate check is not really usable.	2022-08-01 19:14:46 +03:00
Oleksandr Byelkin	3bb36e9495	Merge branch '10.3' into 10.4	2022-07-27 11:02:57 +02:00
Brandon Nesterenko	555c12a541	MDEV-21087/MDEV-21433: ER_SLAVE_INCIDENT arrives at slave without failure specifics Problem: ======= This patch addresses two issues: 1. An incident event can be incorrectly reported for transactions which are rolled back successfully. That is, an incident event should only be generated for failed “non-transactional transactions” (i.e., those which modify non-transactional tables) because they cannot be rolled back. 2. When the mariadb slave (error) stops at receiving the incident event there's no description of what led to it. Neither in the event nor in the master's error log. Solution: ======== Before reporting an incident event for a transaction, first validate that it is “non-transactional” (i.e. cannot be safely rolled back). To determine if a transaction is non-transactional, lex->stmt_accessed_table(LEX::STMT_WRITES_NON_TRANS_TABLE) is used because it is set previously in THD::decide_logging_format(). Additionally, when an incident event is written, write an error message to the server’s error log to indicate the underlying issue. Reviewed by: =========== Andrei Elkin <andrei.elkin@mariadb.com>	2022-07-25 16:26:53 -06:00
Alexander Barkov	208addf484	Main patch MDEV-27896 Wrong result upon `COLLATE latin1_bin CHARACTER SET latin1` on the table or the database level Also fixes MDEV-27782 Wrong columns when using table level `CHARACTER SET utf8mb4 COLLATE DEFAULT` MDEV-28644 Unexpected error on ALTER TABLE t1 CONVERT TO CHARACTER SET utf8mb3, DEFAULT CHARACTER SET utf8mb4	2022-05-24 09:36:15 +04:00
Sergei Golubchik	bf2bdd1a1a	Merge branch '10.8' into 10.9	2022-05-19 14:07:55 +02:00
Sergei Golubchik	443c2a715d	Merge branch '10.7' into 10.8	2022-05-11 12:21:36 +02:00
Sergei Golubchik	fd132be117	Merge branch '10.6' into 10.7	2022-05-11 11:25:33 +02:00
Sergei Golubchik	3bc98a4ec4	Merge branch '10.5' into 10.6	2022-05-10 14:01:23 +02:00
Sergei Golubchik	ef781162ff	Merge branch '10.4' into 10.5	2022-05-09 22:04:06 +02:00
Sergei Golubchik	a70a1cf3f4	Merge branch '10.3' into 10.4	2022-05-08 23:03:08 +02:00
Aleksey Midenkov	92bfc0e8c4	MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT :: Syntax change :: Keyword AUTO enables history partition auto-creation. Examples: CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO; CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 MONTH STARTS '2021-01-01 00:00:00' AUTO PARTITIONS 12; CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME LIMIT 1000 AUTO; Or with explicit partitions: CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO (PARTITION p0 HISTORY, PARTITION pn CURRENT); To disable or enable auto-creation one can use ALTER TABLE by adding or removing AUTO from partitioning specification: CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO; # Disables auto-creation: ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR; # Enables auto-creation: ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO; If the rest of partitioning specification is identical to CREATE TABLE no repartitioning will be done (for details see MDEV-27328). :: Description :: Before executing history-generating DML command (see the list of commands below) add N history partitions, so that N would be sufficient for potentially generated history. N > 1 may be required when history partitions are switched by INTERVAL and current_timestamp is N times further than the interval boundary of the last history partition. If the last history partition equals or exceeds LIMIT records then new history partition is created and selected as the working partition. According to MDEV-28411 partitions cannot be switched (or created) while the command is running. Thus LIMIT does not carry strict limitation and the history partition size must be planned as LIMIT value plus average number of history one DML command can generate. Auto-creation is implemented by synchronous fast_alter_partition_table() call from the thread of the executed DML command before the command itself is run (by the fallback and retry mechanism similar to Discovery feature, see Open_table_context). The name for newly added partitions are generated like default partition names with extension of MDEV-22155 (which avoids name clashes by extending assignment counter to next free-enough gap). These DML commands can trigger auto-creation: DELETE (including multitable DELETE, excluding DELETE HISTORY) UPDATE (including multitable UPDATE) REPLACE (including REPLACE .. SELECT) INSERT .. ON DUPLICATE KEY UPDATE (including INSERT .. SELECT .. ODKU) LOAD DATA .. REPLACE :: Bug fixes :: MDEV-23642 Locking timeout caused by auto-creation affects original DML The reasons for this are: - Do not disrupt main business process (the history is auxiliary service); - Consequences are non-fatal (history is not lost, but comes into wrong partition; fixed by partitioning rebuild); - There is more freedom for application to fail in this case or not: it may read warning info and find corresponding error number. - While non-failing command is easy to handle by an application and fail it, the opposite is hard to handle: there is no automatic actions to fix failed command and retry, DBA intervention is required and until then application is non-functioning. MDEV-23639 Auto-create does not work under LOCK TABLES or inside triggers Don't do tdc_remove_table() for OT_ADD_HISTORY_PARTITION because it is not possible in locked tables mode. LTM_LOCK_TABLES mode (and LTM_PRELOCKED_UNDER_LOCK_TABLES) works out of the box as fast_alter_partition_table() can reopen tables via locked_tables_list. In LTM_PRELOCKED we reopen and relock table manually. :: More fixes :: * some_table_marked_for_reopen flag fix some_table_marked_for_reopen affets only reopen of m_locked_tables. I.e. Locked_tables_list::reopen_tables() reopens only tables from m_locked_tables. * Unused can_recover_from_failed_open() condition Is recover_from_failed_open() can be really used after open_and_process_routine()? :: Reviewed by :: Sergei Golubchik <serg@mariadb.org>	2022-05-06 15:11:02 +03:00
Oleksandr Byelkin	9614fde1aa	Merge branch '10.2' into 10.3	2022-05-03 10:59:54 +02:00
Sergei Golubchik	af810407f7	MDEV-28098 incorrect key in "dup value" error after long unique reset errkey after using it, so that it wouldn't affect the next error message in the next statement	2022-04-28 13:17:13 +02:00
Rucha Deodhar	5945e420f1	MDEV-24920: Merge "old" SQL variable to "old_mode" sql variable Analysis: There are 2 server variables- "old_mode" and "old". "old" is no longer needed as "old_mode" has replaced it (however still used in some places in the code). "old_mode" and "old" has same purpose- emulate behavior from previous MariaDB versions. So they can be merged to avoid confusion. Fix: Deprecate "old" variable and create another mode for @@old_mode to mimic behavior of previous "old" variable. Create specific modes for specifix task that --old sql variable was doing earlier and use the new modes instead.	2022-04-20 00:30:22 +05:30
Marko Mäkelä	b2baeba415	Merge 10.7 into 10.8	2022-04-06 13:28:25 +03:00
Marko Mäkelä	2d8e38bc94	Merge 10.6 into 10.7	2022-04-06 13:00:09 +03:00
Marko Mäkelä	ff99413804	MDEV-25975: Merge 10.5 into 10.6	2022-04-06 12:45:14 +03:00
Marko Mäkelä	9d94c60f2b	Merge 10.5 into 10.6	2022-04-06 12:08:30 +03:00
Marko Mäkelä	5d8dcfd86c	MDEV-25975: Merge 10.4 into 10.5	2022-04-06 10:30:49 +03:00
Marko Mäkelä	cacb61b6be	Merge 10.4 into 10.5	2022-04-06 10:06:39 +03:00
Marko Mäkelä	d172df9913	MDEV-25975: Merge 10.3 into 10.4	2022-04-06 09:18:38 +03:00
Marko Mäkelä	d6d66c6e90	Merge 10.3 into 10.4	2022-04-06 08:59:09 +03:00
Marko Mäkelä	e9735a8185	MDEV-25975 innodb_disallow_writes causes shutdown to hang We will remove the parameter innodb_disallow_writes because it is badly designed and implemented. The parameter was never allowed at startup. It was only internally used by Galera snapshot transfer. If a user executed SET GLOBAL innodb_disallow_writes=ON; the server could hang even on subsequent read operations. During Galera snapshot transfer, we will block writes to implement an rsync friendly snapshot, as follows: sst_flush_tables() will acquire a global lock by executing FLUSH TABLES WITH READ LOCK, which will block any writes at the high level. sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true), will suspend or disable InnoDB background tasks or threads that could initiate writes. As part of this, log_make_checkpoint() will be invoked to ensure that anything in the InnoDB buf_pool.flush_list will be written to the data files. This has the nice side effect that the Galera joiner will avoid crash recovery. The changes to sql/wsrep.cc and to the tests are based on a prototype that was developed by Jan Lindström. Reviewed by: Jan Lindström	2022-04-06 08:06:49 +03:00
Marko Mäkelä	7c584d8270	Merge 10.2 into 10.3	2022-04-06 08:06:35 +03:00
Marko Mäkelä	35425cfc55	Cleanup: Remove some unused functions	2022-03-30 15:57:08 +03:00
Rucha Deodhar	bb4dd70e7c	MDEV-13005: Fixing bugs in SEQUENCE, part 3, 1/5 Task 1: If table is added to list using option TL_OPTION_SEQUENCE (done when we have sequence functions) then then we are dealing with sequence instead of table. So global table list will have sequence set to true. This is used to check and give correct error message about unknown sequence instead of table doesn't exist.	2022-03-30 15:12:33 +05:30
Marko Mäkelä	9f5a3e5689	Merge 10.7 into 10.8	2022-03-15 18:18:07 +02:00
Marko Mäkelä	dc4b7f382b	Merge 10.6 into 10.7	2022-03-15 15:25:31 +02:00
Hugo Wen	dafc5fb9c1	MDEV-27342: Fix issue of recovery failure using new server id Commit `6c39eaeb1` made the crash recovery dependent on server_id. The crash recovery could fail when restoring a new instance from original crashed data directory USING A NEW SERVER ID. The issue doesn't exist in previous major versions before 10.6. Root cause is when generating the input XID to be searched in the hash, server id is populated with the current server id. So if the server id changed when recovering, the XID couldn't be found in the hash due to server id doesn't match. This fix is to use original server id when creating the input XID object in function `xarecover_do_commit_or_rollback`. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2022-03-14 19:57:10 -07:00
Oleksandr Byelkin	4fb2cb1a30	Merge branch '10.7' into 10.8	2022-02-04 14:50:25 +01:00
Oleksandr Byelkin	9ed8deb656	Merge branch '10.6' into 10.7	2022-02-04 14:11:46 +01:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Oleksandr Byelkin	a576a1cea5	Merge branch '10.3' into 10.4	2022-01-30 09:46:52 +01:00
Sergei Golubchik	d751f42a02	MDEV-27586 Auto-increment does not work with DESC on MERGE table also fix handler::get_auto_increment()	2022-01-26 18:43:06 +01:00
Andrei	c9356223c9	MDEV-19555 assert Diagnostics_area::sql_errno() in ha_rollback_trans Fixed the assert to restore pre-refactoring condition for calling set_error() equivalent.	2022-01-26 18:23:51 +02:00
Aleksey Midenkov	7105c810bd	MDEV-27217 typo fix	2022-01-15 14:17:19 +03:00
Aleksey Midenkov	585cb18ed1	MDEV-27452 TIMESTAMP(0) system field is allowed for certain creation of system-versioned table First, we do not add VERS_UPDATE_UNVERSIONED_FLAG for system field and that fixes SHOW CREATE result. Second, we have to call check_sys_fields() for any CREATE TABLE and there correct type is checked for system fields. Third, we update system_time like as_row structures for ALTER TABLE and that makes check_sys_fields() happy for ALTER TABLE when we make system fields hidden.	2022-01-13 23:35:17 +03:00
Aleksey Midenkov	4d5ae2b325	MDEV-27217 DELETE partition selection doesn't work for history partitions LIMIT history switching requires the number of history partitions to be marked for read: from first to last non-empty plus one empty. The least we can do is to fail with error message if the needed partition was not marked for read. As this is handler interface we require new handler error code to display user-friendly error message. Switching by INTERVAL works out-of-the-box with ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET error.	2022-01-13 23:35:16 +03:00
Aleksey Midenkov	6d8794e567	MDEV-21650 Non-empty statement transaction on global rollback after TRT update error TRT opens statement transaction. Cleanup it in case of error.	2022-01-12 23:50:23 +03:00
Marko Mäkelä	7dfaded962	Merge 10.6 into 10.7	2022-01-04 09:55:58 +02:00
Marko Mäkelä	3f5726768f	Merge 10.5 into 10.6	2022-01-04 09:26:38 +02:00
Julius Goryavsky	55bb933a88	Merge branch 10.4 into 10.5	2021-12-26 12:51:04 +01:00
Julius Goryavsky	681b7784b6	Merge branch 10.3 into 10.4	2021-12-25 12:13:03 +01:00
Monty	ca2ea4ff41	Only apply wsrep_trx_fragment_size to InnoDB tables MDEV-22617 Galera node crashes when trying to log to slow_log table in streaming replication mode Other things: - Changed name of wsrep_after_row(two arguments) to wsrep_after_row_internal(one argument) to not depended on the function signature with unused arguments. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com> Added test case	2021-12-23 11:51:31 +02:00
Alexander Barkov	a5ef74e7eb	MDEV-27195 SIGSEGV in Table_scope_and_contents_source_st::vers_check_system_fields The old code erroneously used default_charset_info to compare field names. default_charset_info can point to any arbitrary collation, including ucs2, utf16, utf32*, including those that do not support strcasecmp(). my_charset_utf8mb4_unicode_ci, which is used in this scenario: CREATE TABLE t1 ENGINE=InnoDB WITH SYSTEM VERSIONING AS SELECT 0; does not support strcasecmp(). Fixing the code to use Lex_ident::streq(), which uses system_charset_info instead of default_charset_info.	2021-12-22 13:12:40 +04:00
Marko Mäkelä	06988bdcaa	Merge 10.6 into 10.7	2021-11-09 09:40:29 +02:00
Marko Mäkelä	25ac047baf	Merge 10.5 into 10.6	2021-11-09 09:11:50 +02:00
Marko Mäkelä	9c18b96603	Merge 10.4 into 10.5	2021-11-09 08:50:33 +02:00
Marko Mäkelä	47ab793d71	Merge 10.3 into 10.4	2021-11-09 08:40:14 +02:00
Marko Mäkelä	f7054ff5df	MariaDB 10.3.32 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmGJZJ0ACgkQ8WVvJMdM 0dj8Jw//QD4uSbC4EHVdCDXWPQ9K/+Wv2A1DG4kCtngtQAVd/MgOpWK+9gdDCbKE Ce6m7627YLLzgBDzkEX/VkciHPd9GqvquqmgVKY1MdQ6efmmwgbzaGcaWcuJF8Z/ C1pa7j0Duxn6nEuRbvM8OTgN4KfFlAc0OxpraJ7Fr8NvduLZQYMokBBW9DrJT1f1 zGp4k05wUImBsmBt6teS073FS89frDL4J2aYGTXAxMjiqtno2MCopUIF2rpk5B29 sJFaDpHCitNYDXuXZvWEWmuuss4vHz/NUYXM/GygfIteJqXKRLEOLAFBfvETyt4q 6pYZDVfEGdKquHQu1a2XDI3W9+W1inmZ11dtebGnRexJTp9xeSxPhxiUvOQJj84A w6cQICCtlDCql3VlOIbt0vvAuXu+rOqhlqHorz0l62o6YjGE92z+NUL7B6gODip9 RGd0gwCloPo+jGHnfpC6rvfcjA32vEx6L8giYTAYybxqjN1bMNIrix+7zwgfpZPZ 0QRZtWtio/Iozj41q6x7dmP2Pxjll58+fPUEKevQn2iPm5WoPe+zrq3/lUdXFbZY 3cz9fZch4YMTlhhu9BwuEmc2T9aIIm/YwYaB0Kmg55J/KT9xyerpMFZmRaF0VWcQ 70ODJSMEDBhBW3n19LuYK/p3uJr551V/dFbZ/6lCXzbyp5i5MO8= =yIEG -----END PGP SIGNATURE----- Merge mariadb-10.3.32 into 10.3	2021-11-09 07:59:36 +02:00
Aleksey Midenkov	c8cece9144	MDEV-26928 Column-inclusive WITH SYSTEM VERSIONING doesn't work with explicit system fields versioning_fields flag indicates that any columns were specified WITH SYSTEM VERSIONING. In that case we imply WITH SYSTEM VERSIONING for the whole table and WITHOUT SYSTEM VERSIONING for the other columns.	2021-11-02 11:49:47 +03:00
sjaakola	ef2dbb8dbc	MDEV-23328 Server hang due to Galera lock conflict resolution Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-10-29 20:40:35 +02:00
Jan Lindström	30337addfc	MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL) Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit `29bbcac0ee`.	2021-10-29 10:00:05 +03:00
sjaakola	5c230b21bf	MDEV-23328 Server hang due to Galera lock conflict resolution Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-10-29 09:52:52 +03:00
Sergei Golubchik	9e32f229d4	plugin can signal that it cannot be unloaded by failing deinit() if plugin->deinit() returns a failure, it is no longer ignored, it means that the plugin isn't ready to be unloaded from memory yet. So it's marked "dying", deinitialized as much as possible, but stays in memory until shutdown. also: * increment MARIA_PLUGIN_INTERFACE_VERSION * rewrite ha_rocksdb to use the new approach, update the test	2021-10-27 15:55:14 +02:00
Aleksey Midenkov	d324c03d0c	Vanilla cleanups and refactorings Dead code cleanup: part_info->num_parts usage was wrong and working incorrectly in mysql_drop_partitions() because num_parts is already updated in prep_alter_part_table(). We don't have to update part_info->partitions because part_info is destroyed at alter_partition_lock_handling(). Cleanups: - DBUG_EVALUATE_IF() macro replaced by shorter form DBUG_IF(); - Typo in ER_KEY_COLUMN_DOES_NOT_EXITS. Refactorings: - Splitted write_log_replace_delete_frm() into write_log_delete_frm() and write_log_replace_frm(); - partition_info via DDL_LOG_STATE; - set_part_info_exec_log_entry() removed. DBUG_EVALUATE removed DBUG_EVALUTATE was only added for consistency together with DBUG_EVALUATE_IF. It is not used anywhere in the code. DBUG_SUICIDE() fix on release build On release DBUG_SUICIDE() was statement. It was wrong as DBUG_SUICIDE() is used in expression context.	2021-10-26 17:07:46 +02:00
Thirunarayanan Balathandayuthapani	045757af4c	MDEV-24621 In bulk insert, pre-sort and build indexes one page at a time When inserting a number of rows into an empty table, InnoDB will buffer and pre-sort the records for each index, and build the indexes one page at a time. For each index, a buffer of innodb_sort_buffer_size will be created. If the buffer ran out of memory then we will create temporary files for storing the data. At the end of the statement, we will sort and apply the buffered records. Ideally, we would do this at the end of the transaction or only when starting to execute a non-INSERT statement on the table. However, it could be awkward if duplicate keys or similar errors would be reported during the execution of a later statement. This will be addressed in MDEV-25036. Any columns longer than 2000 bytes will buffered in temporary files. innodb_prepare_commit_versioned(): Apply all bulk buffered insert operation, at the end of each statement. ha_commit_trans(): Handle errors from innodb_prepare_commit_versioned(). row_merge_buf_write(): This function should accept blob file handle too and it should write the field data which are greater than 2000 bytes row_merge_bulk_t: Data structure to maintain the data during bulk insert operation. trx_mod_table_time_t::start_bulk_insert(): Notify the start of bulk insert operation and create new buffer for the given table trx_mod_table_time_t::add_tuple(): Buffer a record. trx_mod_table_time_t::write_bulk(): Do bulk insert operation present in the transaction trx_mod_table_time_t::bulk_buffer_exist(): Whether the buffer storage exist for the bulk transaction trx_mod_table_time_t::write_bulk(): Write all buffered insert operation for the transaction and the table. row_ins_clust_index_entry_low(): Insert the data into the bulk buffer if it is already exist. row_ins_sec_index_entry(): Insert the secondary tuple if the bulk buffer already exist. row_merge_bulk_buf_add(): Insert the tuple into bulk buffer insert operation. row_merge_buf_blob(): Write the field data whose length is more than 2000 bytes into blob temporary file. Write the file offset and length into the tuple field. row_merge_copy_blob_from_file(): Copy the blob from blob file handler based on reference of the given tuple. row_merge_insert_index_tuples(): Handle blob for bulk insert operation. row_merge_bulk_t::row_merge_bulk_t(): Constructor. Initialize the buffer and file for all the indexes expect fts index. row_merge_bulk_t::create_tmp_file(): Create new temporary file for the given index. row_merge_bulk_t::write_to_tmp_file(): Write the content from buffer to disk file for the given index. row_merge_bulk_t::add_tuple(): Insert the tuple into the merge buffer for the given index. If the memory ran out then InnoDB should sort the buffer and write into file. row_merge_bulk_t::write_to_index(): Do bulk insert operation from merge file/merge buffer for the given index row_merge_bulk_t::write_to_table(): Do bulk insert operation for all the indexes. dict_stats_update(): If a bulk insert transaction is in progress, treat the table as empty. The index creation could hold latches for extended amounts of time.	2021-10-26 15:01:44 +03:00
Marko Mäkelä	a8379e53e8	Merge 10.5 into 10.6 The changes to galera.galear_var_replicate_myisam_on in commit `d9b933bec6` are omitted due to conflicts with commit `27d66d644c`.	2021-10-13 13:28:12 +03:00
Marko Mäkelä	99bb3fb656	Merge 10.4 into 10.5	2021-10-13 12:33:56 +03:00
Marko Mäkelä	a736a3174a	Merge 10.3 into 10.4	2021-10-13 12:03:32 +03:00
Aleksey Midenkov	d31f953789	MDEV-22660 SIGSEGV on adding system versioning and modifying system column Second alter subcommand correctly removed VERS_ROW_END flag. We throw ER_VERS_PERIOD_COLUMNS in such case.	2021-10-11 13:36:07 +03:00
Aleksey Midenkov	911c803db1	MDEV-22660 System versioning cleanups - Cleaned up Vers_parse_info::check_sys_fields(); - Renamed VERS_SYS_START_FLAG, VERS_SYS_END_FLAG to VERS_ROW_START, VERS_ROW_END.	2021-10-11 13:36:06 +03:00
Marko Mäkelä	03c09837fc	Merge 10.5 into 10.6	2021-09-16 20:17:12 +03:00
Monty	f03fee06b0	Improve error messages from Aria - Error on commit now returns HA_ERR_COMMIT_ERROR instead of HA_ERR_INTERNAL_ERROR - If checkpoint fails, it will now print out where it failed.	2021-09-15 19:27:34 +03:00
Vladislav Vaintroub	ae85835cc7	Fix warnings from -DPLUGIN_PARTITION=NO, portably. Also fix Spider's cmake.	2021-09-05 20:00:13 +02:00
Marko Mäkelä	cc4e20e56f	Merge 10.5 into 10.6	2021-08-26 10:20:17 +03:00
Marko Mäkelä	87ff4ba7c8	Merge 10.4 into 10.5	2021-08-26 08:46:57 +03:00
Marko Mäkelä	15b691b7bd	After-merge fix `f84e28c119` In a rebase of the merge, two preceding commits were accidentally reverted: commit `112b23969a` (MDEV-26308) commit `ac2857a5fb` (MDEV-25717) Thanks to Daniele Sciascia for noticing this.	2021-08-25 17:35:44 +03:00
Marko Mäkelä	f84e28c119	Merge 10.3 into 10.4	2021-08-18 16:51:52 +03:00
Leandro Pacheco	112b23969a	MDEV-26308 : Galera test failure on galera.galera_split_brain Contains following fixes: * allow TOI commands to timeout while trying to acquire TOI with override lock_wait_timeout with a LONG_TIMEOUT only after succesfully entering TOI * only ignore lock_wait_timeout on TOI * fix galera_split_brain test as TOI operation now returns ER_LOCK_WAIT_TIMEOUT after lock_wait_timeout * explicitly test for TOI Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-08-18 08:57:33 +03:00
Oleksandr Byelkin	6efb5e9f5e	Merge branch '10.5' into 10.6	2021-08-02 10:11:41 +02:00
Oleksandr Byelkin	ae6bdc6769	Merge branch '10.4' into 10.5	2021-07-31 23:19:51 +02:00
mkaruza	eb26e20df5	MDEV-22421 Galera assertion !wsrep_has_changes(thd) \|\| (thd->lex->sql_command == SQLCOM_CREATE_TABLE && !thd->is_current_stmt_binlog_format_row()) Updates to transaction registry table shouldn't be replicated in cluster so there is no need to append wsrep keys. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-07-28 14:54:18 +03:00
Rucha Deodhar	beb401b25f	This commit is a fixup for MDEV-22189. Changes: changed mysqld -> mariadbd for some more error messages that were left.	2021-07-26 22:59:10 +05:30
Andrei Elkin	e95f78f475	MDEV-21117 post-push to cover a "custom" xid format Due to wsrep uses its own xid format for its recovery, the xid hashing has to be refined. When a xid object is not in the server "mysql" format, the hash record made to contain the xid also in the full format.	2021-06-16 23:21:12 +03:00
Andrei Elkin	79a2dbc879	MDEV-21117 post-push fixes 1. work around MDEV-25912 to not apply assert at wsrep running time; 2. handle wsrep mode of the server recovery 3. convert hton calls to static binlog_commit ones. 4. satisfy MSAN complain on uninitialized std::pair	2021-06-15 19:18:11 +03:00
Sujatha	6c39eaeb12	MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit.	2021-06-11 19:49:39 +03:00
Marko Mäkelä	1bd681c8b3	MDEV-25506 (3 of 3): Do not delete .ibd files before commit This is a complete rewrite of DROP TABLE, also as part of other DDL, such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE. The background DROP TABLE queue hack is removed. If a transaction needs to drop and create a table by the same name (like TRUNCATE TABLE does), it must first rename the table to an internal #sql-ib name. No committed version of the data dictionary will include any #sql-ib tables, because whenever a transaction renames a table to a #sql-ib name, it will also drop that table. Either the rename will be rolled back, or the drop will be committed. Data files will be unlinked after the transaction has been committed and a FILE_RENAME record has been durably written. The file will actually be deleted when the detached file handle returned by fil_delete_tablespace() will be closed, after the latches have been released. It is possible that a purge of the delete of the SYS_INDEXES record for the clustered index will execute fil_delete_tablespace() concurrently with the DDL transaction. In that case, the thread that arrives later will wait for the other thread to finish. HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag. ha_innobase::truncate() now requires that all other references to the table be released in advance. This was implemented by Monty. ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected, we will "hijack" the current transaction, drop the table in the current transaction and commit the current transaction. This essentially fixes MDEV-21602. There is a FIXME comment about making the check less failure-prone. ha_innobase::truncate(), ha_innobase::delete_table(): Implement a fast path for temporary tables. We will no longer allow temporary tables to use the adaptive hash index. dict_table_t::mdl_name: The original table name for the purpose of acquiring MDL in purge, to prevent a race condition between a DDL transaction that is dropping a table, and purge processing undo log records of DML that had executed before the DDL operation. For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the dict_table_t::mdl_name will differ from dict_table_t::name. dict_table_t::parse_name(): Use mdl_name instead of name. dict_table_rename_in_cache(): Update mdl_name. For the internal FTS_ tables of FULLTEXT INDEX, purge would acquire MDL on the FTS_ table name, but not on the main table, and therefore it would be able to run concurrently with a DDL transaction that is dropping the table. Previously, the DROP TABLE queue hack prevented a race between purge and DDL. For now, we introduce purge_sys.stop_FTS() to prevent purge from opening any table, while a DDL transaction that may drop FTS_ tables is in progress. The function fts_lock_table(), which will be invoked before the dictionary is locked, will wait for purge to release any table handles. trx_t::drop_table_statistics(): Drop statistics for the table. This replaces dict_stats_drop_index(). We will drop or rename persistent statistics atomically as part of DDL transactions. On lock conflict for dropping statistics, we will fail instantly with DB_LOCK_WAIT_TIMEOUT, because we will be holding the exclusive data dictionary latch. trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory(). Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT in addition to DB_DUPLICATE_KEY. The call to fts_commit() is entirely misplaced here and may obviously break the consistency of transactions that affect FULLTEXT INDEX. It needs to be fixed separately. dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175). The counter was a work-around for missing meta-data locking (MDL) on the SQL layer, and not really needed in MariaDB. ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28. HA_ERR_TABLE_IN_FK_CHECK: Remove. row_ins_check_foreign_constraints(): Do not acquire dict_sys.latch either. The SQL-layer MDL will protect us. This was reviewed by Thirunarayanan Balathandayuthapani and tested by Matthias Leich.	2021-06-09 17:06:07 +03:00
Marko Mäkelä	a722ee88f3	Merge 10.5 into 10.6	2021-06-01 11:39:38 +03:00
Marko Mäkelä	9c7a456a92	Merge 10.4 into 10.5	2021-06-01 10:38:09 +03:00
sjaakola	e212415690	MDEV-25551 applying crash with tables without PK The underlying problem with MDEV-25551 turned out to be that transactions having changes for tables with no primary key, were not safe to apply in parallel. This is due to excessive locking in innodb side, and even non related row modifications could end up in lock conflict during applying. The fix for MDEV-25551 has disabled parallel applying for tables with no PK. This fix depends on change for wsrep-lib, where a separate PR allows application to modify transaction flags in wsrep-lib. This commit has also separate mtr test for verifying that transactions modifying a table with no primary key, will not apply in parallel. This test is a modified version of initial test created by Gabor Orosz, the reporterr of MDEV-25551. Another mtr test was added in galera_sr suite, for testing if modifying tables with no primary key would causes issues for streaming replication use cases. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2021-05-26 07:41:05 +03:00
Sergei Golubchik	b2340cdd07	fix compilation w/o partitioning	2021-05-19 22:54:14 +02:00
Monty	83e529eced	MDEV-18465 Logging of DDL statements during backup Many of the changes was needed to be able to collect and print engine name and table version id's in the ddl log.	2021-05-19 22:54:13 +02:00
Monty	7762ee5dbe	MDEV-25180 Atomic ALTER TABLE MDEV-25604 Atomic DDL: Binlog event written upon recovery does not have default database The purpose of this task is to ensure that ALTER TABLE is atomic even if the MariaDB server would be killed at any point of the alter table. This means that either the ALTER TABLE succeeds (including that triggers, the status tables and the binary log are updated) or things should be reverted to their original state. If the server crashes before the new version is fully up to date and commited, it will revert to the original table and remove all temporary files and tables. If the new version is commited, crash recovery will use the new version, and update triggers, the status tables and the binary log. The one execption is ALTER TABLE .. RENAME .. where no changes are done to table definition. This one will work as RENAME and roll back unless the whole statement completed, including updating the binary log (if enabled). Other changes: - Added handlerton->check_version() function to allow the ddl recovery code to check, in case of inplace alter table, if the table in the storage engine is of the new or old version. - Added handler->table_version() so that an engine can report the current version of the table. This should be changed each time the table definition changes. - Added ha_signal_ddl_recovery_done() and handlerton::signal_ddl_recovery_done() to inform all handlers when ddl recovery has been done. (Needed by InnoDB). - Added handlerton call inplace_alter_table_committed, to signal engine that ddl_log has been closed for the alter table query. - Added new handerton flag HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we should call hton->notify_tabledef_changed() during mysql_inplace_alter_table. This was required as MyRocks and InnoDB needed the call at different times. - Added function server_uuid_value() to be able to generate a temporary xid when ddl recovery writes the query to the binary log. This is needed to be able to handle crashes during ddl log recovery. - Moved freeing of the frm definition to end of mysql_alter_table() to remove duplicate code and have a common exit strategy. ------- InnoDB part of atomic ALTER TABLE (Implemented by Marko Mäkelä) innodb_check_version(): Compare the saved dict_table_t::def_trx_id to determine whether an ALTER TABLE operation was committed. We must correctly recover dict_table_t::def_trx_id for this to work. Before purge removes any trace of DB_TRX_ID from system tables, it will make an effort to load the user table into the cache, so that the dict_table_t::def_trx_id can be recovered. ha_innobase::table_version(): return garbage, or the trx_id that would be used for committing an ALTER TABLE operation. In InnoDB, table names starting with #sql-ib will remain special: they will be dropped on startup. This may be revisited later in MDEV-18518 when we implement proper undo logging and rollback for creating or dropping multiple tables in a transaction. Table names starting with #sql will retain some special meaning: dict_table_t::parse_name() will not consider such names for MDL acquisition, and dict_table_rename_in_cache() will treat such names specially when handling FOREIGN KEY constraints. Simplify InnoDB DROP INDEX. Prevent purge wakeup To ensure that dict_table_t::def_trx_id will be recovered correctly in case the server is killed before ddl_log_complete(), we will block the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS between ha_innobase::commit_inplace_alter_table(commit=true) (purge_sys.stop_SYS()) and purge_sys.resume_SYS(). The completion callback purge_sys.resume_SYS() must be between ddl_log_complete() and MDL release. -------- MyRocks support for atomic ALTER TABLE (Implemented by Sergui Petrunia) Implement these SE API functions: - ha_rocksdb::table_version() - hton->check_version = rocksdb_check_versionMyRocks data dictionary now stores table version for each table. (Absence of table version record is interpreted as table_version=0, that is, which means no upgrade changes are needed) - For inplace alter table of a partitioned table, call the underlying handlerton when checking if the table is ok. This assumes that the partition engine commits all changes at once.	2021-05-19 22:54:13 +02:00
Monty	6aa9a552c2	MDEV-24576 Atomic CREATE TABLE There are a few different cases to consider Logging of CREATE TABLE and CREATE TABLE ... LIKE - If REPLACE is used and there was an existing table, DDL log the drop of the table. - If discovery of table is to be done - DDL LOG create table else - DDL log create table (with engine type) - create the table - If table was created - Log entry to binary log with xid - Mark DDL log completed Crash recovery: - If query was in binary log do nothing and exit - If discoverted table - Delete the .frm file -else - Drop created table and frm file - If table was dropped, write a DROP TABLE statement in binary log CREATE TABLE ... SELECT required a little more work as when one is using statement logging the query is written to the binary log before commit is done. This was fixed by adding a DROP TABLE to the binary log during crash recovery if the ddl log entry was not closed. In this case the binary log will contain: CREATE TABLE xxx ... SELECT .... DROP TABLE xxx; Other things: - Added debug_crash_here() functionality to Aria to be able to test crash in create table between the creation of the .MAI and the .MAD files.	2021-05-19 22:54:13 +02:00
Monty	7a588c30b1	MDEV-24408 Crash-safe DROP DATABASE Description of how DROP DATABASE works after this patch - Collect list of tables - DDL log tables as they are dropped - DDL log drop database - Delete db.opt - Delete data directory - Log either DROP TABLE or DROP DATABASE to binary log - De active ddl log entry This is in line of how things where before (minus ddl logging) except that we delete db.opt file last to not loose it if DROP DATABASE fails. On recovery we have to ensure that all dropped tables are logged in binary log and that they are properly dropped (as with atomic drop table). No new tables be dropped as part of recovery. Recovery of active drop database ddl log entry: - If drop database was logged to ddl log but was not found in the binary log: - drop the db.opt file and database directory. - Log DROP DATABASE to binary log - If drop database was not logged to ddl log - Update binary log with DROP TABLE of the dropped tables. If table list is longer than max_allowed_packet, then the query will be split into multiple DROP TABLE/VIEW queries. Other things: - Added DDL_LOG_STATE and 'current database' as arguments to mysql_rm_table_no_locks(). This was needed to be able to combine ddl logging of DROP DATABASE and DROP TABLE and make the generated DROP TABLE statements shorter. - To make the DROP TABLE statement created by ddl log shorter, I changed the binlogged query to use current directory and omit the directory part for all tables in the current directory. - Merged some DROP TABLE and DROP VIEW code in ddl logger. This was done to be able get separate DROP VIEW and DROP TABLE statements in the binary log. - Added a 'recovery_state' variable to remember the state of dropped tables and views. - Moved out code that drops database objects (stored procedures) from mysql_rm_db_internal() to drop_database_objects() for better code reuse. - Made mysql_rm_db_internal() global so that could be used by the ddl recovery code.	2021-05-19 22:54:13 +02:00
Monty	e3cfb7c803	MDEV-23844 Atomic DROP TABLE (single table) Logging logic: - Log tables, just before they are dropped, to the ddl log - After the last table for the statement is dropped, log an xid for the whole ddl log event In case of crash: - Remove first any active DROP TABLE events from the ddl log that matches xids found in binary log (this mean the drop was successful and was propery logged). - Loop over all active DROP TABLE events - Ensure that the table is completely dropped - Write a DROP TABLE entry to the binary log with the dropped tables. Other things: - Added code to ha_drop_table() to be able to tell the difference if a get_new_handler() failed because of out-of-memory or because the handler refused/was not able to create a a handler. This was needed to get sequences to work as sequences needs a share object to be passed to get_new_handler() - TC_LOG_BINLOG::recover() was changed to always collect Xid's from the binary log and always call ddl_log_close_binlogged_events(). This was needed to be able to collect DROP TABLE events with embedded Xid's (used by ddl log). - Added a new variable "$grep_script" to binlog filter to be able to find only rows that matches a regexp. - Had to adjust some test that changed because drop statements are a bit larger in the binary log than before (as we have to store the xid) Other things: - MDEV-25588 Atomic DDL: Binlog query event written upon recovery is corrupt fixed (in the original commit).	2021-05-19 22:54:12 +02:00
Monty	47010ccffa	MDEV-23842 Atomic RENAME TABLE - Major rewrite of ddl_log.cc and ddl_log.h - ddl_log.cc described in the beginning how the recovery works. - ddl_log.log has unique signature and is dynamic. It's easy to add more information to the header and other ddl blocks while still being able to execute old ddl entries. - IO_SIZE for ddl blocks is now dynamic. Can be changed without affecting recovery of old logs. - Code is more modular and is now usable outside of partition handling. - Renamed log file to dll_recovery.log and added option --log-ddl-recovery to allow one to specify the path & filename. - Added ddl_log_entry_phase[], number of phases for each DDL action, which allowed me to greatly simply set_global_from_ddl_log_entry() - Changed how strings are stored in log entries, which allows us to store much more information in a log entry. - ddl log is now always created at start and deleted on normal shutdown. This simplices things notable. - Added probes debug_crash_here() and debug_simulate_error() to simply crash testing and allow crash after a given number of times a probe is executed. See comments in debug_sync.cc and rename_table.test for how this can be used. - Reverting failed table and view renames is done trough the ddl log. This ensures that the ddl log is tested also outside of recovery. - Added helper function 'handler::needs_lower_case_filenames()' - Extend binary log with Q_XID events. ddl log handling is using this to check if a ddl log entry was logged to the binary log (if yes, it will be deleted from the log during ddl_log_close_binlogged_events() - If a DDL entry fails 3 time, disable it. This is to ensure that if we have a crash in ddl recovery code the server will not get stuck in a forever crash-restart-crash loop. mysqltest.cc changes: - --die will now replace $variables with their values - $error will contain the error of the last failed statement storage engine changes: - maria_rename() was changed to be more robust against crashes during rename.	2021-05-19 22:54:12 +02:00
Monty	d754d3d951	Less noise in the error log - Updated error messages for recovery - Changed printing of debug sync point information to make it fit 80 char	2021-05-19 22:54:12 +02:00
Monty	249263525f	Avoid creating the .frm file twice in some cases Other things: - Updated code comments & fixed indentation - Removed an old QQ (temporary) comment that does not apply anymore	2021-05-19 22:54:12 +02:00
Monty	a206658b98	Change CHARSET_INFO character set and collaction names to LEX_CSTRING This change removed 68 explict strlen() calls from the code. The following renames was done to ensure we don't use the old names when merging code from earlier releases, as using the new variables for print function could result in crashes: - charset->csname renamed to charset->cs_name - charset->name renamed to charset->coll_name Almost everything where mechanical changes except: - Changed to use the new Protocol::store(LEX_CSTRING..) when possible - Changed to use field->store(LEX_CSTRING, CHARSET_INFO) when possible - Changed to use String->append(LEX_CSTRING&) when possible Other things: - There where compiler issues with ensuring that all character set names points to the same string: gcc doesn't allow one to use integer constants when defining global structures (constant char * pointers works fine). To get around this, I declared defines for each character set name length.	2021-05-19 22:54:07 +02:00
Monty	b6ff139aa3	Reduce usage of strlen() Changes: - To detect automatic strlen() I removed the methods in String that uses 'const char ' without a length: - String::append(const char) - Binary_string(const char str) - String(const char str, CHARSET_INFO cs) - append_for_single_quote(const char ) All usage of append(const char) is changed to either use String::append(char), String::append(const char, size_t length) or String::append(LEX_CSTRING) - Added STRING_WITH_LEN() around constant string arguments to String::append() - Added overflow argument to escape_string_for_mysql() and escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow. This was needed as most usage of the above functions never tested the result for -1 and would have given wrong results or crashes in case of overflows. - Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING. Changed all Item_func::func_name()'s to func_name_cstring()'s. The old Item_func_or_sum::func_name() is now an inline function that returns func_name_cstring().str. - Changed Item::mode_name() and Item::func_name_ext() to return LEX_CSTRING. - Changed for some functions the name argument from const char * to to const LEX_CSTRING &: - Item::Item_func_fix_attributes() - Item::check_type_...() - Type_std_attributes::agg_item_collations() - Type_std_attributes::agg_item_set_converter() - Type_std_attributes::agg_arg_charsets...() - Type_handler_hybrid_field_type::aggregate_for_result() - Type_handler_geometry::check_type_geom_or_binary() - Type_handler::Item_func_or_sum_illegal_param() - Predicant_to_list_comparator::add_value_skip_null() - Predicant_to_list_comparator::add_value() - cmp_item_row::prepare_comparators() - cmp_item_row::aggregate_row_elements_for_comparison() - Cursor_ref::print_func() - Removes String_space() as it was only used in one cases and that could be simplified to not use String_space(), thanks to the fixed my_vsnprintf(). - Added some const LEX_CSTRING's for common strings: - NULL_clex_str, DATA_clex_str, INDEX_clex_str. - Changed primary_key_name to a LEX_CSTRING - Renamed String::set_quick() to String::set_buffer_if_not_allocated() to clarify what the function really does. - Rename of protocol function: bool store(const char from, CHARSET_INFO cs) to bool store_string_or_null(const char from, CHARSET_INFO cs). This was done to both clarify the difference between this 'store' function and also to make it easier to find unoptimal usage of store() calls. - Added Protocol::store(const LEX_CSTRING, CHARSET_INFO) - Changed some 'const char' arrays to instead be of type LEX_CSTRING. - class Item_func_units now used LEX_CSTRING for name. Other things: - Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character in the prompt would cause some part of the prompt to be duplicated. - Fixed a lot of instances where the length of the argument to append is known or easily obtain but was not used. - Removed some not needed 'virtual' definition for functions that was inherited from the parent. I added override to these. - Fixed Ordered_key::print() to preallocate needed buffer. Old code could case memory overruns. - Simplified some loops when adding char to a String with delimiters.	2021-05-19 22:27:48 +02:00
Alexey Botchkov	d2e2219e46	MDEV-25146 JSON_TABLE: Non-descriptive + wrong error messages upon trying to store array or object. More informative messages added. Do not issue additional messages if already handled.	2021-04-21 10:21:45 +04:00
Alexey Botchkov	e9fd327ee3	MDEV-17399 Add support for JSON_TABLE. The specific table handler for the table functions was introduced, and used to implement JSON_TABLE.	2021-04-21 10:21:43 +04:00
Marko Mäkelä	d2e2d32933	Merge 10.5 into 10.6	2021-04-14 12:32:27 +03:00
Marko Mäkelä	6c3e860cbf	Merge 10.4 into 10.5	2021-04-14 11:35:39 +03:00
Marko Mäkelä	5008171b05	Merge 10.3 into 10.4	2021-04-14 10:33:59 +03:00
Aleksey Midenkov	77ffbbca49	MDEV-25172 Wrong error message for ADD COLUMN .. AS ROW START Handle one more condition in fix_alter_info() for non-versioned table and produce ER_VERS_NOT_VERSIONED error.	2021-03-31 21:25:41 +03:00
Aleksey Midenkov	0c99e6e9a6	MDEV-22562 Assertion `next_insert_id == 0' upon UPDATE on system-versioned table Don't update autoinc counter on history row insert. Uniqueness is kept due to merge with row_end.	2021-03-31 21:25:36 +03:00
Marko Mäkelä	2ad61c6782	Merge 10.5 into 10.6	2021-03-29 16:16:12 +03:00
Marko Mäkelä	e8b7fceb82	MDEV-24302: RESET MASTER hangs Starting with MariaDB 10.5, roughly after MDEV-23855 was fixed, we are observing sporadic hangs during the execution of the RESET MASTER statement. We are hoping to fix the hangs with these changes, but due to the rather infrequent occurrence of the hangs and our inability to reliably reproduce the hangs, we cannot be sure of this. What we do know is that innodb_force_recovery=2 (or a larger setting) will prevent srv_master_callback (the former srv_master_thread) from running. In that mode, periodic log flushes would never occur and RESET MASTER could hang indefinitely. That is demonstrated by the new test case that was developed by Andrei Elkin. We fix this case by implementing a special case for it. This also includes some code cleanup and renames of misleadingly named code. The interface has nothing to do with log checkpoints in the storage engine; it is only about requesting log writes to be persistent. handlerton::commit_checkpoint_request, commit_checkpoint_notify_ha(): Remove the unused parameter hton. log_requests.start: Replaces pending_checkpoint_list. log_requests.end: Replaces pending_checkpoint_list_end. log_requests.mutex: Replaces pending_checkpoint_mutex. log_flush_notify_and_unlock(), log_flush_notify(): Replaces innobase_mysql_log_notify(). The new implementation should be functionally equivalent to the old one. innodb_log_flush_request(): Replaces innobase_checkpoint_request(). Implement a fast path for common cases, and reduce the mutex hold time. POSSIBLE FIX OF THE HANG: We will invoke commit_checkpoint_notify_ha() for the current request if it is already satisfied, as well as invoke log_flush_notify_and_unlock() for any satisfied requests. log_write(): Invoke log_flush_notify() when the write is already durable. This was missing WITH_PMEM when the log is in persistent memory. Reviewed by: Vladislav Vaintroub	2021-03-29 15:16:23 +03:00
Varun Gupta	f691d9865b	MDEV-7317: Make an index ignorable to the optimizer This feature adds the functionality of ignorability for indexes. Indexes are not ignored be default. To control index ignorability explicitly for a new index, use IGNORE or NOT IGNORE as part of the index definition for CREATE TABLE, CREATE INDEX, or ALTER TABLE. Primary keys (explicit or implicit) cannot be made ignorable. The table INFORMATION_SCHEMA.STATISTICS get a new column named IGNORED that would store whether an index needs to be ignored or not.	2021-03-04 22:50:00 +05:30
Sergei Golubchik	25d9d2e37f	Merge branch 'bb-10.4-release' into bb-10.5-release	2021-02-15 16:43:15 +01:00
Sergei Golubchik	00a313ecf3	Merge branch 'bb-10.3-release' into bb-10.4-release Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution" was null-merged. 10.4 version of the fix is coming up separately	2021-02-12 17:44:22 +01:00
Sergei Golubchik	60ea09eae6	Merge branch '10.2' into 10.3	2021-02-01 13:49:33 +01:00
Sergei Golubchik	0d8bd7cc3a	MDEV-18428 Memory: If transactional=0 is specified in CREATE TABLE, it is not possible to ALTER TABLE * be strict in CREATE TABLE, just like in ALTER TABLE, because CREATE TABLE, just like ALTER TABLE, can be rolled back for any engine * but don't auto-convert warnings into errors for engine warnings (handler::create) - this matches ALTER TABLE behavior * and not when creating a default record, these errors are handled specially (and replaced with ER_INVALID_DEFAULT) * always issue a Note when a non-unique key is truncated, because it's not a Warning that can be converted to an Error. Before this commit it was a Note for blobs and a Warning for all other data types.	2021-01-11 21:54:47 +01:00
Sergei Golubchik	9b750dcbd8	MDEV-23536 Race condition between KILL and transaction commit Server part: kill_handlerton() was accessing thd->ha_data[] for some other thd, while it could be concurrently modified by its owner thd. protect thd->ha_data[] modifications with a mutex. require this mutex when accessing thd->ha_data[] from kill_handlerton. InnoDB part: on close_connection, detach trx from thd before freeing the trx	2021-01-11 21:54:47 +01:00
Oleksandr Byelkin	02e7bff882	Merge commit '10.4' into 10.5	2021-01-06 10:53:00 +01:00
Oleksandr Byelkin	478b83032b	Merge branch '10.3' into 10.4	2020-12-25 09:13:28 +01:00
Oleksandr Byelkin	25561435e0	Merge branch '10.2' into 10.3	2020-12-23 19:28:02 +01:00
Rucha Deodhar	74223c33d1	MDEV-23209: Assertion `!is_set() \|\| (m_status == DA_OK_BULK && is_bulk_op())' failed in Diagnostics_area::set_ok_status on INSERT Analysis: Error is not returned when strict mode is enabled and value is truncated because double is outside range. Fix: Return HA_ERR_AUTOINC_ERANGE if the error was reported when double is outside range.	2020-12-15 13:00:24 +05:30
Sergei Petrunia	6859e80df7	MDEV-24351: S3, same-backend replication: Dropping a table on master... ..causes error on slave. Cause: if the master doesn't have the frm file for the table, DROP TABLE code will call ha_delete_table_force() to drop the table in all available storage engines. The issue was that this code path didn't check for HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE flag for the storage engine, and so did not add "... IF EXISTS" to the statement that's written to the binary log. This can cause error on the slave when it tries to drop a table that's already gone.	2020-12-08 17:58:22 +03:00
Marko Mäkelä	6a1e655cb0	Merge 10.4 into 10.5	2020-12-02 18:29:49 +02:00
Marko Mäkelä	589cf8dbf3	Merge 10.3 into 10.4	2020-12-01 19:51:14 +02:00
Sergei Golubchik	00f54b56b1	cleanup: RAII helper for changing thd->count_cuted_rows	2020-11-25 22:19:59 +01:00
Nikita Malyavin	f244b499e7	handler: move row change start signal down after the checks	2020-11-02 14:21:08 +10:00
Nikita Malyavin	afca976885	MDEV-22639 Assertion failed in ha_check_overlaps upon multi-table update After Sergei's cleanup this assertion is not actual anymore -- we can't predict if the handler was used for lookup, especially in multi-update scenario. `position(old_data)` is made earlier in `ha_check_overlaps`, therefore it is guaranteed that we compare right refs.	2020-11-02 14:11:43 +10:00
Nikita Malyavin	d543363f25	MDEV-22714 Assertion failed upon multi-update on table WITHOUT OVERLAPS The problem here was that ha_check_overlaps internally uses ha_index_read, which in case of fail overwrites table->status. Even though the handlers are different, they share a common table, so the value is anyway spoiled. This is bad, and table->status is badly designed and overweighted by functionality, but nothing can be done with it, since the code related to this logic is ancient and it's impossible to extract it with normal effort. So let's just save and restore the value in ha_update_row before and after the checks. Other operations like INSERT and simple UPDATE are not in risk, since they don't use this table->status approach. DELETE does not do any unique checks, so it's also safe.	2020-11-02 14:11:42 +10:00
Marko Mäkelä	1657b7a583	Merge 10.4 to 10.5	2020-10-22 17:08:49 +03:00
Marko Mäkelä	46957a6a77	Merge 10.3 into 10.4	2020-10-22 13:27:18 +03:00
Marko Mäkelä	e3d692aa09	Merge 10.2 into 10.3	2020-10-22 08:26:28 +03:00
Daniele Sciascia	fdf87973cb	MDEV-23081 Stray XA transactions at startup, with wsrep_on=OFF Change xarecover_handlerton so that transaction with WSREP prefixed xids are rolled back when Galera is disabled. Reviewd-by: Jan Lindström <jan.lindstrom@mariadb.com>	2020-10-21 16:29:07 +03:00
Marko Mäkelä	620ea816ad	Merge 10.1 into 10.2	2020-10-21 14:02:04 +03:00
Monty	71d263a198	MDEV-23691 S3 storage engine: delayed slave can drop the table This commit fixed the problems with S3 after the "DROP TABLE FORCE" changes. It also fixes all failing replication S3 tests. A slave is delayed if it is trying to execute replicated queries on a table that is already converted to S3 by the master later in the binlog. Fixes for replication events on S3 tables for delayed slaves: - INSERT and INSERT ... SELECT and CREATE TABLE are ignored but written to the binary log. UPDATE & DELETE will be fixed in a future commit. Other things: - On slaves with --s3-slave-ignore-updates set, allow S3 tables to be opened in read-write mode. This was done to be able to ignore-but-replicate queries like insert. Without this change any open of an S3 table failed with 'Table is read only' which is too early to be able to replicate the original query. - Errors are now printed if handler::extra() call fails in wait_while_tables_are_used(). - Error message for row changes are changed from HA_ERR_WRONG_COMMAND to HA_ERR_TABLE_READONLY. - Disable some maria_extra() calls for S3 tables. This could cause S3 tables to fail in some cases. - Added missing thr_lock_delete() to ma_open() in case of failure. - Removed from mysql_prepare_insert() the not needed argument 'table'.	2020-10-21 03:09:29 +03:00
Aleksey Midenkov	9b46d8e5c4	MDEV-23968 CREATE TEMPORARY TABLE .. LIKE (system versioned table) returns error if unique index is defined in the table - Remove row_start/row_end from keys in fix_create_like(); - Disable manual adding of implicit row_start/row_end to indexes on CREATE TABLE. INVISIBLE_SYSTEM fields are unoperable by user; - Fix memory leak on allocation of Key_part_spec.	2020-10-20 10:49:54 +03:00
Sergei Petrunia	3e807d255e	MDEV-23938: innodb row_search_idx_cond_check handle ICP_ABORTED_BY_USER - row_search_mvcc() should return DB_INTERRUPTED when it got killed. - Add a syncpoint for the ICP check. - Add test coverage for killed-during-ICP-check scenario Backport of MDEV-22761 fixes for ICP from 10.4 commits: * `a6f956488c` * `c03885cd9c` XtraDB was fixed in `deb3b9a174` Reviewer: Daniel Black	2020-10-16 09:44:03 +11:00
Sergei Petrunia	c03885cd9c	MDEV-22761: innodb row_search_idx_cond_check handle CHECK_ABORTED_BY_USER Part #2: - row_search_mvcc() should return DB_INTERRUPTED when it got - Move the sync point from innodb internals to handler_rowid_filter_check() where other storage engines can use it too - Add a similar syncpoint for the ICP check. - Add a bigger test and test coverage for Rowid Filter with MyISAM - Add test coverage for killed-during-ICP-check scenario	2020-10-14 15:14:46 +03:00
Sujatha	25ede13611	Merge branch '10.4' into 10.5	2020-09-29 16:59:36 +05:30
Monty	16ea692ed4	MDEV-23586 Mariabackup: GTID saved for replication in 10.4.14 is wrong MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel replication Fixed by partly reverting MDEV-21953 to put back MDL_BACKUP_COMMIT locking before log_and_order. The original problem for MDEV-21953 was that while a thread was waiting in for another threads to commit in 'log_and_order', it had the MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks. This causes a deadlock as the waited-for thread can never get past the MDL_BACKUP_COMMIT lock in ha_commit_trans. The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while a thread is waiting for other 'previous' threads to commit. This ensures that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which ensures that there are no deadlocks anymore.	2020-09-25 13:07:03 +03:00
Oleksandr Byelkin	48b5777ebd	Merge branch '10.4' into 10.5	2020-08-04 17:24:15 +02:00
Oleksandr Byelkin	57325e4706	Merge branch '10.3' into 10.4	2020-08-03 14:44:06 +02:00
Oleksandr Byelkin	c32f71af7e	Merge branch '10.2' into 10.3	2020-08-03 13:41:29 +02:00
Rucha Deodhar	97f7bfcebc	MDEV-21017: Assertion `!is_set() \|\| (m_status == DA_OK_BULK && is_bulk_op())' failed or late ER_PERIOD_FIELD_WRONG_ATTRIBUTES upon attempt to create existing table Analysis: Error state is not stored when field is checked in Table_period_info::check_field() Fix: Store error state by setting res to true.	2020-08-03 10:44:14 +05:30
Oleksandr Byelkin	ef7cb0a0b5	Merge branch '10.1' into 10.2	2020-08-02 11:05:29 +02:00
Monty	ce699df905	thd->m_transaction_psi was not properly cleared for new connections This happend when using XA transactions. I also added some extra asserts to ensure that m_transactions are properly cleared. Other things: - Removed set_time() from THD::init_for_queries() as dispatch_command() is already doing that. - Removed duplicate init_for_queries() from prepare_new_connection_state(). The init_for_queries() functions should only be called once per connection.	2020-07-23 10:54:33 +03:00
Ian Gilfillan	d2982331a6	Code comment spellfixes	2020-07-22 23:18:12 +02:00
Marko Mäkelä	4ec032b492	Merge 10.4 into 10.5	2020-07-21 17:33:16 +03:00
Monty	fc48c8ff4c	MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel repl. The issue was: T1, a parallel slave worker thread, is waiting for another worker thread to commit. While waiting, it has the MDL_BACKUP_COMMIT lock. T2, working for mariabackup, is doing BACKUP STAGE BLOCK_COMMIT and blocks all commits. This causes a deadlock as the thread T1 is waiting for can't commit. Fixed by moving locking of MDL_BACKUP_COMMIT from ha_commit_trans() to commit_one_phase_2() Other things: - Added a new argument to ha_comit_one_phase() to signal if the transaction was a write transaction. - Ensured that ha_maria::implicit_commit() is always called under MDL_BACKUP_COMMIT. This code is not needed in 10.5 - Ensure that MDL_Request values 'type' and 'ticket' are always initialized. This makes it easier to check the state of the MDL_Request. - Moved thd->store_globals() earlier in handle_rpl_parallel_thread() as thd->init_for_queries() could use a MDL that could crash if store_globals where not called. - Don't call ha_enable_transactions() in THD::init_for_queries() as this is both slow (uses MDL locks) and not needed.	2020-07-21 12:42:42 +03:00
Sergei Golubchik	478591608e	optimize ha_delete_table_force first try discovering engines, then the rest. otherwise every DROP TABLE non_existent; will do lots of i/o trying to remove .MYI/.MYD/.MAI/.MAD/.CSV/etc files this matches the old behavior where DROP TABLE always tried to discover the table before dropping.	2020-07-04 01:44:47 +02:00
Sergei Golubchik	6c52931680	replace HTON_AUTOMATIC_DELETE_TABLE with return -1 from drop_table()	2020-07-04 01:44:47 +02:00
Sergei Golubchik	7c2ba9e9d7	MDEV-11412 Ensure that table is truly dropped when using DROP TABLE don't do table discovery on DROP. DROP falls back to "force" approach when a table isn't found and will try to drop in all engines anyway. That is, trying to discover in all engines before the drop is redundant and may be expensive.	2020-07-04 01:44:47 +02:00
Sergei Golubchik	529b6dffe9	small cleanup	2020-07-04 01:44:46 +02:00
Sergei Golubchik	35f566db8d	cleanup: make dd_frm_type to work as documented remove redundant argument, return all possible enum values	2020-07-04 01:44:46 +02:00
Sergei Golubchik	2bb5981c20	MDEV-11412 Ensure that table is truly dropped when using DROP TABLE minor post-review fixes * remove duplicate tests * first file indicates table existance even with discovery * don't duplicate trigger dropping code	2020-07-04 01:44:46 +02:00
Sergei Golubchik	c55c292832	introduce hton->drop_table() method first step in moving drop table out of the handler. todo: other methods that don't need an open table for now hton->drop_table is optional, for backward compatibility reasons	2020-07-04 01:44:46 +02:00
Monty	1a49c5eb4d	Cleanup's and more DBUG_PRINT's - Rewrote bool Query_compressed_log_event::write() to make it more readable (no logic changes). - Changed DBUG_PRINT of 'is_error:' to 'is_error():' to make it easier to find error: in traces. - Ensure that 'db' is never null in Query_log_event (Simplified code).	2020-06-19 12:03:13 +03:00
Monty	5bcb1d6532	MDEV-11412 Ensure that table is truly dropped when using DROP TABLE The used code is largely based on code from Tencent The problem is that in some rare cases there may be a conflict between .frm files and the files in the storage engine. In this case the DROP TABLE was not able to properly drop the table. Some MariaDB/MySQL forks has solved this by adding a FORCE option to DROP TABLE. After some discussion among MariaDB developers, we concluded that users expects that DROP TABLE should always work, even if the table would not be consistent. There should not be a need to use a separate keyword to ensure that the table is really deleted. The used solution is: - If a .frm table doesn't exists, try dropping the table from all storage engines. - If the .frm table exists but the table does not exist in the engine try dropping the table from all storage engines. - Update storage engines using many table files (.CVS, MyISAM, Aria) to succeed with the drop even if some of the files are missing. - Add HTON_AUTOMATIC_DELETE_TABLE to handlerton's where delete_table() is not needed and always succeed. This is used by ha_delete_table_force() to know which handlers to ignore when trying to drop a table without a .frm file. The disadvantage of this solution is that a DROP TABLE on a non existing table will be a bit slower as we have to ask all active storage engines if they know anything about the table. Other things: - Added a new flag MY_IGNORE_ENOENT to my_delete() to not give an error if the file doesn't exist. This simplifies some of the code. - Don't clear thd->error in ha_delete_table() if there was an active error. This is a bug fix. - handler::delete_table() will not abort if first file doesn't exists. This is bug fix to handle the case when a drop table was aborted in the middle. - Cleaned up mysql_rm_table_no_locks() to ensure that if_exists uses same code path as when it's not used. - Use non_existing_Table_error() to detect if table didn't exists. Old code used different errors tests in different position. - Table_triggers_list::drop_all_triggers() now drops trigger file if it can't be parsed instead of leaving it hanging around (bug fix) - InnoDB doesn't anymore print error about .frm file out of sync with InnoDB directory if .frm file does not exists. This change was required to be able to try to drop an InnoDB file when .frm doesn't exists. - Fixed bug in mi_delete_table() where the .MYD file would not be dropped if the .MYI file didn't exists. - Fixed memory leak in Mroonga when deleting non existing table - Fixed memory leak in Connect when deleting non existing table Bugs fixed introduced by the original version of this commit: MDEV-22826 Presence of Spider prevents tables from being force-deleted from other engines	2020-06-14 19:39:42 +03:00
Sergei Petrunia	d7d80689b3	MDEV-15101: Stop ANALYZE TABLE from flushing table definition cache Apply this patch from Percona Server (amended for 10.5): commit cd7201514fee78aaf7d3eb2b28d2573c76f53b84 Author: Laurynas Biveinis <laurynas.biveinis@gmail.com> Date: Tue Nov 14 06:34:19 2017 +0200 Fix bug 1704195 / 87065 / TDB-83 (Stop ANALYZE TABLE from flushing table definition cache) Make ANALYZE TABLE stop flushing affected tables from the table definition cache, which has the effect of not blocking any subsequent new queries involving the table if there's a parallel long-running query: - new table flag HA_ONLINE_ANALYZE, return it for InnoDB and TokuDB tables; - in mysql_admin_table, if we are performing ANALYZE TABLE, and the table flag is set, do not remove the table from the table definition cache, do not invalidate query cache; - in partitioning handler, refresh the query optimizer statistics after ANALYZE if the underlying handler supports HA_ONLINE_ANALYZE; - new testcases main.percona_nonflushing_analyze_debug, parts.percona_nonflushing_abalyze_debug and a supporting debug sync point. For TokuDB, this change exposes bug TDB-83 (Index cardinality stats updated for handler::info(HA_STATUS_CONST), not often enough for tokudb_cardinality_scale_percent). TokuDB may return different rec_per_key values depending on dynamic variable tokudb_cardinality_scale_percent value. The server does not have a way of knowing that changing this variable invalidates the previous rec_per_key values in any opened table shares, and so does not call info(HA_STATUS_CONST) again. Fix by updating rec_per_key for both HA_STATUS_CONST and HA_STATUS_VARIABLE. This also forces a re-record of tokudb.bugs.db756_card_part_hash_1_pick, with the new output seeming to be more correct.	2020-06-12 20:29:05 +03:00
Sergei Golubchik	89a33303c4	remove dead code reduce the amount of engine-specific code in the server, particularly as it does not serve any purpose now. may be needed for VP engine, to be reconsidered in MDEV-7795	2020-06-09 14:32:43 +02:00
Marko Mäkelä	4a0b56f604	Merge 10.4 into 10.5	2020-05-31 10:28:59 +03:00
Marko Mäkelä	6da14d7b4a	Merge 10.3 into 10.4	2020-05-30 11:04:27 +03:00
Marko Mäkelä	dad7a8ee7d	Merge 10.2 into 10.3	2020-05-27 17:10:39 +03:00
Monty	403dacf6a9	Fixed crash in aria recovery when using bulk insert MDEV-20578 Got error 126 when executing undo undo_key_delete upon Aria crash recovery The crash happens in this scenario: - Table with unique keys and non unique keys - Batch insert (LOAD DATA or INSERT ... SELECT) with REPLACE - Some insert succeeds followed by duplicate key error In the above scenario the table gets corrupted. The bug was that we don't generate any undo entry for the failed insert as the whole insert can be ignored by undo. The code did however not take into account that when bulk insert is used, we would write cached keys to the file on failure and undo would wrongly ignore these. Fixed by moving the writing of the cache keys after we write the aborted-insert event to the log.	2020-05-26 20:05:17 +03:00
Monty	f4ddde0698	Only apply wsrep_trx_fragment_size to InnoDB tables MDEV-22617 Galera node crashes when trying to log to slow_log table in streaming replication mode Other things: - Changed name of wsrep_after_row(two arguments) to wsrep_after_row_internal(one argument) to not depended on the function signature with unused arguments.	2020-05-23 12:29:10 +03:00
Monty	9bf479b0cf	Update galera to work with independent sub transactions	2020-05-23 12:29:10 +03:00
Monty	4102f1589c	Aria will now register it's transactions MDEV-22531 Remove maria::implicit_commit() MDEV-22607 Assertion `ha_info->ht() != binlog_hton' failed in MYSQL_BIN_LOG::unlog_xa_prepare From the handler point of view, Aria now looks like a transactional engine. One effect of this is that we don't need to call maria::implicit_commit() anymore. This change also forces the server to call trans_commit_stmt() after doing any read or writes to system tables. This work will also make it easier to later allow users to have system tables in other engines than Aria. To handle the case that Aria doesn't support rollback, a new handlerton flag, HTON_NO_ROLLBACK, was added to engines that has transactions without rollback (for the moment only binlog and Aria). Other things - Moved freeing of MARIA_SHARE to a separate function as the MARIA_SHARE can be still part of a transaction even if the table has closed. - Changed Aria checkpoint to use the new MARIA_SHARE free function. This fixes a possible memory leak when using S3 tables - Changed testing of binlog_hton to instead test for HTON_NO_ROLLBACK - Removed checking of has_transaction_manager() in handler.cc as we can assume that as the transaction was started by the engine, it does support transactions. - Added new class 'start_new_trans' that can be used to start indepdendent sub transactions, for example while reading mysql.proc, using help or status tables etc. - open_system_tables...() and open_proc_table_for_Read() doesn't anymore take a Open_tables_backup list. This is now handled by 'start_new_trans'. - Split thd::has_transactions() to thd::has_transactions() and thd::has_transactions_and_rollback() - Added handlerton code to free cached transactions objects. Needed by InnoDB. squash! 2ed35999f2a2d84f1c786a21ade5db716b6f1bbc	2020-05-23 12:29:10 +03:00
Monty	d1d472646d	Change THD->transaction to a pointer to enable multiple transactions All changes (except one) is of type thd->transaction. -> thd->transaction-> thd->transaction points by default to 'thd->default_transaction' This allows us to 'easily' have multiple active transactions for a THD object, like when reading data from the mysql.proc table	2020-05-23 12:29:10 +03:00
Monty	7ae812cf2c	Fix that BACKUP STAGE BLOCK_COMMIT blocks commit to the Aria engine MDEV-22468 BACKUP STAGE BLOCK_COMMIT should block commit in the Aria engine This is needed to ensure that mariabackup works properly with Aria tables This code ads new calls to ha_maria::implicit_commit(). These will be deleted by MDEV-22531 Remove maria::implicit_commit().	2020-05-23 12:29:10 +03:00
Sergei Golubchik	13038e4705	Merge branch '10.4' into 10.5	2020-05-09 20:43:36 +02:00
Sergei Petrunia	8d85715d50	MDEV-21794: Optimizer flag rowid_filter leads to long query Rowid Filter check is just like Index Condition Pushdown check: before we check the filter, we must check if we have walked out of the range we are scanning. (If we did, we should return, and not continue the scan). Consequences of this: - Rowid filtering doesn't work for keys that have partially-covered blob columns (just like Index Condition Pushdown) - The rowid filter function has three return values: CHECK_POS (passed) CHECK_NEG (filtered out), CHECK_OUT_OF_RANGE. All of the above is implemented in this patch	2020-05-07 12:27:17 +02:00
Sergei Golubchik	18502f99eb	MDEV-22185 Failing assertion: node->pcur->rel_pos == BTR_PCUR_ON or ER_KEY_NOT_FOUND or Assertion `inited==NONE' failed in handler::ha_index_init long unique checks should be done for a partitioned table as a whole, not for individual partitions. Followup for `f3f31eaa8e` that extends it to UPDATE	2020-05-05 19:41:12 +02:00
Sergei Golubchik	67aaf51cf9	cleanup: ha_external_unlock() helper as mentioned in `f9f33b85be` and generally to make it easier to talk about	2020-05-05 19:41:12 +02:00
Marko Mäkelä	496d0372ef	Merge 10.4 into 10.5	2020-04-29 15:40:51 +03:00
Marko Mäkelä	0632b8034b	Merge 10.3 into 10.4	2020-04-29 09:05:15 +03:00
Marko Mäkelä	fbe2712705	Merge 10.4 into 10.5 The functional changes of commit `5836191c8f` (MDEV-21168) are omitted due to MDEV-742 having addressed the issue.	2020-04-25 21:57:52 +03:00
Jan Lindström	93475aff8d	MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate Replaced WSREP_ON macro by single global variable WSREP_ON that is then updated at server statup and on wsrep_on and wsrep_provider update functions.	2020-04-24 13:12:46 +03:00
Monty	eca5c2c67f	Added support for more functions when using partitioned S3 tables MDEV-22088 S3 partitioning support All ALTER PARTITION commands should now work on S3 tables except REBUILD PARTITION TRUNCATE PARTITION REORGANIZE PARTITION In addition, PARTIONED S3 TABLES can also be replicated. This is achived by storing the partition tables .frm and .par file on S3 for partitioned shared (S3) tables. The discovery methods are enchanced by allowing engines that supports discovery to also support of the partitioned tables .frm and .par file Things in more detail - The .frm and .par files of partitioned tables are stored in S3 and kept in sync. - Added hton callback create_partitioning_metadata to inform handler that metadata for a partitoned file has changed - Added back handler::discover_check_version() to be able to check if a table's or a part table's definition has changed. - Added handler::check_if_updates_are_ignored(). Needed for partitioning. - Renamed rebind() -> rebind_psi(), as it was before. - Changed CHF_xxx hadnler flags to an enum - Changed some checks from using table->file->ht to use table->file->partition_ht() to get discovery to work with partitioning. - If TABLE_SHARE::init_from_binary_frm_image() fails, ensure that we don't leave any .frm or .par files around. - Fixed that writefrm() doesn't leave unusable .frm files around - Appended extension to path for writefrm() to be able to reuse to function for creating .par files. - Added DBUG_PUSH("") to a a few functions that caused a lot of not critical tracing.	2020-04-19 17:33:51 +03:00
Monty	f9f33b85be	Handle errors from external_unlock & mysql_unlock_tables Other things: - Handler errors from ha_maria::implict_commit - Disable DBUG in safe_mutex_lock to get trace file easier to read	2020-04-19 17:33:51 +03:00
Teemu Ollakka	632b1deb67	MDEV-21025 Server crashes on START TRANSACTION after INSERT IGNORE (#1489 ) If a transaction had no effect due to INSERT IGNORE and a new transaction was started with START TRANSACTION without committing the previous one, the server crashed on assertion when starting a new wsrep transaction. As a fix, refined the condition to do wsrep_commit_empty() at the end of the ha_commit_trans().	2020-04-18 08:06:13 +03:00
Marko Mäkelä	af91266498	Merge 10.3 into 10.4 In main.index_merge_myisam we remove the test that was added in commit `a2d24def8c` because it duplicates the test case that was added in commit `5af12e4635`.	2020-04-16 12:12:26 +03:00
Marko Mäkelä	84db10f27b	Merge 10.2 into 10.3	2020-04-15 09:56:03 +03:00
Sergei Golubchik	fcd84da5f1	MDEV-22218 InnoDB: Failing assertion: node->pcur->rel_pos == BTR_PCUR_ON upon LOAD DATA with NO_BACKSLASH_ESCAPES in SQL_MODE and unique blob in table `inited == NONE` at the initialization time does not always mean that it'll be `NONE` later, at the execution time. Use a more complex caller-specific logic to decide whether to create a cloned lookup handler. Besides LOAD (as in the original bug report) make sure that all prepare_for_insert() invocations are covered by tests. Add tests for CREATE ... SELECT, multi-UPDATE, and multi-DELETE. Don't enable write cache with long uniques.	2020-04-12 22:10:57 +02:00
Vlad Lesin	5836191c8f	MDEV-21168: Active XA transactions stop slave from working after backup was restored. Optionally rollback prepared XA's on "mariabackup --prepare". The fix MUST NOT be ported on 10.5+, as MDEV-742 fix solves the issue for slaves.	2020-04-07 15:05:38 +03:00
Sergei Golubchik	3bb5c6b0c2	MDEV-22113 SIGSEGV, ASAN use-after-poison, Assertion `next_insert_id == 0' in handler::ha_external_lock if the lookup_handler is allocated on the THD's memroot, it may not live long enough to be deleted in handler::ha_external_lock()	2020-04-02 14:03:54 +02:00
Sergei Golubchik	62e7ad2bbc	cleanup: move initializations from query exec to prepare time that is don't call alloc_lookup_buffer() and create_lookup_handler() for every row also, don't call ha_check_overlaps() for every partition, after it was already done on the ha_partition level	2020-03-31 17:42:34 +02:00
Nikita Malyavin	259fb1cbed	MDEV-16978 Application-time periods: WITHOUT OVERLAPS * The overlaps check is implemented on a handler level per row command. It creates a separate cursor (actually, another handler instance) and caches it inside the original handler, when ha_update_row or ha_insert_row is issued. Cursor closes on unlocking the handler. * Containing the same key in index means unique constraint violation even in usual terms. So we fetch left and right neighbours and check that they have same key prefix, excluding from the key only the period part. If it doesnt match, then there's no such neighbour, and the check passes. Otherwise, we check if this neighbour intersects with the considered key. * The check does not introduce new error and fails with ER_DUPP_KEY error. This might break REPLACE workflow and should be fixed separately	2020-03-31 17:42:34 +02:00
Sergei Golubchik	0515577d12	cleanup: prepare "update_handler" for WITHOUT OVERLAPS * rename to a generic name * move remaning initializations from query exec to prepare time * simplify/unify key handling in open_table_from_share and delayed * remove dead code * move tests where they belong	2020-03-31 17:42:34 +02:00
Sergei Golubchik	dc3185c759	cleanup: pk_is_clustering_key() -> is_clustering_key() where PK is neither required nor implied	2020-03-31 17:42:33 +02:00
Sergei Golubchik	27bf97aa00	cleanup: dead code, comments, avoid current_thd	2020-03-31 17:42:33 +02:00
Monty	eb483c5181	Updated optimizer costs in multi_range_read_info_const() and sql_select.cc - multi_range_read_info_const now uses the new records_in_range interface - Added handler::avg_io_cost() - Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is not 1.0. In this case we trust the avg_io_cost() from the handler. - Changed test_quick_select to use TIME_FOR_COMPARE instead of TIME_FOR_COMPARE_IDX to align this with the rest of the code. - Fixed bug when using test_if_cheaper_ordering where we didn't use keyread if index was changed - Fixed a bug where we didn't use index only read when using order-by-index - Added keyread_time() to HEAP. The default keyread_time() was optimized for blocks and not suitable for HEAP. The effect was the HEAP prefered table scans over ranges for btree indexes. - Fixed get_sweep_read_cost() for HEAP tables - Ensure that range and ref have same cost for simple ranges Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure we favior ref for range for simple queries. - Fixed that matching_candidates_in_table() uses same number of records as the rest of the optimizer - Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for HEAP and temporary tables better. A few tests changed because of this. - heap::read_time() and heap::keyread_time() adjusted to not add +1. This was to ensure that handler::keyread_time() doesn't give higher cost for heap tables than for normal tables. One effect of this is that heap and derived tables stored in heap will prefer key access as this is now regarded as cheap. - Changed cost for index read in sql_select.cc to match multi_range_read_info_const(). All index cost calculation is now done trough one function. - 'ref' will now use quick_cost for keys if it exists. This is done so that for '=' ranges, 'ref' is prefered over 'range'. - scan_time() now takes avg_io_costs() into account - get_delayed_table_estimates() uses block_size and avg_io_cost() - Removed default argument to test_if_order_by_key(); simplifies code	2020-03-27 03:58:32 +02:00
Monty	37393bea23	Replace handler::primary_key_is_clustered() with handler::pk_is_clustering_key() This was done to both simplify the code and also to be easier to handle storage engines that are clustered on some other index than the primary key. As pk_is_clustering_key() and is_clustering_key now are using only index_flags, these where removed from all storage engines.	2020-03-24 21:00:04 +02:00
Monty	91ab42a823	Clean up and speed up interfaces for binary row logging MDEV-21605 Clean up and speed up interfaces for binary row logging MDEV-21617 Bug fix for previous version of this code The intention is to have as few 'if' as possible in ha_write() and related functions. This is done by pre-calculating once per statement the row_logging state for all tables. Benefits are simpler and faster code both when binary logging is disabled and when it's enabled. Changes: - Added handler->row_logging to make it easy to check it table should be row logged. This also made it easier to disabling row logging for system, internal and temporary tables. - The tables row_logging capabilities are checked once per "statements that updates tables" in THD::binlog_prepare_for_row_logging() which is called when needed from THD::decide_logging_format(). - Removed most usage of tmp_disable_binlog(), reenable_binlog() and temporary saving and setting of thd->variables.option_bits. - Moved checks that can't change during a statement from check_table_binlog_row_based() to check_table_binlog_row_based_internal() - Removed flag row_already_logged (used by sequence engine) - Moved binlog_log_row() to a handler:: - Moved write_locked_table_maps() to THD::binlog_write_table_maps() as most other related binlog functions are in THD. - Removed binlog_write_table_map() and binlog_log_row_internal() as they are now obsolete as 'has_transactions()' is pre-calculated in prepare_for_row_logging(). - Remove 'is_transactional' argument from binlog_write_table_map() as this can now be read from handler. - Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h to first evaluate fast and likely cases before more complex ones. - Added error checking in ha_write_row() and related functions if binlog_log_row() failed. - Don't clear check_table_binlog_row_based_result in clear_cached_table_binlog_row_based_flag() as it's not needed. - THD::clear_binlog_table_maps() has been replaced with THD::reset_binlog_for_next_statement() - Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables() to avoid calculating of binary log format for internal opens. This flag is also used to avoid reading statistics tables for internal tables. - Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary for create (instead of using THD::sql_log_bin_off. - Removed flag THD::sql_log_bin_off (not needed anymore) - Speed up THD::decide_logging_format() by remembering if blackhole engine is used and avoid a loop over all tables if it's not used (the common case). - THD::decide_logging_format() is not called anymore if no tables are used for the statement. This will speed up pure stored procedure code with about 5%+ according to some simple tests. - We now get annotated events on slave if a CREATE ... SELECT statement is transformed on the slave from statement to row logging. - In the original code, the master could come into a state where row logging is enforced for all future events if statement could be used. This is now partly fixed. Other changes: - Ensure that all tables used by a statement has query_id set. - Had to restore the row_logging flag for not used tables in THD::binlog_write_table_maps (not normal scenario) - Removed injector::transaction::use_table(server_id_type sid, table tbl) as it's not used. - Cleaned up set_slave_thread_options() - Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation changes. - Ensure we only call THD::decide_logging_format_low() once in mysql_insert() (inefficiency). - Don't annotate INSERT DELAYED - Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's already 0	2020-03-24 21:00:03 +02:00
Monty	4ef437558a	Improve update handler (long unique keys on blobs) MDEV-21606 Improve update handler (long unique keys on blobs) MDEV-21470 MyISAM and Aria start_bulk_insert doesn't work with long unique MDEV-21606 Bug fix for previous version of this code MDEV-21819 2 Assertion `inited == NONE \|\| update_handler != this' - Move update_handler from TABLE to handler - Move out initialization of update handler from ha_write_row() to prepare_for_insert() - Fixed that INSERT DELAYED works with update handler - Give an error if using long unique with an autoincrement column - Added handler function to check if table has long unique hash indexes - Disable write cache in MyISAM and Aria when using update_handler as if cache is used, the row will not be inserted until end of statement and update_handler would not find conflicting rows. - Removed not used handler argument from check_duplicate_long_entries_update() - Syntax cleanups - Indentation fixes - Don't use single character indentifiers for arguments	2020-03-24 21:00:02 +02:00
Monty	6a9e24d046	Added support for replication for S3 MDEV-19964 S3 replication support Added new configure options: s3_slave_ignore_updates "If the slave has shares same S3 storage as the master" s3_replicate_alter_as_create_select "When converting S3 table to local table, log all rows in binary log" This allows on to configure slaves to have the S3 storage shared or independent from the master. Other thing: Added new session variable '@@sql_if_exists' to force IF_EXIST to DDL's.	2020-03-24 21:00:02 +02:00
Sergey Vojtovich	da82e75901	handler::rebind() - rename PFS specific rebind_psi() to generic rebind() - call rebind independently of PFS compilation status - allow rebind() return an error	2020-03-24 20:47:41 +02:00
Monty	bff79492c5	Added IF EXISTS to RENAME TABLE and ALTER TABLE	2020-03-24 20:47:41 +02:00
Marko Mäkelä	3b25083785	Merge 10.4 into 10.5	2020-03-23 10:50:14 +02:00
Sergey Vojtovich	3b3f931570	Discovery counters: my_atomic to Atomic_counter	2020-03-21 18:37:15 +04:00
Andrei Elkin	c8ae357341	MDEV-742 XA PREPAREd transaction survive disconnect/server restart Lifted long standing limitation to the XA of rolling it back at the transaction's connection close even if the XA is prepared. Prepared XA-transaction is made to sustain connection close or server restart. The patch consists of - binary logging extension to write prepared XA part of transaction signified with its XID in a new XA_prepare_log_event. The concusion part - with Commit or Rollback decision - is logged separately as Query_log_event. That is in the binlog the XA consists of two separate group of events. That makes the whole XA possibly interweaving in binlog with other XA:s or regular transaction but with no harm to replication and data consistency. Gtid_log_event receives two more flags to identify which of the two XA phases of the transaction it represents. With either flag set also XID info is added to the event. When binlog is ON on the server XID::formatID is constrained to 4 bytes. - engines are made aware of the server policy to keep up user prepared XA:s so they (Innodb, rocksdb) don't roll them back anymore at their disconnect methods. - slave applier is refined to cope with two phase logged XA:s including parallel modes of execution. This patch does not address crash-safe logging of the new events which is being addressed by MDEV-21469. CORNER CASES: read-only, pure myisam, binlog-, @@skip_log_bin, etc Are addressed along the following policies. 1. The read-only at reconnect marks XID to fail for future completion with ER_XA_RBROLLBACK. 2. binlog- filtered XA when it changes engine data is regarded as loggable even when nothing got cached for binlog. An empty XA-prepare group is recorded. Consequent Commit-or-Rollback succeeds in the Engine(s) as well as recorded into binlog. 3. The same applies to the non-transactional engine XA. 4. @@skip_log_bin=OFF does not record anything at XA-prepare (obviously), but the completion event is recorded into binlog to admit inconsistency with slave. The following actions are taken by the patch. At XA-prepare: when empty binlog cache - don't do anything to binlog if RO, otherwise write empty XA_prepare (assert(binlog-filter case)). At Disconnect: when Prepared && RO (=> no binlogging was done) set Xid_cache_element::error := ER_XA_RBROLLBACK keep XID in the cache, and rollback the transaction. At XA-"complete": Discover the error, if any don't binlog the "complete", return the error to the user. Kudos ----- Alexey Botchkov took to drive this work initially. Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of good recommendations. Sergei Voitovich made a magnificent review and improvements to the code. They all deserve a bunch of thanks for making this work done!	2020-03-14 22:45:48 +02:00
Oleksandr Byelkin	fad47df995	Merge branch '10.4' into 10.5	2020-03-11 17:52:49 +01:00
Alexander Barkov	a1e330de5a	MDEV-21743 Split up SUPER privilege to smaller privileges	2020-03-10 23:49:47 +04:00
Sergei Golubchik	cbede21d0d	cleanup: pass trxid by value	2020-03-10 19:24:23 +01:00
Sergei Golubchik	81cffda2e6	perfschema transaction instrumentation related changes	2020-03-10 19:24:23 +01:00

... 2 3 4 5 6 ...

2978 commits