mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 02:51:44 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	c425d93b92	Merge 10.2 into 10.3 except commit `1288dfffe7`	2021-04-24 10:37:21 +03:00
Marko Mäkelä	25ed665a20	MDEV-25459 MVCC read from index on CHAR or VARCHAR wrongly omits rows row_sel_sec_rec_is_for_clust_rec(): If the field in the clustered index record stored off page, always fetch it, also when the secondary index field has been built on the entire column. This was broken ever since the InnoDB Plugin for MySQL Server 5.1 introduced ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED for InnoDB tables. That code was first introduced in this tree in commit `3945d5e554`. For the original ROW_FORMAT=REDUNDANT and the MySQL 5.0.3 ROW_FORMAT=COMPRESSED, there was no problem, because for those tables we always stored at least a 768-byte prefix of each column in the clustered index record. row_sel_sec_rec_is_for_blob(): Allow prefix_len==0 for matching the full column.	2021-04-24 09:26:49 +03:00
Marko Mäkelä	b8c8692fd9	MDEV-24620 ASAN heap-buffer-overflow in btr_pcur_restore_position() Between btr_pcur_store_position() and btr_pcur_restore_position() it is possible that purge empties a table and enlarges index->n_core_fields and index->n_core_null_bytes. Therefore, we must cache index->n_core_fields in btr_pcur_t::old_n_core_fields so that btr_pcur_t::old_rec can be parsed correctly. Unfortunately, this is a huge change, because we will replace "bool leaf" parameters with "ulint n_core" (passing index->n_core_fields, or 0 for non-leaf pages). For special cases where we know that index->is_instant() cannot hold, we may also pass index->n_fields.	2021-04-13 10:28:13 +03:00
Marko Mäkelä	bcd160753c	Merge 10.2 into 10.3	2021-03-05 10:06:42 +02:00
Marko Mäkelä	7759991a06	fixup `58b56f14a0`: Remove dead code row_prebuilt_t::m_no_prefetch: Remove (it was always false). row_prebuilt_t::m_read_virtual_key: Remove (it was always false). Only ha_innopart ever set these fields.	2021-03-04 18:11:25 +02:00
Marko Mäkelä	e3d692aa09	Merge 10.2 into 10.3	2020-10-22 08:26:28 +03:00
Marko Mäkelä	620ea816ad	Merge 10.1 into 10.2	2020-10-21 14:02:04 +03:00
Sergei Petrunia	3e807d255e	MDEV-23938: innodb row_search_idx_cond_check handle ICP_ABORTED_BY_USER - row_search_mvcc() should return DB_INTERRUPTED when it got killed. - Add a syncpoint for the ICP check. - Add test coverage for killed-during-ICP-check scenario Backport of MDEV-22761 fixes for ICP from 10.4 commits: * `a6f956488c` * `c03885cd9c` XtraDB was fixed in `deb3b9a174` Reviewer: Daniel Black	2020-10-16 09:44:03 +11:00
Marko Mäkelä	7e07e38cf6	Merge 10.2 into 10.3	2020-09-09 13:06:46 +03:00
Marko Mäkelä	040ae4c59b	MDEV-22924 fixup: Replace C++11 auto	2020-09-09 13:02:25 +03:00
Marko Mäkelä	d44c0f46c5	MDEV-22924 fixup: Replace C++11 nullptr Only starting with MariaDB Server 10.4 we may depend on C++11.	2020-09-09 12:26:51 +03:00
Marko Mäkelä	f99cace77f	MDEV-22924 Corruption in MVCC read via secondary index An unsafe optimization was introduced by commit `2347ffd843` (MDEV-20301) which is based on mysql/mysql-server@3f3136188f or mysql/mysql-server@647a3814a9 in MySQL 8.0.12 or MySQL 8.0.13 (which in turn is based on the contribution in MySQL Bug #84958). Row_sel_get_clust_rec_for_mysql::operator(): In addition to checking that the pointer to the record matches, also check the latest modification of the page (FIL_PAGE_LSN) as well as the page identifier. Only if all three match, it is safe to reuse cached_old_vers. Row_sel_get_clust_rec_for_mysql::check_eq(): Assert that the PRIMARY KEY of the cached old version of the record corresponds to the latest version. We got a test case where CHECK TABLE, UPDATE and purge would be hammering on the same table (with only 6 rows) and a pointer that was originally pointing to a record pk=2 would match a cached_clust_rec that was pointing to a record pk=1. In the diagnosed `rr replay` trace, we would wrongly return an old cached version of the pk=1 record, instead of retrieving the correct version of the pk=2 record. Because of this, CHECK TABLE would fail to count one of the records in a secondary index, and report failure. This bug appears to affect MVCC reads via secondary indexes only. The purge of history in secondary indexes uses a different code path, and so do checks for implicit record locks.	2020-09-07 15:31:54 +03:00
Marko Mäkelä	c3752cef3c	Merge 10.2 into 10.3	2020-09-03 09:26:54 +03:00
Marko Mäkelä	4d51ca6386	Merge 10.1 into 10.2 This also fixes MDEV-20464.	2020-09-01 16:20:23 +03:00
Marko Mäkelä	94e9dc95d4	MDEV-23600 Division by 0 in row_search_with_covering_prefix The InnoDB index fields store bytes, not characters. Remove some unnecessary conversions from characters to bytes. This also fixes MDEV-20422 and the wrong-result bug MDEV-12486.	2020-09-01 15:52:36 +03:00
Nikita Malyavin	97db6c15ea	MDEV-20618 Assertion failed in row_upd_sec_index_entry Add a proper error handling of innobase_get_computed_value results in row_upd_store_row/row_upd_store_v_row. Also add an assertion in row_vers_build_clust_v_col to fail during row purge. Add one more assertion in row_sel_sec_rec_is_for_clust_rec for possible future catches.	2020-09-01 18:27:09 +10:00
Nikita Malyavin	a3d66090c7	MDEV-18366 Crash on SELECT on a table with indexed virtual columns The problem was in improper error handling behavior in `row_upd_build_difference_binary`: `innobase_free_row_for_vcol` wasn't called. To eliminate this problem in all potential places, a refactoring has been made: * class ib_vcol_row is added. It owns VCOL_STORAGE and heap and maintains it in RAII manner * all innobase_allocate_row_for_vcol/innobase_free_row_for_vcol pairs are substituted with ib_vcol_row usage * row_merge_buf_add is only left untouched because it doesn't own vheap passed as an argument * innobase_allocate_row_for_vcol does not allocate VCOL_STORAGE anymore and accepts it as an argument -- this reduces a number of memory allocations * move rec_printer out of `#ifndef DBUG_OFF` and mark it cold	2020-09-01 18:27:09 +10:00
Marko Mäkelä	b6ec1e8bbf	MDEV-20377 post-fix: Introduce MEM_MAKE_ADDRESSABLE In AddressSanitizer, we only want memory poisoning to happen in connection with custom memory allocation or freeing. The primary use of MEM_UNDEFINED is for declaring memory uninitialized in Valgrind or MemorySanitizer. We do not want MEM_UNDEFINED to have the unwanted side effect that AddressSanitizer would no longer be able to complain about accessing unallocated memory. MEM_UNDEFINED(): Define as no-op for AddressSanitizer. MEM_MAKE_ADDRESSABLE(): Define as MEM_UNDEFINED() or ASAN_UNPOISON_MEMORY_REGION(). MEM_CHECK_ADDRESSABLE(): Wrap also __asan_region_is_poisoned().	2020-07-02 17:59:28 +03:00
Monty	65f831d17c	Fixed bugs found by valgrind - Some of the bug fixes are backports from 10.5! - The fix in innobase/fil/fil0fil.cc is just a backport to get less error messages in mysqld.1.err when running with valgrind. - Renamed HAVE_valgrind_or_MSAN to HAVE_valgrind	2020-07-02 17:57:34 +03:00
Marko Mäkelä	1df1a63924	Merge 10.2 into 10.3	2020-07-02 06:17:51 +03:00
Marko Mäkelä	c36834c832	MDEV-20377: Make WITH_MSAN more usable MemorySanitizer (clang -fsanitize=memory) requires that all code be compiled with instrumentation enabled. The only exception is the C runtime library. Failure to use instrumented libraries will cause bogus messages about memory being uninitialized. In WITH_MSAN builds, we must avoid calling getservbyname(), because even though it is a standard library function, it is not instrumented, not even in clang 10. Note: Before MariaDB Server 10.5, ./mtr will typically fail due to the old PCRE library, which was updated in MDEV-14024. The following cmake options were tested on 10.5 in commit `94d0bb4dbe`: cmake \ -DCMAKE_C_FLAGS='-march=native -O2' \ -DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \ -DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \ -DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \ -DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \ -DWITH_SAFEMALLOC=OFF \ -DWITH_{ZLIB,SSL,PCRE}=bundled \ -DHAVE_LIBAIO_H=0 \ -DWITH_MSAN=ON MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED() and __msan_unpoison(). MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow(). InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros. ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in functions instead of inline assembler when building WITH_MSAN. This will require at least -msse4.2 when building for IA-32 or AMD64. The inline assembler would not be instrumented, and would thus cause bogus failures.	2020-07-01 17:23:00 +03:00
Marko Mäkelä	2e9f4cdc44	MDEV-21936 Assertion !btr_search_own... in btr_search_drop_page_hash_index This is a regression due to the cleanup commit `12f804acfa`. row_sel_open_pcur(): Remove the unnecessary parameter. It suffices for us to acquire the adaptive hash index latch only when btr_search_guess_on_hash() is called by btr_cur_search_to_nth_level_func(), in btr_pcur_open_with_no_init(). This code seems to be a relic from the times when there was only one btr_search_latch, which was held in shared mode for longer periods of time. Another relic of that era was removed in commit `e5980bf1b1`. This clean-up was missed when the btr_search_latch was split in mysql/mysql-server/commit@ab17ab91ce18a47bb6c5c49e4dc0505ad488a448 (MySQL 5.7.8).	2020-05-19 15:43:35 +03:00
Marko Mäkelä	3d0bb2b7f1	Merge 10.2 into 10.3	2020-05-15 19:11:57 +03:00
Marko Mäkelä	ad6171b91c	MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB If the InnoDB buffer pool contains many pages for a table or index that is being dropped or rebuilt, and if many of such pages are pointed to by the adaptive hash index, dropping the adaptive hash index may consume a lot of time. The time-consuming operation of dropping the adaptive hash index entries is being executed while the InnoDB data dictionary cache dict_sys is exclusively locked. It is not actually necessary to drop all adaptive hash index entries at the time a table or index is being dropped or rebuilt. We can let the LRU replacement policy of the buffer pool take care of this gradually. For this to work, we must detach the dict_table_t and dict_index_t objects from the main dict_sys cache, and once the last adaptive hash index entry for the detached table is removed (when the garbage page is evicted from the buffer pool) we can free the dict_table_t and dict_index_t object. Related to this, in MDEV-16283, we made ALTER TABLE...DISCARD TABLESPACE skip both the buffer pool eviction and the drop of the adaptive hash index. We shifted the burden to ALTER TABLE...IMPORT TABLESPACE or DROP TABLE. We can remove the eviction from DROP TABLE. We must retain the eviction in the ALTER TABLE...IMPORT TABLESPACE code path, so that in case the discarded table is being re-imported with the same tablespace identifier, the fresh data from the imported tablespace will replace any stale pages in the buffer pool. rpl.rpl_failed_drop_tbl_binlog: Remove the test. DROP TABLE can no longer be interrupted inside InnoDB. fseg_free_page(), fseg_free_step(), fseg_free_step_not_header(), fseg_free_page_low(), fseg_free_extent(): Remove the parameter that specifies whether the adaptive hash index should be dropped. btr_search_lazy_free(): Lazily free an index when the last reference to it is dropped from the adaptive hash index. buf_pool_clear_hash_index(): Declare static, and move to the same compilation unit with the bulk of the adaptive hash index code. dict_index_t::clone(), dict_index_t::clone_if_needed(): Clone an index that is being rebuilt while adaptive hash index entries exist. The original index will be inserted into dict_table_t::freed_indexes and dict_index_t::set_freed() will be called. dict_index_t::set_freed(), dict_index_t::freed(): Note that or check whether the index has been freed. We will use the impossible page number 1 to denote this condition. dict_index_t::n_ahi_pages(): Replaces btr_search_info_get_ref_count(). dict_index_t::detach_columns(): Move the assignment n_fields=0 to ha_innobase_inplace_ctx::clear_added_indexes(). We must have access to the columns when freeing the adaptive hash index. Note: dict_table_t::v_cols[] will remain valid. If virtual columns are dropped or added, the table definition will be reloaded in ha_innobase::commit_inplace_alter_table(). buf_page_mtr_lock(): Drop a stale adaptive hash index if needed. We will also reduce the number of btr_get_search_latch() calls and enclose some more code inside #ifdef BTR_CUR_HASH_ADAPT in order to benefit cmake -DWITH_INNODB_AHI=OFF.	2020-05-15 17:23:08 +03:00
Oleksandr Byelkin	7fb73ed143	Merge branch '10.2' into 10.3	2020-05-04 16:47:11 +02:00
Daniel Black	ba2061da52	MDEV-21595: innodb offset_t rename to rec_offs thanks to: perl -i -pe 's/\boffset_t\b/rec_offs/g' $(git grep -lw offset_t storage/innobase)	2020-04-29 12:02:47 +03:00
Marko Mäkelä	3466b47b0d	Merge 10.2 into 10.3	2019-12-13 10:08:57 +02:00
Eugene Kosov	f0aa073f2b	MDEV-20950 Reduce size of record offsets offset_t: this is a type which represents one record offset. It's unsigned short int. a lot of functions: replace ulint with offset_t btr_pcur_restore_position_func(), page_validate(), row_ins_scan_sec_index_for_duplicate(), row_upd_clust_rec_by_insert_inherit_func(), row_vers_impl_x_locked_low(), trx_undo_prev_version_build(): allocate record offsets on the stack instead of waiting for rec_get_offsets() to allocate it from mem_heap_t. So, reducing memory allocations. RECORD_OFFSET, INDEX_OFFSET: now it's less convenient to store pointers in offset_t* array. One pointer occupies now several offset_t. And those constant are start indexes into array to places where to store pointer values REC_OFFS_HEADER_SIZE: adjusted for the new reality REC_OFFS_NORMAL_SIZE: increase size from 100 to 300 which means less heap allocations. And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which is smaller than previous 800 bytes. REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality rem0rec.h, rem0rec.ic, rem0rec.cc: various arguments, return values and local variables types were changed to fix numerous integer conversions issues. enum field_type_t: offset types concept was introduces which replaces old offset flags stuff. Like in earlier version, 2 upper bits are used to store offset type. And this enum represents those types. REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed get_type(), set_type(), get_value(), combine(): these are convenience functions to work with offsets and it's types rec_offs_base()[0]: still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL rec_offs_base()[i]: these have type offset_t now. Two upper bits contains type.	2019-12-13 00:26:50 +07:00
Faustin Lammler	2df2238cb8	Lintian complains on spelling error The lintian check complains on spelling error: https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739	2019-12-02 12:41:13 +02:00
Marko Mäkelä	dae1b3b04c	MDEV-15326: Backport trx_t::is_referenced() Backport the applicable part of Sergey Vojtovich's commit `0ca2ea1a65` from MariaDB Server 10.3. trx reference counter was updated under mutex and read without any protection. This is both slow and unsafe. Use atomic operations for reference counter accesses.	2019-09-04 09:42:38 +03:00
Marko Mäkelä	395e1dcd17	Merge 10.2 into 10.3	2019-08-16 10:02:18 +03:00
Marko Mäkelä	555af003e4	MDEV-8588/MDEV-19740: Restore a condition It looks like the merge of MySQL 5.7.9 to MariaDB 10.2.2 conflicted with earlier changes that were made in MDEV-8588. row_search_mvcc(): If the page is corrupted, avoid invoking btr_cur_store_position(). The caller should not try to fetch the next record after a hard error.	2019-08-16 09:54:33 +03:00
Marko Mäkelä	d50fe4021e	Merge 10.2 into 10.3	2019-08-15 15:59:32 +03:00
Marko Mäkelä	112589cded	MDEV-19740: Remove a bogus condition This triggered a "may be uninitialized" warning from GCC 9.2.1. The bogus-looking condition was added in `7e916bb86f`	2019-08-15 15:58:37 +03:00
Aleksey Midenkov	c23a5e0e5e	Merge 10.2 into 10.3	2019-08-14 19:16:08 +03:00
Aleksey Midenkov	2347ffd843	MDEV-20301 InnoDB's MVCC has O(N^2) behaviors If there're multiple row versions in InnoDB, reading one row from PK may have O(N) complexity and reading from secondary keys may have O(N^2) complexity. The problem occurs when there are many pending versions of the same row, meaning that the primary key is the same, but a secondary key is different. The slowdown occurs when the secondary index is traversed. This patch creates a helper class for the function row_sel_get_clust_rec_for_mysql() which can remember and re-use cached_clust_rec & cached_old_vers so that rec_get_offsets() does not need to be called over and over for the clustered record. Corrections by Kevin Lewis <kevin.lewis@oracle.com> MDEV-20341 Unstable innodb.innodb_bug14704286 Removed test that tested the ability of interrupting long query which is not long anymore.	2019-08-14 19:10:17 +03:00
Marko Mäkelä	90a9193685	Merge 10.2 into 10.3	2019-05-29 11:32:46 +03:00
Marko Mäkelä	6eefeb6fea	MDEV-19541: Avoid infinite loop of reading a corrupted page row_search_mvcc(): Duplicate the logic of btr_pcur_move_to_next() so that an infinite loop can be avoided when advancing to the next page fails due to a corrupted page.	2019-05-29 11:20:56 +03:00
Marko Mäkelä	74904a667e	Remove UT_NOT_USED btr_pcur_move_to_last_on_page(): Merge with the only caller.	2019-05-20 17:09:50 +03:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	c0ac0b8860	Update FSF address	2019-05-11 19:25:02 +03:00
Marko Mäkelä	b6f4cccd19	Merge 10.2 into 10.3	2019-05-03 20:14:09 +03:00
Marko Mäkelä	ce195987c3	MDEV-19385: Inconsistent definition of dtuple_get_nth_v_field() The accessor dtuple_get_nth_v_field() was defined differently between debug and release builds in MySQL 5.7.8 in mysql/mysql-server@c47e1751b7 and a debug assertion to document or enforce the questionable assumption tuple->v_fields == &tuple->fields[tuple->n_fields] was missing. This was apparently no problem until MDEV-11369 introduced instant ADD COLUMN to MariaDB Server 10.3. With that work present, in one test case, trx_undo_report_insert_virtual() could in release builds fetch the wrong value for a virtual column. We replace many of the dtuple_t accessors with const-preserving inline functions, and fix missing or misleadingly applied const qualifiers accordingly.	2019-05-03 20:02:50 +03:00
Marko Mäkelä	d5a2bc6a0f	Merge 10.2 into 10.3	2019-04-04 19:41:12 +03:00
Marko Mäkelä	f602385776	Do not pass table_name_t to printf-like functions	2019-04-04 08:57:53 +03:00
Marko Mäkelä	d0116e10a5	Revert MDEV-18464 and MDEV-12009 This reverts commit `21b2fada7a` and commit `81d71ee6b2`. The MDEV-18464 change introduces a few data race issues. Contrary to the documentation, the field trx_t::victim is not always being protected by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems that KILL QUERY could wrongly avoid acquiring both mutexes when invoking lock_trx_handle_wait_low(), in case another thread had already set trx->victim=true. We also revert MDEV-12009, because it should depend on the MDEV-18464 fix being present.	2019-03-28 12:39:50 +02:00
Jan Lindström	21b2fada7a	MDEV-18464: Port kill_one_trx fixes from 10.4 to 10.1 Pushed the decision for innodb transaction and system locking down to lock0lock.cc level. With this, we can avoid releasing these mutexes for executions where these mutexes were acquired upfront. This patch will also fix BF aborting of native threads, e.g. threads which have declared wsrep_on=OFF. Earlier, we have used, for innodb trx locks, was_chosen_as_deadlock_victim flag, for marking inodb transactions, which are victims for wsrep BF abort. With native threads (wsrep_on==OFF), re-using was_chosen_as_deadlock_victim flag may lead to inteference with real deadlock, and to deal with this, the patch has added new flag for marking wsrep BF aborts only: victim=true Similar way if replication decides to abort one of the threads we mark victim by: victim=true innobase_kill_query Remove lock sys and trx mutex handling. wsrep_innobase_kill_one_trx Mark victim trx with victim=true trx0trx.h Remove trx_abort_t type and abort type variable from trx struct. Add victim variable to trx. wsrep_kill_victim Remove abort_type lock_report_waiters_to_mysql Take also trx mutex and mark trx as a victim for replication abort. lock_trx_handle_wait_low New low level function to check whether the transaction has already been rolled back because it was selected as a deadlock victim, or if it has to wait then cancel the wait lock. lock_trx_handle_wait If transaction is not marked as victim take lock sys and trx mutex before calling lock_trx_handle_wait_low and release them after that. row_search_for_mysql Remove lock sys and trx mutex taking and releasing. trx_rollback_to_savepoint_for_mysql_low trx_commit_in_memory Clean up victim variable.	2019-03-28 07:40:03 +02:00
Oleksandr Byelkin	65c5ef9b49	dirty merge	2019-02-07 13:59:31 +01:00
Marko Mäkelä	625994b7cc	MDEV-16982 Server crashes in mem_heap_dup upon DELETE from table with virtual columns An uninitialized buffer is passed to row_sel_store_mysql_rec() but InnoDB may not initialize everything. Looks like it's ok in most cases but not always. The partially initialized buffer was later passed to ha_innobase::write_row() which reads random NULL bit values for virtual columns and random stuff happens. No test case for MariaDB 10.2 was found. The test case for MariaDB 10.3 involves partitioning, system versioning and the TRASH_ALLOC fill pattern 0xA5. Test case depends very much on the number and layout of columns. Think about 0xA5 byte for a NULL bit mask. row_sel_store_mysql_rec(): always initialize virtual columns NULL bit Closes #1144	2019-02-05 12:02:41 +02:00

1 2 3 4 5

208 commits