mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Sergei Golubchik	761d5c8987	MDEV-33092 Undefined reference to concurrency on Solaris remove thr_setconcurrency() followup for `8bbcaab160` Fix by Rainer Orth	2024-01-10 10:16:20 +01:00
Marko Mäkelä	a3dd7ea09f	Merge 10.4 into 10.5	2023-12-21 11:30:32 +02:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Marko Mäkelä	f5fdb9cec5	MDEV-16660: Increase the DEFAULT_THREAD_STACK for ASAN To allow cmake -DWITH_ASAN=ON to work out of the box when using newer compilers, we must increase the default thread stack size. By design, AddressSanitizer will allocate some "sentinel" areas in stack frames so that it can better catch buffer overflows, by trapping access to memory addresses that reside between stack-allocated variables. Apparently, some parameters related to this have been changed recently, possibly to allow -fsanitize=address to catch more errors.	2023-11-17 14:12:48 +02:00
Sergei Golubchik	6b685ea7b0	correctness assert thd_get_ha_data() can be used without a lock, but only from the current thd thread, when calling from anoher thread it must be protected by thd->LOCK_thd_data * fix group commit code to take thd->LOCK_thd_data * remove innobase_close_connection() from the innodb background thread, it's not needed after `87775402cd` and was failing the assert with current_thd==0	2022-09-29 10:44:39 +02:00
Marko Mäkelä	a8ded39557	Merge 10.4 into 10.5	2021-10-28 08:48:36 +03:00
Marko Mäkelä	3a79e5fd31	Merge 10.3 into 10.4	2021-10-28 08:28:39 +03:00
Marko Mäkelä	657bcf928e	Merge 10.2 into 10.3	2021-10-28 07:50:05 +03:00
Oleksandr Byelkin	1f70e4b00c	pthread_yield() is depricated now, so use sched_yield() if possible.	2021-10-26 15:05:13 +02:00
Marko Mäkelä	7b48da4d7e	Merge 10.4 into 10.5	2021-04-08 07:47:49 +03:00
Daniel Black	f69c1c9dcb	MDEV-19508: SI_KERNEL is not on all implementations SI_USER is, however in FreeBSD there are a couple of non-kernel user signal infomations above SI_KERNEL. Put a fallback just in case there is nothing available.	2021-04-07 14:01:56 +10:00
Vladislav Vaintroub	031b3dfc22	MDEV-25123 support MSVC ASAN	2021-03-12 08:44:55 +01:00
Marko Mäkelä	7cffb5f6e8	MDEV-23399: Performance regression with write workloads The buffer pool refactoring in MDEV-15053 and MDEV-22871 shifted the performance bottleneck to the page flushing. The configuration parameters will be changed as follows: innodb_lru_flush_size=32 (new: how many pages to flush on LRU eviction) innodb_lru_scan_depth=1536 (old: 1024) innodb_max_dirty_pages_pct=90 (old: 75) innodb_max_dirty_pages_pct_lwm=75 (old: 0) Note: The parameter innodb_lru_scan_depth will only affect LRU eviction of buffer pool pages when a new page is being allocated. The page cleaner thread will no longer evict any pages. It used to guarantee that some pages will remain free in the buffer pool. Now, we perform that eviction 'on demand' in buf_LRU_get_free_block(). The parameter innodb_lru_scan_depth(srv_LRU_scan_depth) is used as follows: * When the buffer pool is being shrunk in buf_pool_t::withdraw_blocks() * As a buf_pool.free limit in buf_LRU_list_batch() for terminating the flushing that is initiated e.g., by buf_LRU_get_free_block() The parameter also used to serve as an initial limit for unzip_LRU eviction (evicting uncompressed page frames while retaining ROW_FORMAT=COMPRESSED pages), but now we will use a hard-coded limit of 100 or unlimited for invoking buf_LRU_scan_and_free_block(). The status variables will be changed as follows: innodb_buffer_pool_pages_flushed: This includes also the count of innodb_buffer_pool_pages_LRU_flushed and should work reliably, updated one by one in buf_flush_page() to give more real-time statistics. The function buf_flush_stats(), which we are removing, was not called in every code path. For both counters, we will use regular variables that are incremented in a critical section of buf_pool.mutex. Note that show_innodb_vars() directly links to the variables, and reads of the counters will not be protected by buf_pool.mutex, so you cannot get a consistent snapshot of both variables. The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed, because the page cleaner no longer deals with writing or evicting least recently used pages, and because the single-page writes have been removed: * buffer_LRU_batch_flush_avg_time_slot * buffer_LRU_batch_flush_avg_time_thread * buffer_LRU_batch_flush_avg_time_est * buffer_LRU_batch_flush_avg_pass * buffer_LRU_single_flush_scanned * buffer_LRU_single_flush_num_scan * buffer_LRU_single_flush_scanned_per_call When moving to a single buffer pool instance in MDEV-15058, we missed some opportunity to simplify the buf_flush_page_cleaner thread. It was unnecessarily using a mutex and some complex data structures, even though we always have a single page cleaner thread. Furthermore, the buf_flush_page_cleaner thread had separate 'recovery' and 'shutdown' modes where it was waiting to be triggered by some other thread, adding unnecessary latency and potential for hangs in relatively rarely executed startup or shutdown code. The page cleaner was also running two kinds of batches in an interleaved fashion: "LRU flush" (writing out some least recently used pages and evicting them on write completion) and the normal batches that aim to increase the MIN(oldest_modification) in the buffer pool, to help the log checkpoint advance. The buf_pool.flush_list flushing was being blocked by buf_block_t::lock for no good reason. Furthermore, if the FIL_PAGE_LSN of a page is ahead of log_sys.get_flushed_lsn(), that is, what has been persistently written to the redo log, we would trigger a log flush and then resume the page flushing. This would unnecessarily limit the performance of the page cleaner thread and trigger the infamous messages "InnoDB: page_cleaner: 1000ms intended loop took 4450ms. The settings might not be optimal" that were suppressed in commit `d1ab89037a` unless log_warnings>2. Our revised algorithm will make log_sys.get_flushed_lsn() advance at the start of buf_flush_lists(), and then execute a 'best effort' to write out all pages. The flush batches will skip pages that were modified since the log was written, or are are currently exclusively locked. The MDEV-13670 message "page_cleaner: 1000ms intended loop took" message will be removed, because by design, the buf_flush_page_cleaner() should not be blocked during a batch for extended periods of time. We will remove the single-page flushing altogether. Related to this, the debug parameter innodb_doublewrite_batch_size will be removed, because all of the doublewrite buffer will be used for flushing batches. If a page needs to be evicted from the buffer pool and all 100 least recently used pages in the buffer pool have unflushed changes, buf_LRU_get_free_block() will execute buf_flush_lists() to write out and evict innodb_lru_flush_size pages. At most one thread will execute buf_flush_lists() in buf_LRU_get_free_block(); other threads will wait for that LRU flushing batch to finish. To improve concurrency, we will replace the InnoDB ib_mutex_t and os_event_t native mutexes and condition variables in this area of code. Most notably, this means that the buffer pool mutex (buf_pool.mutex) is no longer instrumented via any InnoDB interfaces. It will continue to be instrumented via PERFORMANCE_SCHEMA. For now, both buf_pool.flush_list_mutex and buf_pool.mutex will be declared with MY_MUTEX_INIT_FAST (PTHREAD_MUTEX_ADAPTIVE_NP). The critical sections of buf_pool.flush_list_mutex should be shorter than those for buf_pool.mutex, because in the worst case, they cover a linear scan of buf_pool.flush_list, while the worst case of a critical section of buf_pool.mutex covers a linear scan of the potentially much longer buf_pool.LRU list. mysql_mutex_is_owner(), safe_mutex_is_owner(): New predicate, usable with SAFE_MUTEX. Some InnoDB debug assertions need this predicate instead of mysql_mutex_assert_owner() or mysql_mutex_assert_not_owner(). buf_pool_t::n_flush_LRU, buf_pool_t::n_flush_list: Replaces buf_pool_t::init_flush[] and buf_pool_t::n_flush[]. The number of active flush operations. buf_pool_t::mutex, buf_pool_t::flush_list_mutex: Use mysql_mutex_t instead of ib_mutex_t, to have native mutexes with PERFORMANCE_SCHEMA and SAFE_MUTEX instrumentation. buf_pool_t::done_flush_LRU: Condition variable for !n_flush_LRU. buf_pool_t::done_flush_list: Condition variable for !n_flush_list. buf_pool_t::do_flush_list: Condition variable to wake up the buf_flush_page_cleaner when a log checkpoint needs to be written or the server is being shut down. Replaces buf_flush_event. We will keep using timed waits (the page cleaner thread will wake _at least_ once per second), because the calculations for innodb_adaptive_flushing depend on fixed time intervals. buf_dblwr: Allocate statically, and move all code to member functions. Use a native mutex and condition variable. Remove code to deal with single-page flushing. buf_dblwr_check_block(): Make the check debug-only. We were spending a significant amount of execution time in page_simple_validate_new(). flush_counters_t::unzip_LRU_evicted: Remove. IORequest: Make more members const. FIXME: m_fil_node should be removed. buf_flush_sync_lsn: Protect by std::atomic, not page_cleaner.mutex (which we are removing). page_cleaner_slot_t, page_cleaner_t: Remove many redundant members. pc_request_flush_slot(): Replaces pc_request() and pc_flush_slot(). recv_writer_thread: Remove. Recovery works just fine without it, if we simply invoke buf_flush_sync() at the end of each batch in recv_sys_t::apply(). recv_recovery_from_checkpoint_finish(): Remove. We can simply call recv_sys.debug_free() directly. srv_started_redo: Replaces srv_start_state. SRV_SHUTDOWN_FLUSH_PHASE: Remove. logs_empty_and_mark_files_at_shutdown() can communicate with the normal page cleaner loop via the new function flush_buffer_pool(). buf_flush_remove(): Assert that the calling thread is holding buf_pool.flush_list_mutex. This removes unnecessary mutex operations from buf_flush_remove_pages() and buf_flush_dirty_pages(), which replace buf_LRU_flush_or_remove_pages(). buf_flush_lists(): Renamed from buf_flush_batch(), with simplified interface. Return the number of flushed pages. Clarified comments and renamed min_n to max_n. Identify LRU batch by lsn=0. Merge all the functions buf_flush_start(), buf_flush_batch(), buf_flush_end() directly to this function, which was their only caller, and remove 2 unnecessary buf_pool.mutex release/re-acquisition that we used to perform around the buf_flush_batch() call. At the start, if not all log has been durably written, wait for a background task to do it, or start a new task to do it. This allows the log write to run concurrently with our page flushing batch. Any pages that were skipped due to too recent FIL_PAGE_LSN or due to them being latched by a writer should be flushed during the next batch, unless there are further modifications to those pages. It is possible that a page that we must flush due to small oldest_modification also carries a recent FIL_PAGE_LSN or is being constantly modified. In the worst case, all writers would then end up waiting in log_free_check() to allow the flushing and the checkpoint to complete. buf_do_flush_list_batch(): Clarify comments, and rename min_n to max_n. Cache the last looked up tablespace. If neighbor flushing is not applicable, invoke buf_flush_page() directly, avoiding a page lookup in between. buf_flush_space(): Auxiliary function to look up a tablespace for page flushing. buf_flush_page(): Defer the computation of space->full_crc32(). Never call log_write_up_to(), but instead skip persistent pages whose latest modification (FIL_PAGE_LSN) is newer than the redo log. Also skip pages on which we cannot acquire a shared latch without waiting. buf_flush_try_neighbors(): Do not bother checking buf_fix_count because buf_flush_page() will no longer wait for the page latch. Take the tablespace as a parameter, and only execute this function when innodb_flush_neighbors>0. Avoid repeated calls of page_id_t::fold(). buf_flush_relocate_on_flush_list(): Declare as cold, and push down a condition from the callers. buf_flush_check_neighbor(): Take id.fold() as a parameter. buf_flush_sync(): Ensure that the buf_pool.flush_list is empty, because the flushing batch will skip pages whose modifications have not yet been written to the log or were latched for modification. buf_free_from_unzip_LRU_list_batch(): Remove redundant local variables. buf_flush_LRU_list_batch(): Let the caller buf_do_LRU_batch() initialize the counters, and report n->evicted. Cache the last looked up tablespace. If neighbor flushing is not applicable, invoke buf_flush_page() directly, avoiding a page lookup in between. buf_do_LRU_batch(): Return the number of pages flushed. buf_LRU_free_page(): Only release and re-acquire buf_pool.mutex if adaptive hash index entries are pointing to the block. buf_LRU_get_free_block(): Do not wake up the page cleaner, because it will no longer perform any useful work for us, and we do not want it to compete for I/O while buf_flush_lists(innodb_lru_flush_size, 0) writes out and evicts at most innodb_lru_flush_size pages. (The function buf_do_LRU_batch() may complete after writing fewer pages if more than innodb_lru_scan_depth pages end up in buf_pool.free list.) Eliminate some mutex release-acquire cycles, and wait for the LRU flush batch to complete before rescanning. buf_LRU_check_size_of_non_data_objects(): Simplify the code. buf_page_write_complete(): Remove the parameter evict, and always evict pages that were part of an LRU flush. buf_page_create(): Take a pre-allocated page as a parameter. buf_pool_t::free_block(): Free a pre-allocated block. recv_sys_t::recover_low(), recv_sys_t::apply(): Preallocate the block while not holding recv_sys.mutex. During page allocation, we may initiate a page flush, which in turn may initiate a log flush, which would require acquiring log_sys.mutex, which should always be acquired before recv_sys.mutex in order to avoid deadlocks. Therefore, we must not be holding recv_sys.mutex while allocating a buffer pool block. BtrBulk::logFreeCheck(): Skip a redundant condition. row_undo_step(): Do not invoke srv_inc_activity_count() for every row that is being rolled back. It should suffice to invoke the function in trx_flush_log_if_needed() during trx_t::commit_in_memory() when the rollback completes. sync_check_enable(): Remove. We will enable innodb_sync_debug from the very beginning. Reviewed by: Vladislav Vaintroub	2020-10-15 17:04:56 +03:00
Eugene Kosov	89ff4176c1	MDEV-22437 make THR_THD* variable thread_local Now all access goes through _current_thd() and set_current_thd() functions. Some functions like THD::store_globals() can not fail now.	2020-05-05 18:13:31 +03:00
Marko Mäkelä	496d0372ef	Merge 10.4 into 10.5	2020-04-29 15:40:51 +03:00
Marko Mäkelä	b63446984c	Merge 10.3 into 10.4	2020-04-27 17:38:17 +03:00
Marko Mäkelä	2e12d471ea	Merge 10.2 into 10.3	2020-04-27 14:24:41 +03:00
Marko Mäkelä	fbe2712705	Merge 10.4 into 10.5 The functional changes of commit `5836191c8f` (MDEV-21168) are omitted due to MDEV-742 having addressed the issue.	2020-04-25 21:57:52 +03:00
Eugene Kosov	2c5067b689	cleanup THR_KEY_mysys read TLS with my_thread_var write TLS with set_mysys_var() my_thread_var is no longer __attribute__ ((const)): this attribute is simply incorrect here. Read gcc manual for more information. sql/threadpool_generic.cc fails with that attribute.	2020-04-25 00:55:39 +03:00
Sergey Vojtovich	5679a2b6b3	Shrink my_atomic.h and my_cpu.h scope	2020-04-15 22:23:03 +04:00
Sergey Vojtovich	4bd9f82a8f	slave_open_temp_tables to Atomic_counter	2020-04-15 21:05:21 +04:00
Sergey Vojtovich	5876ed9e5b	Relay_log_info::executed_entries to Atomic_counter	2020-04-15 18:36:07 +04:00
Marko Mäkelä	37c14690fc	Merge 10.4 into 10.5	2020-03-30 19:07:25 +03:00
Marko Mäkelä	e2f1f88fa6	Merge 10.3 into 10.4	2020-03-30 14:50:23 +03:00
Marko Mäkelä	1a9b6c4c7f	Merge 10.2 into 10.3	2020-03-30 11:12:56 +03:00
Eugene Kosov	a7cbce06d4	unoptimized -fsanitize=undefined build on clang requires more stack space	2020-03-23 17:42:57 +03:00
Marko Mäkelä	d82ac8d374	MDEV-21907: Fix some -Wconversion outside InnoDB Some .c and .cc files are compiled as part of Mariabackup. Enabling -Wconversion for InnoDB would also enable it for Mariabackup. The .h files are being included during InnoDB or Mariabackup compilation. Notably, GCC 5 (but not GCC 4 or 6 or later versions) would report -Wconversion for x\|=y when the type is unsigned char. So, we will either write x=(uchar)(x\|y) or disable the -Wconversion warning for GCC 5. bitmap_set_bit(), bitmap_flip_bit(), bitmap_clear_bit(), bitmap_is_set(): Always implement as inline functions.	2020-03-12 19:44:52 +02:00
Sergei Golubchik	7af733a5a2	perfschema compilation, test and misc fixes	2020-03-10 19:24:23 +01:00
Jon Olav Hauglid	52d7980753	Bug#18913935: REMOVE SUPPORT FOR LINUXTHREADS This patch removes support for LinuxThreads. It was superseded by NPTL in Linux 2.6 (2003).	2020-03-10 19:24:21 +01:00
Monty	6c50875a38	MDEV-20279 Increase Aria index length limit Limit increased from 1000 to 2000. Avoiding stack overflow by only storing keys and pages on the stack in recursive functions if there is plenty of space on it. Other things: - Use less stack space for b-tree operations as we now only allocate as much space as needed instead of always allocating HA_MAX_KEY_LENGTH. - Replaced most usage of my_safe_alloca() in Aria with the stack_alloc interface. - Moved my_setstacksize() to mysys/my_pthread.c	2019-08-23 11:26:04 +02:00
Vladislav Vaintroub	73be875c8e	MDEV-19773 : simplify implementation of Windows rwlock No need to do dynamic loading and fallbacks anymore. We can safely assume Windows 7, and availability of all SRWLock functions.	2019-06-18 00:37:09 +01:00
Marko Mäkelä	826f9d4f7e	Merge 10.4 into 10.5	2019-05-23 10:32:21 +03:00
Monty	ab38b7511b	MDEV-17841 S3 storage engine A read-only storage engine that stores it's data in (aws) S3 To store data in S3 one could use ALTER TABLE: ALTER TABLE table_name ENGINE=S3 libmarias3 integration done by Sergei Golubchik libmarias3 created by Andrew Hutchings	2019-05-23 02:28:23 +03:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru	5543b75550	Update FSF Address * Update wrong zip-code	2019-05-11 21:29:06 +03:00
Sergei Golubchik	15c79c41e4	MDEV-17845 Extreme high open file limit used SHOW STATUS LIKE 'Open_files' was showing 18446744073709551615 my_file_opened used statistic_increment/statistic_decrement, so one-off errors were normal and expected. But they confused monitoring tools, so let's move my_file_opened to use atomics.	2019-05-07 18:40:36 +02:00
Marko Mäkelä	c67b306e4f	Merge 10.3 into 10.4	2019-03-08 11:19:48 +02:00
Marko Mäkelä	94eb56fb29	Give ASAN some more stack When compiling CMAKE_BUILD_TYPE=Debug WITH_ASAN using clang-7 -O2 the following tests could fail due to insufficient stack size: main.signal_demo3 sys_vars.max_sp_recursion_depth_func	2019-03-08 10:40:30 +02:00
Marko Mäkelä	2d0dd62cf7	Merge 10.2 into 10.3	2019-03-08 00:26:55 +02:00
Marko Mäkelä	913e33e423	Merge 10.1 into 10.2 Rewrite the MDEV-13818 fix to prevent heap-use-after-free. Add a test case for MDEV-18272.	2019-03-07 17:52:27 +02:00
Sergei Golubchik	84645366c4	ASAN loves stack, give it some fixes these test failures in ASAN builds (in 10.1 and 10.4): * main.signal_demo3 * main.sp * sys_vars.max_sp_recursion_depth_func * mroonga/storage.foreign_key_delete_existent * mroonga/storage.foreign_key_delete_nonexistent * mroonga/storage.foreign_key_insert_existent * mroonga/storage.foreign_key_update_existent * mroonga/storage.foreign_key_update_nonexistent * mroonga/storage.function_command_auto-escape * mroonga/storage.function_command_select * mroonga/storage.variable_enable_operations_recording_insert	2019-03-06 15:12:11 +01:00
Sergei Golubchik	07e9b13898	mysqld: ignore SIGHUP sent by the kernel SIGHUP causes debug info in the error log and reload of logs/privileges/tables/etc. The server should only do it when a user intentionally sends SUGHUP, not when a parent terminal gets disconnected or something. In particular, not ignoring kernel SIGHUP causes FLUSH PRIVILEGES at some random point during non-systemd Debian upgrades (Debian restarts mysqld, debian-start script runs mysql_upgrade in the background, postinit script ends and kernel sends SIGHUP to all background processes it has started). And during mysql_upgrade privilege tables aren't necessarily ready to be reloaded.	2018-12-12 00:31:04 +01:00
Marko Mäkelä	ae9d82c9f8	Merge 10.2 into 10.3	2018-10-11 08:22:08 +03:00
Marko Mäkelä	07815d9555	Merge 10.1 into 10.2	2018-10-11 08:16:08 +03:00
Sergey Vojtovich	1655053ac1	MDEV-17200 - pthread_detach called for already detached threads pthread_detach_this_thread() was intended to be defined to something meaningful only on some ancient unixes, which don't have pthread_attr_setdetachstate() defined. Otherwise, on normal unixes, threads are created detached in the first place. This was broken in `0f01bf2676` so that we started calling pthread_detach() for already detached threads. Intention was to detach aria checkpoint thread. However in `87007dc2f7` aria service threads were made joinable with appropriate handling, which makes breaking revision unneccessary. Revert remnants of `0f01bf2676`, so that pthread_detach_this_thread() is meaningful only on some ancient unixes again.	2018-10-05 14:37:15 +04:00
luz.paz	3dd01669b4	Misc. typos Found via `codespell -i 3 -w --skip="./debian/po" -I ../mariadb-server-word-whitelist.txt ./cmake/ ./debian/ ./Docs/ ./include/ ./man/ ./plugin/ ./strings/`	2018-04-05 15:26:57 +04:00
Vladislav Vaintroub	19bb7fdcd6	MDEV-15694 Windows : use GetSystemTimePreciseAsFileTime if available for high resolution time Use high accuracy timer on Windows 8.1+ for system versioning,it needs accurate high resoution start query time. Continue to use the inaccurate (but much faster timer function) GetSystemTimeAsFileTime() where accuracy does not matter, e.g in set_timespec_time_nsec(),or my_time()	2018-04-01 14:38:45 +00:00

1 2 3 4 5 ...

348 commits