mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 11:01:52 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	7de38492fc	After-merge fix: cmake -DPLUGIN_PERFSCHEMA=NO An #include was forgotten in `b6ac67389d`	2019-07-25 13:34:31 +03:00
Marko Mäkelä	b6ac67389d	Merge 10.1 into 10.2	2019-07-25 12:14:27 +03:00
Marko Mäkelä	0c7c61019d	Remove the wrappers ut_time(), ut_difftime(), ib_time_t	2019-07-24 21:59:26 +03:00
Marko Mäkelä	10ee1b95b8	Remove ut_usectime(), ut_gettimeofday() Replace ut_usectime() with my_interval_timer(), which is equivalent, but monotonically counting nanoseconds instead of counting the microseconds of real time. os_event_wait_time_low(): Use my_hrtime() instead of ut_usectime(). FIXME: Set a clock attribute on the condition variable that allows a monotonic clock to be chosen as the time base, so that the wait is immune to adjustments of the system clock.	2019-07-24 21:59:26 +03:00
Marko Mäkelä	cbac8f9351	MDEV-19725 Incorrect error handling in ALTER TABLE Some I/O functions and macros that are declared in os0file.h used to return a Boolean status code (nonzero on success). In MySQL 5.7, they were changed to return dberr_t instead. Alas, in MariaDB Server 10.2, some uses of functions were not adjusted to the changed return value. Until MDEV-19231, the valid values of dberr_t were always nonzero. This means that some code that was incorrectly checking for a zero return value from the functions would never detect a failure. After MDEV-19231, some tests for ALTER ONLINE TABLE would fail with cmake -DPLUGIN_PERFSCHEMA=NO. It turned out that the wrappers pfs_os_file_read_no_error_handling_int_fd_func() and pfs_os_file_write_int_fd_func() were wrongly returning bool instead of dberr_t. Also the callers of these functions were wrongly expecting bool (nonzero on success) instead of dberr_t. This mistake had been made when the addition of these functions was merged from MySQL 5.6.36 and 5.7.18 into MariaDB Server 10.2.7. This fix also reverts commit `40becbc3c7` which attempted to work around the problem.	2019-06-10 18:15:25 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	c0ac0b8860	Update FSF address	2019-05-11 19:25:02 +03:00
Sachin Agarwal	06ec56f579	Bug #27850600 INNODB ASYNC IO ERROR HANDLING IN IO_EVENT Problem: io_getevents() - read asynchronous I/O events from the completion queue. For each IO event, the res field in io_event tells whether IO event is succeeded or not. To see if the IO actually succeeded we always need to check event.res (negative=error, positive=bytesread/written). LinuxAIOHandler::collect() doesn't check event.res value for each event. which leads to incorrect value in n_bytes for IO context (or IO Slot). Fix: Added a check for event.res negative value. RB: 20871 Reviewed by : annamalai.gurusami@oracle.com	2019-04-26 17:40:20 +03:00
Marko Mäkelä	e7f426d2c9	MDEV-19212: Replace macros with type-safe inline functions The regression that was reported in MDEV-19212 occurred due to use of macros that did not ensure that the arguments have compatible types. ut_2pow_remainder(), ut_2pow_round(), ut_calc_align(): Define as inline function templates. UT_CALC_ALIGN(): Define as a macro, because this is used in compile_time_assert(). Only starting with C++11 (MariaDB 10.4) we could define the inline functions as constexpr.	2019-04-08 21:33:49 +03:00
Marko Mäkelä	f120a15b93	MDEV-19212 4GB Limit on large_pages - integer overflow os_mem_alloc_large(): Invoke the macro ut_2pow_round() with the correct argument type. innobase_large_page_size, innobase_use_large_pages, os_use_large_pages, os_large_page_size: Remove. Simply refer to opt_large_page_size, my_use_large_pages.	2019-04-08 21:33:49 +03:00
Sergei Golubchik	f97d879bf8	cmake: re-enable -Werror in the maintainer mode now we can afford it. Fix -Werror errors. Note: * old gcc is bad at detecting uninit variables, disable it. * time_t is int or long, cast it for printf's	2019-03-27 22:51:37 +01:00
Marko Mäkelä	00572a0b0c	MDEV-17482 InnoDB fails to say which fatal error fsync() returned os_file_fsync_posix(): If fsync() returns a fatal error, do include errno in the error message. In the future, we might handle fsync() or write or allocation failures on InnoDB data files a little more gracefully: flag the affected index or table as corrupted, and deny any subsequent writes to the table. If a write to the undo log or redo log fails, an alternative to killing the server could be to deny any writes to InnoDB tables until the server has been restarted.	2019-03-18 12:32:10 +02:00
Sergei Golubchik	f1134d5676	post-merge: gcc 8 warnings note: Inherit String from Sql_alloc, to get operators new and new[] in sync in rocksdb gcc was complaining that non-lvalue was cast to const.	2019-03-15 21:00:50 +01:00
Marko Mäkelä	2565c02ca5	Remove unnecessary type casts	2019-01-23 14:42:21 +02:00
Marko Mäkelä	8e80fd6bfd	Merge 10.1 into 10.2	2019-01-17 11:24:38 +02:00
Marko Mäkelä	71eb762611	Merge 10.0 into 10.1	2019-01-17 06:40:24 +02:00
Eugene Kosov	e0633f25e8	MDEV-18243 incorrect ASAN instrumentation Poisoning memory after munmap() and friends is totally incorrect as this memory could be anything. os_mem_free_large(): remove memory poisoning	2019-01-15 15:32:18 +03:00
Eugene Kosov	0dafcf529c	cleanup os_event	2018-12-21 10:16:03 +02:00
Eugene Kosov	ed166f53fa	MDEV-18043 data race in os_event os_event::is_set(): protect os_event::m_set with os_event::mutex	2018-12-21 10:16:03 +02:00
Marko Mäkelä	447e493179	Remove some unnecessary InnoDB #include	2018-11-29 12:53:44 +02:00
Marko Mäkelä	ff88e4bb8a	Remove many redundant #include from InnoDB	2018-11-19 11:42:14 +02:00
Marko Mäkelä	055a3334ad	MDEV-13564 Mariabackup does not work with TRUNCATE Implement undo tablespace truncation via normal redo logging. Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name, CREATE, and DROP. Note: Orphan #sql-ib.ibd may be left behind if MariaDB Server 10.2 is killed before the DROP operation is committed. If MariaDB Server 10.2 is killed during TRUNCATE, it is also possible that the old table was renamed to #sql-ib.ibd but the data dictionary will refer to the table using the original name. In MariaDB Server 10.3, RENAME inside InnoDB is transactional, and #sql-* tables will be dropped on startup. So, this new TRUNCATE will be fully crash-safe in 10.3. ha_mroonga::wrapper_truncate(): Pass table options to the underlying storage engine, now that ha_innobase::truncate() will need them. rpl_slave_state::truncate_state_table(): Before truncating mysql.gtid_slave_pos, evict any cached table handles from the table definition cache, so that there will be no stale references to the old table after truncating. == TRUNCATE TABLE == WL#6501 in MySQL 5.7 introduced separate log files for implementing atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB undo and redo log. Some convoluted logic was added to the InnoDB crash recovery, and some extra synchronization (including a redo log checkpoint) was introduced to make this work. This synchronization has caused performance problems and race conditions, and the extra log files cannot be copied or applied by external backup programs. In order to support crash-upgrade from MariaDB 10.2, we will keep the logic for parsing and applying the extra log files, but we will no longer generate those files in TRUNCATE TABLE. A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE (with full redo and undo logging and proper rollback). This will be implemented in MDEV-14717. ha_innobase::truncate(): Invoke RENAME, create(), delete_table(). Because RENAME cannot be fully rolled back before MariaDB 10.3 due to missing undo logging, add some explicit rename-back in case the operation fails. ha_innobase::delete(): Introduce a variant that takes sqlcom as a parameter. In TRUNCATE TABLE, we do not want to touch any FOREIGN KEY constraints. ha_innobase::create(): Add the parameters file_per_table, trx. In TRUNCATE, the new table must be created in the same transaction that renames the old table. create_table_info_t::create_table_info_t(): Add the parameters file_per_table, trx. row_drop_table_for_mysql(): Replace a bool parameter with sqlcom. row_drop_table_after_create_fail(): New function, wrapping row_drop_table_for_mysql(). dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(), fil_prepare_for_truncate(), fil_reinit_space_header_for_table(), row_truncate_table_for_mysql(), TruncateLogger, row_truncate_prepare(), row_truncate_rollback(), row_truncate_complete(), row_truncate_fts(), row_truncate_update_system_tables(), row_truncate_foreign_key_checks(), row_truncate_sanity_checks(): Remove. row_upd_check_references_constraints(): Remove a check for TRUNCATE, now that the table is no longer truncated in place. The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some race-condition like scenarios. The test innodb-innodb.truncate does not use any synchronization. We add a redo log subformat to indicate backup-friendly format. MariaDB 10.4 will remove support for the old TRUNCATE logging, so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve limitations. == Undo tablespace truncation == MySQL 5.7 implements undo tablespace truncation. It is only possible when innodb_undo_tablespaces is set to at least 2. The logging is implemented similar to the WL#6501 TRUNCATE, that is, using separate log files and a redo log checkpoint. We can simply implement undo tablespace truncation within a single mini-transaction that reinitializes the undo log tablespace file. Unfortunately, due to the redo log format of some operations, currently, the total redo log written by undo tablespace truncation will be more than the combined size of the truncated undo tablespace. It should be acceptable to have a little more than 1 megabyte of log in a single mini-transaction. This will be fixed in MDEV-17138 in MariaDB Server 10.4. recv_sys_t: Add truncated_undo_spaces[] to remember for which undo tablespaces a MLOG_FILE_CREATE2 record was seen. namespace undo: Remove some unnecessary declarations. fil_space_t::is_being_truncated: Document that this flag now only applies to undo tablespaces. Remove some references. fil_space_t::is_stopping(): Do not refer to is_being_truncated. This check is for tablespaces of tables. Potentially used tablespaces are never truncated any more. buf_dblwr_process(): Suppress the out-of-bounds warning for undo tablespaces. fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero page number (new size of the tablespace in pages) to inform crash recovery that the undo tablespace size has been reduced. fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2 can be written for undo tablespaces (without .ibd file suffix) for a nonzero page number. os_file_truncate(): Add the parameter allow_shrink=false so that undo tablespaces can actually be shrunk using this function. fil_name_parse(): For undo tablespace truncation, buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[]. recv_read_in_area(): Avoid reading pages for which no redo log records remain buffered, after recv_addr_trim() removed them. trx_rseg_header_create(): Add a FIXME comment that we could write much less redo log. trx_undo_truncate_tablespace(): Reinitialize the undo tablespace in a single mini-transaction, which will be flushed to the redo log before the file size is trimmed. recv_addr_trim(): Discard any redo logs for pages that were logged after the new end of a file, before the truncation LSN. If the rec_list becomes empty, reduce n_addrs. After removing any affected records, actually truncate the file. recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before applying any log records. The undo tablespace files must be open at this point. buf_flush_or_remove_pages(), buf_flush_dirty_pages(), buf_LRU_flush_or_remove_pages(): Add a parameter for specifying the number of the first page to flush or remove (default 0). trx_purge_initiate_truncate(): Remove the log checkpoints, the extra logging, and some unnecessary crash points. Merge the code from trx_undo_truncate_tablespace(). First, flush all to-be-discarded pages (beyond the new end of the file), then trim the space->size to make the page allocation deterministic. At the only remaining crash injection point, flush the redo log, so that the recovery can be tested.	2018-09-07 22:10:02 +03:00
Marko Mäkelä	de469a2f29	MDEV-14637: Fix hang due to persistent statistics Similar to the tables SYS_FOREIGN and SYS_FOREIGN_COLS, the tables mysql.innodb_table_stats and mysql.innodb_index_stats are updated by the InnoDB internal SQL parser, which fails to enforce the size limits of the data. Due to this, it is possible for InnoDB to hang when there are persistent statistics defined on partitioned tables where the total length of table name, partition name and subpartition name exceeds the incorrectly defined limit VARCHAR(64). That column should have been defined as VARCHAR(199). btr_node_ptr_max_size(): Interpret the VARCHAR(64) as VARCHAR(199), to prevent a hang in the case that the upgrade script has not been run. dict_table_schema_check(): Ignore difference in the length of the table_name column. ha_innobase::max_supported_key_length(): For innodb_page_size=4k, return a larger value so that the table mysql.innodb_index_stats can be created. This could allow "impossible" tables to be created, such that it is not possible to insert anything into a secondary index when both the secondary key and the primary key are long, but this is the easiest and most consistent way. The Oracle fix would only ignore the maximum length violation for the two statistics tables. os_file_get_status_posix(), os_file_get_status_win32(): Handle ENAMETOOLONG as well. This patch is based on the following change in MySQL 5.7.23. Not all changes were applied, and our variant allows persistent statistics to work without hangs even if the table definitions were not upgraded. From fdbdce701ab8145ae234c9d401109dff4e4106cb Mon Sep 17 00:00:00 2001 From: Aditya A <aditya.a@oracle.com> Date: Thu, 17 May 2018 16:11:43 +0530 Subject: [PATCH] Bug #26390736 THE FIELD TABLE_NAME (VARCHAR(64)) FROM MYSQL.INNODB_TABLE_STATS CAN OVERFLOW. In mysql.innodb_index_stats and mysql.innodb_table_stats tables the table name column didn't take into consideration partition names which can be more than varchar(64).	2018-08-03 08:33:38 +03:00
Allen Lai	f70a318576	Bug#27805553 HARD ERROR SHOULD BE REPORTED WHEN FSYNC() RETURN EIO. fsync() will just return EIO only once when the IO error happens, so, it's wrong to keep trying to call it till it return success. When fsync() returns EIO it should be treated as a hard error and InnoDB must abort immediately.	2018-08-03 08:32:17 +03:00
Vladislav Vaintroub	b71c9ae030	amend fix for MDEV-16596 - do not use CREATE_NEW flag when reopening redo log file. use OPEN_ALWAYS instead, since we know file already exist.	2018-07-01 14:00:29 +01:00
Vladislav Vaintroub	c612a1e77c	MDEV-16596 : Windows - redo log does not work on native 4K sector disks. Disks with native 4K sectors need 4K alignment and size for unbuffered IO (i.e for files opened with FILE_FLAG_NO_BUFFERING) Innodb opens redo log with FILE_FLAG_NO_BUFFERING, however it always does 512byte IOs. Thus, the IO on 4K native sectors will fail, rendering Innodb non-functional. The fix is to check whether OS_FILE_LOG_BLOCK_SIZE is multiple of logical sector size, and if it is not, reopen the redo log without FILE_FLAG_NO_BUFFERING flag.	2018-06-30 11:04:51 +01:00
Vladislav Vaintroub	04677f44c7	Innodb : do not use errno on Windows to print os_file_pwrite() error. Use GetLastError() instead.	2018-06-28 17:23:05 +01:00
Sergei Golubchik	b942aa34c1	Merge branch '10.1' into 10.2	2018-06-21 23:47:39 +02:00
Vicențiu Ciorbaru	6e55236c0a	Merge branch '10.0-galera' into 10.1	2018-06-12 19:39:37 +03:00
Vicențiu Ciorbaru	aa59ecec89	Merge branch '10.0' into 10.1	2018-06-12 18:55:27 +03:00
Marko Mäkelä	8f5f0575ab	MDEV-16456 InnoDB error "returned OS error 71" complains about wrong path When attempting to rename a table to a non-existing database, InnoDB would misleadingly report "OS error 71" when in fact the error code is InnoDB's own (OS_FILE_NOT_FOUND), and not report both pathnames. Errors on rename could occur due to reasons connected to either pathname. os_file_handle_rename_error(): New function, to report errors in renaming files.	2018-06-12 10:25:23 +03:00
Marko Mäkelä	0ad9c3a016	MDEV-16456 InnoDB error "returned OS error 71" complains about wrong path When attempting to rename a table to a non-existing database, InnoDB would misleadingly report "OS error 71" when in fact the error code is InnoDB's own (OS_FILE_NOT_FOUND), and not report both pathnames. Errors on rename could occur due to reasons connected to either pathname. os_file_handle_rename_error(): New function, to report errors in renaming files.	2018-06-12 09:54:31 +03:00
Marko Mäkelä	df42830b28	Merge 10.1 into 10.2	2018-06-06 11:25:33 +03:00
Marko Mäkelä	1d4e1d3263	Merge 10.0 to 10.1	2018-06-06 11:04:17 +03:00
Marko Mäkelä	55abcfa7b7	MDEV-16124 fil_rename_tablespace() times out and crashes server during table-rebuilding ALTER TABLE InnoDB insisted on closing the file handle before renaming a file. Renaming a file should never be a problem on POSIX systems. Also on Windows it should work if the file was opened in FILE_SHARE_DELETE mode. fil_space_t::stop_ios: Remove. We no longer need to stop file access during rename operations. fil_mutex_enter_and_prepare_for_io(): Remove the wait for stop_ios. fil_rename_tablespace(): Remove the retry logic; do not close the file handle. Remove the unused fault injection that was added along with the DATA DIRECTORY functionality (MySQL WL#5980). os_file_create_simple_func(), os_file_create_func(), os_file_create_simple_no_error_handling_func(): Include FILE_SHARE_DELETE in the share_mode. (We will still prevent multiple InnoDB instances from using the same files by not setting FILE_SHARE_WRITE.)	2018-06-05 18:16:12 +03:00
Jan Lindström	648cf7176c	Merge remote-tracking branch 'origin/5.5-galera' into 10.0-galera	2018-05-07 13:49:14 +03:00
Vladislav Vaintroub	47ea2227e5	fix typo, amend last commit	2018-04-14 23:59:59 +01:00
Vladislav Vaintroub	043a9b4e1b	Windows, innodb : reduce noise from os_file_get_block_size() if volume can't be opened due to permissions, or IOCTL_STORAGE_QUERY_PROPERTY fails with not implemented, do not report it. Those errors happen, there is nothing user can do. This patch amends fix for MDEV-12948.	2018-04-14 23:53:11 +01:00
Vladislav Vaintroub	7b16291c36	MDEV-15707 : deadlock in Innodb IO code, caused by change buffering. In async IO completion code, after reading a page,Innodb can wait for completion of other bufferpool reads. This is for example what happens if change-buffering is active. Innodb on Windows could deadlock, as it did not have dedicated threads for processing change buffer asynchronous reads. The fix for that is to have windows now has the same background threads, including dedicated thread for ibuf, and log AIOs. The ibuf/read completions are now dispatched to their threads with PostQueuedCompletionStatus(), the write and log completions are processed in thread where they arrive.	2018-04-08 21:32:02 +00:00
Marko Mäkelä	3d7915f000	Merge 10.1 into 10.2	2018-03-21 22:58:52 +02:00
Vicențiu Ciorbaru	82aeb6b596	Merge branch '10.1' into 10.2	2018-03-21 10:36:49 +02:00
Marko Mäkelä	e0a0fe7d81	MDEV-12396 IMPORT TABLESPACE: Do not retry partial reads fil_iterate(), fil_tablespace_iterate(): Replace os_file_read() with os_file_read_no_error_handling(). os_file_read_func(), os_file_read_no_error_handling_func(): Do not retry partial reads. There used to be an infinite amount of retries. Because InnoDB extends both data and log files upfront, partial reads should be impossible during normal operation.	2018-03-20 15:31:39 +02:00
Vicențiu Ciorbaru	24b353162f	Merge branch '10.0-galera' into 10.1	2018-03-19 15:21:01 +02:00
Daniel Black	26e4a48bda	MDEV-8743: ib_logfile0 Use O_CLOEXEC so galera SST scripts don't get fd	2018-03-02 11:09:51 +11:00
Daniel Black	9629bca1f0	MDEV-8743: use O_CLOEXEC (innodb/xtradb)	2018-03-02 10:54:00 +11:00
Marko Mäkelä	00f0c039d2	MDEV-15270 Mariabackup should not try to use doublewrite buffer When Mariabackup gets a bad read of the first page of the system tablespace file, it would inappropriately try to apply the doublewrite buffer and write changes back to the data file (to the source file)! This is very wrong and must be prevented. The correct action would be to retry reading the system tablespace as well as any other files whose first page was read incorrectly. Fixing this was not attempted. xb_load_tablespaces(): Shorten a bogus message to be more relevant. The message can be displayed by --backup or --prepare. xtrabackup_backup_func(), os_file_write_func(): Add a missing space to a message. Datafile::restore_from_doublewrite(): Do not even attempt the operation in Mariabackup. recv_init_crash_recovery_spaces(): Do not attempt to restore the doublewrite buffer in Mariabackup (--prepare or --export), because all pages should have been copied correctly in --backup already, and because --backup should ignore the doublewrite buffer. SysTablespace::read_lsn_and_check_flags(): Do not attempt to initialize the doublewrite buffer in Mariabackup. innodb_make_page_dirty(): Correct the bounds check. Datafile::read_first_page(): Correct the name of the parameter.	2018-02-12 16:56:01 +02:00
Marko Mäkelä	c19ef508b8	InnoDB: Remove ut_snprintf() and the use of my_snprintf(); use snprintf()	2017-11-13 02:11:48 +02:00
Marko Mäkelä	51679e5c38	MDEV-14132 InnoDB page corruption On some old GNU/Linux systems, invoking posix_fallocate() with offset=0 would sometimes cause already allocated bytes in the data file to be overwritten. Fix a correctness regression that was introduced in commit `420798a81a` by invoking posix_fallocate() in a safer way. A similar change was made in MDEV-5746 earlier. os_file_get_size(): Avoid changing the state of the file handle, by invoking fstat() instead of lseek(). os_file_set_size(): Determine the current size of the file by os_file_get_size(), and then extend the file from that point onwards.	2017-11-06 08:53:51 +02:00
Marko Mäkelä	30a8764b92	MDEV-14244 MariaDB fails to run with O_DIRECT os_file_set_size(): If posix_fallocate() returns EINVAL, fall back to writing zero bytes to the file. Also, remove some error log output, and make it possible for a server shutdown to interrupt the fall-back code. MariaDB used to ignore any possible return value from posix_fallocate() ever since innodb_use_fallocate was introduced in MDEV-4338. If EINVAL was returned, the file would not be extended. Starting with MDEV-11520, MariaDB would treat EINVAL as a hard error. Why is the EINVAL returned? The GNU posix_fallocate() function would first try the fallocate() system call, which would return -EOPNOTSUPP for many file systems (notably, not ext4). Then, it would fall back to extending the file one block at a time by invoking pwrite(fd, "", 1, offset) where offset is 1 less than a multiple of the file block size. This would fail with EINVAL if the file is in O_DIRECT mode, because O_DIRECT requires aligned operation.	2017-11-06 08:53:50 +02:00
Marko Mäkelä	19733efa7b	MDEV-14244 MariaDB 10.2.10 fails to run on Debian Stretch with ext3 and O_DIRECT os_file_set_size(): If posix_fallocate() returns EINVAL, fall back to writing zero bytes to the file. Also, remove some error log output, and make it possible for a server shutdown to interrupt the fall-back code. MariaDB 10.2 used to handle the EINVAL return value from posix_fallocate() before commit `b731a5bcf2` which refactored os_file_set_size() to try posix_fallocate(). Why is the EINVAL returned? The GNU posix_fallocate() function would first try the fallocate() system call, which would return -EOPNOTSUPP for many file systems (notably, not ext4). Then, it would fall back to extending the file one block at a time by invoking pwrite(fd, "", 1, offset) where offset is 1 less than a multiple of the file block size. This would fail with EINVAL if the file is in O_DIRECT mode, because O_DIRECT requires aligned operation.	2017-11-02 16:18:41 +02:00

1 2 3 4 5 ...

457 commits