The function posix_fallocate() as well as the Linux system call
fallocate() can return EINTR when the operation was interrupted
by a signal. In that case, keep retrying the operation, except
if InnoDB shutdown has been initiated.
it was race condition prone. instead use either a pair of my_delete()
calls with already resolved paths, or a safe high-level function
my_handler_delete_with_symlink(), like MyISAM and Aria already do.
TOCTOU bug. The path is checked to be valid, symlinks are resolved.
Then the resolved path is opened. Between the check and the open,
there's a window when one can replace some path component with a
symlink, bypassing validity checks.
Fix: after we resolved all symlinks in the path, don't allow open()
to resolve symlinks, there should be none.
Compared to the old MyISAM/Aria code:
* fastpath. Opening of not-symlinked files is just one open(),
no fn_format() and lstat() anymore.
* opening of symlinked tables doesn't do fn_format() and lstat() either.
it also doesn't to realpath() (which was lstat-ing every path
component), instead if opens every path component with O_PATH.
* share->data_file_name stores realpath(path) not readlink(path). So,
SHOW CREATE TABLE needs to do lstat/readlink() now (see ::info()),
and certain error messages (cannot open file "XXX") show the real
file path with all symlinks resolved.
Before the MDEV-11520 fixes, fil_extend_space_to_desired_size()
in MariaDB Server 5.5 incorrectly passed the desired file size as the
third argument to posix_fallocate(), even though the length of the
extension should have been passed. This looks like a regression
that was introduced in the 5.5 version of MDEV-5746.
Remove the unused variable desired_size.
Also, correct the expression for the posix_fallocate() start_offset,
and actually test that it works with a multi-file system tablespace.
Before MDEV-11520, the expression was wrong in both innodb_plugin and
xtradb, in different ways.
The start_offset formula was tested with the following:
./mtr --big-test --mysqld=--innodb-use-fallocate \
--mysqld=--innodb-data-file-path='ibdata1:5M;ibdata2:5M:autoextend' \
--parallel=auto --force --retry=0 --suite=innodb &
ls -lsh mysql-test/var/*/mysqld.1/data/ibdata2
a large memory buffer on Windows
fil_extend_space_to_desired_size(), os_file_set_size(): Use calloc()
for memory allocation, and handle failures. Properly check the return
status of posix_fallocate().
On Windows, instead of extending the file by at most 1 megabyte at a time,
write a zero-filled page at the end of the file.
According to the Microsoft blog post
https://blogs.msdn.microsoft.com/oldnewthing/20110922-00/?p=9573
this will physically extend the file by writing zero bytes.
(InnoDB never uses DeviceIoControl() to set the file sparse.)
For innodb_plugin, port the XtraDB fix for MySQL Bug#56433
(introducing fil_system->file_extend_mutex). The bug was
fixed differently in MySQL 5.6 (and MariaDB Server 10.0).
MY_THREAD_INIT IN BACKGROUND THREAD
Description:
===========
Add my_thread_init() and my_thread_exit() for background threads which
initializes and frees the st_my_thread_var structure.
Reviewed-by: Jimmy Yang<jimmy.yang@oracle.com>
RB: 15003
Prevent GCC from moving a mach_read_from_4() before we have checked that
we have 4 bytes to read. The pointer may only point to a 1, 2 or 3
bytes in which case the code should not read 4 bytes. This is a
workaround to a GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77673
Patch submitted by: Laurynas Biveinis <laurynas.biveinis@gmail.com>
RB: 14135
Reviewed by: Pawel Olchawa <pawel.olchawa@oracle.com>
During REPAIR TABLE of a MyISAM table, a temporary data file (.TMD)
is created. When repair finishes, this file is renamed to the original
.MYD file. The problem was that during this rename, we copied the
stats from the old file to the new file with chmod/chown. If a user
managed to replace the temporary file before chmod/chown was executed,
it was possible to get an arbitrary file with the privileges of the
mysql user.
This patch fixes the problem by not copying stats from the old
file to the new file. This is not needed as the new file was
created with the correct stats. This fix only changes server
behavior - external utilities such as myisamchk still does
chmod/chown.
No test case provided since the problem involves synchronization
with file system operations.
Fix memory barrier issues on releasing mutexes. We must have a full
memory barrier between releasing a mutex lock and reading its waiters.
This prevents us from missing to release waiters due to reading the
number of waiters speculatively before releasing the lock. If threads
try and wait between us reading the waiters count and releasing the
lock, those threads might stall indefinitely.
Also, we must use proper ACQUIRE/RELEASE semantics for atomic
operations, not ACQUIRE/ACQUIRE.
Analysis: row_drop_table_for_mysql did not allow dropping
referenced table even in case when actual creating of the
referenced table was not successfull if foreign_key_checks=1.
Fix: Allow dropping referenced table even if foreign_key_checks=1
if actual table create returned error.
Fix the assertion failure by setting the struct to 0. This can not be
done using a macro due to different definitions of mutexes on various
OS-es.
Afterwards we call toku_mutex_init and completely initialize the locks.
Fixed wrong calculation of buffer sizes. ulong datatype was used wrongly,
as were the casts to ulong. Buffer sizes should be of type size_t,
not ulong, or bad things happen on 64 bit Windows.
This patch changes pagecache struct to use size_t/ssize_t
where long/ulong were previously used. Also, removed several casts.
Both aria and myisam storage engines feature a logic path in
thr_find_all_keys that leads to undefined behaviour by bypassing the
initialization code of variables after my_thread_init().
By refactoring the nested logic into a separate function, this problem
is resolved.
Aria service threads are created "joinable", but they're not "joined" on
completion. This causes memory leaks around thread local storage.
Fixed by joining service thread. Simplified relevant code and cleaned up
relevant valgrind suppressions.
Use direct persistent index corruption set on InnoDB dictionary
for this test. Do not allow creating new indexes if one of the
existing indexes is already marked as corrupted.
For some reason check_cxx_compiler_flag() passes result variable name down to
compiler:
https://github.com/Kitware/CMake/blob/master/Modules/CheckCXXSourceCompiles.cmake#L57
But compiler doesn't permit dashes in macro name, like in
-DHAVE_CXX_-fimplicit-templates.
Workarounded by renaming HAVE_CXX_-fimplicit-templates to
HAVE_CXX_IMPLICIT_TEMPLAES.
The main.merge test case was failing when tested using row based
binlog format.
While analyzing the issue it was found the following issues:
a) The server is calling binlog related code even when a statement will
not be binlogged;
b) The child table list was not present into table structure by the time
to generate the create table statement;
c) The tables in the child table list will not be opened yet when
generating table create info using row based replication;
d) CREATE TABLE LIKE TEMP_TABLE does not preserve original table storage
engine when using row based replication;
This patch addressed all above issues.
@ sql/sql_class.h
Added a function to determine if the binary log is disabled to
the current session. This is related with issue (a) above.
@ sql/sql_table.cc
Added code to skip binary logging related code if the statement
will not be binlogged. This is related with issue (a) above.
Added code to add the children to the query list of the table that
will have its CREATE TABLE generated. This is related with issue (b)
above.
Added code to force the storage engine to be generated into the
CREATE TABLE. This is related with issue (d) above.
@ storage/myisammrg/ha_myisammrg.cc
Added a test to skip a table getting info about a child table if the
child table is not opened. This is related to issue (c) above.