main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
Users expect window functions to produce a certain ordering of rows in
the final result set. Although the standard does not require this, we
already have the filesort result done for when we computed the window
function. If there is no ORDER BY attached to the query, just keep it
till the SELECT is completely evaluated and use that to print the
result.
Update test cases as many did not take care to guarantee a stable
result.
According to logs analysis the Dump thread attempted to read again data which
was already sent. The reason of regressed read turns out in an _my_b_cache_read()
early exit branch which missed to distinguish between total zero size read (e.g
ineffective read when Count argument is zero) from a case when the
requested amount of data is fully read out by sole accessing the cache's
file. In the latter case such then *effective* reading was not
reflected in the cache's state to screw the cache's state.
Fixed with a check introduced of whether the file reading was effective prior to
early exit. When this is the case conduct standard cache state change to
account the actual read size.
Notice the bug can show up also as an error to read binlog event e.g
through BINLOG_GTID_POS() (of MDEV-16886).
This assert is hit when we do filesort using the priority queue and try to insert elements in
the queue. The compare function used for the priority queue should handle the case for zerolength
sortkey.
MySQL bug number 90264
Contribution by Yura Sorokin.
Problem:
File mysys/mf_iocache2.c contains non instrumented file io operations.
This causes inaccurate statistics in PERFORMANCE_SCHEMA.
Solution:
Use the instrumentation apis (mysql_file_tell instead of my_tell, etc).
MEMORY table could be renamed into a non-extistent database.
rename() is documented to return ENOENT when the source file does not
exist OR when the target directory not exist. Nonexistent source .frm
file is ok (table can still exist in the engine), nonexistent target
directory is not.
Make my_rename to use ENOTDIR for the latter case. Make RENAME TABLE
issue an appropriate error ("unknown database" instead of "unknown table")
O_TMPFILE creates a tempfile not attached to a filename.
This is what we want for MY_TEMPORARY.
We preserve a state O_TMPFILE_works, because kernel version or
filesystem could cause failure.
Closes#662
Make mariadb crc32 lib platform independent
It looks strange that someone can make use of 2 crc libraries
(Power64 or AArch64) at the same time.
The patch sets macros 'CRC32_LIBRARY' to make platform independence as an optional crc32 library.
Change-Id: I68bbf73cafb6a12f7fb105ad57d117b114a8c4af
Signed-off-by: Yuqi Gu <yuqi.gu@arm.com>
don't rely on imprecise my_file_opened | my_stream_opened,
scan the array for open handles instead.
also, remove my_print_open_files() and embed it in my_end()
to have one array scan instead of two.
Change the following to statistic counters:
* my_file_opened
* my_file_total_opened
* my_stream_opened
* my_tmp_file_created
There is one non-statistics use of my_file_opened/my_stream_opened
in my_end which prints a warning if we shutdown and its still open.
It seems excessive to hold locks to prevent this warning.
A file descriptor is already a unique element per process - in Windows,
protection occurs at fd allocation using THR_LOCK_open in my_win_{,f}open
and in other OSes, a unique fd to file map exists at the OS level.
So accesses to my_file_info[fd] don't need to be protected by the
THR_LOCK_open.
my_close/my_fclose where restructured to clear out the my_file_info
before the close/my_win_close/my_win_fclose. After these calls another
thread could gain the same file descriptor. So for Windows this
the file_info elements available to the my_win_{,f}_open are released
during the invalidate_fd call within my_win_close. No locking is needed
as the my_win_{,f}open is searching for a invalidate entry which is
determined by a single value change.
my_fclose also changed for non-Windows to retry closing if EINTR was
returned, same as my_close.
Closes#657
simplify. move common code inside, specify common flags inside,
rewrite dead code (`if (mode & O_TEMPORARY)` on Linux, where
`O_TEMPORARY` is always 0) to actually do something.
This previously unreported warning comes from casting size_t to ulong
in sql_hset.h in Hash_Set::at().
Change my_hash_element to accept size_t index parameter.
Performance schema likely/unlikely assume that performance schema
is enabled by default, which causes a performance degradation for
default installations that doesn't have performance schema enabled.
Fixed by changing the likely/unlikely in PS to assume it's
not enabled. This can be changed by compiling with
-DPSI_ON_BY_DEFAULT
Other changes:
- Added psi_likely/psi_unlikely that is depending on
PSI_ON_BY_DEFAULT. psi_likely() is assumed to be true
if PS is enabled.
- Added likely/unlikely to some PS interface code.
- Moved pfs_enabled to mysys (was initialized but not used before)
- Added "if (pfs_likely(pfs_enabled))" around calls to PS to avoid
an extra call if PS is not enabled.
- Moved checking flag_global_instrumention before other flags
to speed up the case when PS is not enabled.
To use:
- Compile with -DUSE_MY_LIKELY
- Change (with replace) all likely/unlikely to my_likely/my_/unlikely
replace likely my_likely unlikely my_unlikely -- *c *h
- Start mysqld with -T
- run some test
- When mysqld has shut down cleanely, report will be on stderr
Also clarify which --{no-,}default* options, must be first.
Sample output:
$ client/mysql --help
client/mysql Ver 15.1 Distrib 5.5.59-MariaDB, for Linux (x86_64) using readline 5.1
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Usage: client/mysql [OPTIONS] [database]
Default options are read from the following files in the given order:
/etc/my.cnf /etc/mysql/my.cnf ~/.my.cnf
The following groups are read: mysql client client-server client-mariadb
The following options may be given as the first argument:
--print-defaults Print the program argument list and exit.
--no-defaults Don't read default options from any option file.
The following specify which files/groups are read (specified before other options):
--defaults-file=# Only read default options from the given file #.
--defaults-extra-file=# Read this file after the global files are read.
--defaults-group-suffix=# Additionally read default groups with # appended as a suffix.
tests running from build directory:
TEST: print defaults ignored as not first
$ sql/mysqld --no-defaults --print-defaults --lc-messages-dir=${PWD}/sql/share
TEST: no startup occurs as --print-defaults specified
$ sql/mysqld --print-defaults --lc-messages-dir=${PWD}/sql/share
sql/mysqld would have been started with the following arguments:
--lc-messages-dir=/home/dan/repos/build-mariadb-5.5/sql/share
TEST: default args can't be anywhere
$ client/mysql --user=bob --defaults-file=/etc/my.cnf
client/mysql: unknown variable 'defaults-file=/etc/my.cnf'
$ client/mysql --user=bob --defaults-group-suffix=.group
client/mysql: unknown variable 'defaults-group-suffix=.group'
/etc/my.cnf:
[client-server.group]
socket=/var/lib/mysql-multi/group/mysqld.sock
user=bob
/etc/my.other.cnf:
socket=/var/lib/mysql-other/mysqld.sock
TEST: defaults file read and suffix also applied
$ client/mysql --defaults-file=/etc/my.other.cnf --defaults-group-suffix=.group
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql-other/mysqld.sock' (2)
TEST: defaults extra file
$ client/mysql --defaults-extra-file=/etc/my.other.cnf --defaults-group-suffix=.group
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql-other/mysqld.sock' (2)
Use high accuracy timer on Windows 8.1+ for system versioning,it needs
accurate high resoution start query time.
Continue to use the inaccurate (but much faster timer function)
GetSystemTimeAsFileTime() where accuracy does not matter, e.g in
set_timespec_time_nsec(),or my_time()
The merge only covered 10.1 up to
commit 4d248974e0.
Actually merge the changes up to
commit 0a534348c7.
Also, remove the unused InnoDB field trx_t::abort_type.
For galera compatibility, the main thing is to ensure the FD 1, 2 are
not opened with O_CLOEXEC otherwise galera sst errors don't appear in
the error.log
Files without O_CLOEXEC from the test below:
0 -> /dev/pts/9
1 -> /tmp/error.log (intended)
2 -> /tmp/error.log (intended)
5 -> /tmp/datadir
6 -> /tmp/datadir/aria_log.00000001
(Innodb temp files)
8 -> /tmp/ibIIrhFL (deleted)
9 -> /tmp/ibfx1vai (deleted)
10 -> /tmp/ibAQUKFO (deleted)
11 -> /tmp/ibWBQSHR (deleted)
15 -> /tmp/ibEXEcfo (deleted)
20 -> /tmp/datadir/mysql/host.MYD
22 -> /tmp/datadir/mysql/user.MYD
... (rest of MYD files)
Test for this and the previous commit.
sql/mysqld --skip-networking --datadir=/tmp/datadir --log-bin=/tmp/datadir/mysqlbin --socket /tmp/s.sock --lc-messages-dir=${PWD}/sql/share --verbose --log-error=/tmp/error.log --general-log-file=/tmp/general.log --general-log=1 --slow-query-log-file=/tmp/slow.log --slow-query-log=1
180302 10:56:41 [Note] sql/mysqld (mysqld 5.5.60-MariaDB-wsrep) starting as process 26056 ...
$ cd /proc/26056
$ ls -la --sort=none fd
total 0
dr-x------. 2 dan dan 0 Mar 2 10:57 .
dr-xr-xr-x. 9 dan dan 0 Mar 2 10:56 ..
lrwx------. 1 dan dan 64 Mar 2 10:57 0 -> /dev/pts/9
l-wx------. 1 dan dan 64 Mar 2 10:57 1 -> /tmp/error.log
l-wx------. 1 dan dan 64 Mar 2 10:57 2 -> /tmp/error.log
lrwx------. 1 dan dan 64 Mar 2 10:57 3 -> /tmp/datadir/mysqlbin.index
lrwx------. 1 dan dan 64 Mar 2 10:57 4 -> /tmp/datadir/aria_log_control
lr-x------. 1 dan dan 64 Mar 2 10:57 5 -> /tmp/datadir
lrwx------. 1 dan dan 64 Mar 2 10:57 6 -> /tmp/datadir/aria_log.00000001
lrwx------. 1 dan dan 64 Mar 2 10:57 7 -> /tmp/datadir/ibdata1
lrwx------. 1 dan dan 64 Mar 2 10:57 8 -> /tmp/ibIIrhFL (deleted)
lrwx------. 1 dan dan 64 Mar 2 10:57 9 -> /tmp/ibfx1vai (deleted)
lrwx------. 1 dan dan 64 Mar 2 10:57 10 -> /tmp/ibAQUKFO (deleted)
lrwx------. 1 dan dan 64 Mar 2 10:57 11 -> /tmp/ibWBQSHR (deleted)
lrwx------. 1 dan dan 64 Mar 2 10:57 12 -> /tmp/datadir/ib_logfile0
lrwx------. 1 dan dan 64 Mar 2 10:57 13 -> /tmp/datadir/ib_logfile1
l-wx------. 1 dan dan 64 Mar 2 10:57 14 -> /tmp/slow.log
lrwx------. 1 dan dan 64 Mar 2 10:57 15 -> /tmp/ibEXEcfo (deleted)
l-wx------. 1 dan dan 64 Mar 2 10:57 16 -> /tmp/general.log
lrwx------. 1 dan dan 64 Mar 2 10:57 17 -> socket:[1897356]
lrwx------. 1 dan dan 64 Mar 2 10:57 18 -> socket:[45335]
l-wx------. 1 dan dan 64 Mar 2 10:57 19 -> /tmp/datadir/mysqlbin.000004
lrwx------. 1 dan dan 64 Mar 2 10:57 20 -> /tmp/datadir/mysql/host.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 21 -> /tmp/datadir/mysql/host.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 22 -> /tmp/datadir/mysql/user.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 23 -> /tmp/datadir/mysql/user.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 24 -> /tmp/datadir/mysql/db.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 25 -> /tmp/datadir/mysql/db.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 26 -> /tmp/datadir/mysql/proxies_priv.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 27 -> /tmp/datadir/mysql/proxies_priv.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 28 -> /tmp/datadir/mysql/tables_priv.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 29 -> /tmp/datadir/mysql/tables_priv.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 30 -> /tmp/datadir/mysql/columns_priv.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 31 -> /tmp/datadir/mysql/columns_priv.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 32 -> /tmp/datadir/mysql/procs_priv.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 33 -> /tmp/datadir/mysql/procs_priv.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 34 -> /tmp/datadir/mysql/servers.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 35 -> /tmp/datadir/mysql/servers.MYI
lrwx------. 1 dan dan 64 Mar 2 10:57 36 -> /tmp/datadir/mysql/event.MYD
lrwx------. 1 dan dan 64 Mar 2 10:57 37 -> /tmp/datadir/mysql/event.MYI
O_CLOEXEC files are those with flags 02000000
/usr/include/bits/fcntl-linux.h:# define __O_CLOEXEC 02000000
/usr/include/bits/fcntl-linux.h:# define O_CLOEXEC __O_CLOEXEC /* Set close_on_exec. */
$ find fdinfo/ -type f -ls -exec cat {} \; | more
1924720 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/0
pos: 0
flags: 0100002
mnt_id: 25
1924721 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/1
pos: 9954
flags: 0102001
mnt_id: 82
1924722 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/2
pos: 10951
flags: 0102001
mnt_id: 82
1924723 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/3
pos: 116
flags: 02100002
mnt_id: 82
1924724 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/4
pos: 52
flags: 0100002
mnt_id: 82
lock: 1: POSIX ADVISORY WRITE 26056 00:2c:1866365 0 EOF
1924725 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/5
pos: 0
flags: 0100000
mnt_id: 82
1924726 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/6
pos: 16384
flags: 0100002
mnt_id: 82
1924727 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/7
pos: 0
flags: 02100002
mnt_id: 82
lock: 1: POSIX ADVISORY WRITE 26056 00:2c:1866491 0 EOF
1924728 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/8
pos: 0
flags: 0100002
mnt_id: 82
1924729 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/9
pos: 0
flags: 0100002
mnt_id: 82
1924730 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/10
pos: 0
flags: 0100002
mnt_id: 82
1924731 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/11
pos: 0
flags: 0100002
mnt_id: 82
1924732 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/12
pos: 0
flags: 02100002
mnt_id: 82
lock: 1: POSIX ADVISORY WRITE 26056 00:2c:1866492 0 EOF
1924733 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/13
pos: 0
flags: 02100002
mnt_id: 82
lock: 1: POSIX ADVISORY WRITE 26056 00:2c:1866493 0 EOF
1924734 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/14
pos: 763
flags: 02102001
mnt_id: 82
1924735 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/15
pos: 0
flags: 0100002
mnt_id: 82
1924736 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/16
pos: 473
flags: 02102001
mnt_id: 82
1924737 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/17
pos: 0
flags: 02000002
mnt_id: 9
1924738 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/18
pos: 0
flags: 02
mnt_id: 9
1924739 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/19
pos: 245
flags: 02100001
mnt_id: 82
1924740 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/20
pos: 0
flags: 0100002
mnt_id: 82
1924741 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/21
pos: 503
flags: 0500002
mnt_id: 82
1924742 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/22
pos: 324
flags: 0100002
mnt_id: 82
1924743 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/23
pos: 642
flags: 0500002
mnt_id: 82
1924744 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/24
pos: 880
flags: 0100002
mnt_id: 82
1924745 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/25
pos: 581
flags: 0500002
mnt_id: 82
1924746 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/26
pos: 1386
flags: 0100002
mnt_id: 82
1924747 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/27
pos: 498
flags: 0500002
mnt_id: 82
1924748 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/28
pos: 0
flags: 0100002
mnt_id: 82
1924749 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/29
pos: 513
flags: 0500002
mnt_id: 82
1924750 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/30
pos: 0
flags: 0100002
mnt_id: 82
1924751 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/31
pos: 494
flags: 0500002
mnt_id: 82
1924752 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/32
pos: 0
flags: 0100002
mnt_id: 82
1924753 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/33
pos: 535
flags: 0500002
mnt_id: 82
1924754 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/34
pos: 0
flags: 0100002
mnt_id: 82
1924755 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/35
pos: 396
flags: 0500002
mnt_id: 82
1924756 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/36
pos: 0
flags: 0100002
mnt_id: 82
1924757 0 -r-------- 1 dan dan 0 Mar 2 10:57 fdinfo/37
pos: 517
flags: 0500002
mnt_id: 82
But set _CRT_NONSTDC_NO_WARNINGS to silence silly warnings about
ANSI C function being non-standard
Remove now deprecated GetVersion()/GetVersionEx(),except single case
where where it is really needed, in feedback plugin. Remove checks for
Windows NT.
Avoid old IPv4-only inet_aton, which generated the warning.
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
This will make it easier to how memory allocation is done when debugging
with either DBUG or gdb.
Will especially help when debugging stored procedures
Main change is a name argument as second argument to init_alloc_root()
init_sql_alloc()
Other things:
- Added DBUG_ENTER/EXIT to some Virtual_tmp_table functions
One can use -DTRASH_FREED_MEMORY to enable TRASH
macros. Useful to do when one suspects that there
is accesses to freed memory.
Extended my_free() to TRASH freed memory
escape all charecters less or equal 0x1F (control symbols)
(shorter sequence are not used to make code simple, long encoding is always legal according to the rfc4627)
TRASH was mapped to TRASH_FREE and was supposed to be used for memory
that should not be accessed anymore, while TRASH_ALLOC() is to be
used for uninitialized but to-be-used memory.
But sometimes TRASH() was used in the latter sense.
Remove TRASH() macro, always use explicit TRASH_ALLOC() or TRASH_FREE().
Resolving a stacktrace including functions in dynamic libraries requires
us to look inside the libraries for the symbols. Addr2line needs to be
started with the correct binary for each address on the stack. To do this,
figure out which library it is using dladdr, then if the addr2line
binary was started with a different binary, fork it again with the
correct one.
We only have one addr2line process running at any point during the
stacktrace resolving step. The maximum number of forks for addr2line should
generally be around 6.
One for server stacktrace code, one for plugin code, one when going back
into server code, one for pthread library, one for libc, one for the
_start function in the server. More can come up if plugin calls server
function which goes back to a plugin, etc.
(from 10.1 to 10.0-galera)
This conflicted signficantly with 7d550c76be
which added --defaults-group-suffix support.
Took the approach of 4bb49d84a9 and adapted the
--defaults-group-suffix handling to be consistent.
The following changes as follows:
SST scripts now use $MY_PRINT_DEFAULTS rather than the lowercase for
consistency and this include all required --default arguements.
Backport/merge by Daniel Black <daniel@linux.vnet.ibm.com>
The reason for adding this was that while testing mysqlbinlog on
a replication event with 3G event output, Linux failed reading
the whole file in memory with one read (only got 2G on first read
even if file had just been written).
- Don't reset info->error on write error in IO_CACHE.
- In case of write_error in IO_CACHE , always return -1
- Fixed wrong result from my_read when using MY_FULL_IO. Also don't give
an error in case of retry.
Main problem was that no log-event print function checked for disk
full error on the IO_CACHE.
All changes in this patch only affects mysqlbinlog, not the server!
- Changed all log-event print functions to return 1 on error
- Fixed memory usage when not using --flashback.
- Added printing of number of rows in row events. Can be disabled with
--print-row-count=0
- Print annotated rows when using mysqlbinlog --short-form
- Fixed that mysqlbinlog --debug works
- Fixed create_drop_binlog.test test failure
- Reorganized fields in PRINT_EVENT_INFO to be according to size to
optimize storage
- Don't change print_row_event_position or print_row_counts if set by user
- Remove some testing of argument to my_free is 0
- base64-output=never is now supported and works in all context
- Updated help information for --base64-output and --short-form
- print_row_count is now on by default. Reset automatically if --short-form
is used
- Removed obsolote warning for mysql 5.6.0
- More DBUG_PRINT for mysqltest.cc
- my_b_write_byte() now checks for flush failures. This fixed a memory
overrun on disk full
- my_b_printf() now returns 1 on failure, 0 on ok. This simplifies code
and no old code was using the old return value of my_b_printf().
- my_b_Write_backtick_quote() now returns 1 on failure and 0 on ok
- Fixed some error conditions in log printing that was not previously
handled.
- Slave_rows_error_report() can now handle longlong positions
- Write_on_release_cache() rewritten so that we can detect errors
on flush. Not depending on automatic release anymore.
- Changed types for Pos and End_log_pos to 64 bit in SHOW BINLOG EVENTS
- Fixed that copy_event_cache_to_string_and_reinit() works with strings
longer than 4G (Changed to use LEX_STRING instead of String)
- Restricted binlog_rows_event_max_size to UINT32_MAX-1 as String's are
anyway restricted to UINT32_MAX
- Fixed bug in rpl_binlog_state::write_to_iocache() which hide write
failures (duplicate variable name)
- Fixed bug in String::append if original string was not allocated
- Stop mysqlbinlog output at once if there is an error.
- Before printing error message, flush result file. This ensures that
the error message is printed last. (Easier to find)
find_type_or_exit() client helper did exit(1) on error, exit(1) moved to
clients.
mysql_read_default_options() did exit(1) on error, error is passed through and
handled now.
my_str_malloc_default() did exit(1) on error, replaced my_str_ allocator
functions with normal my_malloc()/my_realloc()/my_free().
sql_connect.cc did many exit(1) on hash initialisation failure. Removed error
check since my_hash_init() never fails.
my_malloc() did exit(1) on error. Replaced with abort().
my_load_defaults() did exit(1) on error, replaced with return 2.
my_load_defaults() still does exit(0) when invoked with --print-defaults.
* The version of tcmalloc is written to the system variable
'version_malloc_library' if tcmalloc is used, similarly to
jemalloc
* Extracted method guess_malloc_library()
Other things:
- Cleanup of allocated bitmaps done in open(), which
simplifies init_partition_bitmaps()
- Add needed defines in ha_spider.cc to enable new spider code
- Fixed some DBUG_PRINT() to be consistent with normal code
- Removed end space
- The changes in test cases partition_innodb, partition_range,
partition_pruning etc are becasue partitions can now more exactly
calculate the number of rows in a range.
Contains spider patches:
014,015,023,033,035,037,040,042,044,045,049,050,051,053,059
MDL_CONTEXT::TRY_ACQUIRE_LOCK_IMPL
ANALYSIS:
=========
Server sometimes exited when multiple threads tried to
acquire and release metadata locks simultaneously (for
example, necessary to access a table). The same problem
could have occurred when new objects were registered/
deregistered in Performance Schema.
The problem was caused by a bug in LF_HASH - our lock free
hash implementation which is used by metadata locking
subsystem in 5.7 branch. In 5.5 and 5.6 we only use LF_HASH
in Performance Schema Instrumentation implementation. So
for these versions, the problem was limited to P_S.
The problem was in my_lfind() function, which searches for
the specific hash element by going through the elements
list. During this search it loads information about element
checked such as key pointer and hash value into local
variables. Then it confirms that they are not corrupted by
concurrent delete operation (which will set pointer to 0)
by checking if element is still in the list. The latter
check did not take into account that compiler (and
processor) can reorder reads in such a way that load of key
pointer will happen after it, making result of the check
invalid.
FIX:
====
This patch fixes the problem by ensuring that no such
reordering can take place. This is achieved by using
my_atomic_loadptr() which contains compiler and processor
memory barriers for the check mentioned above and other
similar places.
The default (for non-Windows systems) implementation of
my_atomic*() relies on old __sync intrisics and implements
my_atomic_loadptr() as read-modify operation. To avoid
scalability/performance penalty associated with addition of
my_atomic_loadptr()'s we change the my_atomic*() to use
newer __atomic intrisics when available. This new default
implementation doesn't have such a drawback.
The background is that one user had a lot of views and using some complex
queries on information schema temporary memory of more than 2G was used.
- Added new element 'total_alloc' to MEM_ROOT for easier debugging.
- Added MAX_MEMORY_USED to information_schema.processlist.
- Added new status variable "Memory_used_initial" that shows how much MariaDB
uses at startup. This gives the base value for "Memory_used".
- Reuse memory continuously for information schema queries instead of
only freeing memory at query end.
Other things
- Removed some not needed set_notnull() calls for not null columns.
finish_resize_simple_key_cache removed argument acquire for acquiring
locks.
resize_simple_key_cache enforces assertion that the cache is inited.
read_block was really two functions, primary and secondary so separated
as such. Make the callers of read_block explictly use the required function.
Signed-off-by: Daniel Black <daniel@linux.vnet.ibm.com>
- Fix win64 pointer truncation warnings
(usually coming from misusing 0x%lx and long cast in DBUG)
- Also fix printf-format warnings
Make the above mentioned warnings fatal.
- fix pthread_join on Windows to set return value.
Reintroduce environment variable MYSQL_GROUP_SUFFIX to be used as
--default-group-suffix value if not already set.
The environment variable was accidentally renamed to DEFAULT_GROUP_SUFFIX_ENV
in MySQL server 5.5.
ERROR_FILE_SYSTEM_LIMITATION was seen by support when backing up large
file. However mariabackup error message was not very helpful,
since it mapped the error to generic catch-all EINVAL.
With the patch, ERROR_FILE_SYSTEM_LIMITATION will be mapped to more
appropriate EFBIG. Also add mapping from ERROR_NO_SYSTEM_RESOURCES
to ENOMEM.
This fixes MDEV-7742 and MDEV-8305 (Allow user to specify if stored
procedures should be logged in the slow and general log)
New functionality:
- Added new variables log_slow_disable_statements and log_disable_statements
that can be used to disable logging of certain queries to slow and
general log. Currently supported options are 'admin', 'call', 'slave'
and 'sp'.
Defaults are as before. Only 'sp' (stored procedure statements) is
disabled for slow and general_log.
- Slow log to files now includes the following new information:
- When logging stored procedure statements the name of stored
procedure is logged.
- Number of created tmp_tables, tmp_disk_tables and the space used
by temporary tables.
- When logging 'call', the logged status now contains the sum of all
included statements. Before only 'time' was correct.
- Added filsort_priority_queue as an option for log_slow_filter (this
variable existed before, but was not exposed)
- Added support for BIT types in my_getopt()
Mapped some old variables to bitmaps (old variables can still be used)
- Variable 'log_queries_not_using_indexes' is mapped to
log_slow_filter='not_using_index'
- Variable 'log_slow_slave_statements' is mapped to
log_slow_disabled_statements='slave'
- Variable 'log_slow_admin_statements' is mapped to
log_slow_disabled_statements='admin'
- All the above variables are changed to session variables from global
variables
Other things:
- Simplified LOGGER::log_command. We don't need to check for super if
OPTION_LOG_OFF is set as this flag can only be set if one is a super
user.
- Removed some setting of enable_slow_log as it's guaranteed to be set by
mysql_parse()
- mysql_admin_table() now sets thd->enable_slow_log
- Added prepare_logs_for_admin_command() to reset thd->enable_slow_log if
needed.
- Added new functions to store, restore and add slow query status
- Added new functions to store and restore query start time
- Reorganized Sub_statement_state according to types
- Added code in dispatch_command() to ensure that
thd->reset_for_next_command() is always called for a query.
- Added thd->last_sql_command to simplify checking of what was the type
of the last command. Needed when logging to slow log as lex->sql_command
may have changed before slow logging is called.
- Moved QPLAN_TMP_... to where status for tmp tables are updated
- Added new THD variable, affected_rows, to be able to correctly log
number of affected rows to slow log.
If compiling a non DBUG binary with
-DDBUG_ASSERT_AS_PRINTF asserts will be
changed to printf + stack trace (of stack
trace are enabled).
- Changed #ifndef DBUG_OFF to
#ifdef DBUG_ASSERT_EXISTS
for those DBUG_OFF that was just used to enable
assert
- Assert checking that could greatly impact
performance where changed to DBUG_ASSERT_SLOW which
is not affected by DBUG_ASSERT_AS_PRINTF
- Added one extra option to my_print_stacktrace() to
get more silent in case of stack trace printing as
part of assert.
- Simplified use_trans_cache() to return at once if is_transactional is set
- Indentation and spelling errors fixed
- Don't call signal_update() if update_binlog_end_pos() is called as the
function already calls signal_update()
- Removed not used function wait_for_update_bin_log(), which would cause
errors if ever used.
- Simplified handler::clone() by always allocating 'ref' in ha_open(). To do
this I added an optional MEM_ROOT argument to ha_open() to be used when
allocating 'ref'
- Changed arguments to get_system_var() from LEX_CSTRING to LEX_CSTRING*
- Added THD as argument to create_select_for_variable(). Changed also char*
argument to LEX_CSTRING to avoid strlen() call.
- Change calls to append() to use LEX_CSTRING
- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
A few tests assumes that the CYCLE timer is always available,
which is not true on some platforms (e.g. ARM).
Fixing the tests not to reply on the CYCLE availability.
These self references were previously used to avoid having to check the
IO_CACHE's type. However, a benchmark shows that on x86 5930k stock,
the type comparison is marginally faster than the double pointer dereference.
For 40 billion my_b_tell calls, the difference is .1 seconds in favor of performing the
type check. (Basically there is no measurable difference)
To prevent bugs from copying the structure using the equals(=) operator,
and having to do the bookkeeping manually, remove these "convenience"
variables.
The XtraDB storage engine was already replaced by InnoDB
and disabled in MariaDB Server 10.2. Let us remove it altogether
to avoid dragging dead code around.
Replace some references to XtraDB with references to InnoDB.
rpl_get_position_info(): Remove.
Remove the mysql-test-run --suite=percona, because it only contains
tests specific to XtraDB, many of which were disabled already in
earlier MariaDB versions.
This excludes MDEV-12472 (InnoDB should accept XtraDB parameters,
warning that they are ignored). In other words, MariaDB 10.3 will not
recognize any XtraDB-specific parameters.
- Old sequence code forced row based replication for any statements that
refered to a sequence table. What is new is that row based replication
is now sequence aware:
- NEXT VALUE is now generating a short row based event with only
next_value and round being replicated.
- Short row based events are now on the slave updated as trough
SET_VALUE(sequence_name)
- Full row based events are on the slave updated with a full insert,
which is practically same as ALTER SEQUENCE.
- INSERT on a SEQUENCE table does now a EXCLUSIVE LOCK to ensure that
it is logged in binary log before any following NEXT VALUE calls.
- Enable all sequence tests and fixed found bugs
- ALTER SEQUENCE doesn't anymore allow changes that makes the next_value
outside of allowed range
- SEQUENCE changes are done with TL_WRITE_ALLOW_WRITE. Because of this
one can generate a statement for MyISAM with both
TL_WRITE_CONCURRENT_INSERT and TL_WRITE_ALLOW_WRITE. To fix a warning
I had to add an extra test in thr_lock.c for this.
- Removed UPDATE of SEQUENCE (no need to support this as we
have ALTER SEQUENCE, which takes the EXCLUSIVE lock properly.
- Removed DBUG_ASSERT() in MDL_context::upgrade_shared_lock. This was
removed upstream in MySQL 5.6 in 72f823de453.
- Simplified test in decided_logging_format() by using sql_command_flags()
- Fix that we log DROP SEQUENCE correctly.
- Fixed that Aria works with SEQUENCE
Significantly reduce the amount of InnoDB, XtraDB and Mariabackup
code changes by defining pfs_os_file_t as something that is
transparently compatible with os_file_t.
This affected mainly MyISAM and Aria engines.
Also fixed that end_bulk_insert() detects errors from
internal mi_end_bulk_insert() and ma_end_bulk_insert()
- delete_tree() and delete_tree_element() now has an
extra argument that marks if future calls to
tree->free should be ignored.
- tree->free changed to function returning int, to be
able to signal errors.
- Restored deleting flag in MyISAM that was accidently
disabled in mi_extra(PREPARE_FOR_DROP)
SYMLINK CHECK RACE CONDITIONS
ANALYSIS:
=========
A potential defect exists in the handling of CREATE
TABLE .. DATA DIRECTORY/ INDEX DIRECTORY which gives way to
the user to gain access to another user table or a system
table.
FIX:
====
The lstat and fstat output of the target files are now
stored which help in determining the identity of the target
files thus preventing the unauthorized access to other
files.
This was because of two issues:
- thr_multi_lock_after_thr_lock needed to be hit 3 times before go2 could
be signaled, because 2 of these happened before statistics_update_start
was reached.
- The original code didn't take into accunt that thr_locks can be executed in
any random order, which caused sporadic failures when waiting for 1 lock
of 3, as if the locks where in different order, there would be a dead-lock.
Fixed by introducing thr_multi_lock_before_thr_lock which is deterministic.
- Some of the test failures where not noticed as the DEBUG_SYNC timeout
would cause the test to pass (after 300 seconds).
Working features:
CREATE OR REPLACE [TEMPORARY] SEQUENCE [IF NOT EXISTS] name
[ INCREMENT [ BY | = ] increment ]
[ MINVALUE [=] minvalue | NO MINVALUE ]
[ MAXVALUE [=] maxvalue | NO MAXVALUE ]
[ START [ WITH | = ] start ] [ CACHE [=] cache ] [ [ NO ] CYCLE ]
ENGINE=xxx COMMENT=".."
SELECT NEXT VALUE FOR sequence_name;
SELECT NEXTVAL(sequence_name);
SELECT PREVIOUS VALUE FOR sequence_name;
SELECT LASTVAL(sequence_name);
SHOW CREATE SEQUENCE sequence_name;
SHOW CREATE TABLE sequence_name;
CREATE TABLE sequence-structure ... SEQUENCE=1
ALTER TABLE sequence RENAME TO sequence2;
RENAME TABLE sequence TO sequence2;
DROP [TEMPORARY] SEQUENCE [IF EXISTS] sequence_names
Missing features
- SETVAL(value,sequence_name), to be used with replication.
- Check replication, including checking that sequence tables are marked
not transactional.
- Check that a commit happens for NEXT VALUE that changes table data (may
already work)
- ALTER SEQUENCE. ANSI SQL version of setval.
- Share identical sequence entries to not add things twice to table list.
- testing insert/delete/update/truncate/load data
- Run and fix Alibaba sequence tests (part of mysql-test/suite/sql_sequence)
- Write documentation for NEXT VALUE / PREVIOUS_VALUE
- NEXTVAL in DEFAULT
- Ensure that NEXTVAL in DEFAULT uses database from base table
- Two NEXTVAL for same row should give same answer.
- Oracle syntax sequence_table.nextval, without any FOR or FROM.
- Sequence tables are treated as 'not read constant tables' by SELECT; Would
be better if we would have a separate list for sequence tables so that
select doesn't know about them, except if refereed to with FROM.
Other things done:
- Improved output for safemalloc backtrack
- frm_type_enum changed to Table_type
- Removed lex->is_view and replaced with lex->table_type. This allows
use to more easy check if item is view, sequence or table.
- Added table flag HA_CAN_TABLES_WITHOUT_ROLLBACK, needed for handlers
that want's to support sequences
- Added handler calls:
- engine_name(), to simplify getting engine name for partition and sequences
- update_first_row(), to be able to do efficient sequence implementations.
- Made binlog_log_row() global to be able to call it from ha_sequence.cc
- Added handler variable: row_already_logged, to be able to flag that the
changed row is already logging to replication log.
- Added CF_DB_CHANGE and CF_SCHEMA_CHANGE flags to simplify
deny_updates_if_read_only_option()
- Added sp_add_cfetch() to avoid new conflicts in sql_yacc.yy
- Moved code for add_table_options() out from sql_show.cc::show_create_table()
- Added String::append_longlong() and used it in sql_show.cc to simplify code.
- Added extra option to dd_frm_type() and ha_table_exists to indicate if
the table is a sequence. Needed by DROP SQUENCE to not drop a table.
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
mysql_client uses some inline assembly code to switch thread stacks.
This works, however tools that perform backtrace get confused to fix
this we write a specific constant to signify bottom of stack. This
constant is needed when compiling with CLang as well.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong. Change some parameters to this type.
Use size_t in a few more places.
Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.
When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.
In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
my_readline can fail due to missing file. Make my_readline report this
condition separately so that we can catch it and report an appropriate
error message to the user.
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.
Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()
Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
of stop_slave() to not have to use LOCK_active_mi inside
terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
replication. Problem was that waiting for pause_for_ftwrl was not done
when deleting rpt->current_owner after a force_abort.
it was race condition prone. instead use either a pair of my_delete()
calls with already resolved paths, or a safe high-level function
my_handler_delete_with_symlink(), like MyISAM and Aria already do.
TOCTOU bug. The path is checked to be valid, symlinks are resolved.
Then the resolved path is opened. Between the check and the open,
there's a window when one can replace some path component with a
symlink, bypassing validity checks.
Fix: after we resolved all symlinks in the path, don't allow open()
to resolve symlinks, there should be none.
Compared to the old MyISAM/Aria code:
* fastpath. Opening of not-symlinked files is just one open(),
no fn_format() and lstat() anymore.
* opening of symlinked tables doesn't do fn_format() and lstat() either.
it also doesn't to realpath() (which was lstat-ing every path
component), instead if opens every path component with O_PATH.
* share->data_file_name stores realpath(path) not readlink(path). So,
SHOW CREATE TABLE needs to do lstat/readlink() now (see ::info()),
and certain error messages (cannot open file "XXX") show the real
file path with all symlinks resolved.
Don't let my_register_filename() fail because strdup() failed. Better to
have NULL for a filename, then to fail the already successful open().
Filenames are only used for error reporting and there was already code
to ignore OOMs (my_fdopen()) and to cope with missing filenames
(my_filename()).