Revert following bug fix:
Bug#20685029: SLAVE IO THREAD SHOULD STOP WHEN DISK IS
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE
This fix results in a deadlock between slave IO thread
and SQL thread.
(cherry picked from commit e3fea6c6dbb36c6ab21c4ab777224560e9608b53)
- Avoid some realloc() during startup
- Ensure that file_key_management_plugin frees it's memory early, even if
it's linked statically.
- Fixed compiler warnings from unused variables and missing destructors
- Fixed wrong indentation
FAILURES
Analysis:
=========
Test script is not ensuring that "assert_grep.inc" should be
called only after 'Disk is full' error is written to the
error log.
Test checks for "Queueing master event to the relay log"
state. But this state is set before invoking 'queue_event'.
Actual 'Disk is full' error happens at a very lower level.
It can happen that we might even reset the debug point
before even the actual disk full simulation occurs and the
"Disk is full" message will never appear in the error log.
In order to guarentee that we must have some mechanism where
in after we write "Disk is full" error messge into the error
log we must signal the test to execute SSS and then reset
the debug point. So that test is deterministic.
Fix:
===
Added debug sync point to make script deterministic.
This is done by splitting variables.errmsg and locale.errmsg to
variables.errmsg_extra and locale.errmsg_extra
The ER() macros in unireg.h now looks more complex than before, but this
isn't critical as most usage of them are with constants and the compiler
will remove most of the test code.
- Removing the "diff_if_only_endspace_difference" argument from
MY_COLLATION_HANDLER::strnncollsp(), my_strnncollsp_simple(),
as well as in the function template MY_FUNCTION_NAME(strnncollsp)
in strcoll.ic
- Removing the "diff_if_only_space_different" from ha_compare_text(),
hp_rec_key_cmp().
- Adding a new function my_strnncollsp_padspace_bin() and reusing
it instead of duplicate code pieces in my_strnncollsp_8bit_bin(),
my_strnncollsp_latin1_de(), my_strnncollsp_tis620(),
my_strnncollsp_utf8_cs().
- Adding more tests for better coverage of the trailing space handling.
- Removing the unused definition of HA_END_SPACE_ARE_EQUAL
don't allocate all the stack, leave some stack for
function calls.
To test I added the following line:
alloca_size = available_stack_size() - X
at X=4096 or less mysqld crashed, at 8192 mtr test passed.
UNIQUE::~UNIQUE | SQL/UNIQUES.CC:355
Analysis
========
Enabling the sort_buffer_size with a large value
can cause operations utilizing the sort buffer
like DELETE as mentioned in the bug report to
fail. 5.5 and 5.6 versions reports OOM error
while in 5.7+, the server crashes.
While initializing the mem_root for the sort buffer
tree, the block size for the mem_root is determined
from the 'sort_buffer_size' value. This unsigned
long value is typecasted to unsigned int, hence
it becomes zero. Further block_size computation
while initializing the mem_root results in a very
large block_size value. Hence while trying to
allocate a block during the DELETE operation,
an OOM error is reported. In case of 5.7+, the PFS
instrumentation for memory allocation, overshoots
the unsigned value and allocates a block of just
one byte. While trying to free the block of the
mem_root, the original block_size is used. This
triggers the crash since the server tries to free
unallocated memory.
Fix:
====
In order to restrict usage of such unreasonable
sort_buffer_size, the typecast of block size
to 'unsigned int' is removed and hence reports
OOM error across all versions for sizes
exceeding unsigned int range.
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE
Problem:
========
Currently SHOW SLAVE STATUS blocks if IO thread waits for
disk space. This makes automation tools verifying
server health block on taking relevant action. Finally this
will create SHOW SLAVE STATUS piles.
Analysis:
=========
SHOW SLAVE STATUS hangs on mi->data_lock if relay log write
is waiting for free disk space while holding mi->data_lock.
mi->data_lock is needed to protect the format description
event (mi->format_description_event) which is accessed by
the clients running FLUSH LOGS and slave IO thread. Note
relay log writes don't need to be protected by
mi->data_lock, LOCK_log is used to protect relay log between
IO and SQL thread (see MYSQL_BIN_LOG::append_event). The
code takes mi->data_lock to protect
mi->format_description_event during relay log rotate which
might get triggered right after relay log write.
Fix:
====
Release the data_lock just for the duration of writing into
relay log.
Made change to ensure the following lock order is maintained
to avoid deadlocks.
data_lock, LOCK_log
data_lock is held during relay log rotations to protect
the description event.
Compile time assertion "sizeof(struct st_irem) % sizeof(double) == 0" started
to fail on 32bit systems after my_thread_id was changed from ulong to int64.
Fixed by added padding to struct st_irem on 32bit systems.
Creating a CONNECT object on client connect and pass this to the working thread which creates the THD.
Split LOCK_thread_count to different mutexes
Added LOCK_thread_start to syncronize threads
Moved most usage of LOCK_thread_count to dedicated functions
Use next_thread_id() instead of thread_id++
Other things:
- Thread id now starts from 1 instead of 2
- Added cast for thread_id as thread id is now of type my_thread_id
- Made THD->host const (To ensure it's not changed)
- Removed some DBUG_PRINT() about entering/exiting mutex as these was already logged by mutex code
- Fixed that aborted_connects and connection_errors_internal are counted in all cases
- Don't take locks for current_linfo when we set it (not needed as it was 0 before)
cherry-pick f1daf9ce from 10.0 branch
-------------------------------------
Fix build failures caused by new C runtime library
- isnan, snprintf, struct timespec are now defined, attempt to
redefine them leads
- P_tmpdir, tzname are no more defined
- lfind() and lsearch() in lf_hash.c had to be renamed, declaration
conflicts with some C runtime functions with the same name declared in
a header included by stdlib.h
Also fix couple of annoying warnings :
- remove #define NOMINMAX from config.h to avoid "redefined" compiler
warnings(NOMINMAX is already in compile flags)
- disable incremental linker in Debug as well (feature not used much
and compiler crashes often)
Also simplify package building with Wix, require Wix 3.9 or later
(VS2015 is not compatible with old Wix 3.5/3.6)
Post-push fix: The problem was that condition variable
timeouts could in some cases (slow machines and/or short
timeouts) be infinite.
When the number of milliseconds to wait is computed, the
end time is computed before the now() time. This can result
in the now() time being later than the end time, leading to
negative timeout. Which after conversion to unsigned becomes
~infinite.
This patch fixes the problem by explicitly checking if we
get negative timeout and then using 0 if this is the case.
In order to get all the input from addr2line we must read in a loop,
until the response is complete. Also, in case that the response is
malformed, we must not end up reading invalid memory.
Problem Statement
=========
Fix various issues when building MySQL with Visual Studio 2015.
Fix:
=======
- Visual Studio 2015 adds support for timespec. Add check and
related code to use this and only use our replacement if
timespec is not defined.
- Rename lfind/lsearch to my* to avoid redefinition problems.
- Set default value for TMPDIR to "" on Windows as P_tmpdir
no longer exists.
- using VS definition of snprintf if available
- tzname are now renamed to _tzname.
This includes fixing all utilities to not have any memory leaks,
as safemalloc warnings stopped tests from passing on MacOSX.
- Ensure that all clients takes character-set-dir, as the
libmysqlclient library will use it.
- mysql-test-run now passes character-set-dir to all external clients.
- Changed dynstr_free() so that it can be called twice (made freeing code easier)
- Changed rpl_global_gtid_slave_state to be allocated dynamicly as it
includes a mutex that needs to be initizlied/destroyed before my_end() is called.
- Removed rpl_slave_state::init() and rpl_slave_stage::deinit() as
their job are better handling by constructor and delete.
- Print alias instead of table_name in check_duplicate_key as
table_name may have been converted to lower case.
Other things:
- Fixed a case in time_to_datetime_with_warn() where we where
using && instead of & in tests
The bitmap implementation defines two template Bitmap classes. One
optimized for 64-bit (default) wide bitmaps while the other is used for
all other widths.
In order to optimize the computations, Bitmap<64> class has defined its
own member functions for bitmap operations, the other one, however,
relies on mysys' bitmap implementation (mysys/my_bitmap.c).
Issue 1:
In case of non 64-bit Bitmap class, intersect() wrongly reset the
received bitmap while initialising a new local bitmap structure
(bitmap_init() clears the bitmap buffer) thus, the received bitmap was
getting cleared.
Fixed by initializing the local bitmap structure by using a temporary
buffer and later copying the received bitmap to the initialised bitmap
structure.
Issue 2:
The non 64-bit Bitmap class had the Iterator missing which caused
compilation failure.
Also added a cmake variable to hold the MAX_INDEXES value when supplied
from the command prompt. (eg. cmake .. -DMAX_INDEXES=128U). Checks have
been put in place to trigger build failure if MAX_INDEXES value is
greater than 128.
Test modifications:
* Introduced include/have_max_indexes_[64|128].inc to facilitate
skipping of tests for which the output differs with different
MAX_INDEXES.
* Introduced include/max_indexes.inc which would get modified by cmake
to reflect the MAX_INDEXES value used to build the server. This file
simply sets an mtr variable '$max_indexes' to show the MAX_INDEXES
value, which will then be consumed by the above introduced include file.
* Some tests (portions), dependent on MAX_INDEXES value, have been moved
to separate test files.
Fix build failures caused by new C runtime library
- isnan, snprintf, struct timespec are now defined, attempt to
redefine them leads
- P_tmpdir, tzname are no more defined
- lfind() and lsearch() in lf_hash.c had to be renamed, declaration
conflicts with some C runtime functions with the same name declared in
a header included by stdlib.h
Also fix couple of annoying warnings :
- remove #define NOMINMAX from config.h to avoid "redefined" compiler
warnings(NOMINMAX is already in compile flags)
- disable incremental linker in Debug as well (feature not used much
and compiler crashes often)
Also simplify package building with Wix, require Wix 3.9 or later
(VS2015 is not compatible with old Wix 3.5/3.6)
A few tests assumes that the CYCLE timer is always available,
which is not true on some platforms (e.g. ARM).
Fixing the tests not to reply on the CYCLE availability.
Addendum:
* Before calling THD::init_for_queries(), flip the current_thd to wsrep
thread so that memory gets allocated for the right THD.
* Use wsrep_creating_startup_threads instead of plugins_are_initialized
as the condition for the execution of THD::init_for_queries() within
start_wsrep_THD(), as use of latter could still leave some room for
race.
Instead of encrypt(src, dst, key, iv) that encrypts all
data in one go, now we have encrypt_init(key,iv),
encrypt_update(src,dst), and encrypt_finish(dst).
This also causes collateral changes in the internal my_crypt.cc
encryption functions and in the encryption service.
There are wrappers to provide the old all-at-once encryption
functionality. But binlog events are often written piecewise,
they'll need the new api.
PID_FILE CHECK LEADS TO OOM SIG 11
Description:- A server started with 'query_alloc_block_size'
option set to a certain range of negative values on a
machine without enough memory may lead to OOM.
Analysis:- Server uses 'strtoull()' to convert server
variable values of type 'GET_UINT', 'GET_ULONG' or 'GET_ULL'
from string to unsigned long long. According to the man
page, 'strtoull()' function returns either the result of the
conversion or, if there was a leading minus sign, the
negation of the result of the conversion represented as an
unsigned value, unless the original(nonnegated) value would
overflow; in the latter case, strtoull() returns ULLONG_MAX
and sets errno to ERANGE. So 'strtoull()' converts a small
negative value to a larger postive value. For example string
'-1125899906842624' will be converted to an unsigned value,
'18445618173802708992' (ulonglong typecast of
'-1125899906842624'). So a
server started with 'query_alloc_block_size' set to
"-1125899906842624" on a machine without enough memory will
lead to OOM since server allocates '18445618173802708992'
bytes(17178820608 GB) for query allocation block.
Fix:- When server is started with any server variable, of
type "GET_UINT", "GET_ULONG" or "GET_ULL", set to a negative
value, a warning, "option xxx: value -yyy adjusted to zzz"
is thrown and the value is adjusted to the lowest possible
value for that variable. The dynamic server variable which
is configured through the client exhibit the same behavior
as fix made for variables configured during the server
start up.
of more than 45G with a key_cache_block_size of 1024 or less.
The problem was that some of the arguments to my_multi_malloc() got to be
more than 4G.
Fix:
- Inntroduced my_multi_malloc_large() that can handle big regions.
- Changed MyISAM and Aria key caches to use my_multi_malloc_large().
I didn't change the default my_multi_malloc() as this would be a too big
patch and we don't allocate 4G blocks anywhere else.
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd
Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)
Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
examined_rows -> join_examined_rows
record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()
Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
* remove unused (and not implemented) WRITE_NET type
* remove cast in my_b_write() macro. my_b_* macros are
function-like, casts are responsibility of the caller
* replace hackish _my_b_write(info,0,0) with the explicit
my_b_flush_io_cache() in my_b_write_byte()
* remove unused my_b_fill_cache()
* replace pbool -> my_bool
* make internal IO_CACHE functions static
* reformat comments, correct typos, remove obsolete comments (ISAM)
* assert valid cache type in init_functions()
* use IO_ROUND_DN() macro where appropriate
* remove unused DBUG_EXECUTE_IF in _my_b_cache_write()
* remove unnecessary __attribute__((unused))
* fix goto error in parse_file.cc
* remove redundant reinit_io_cache() in uniques.cc
* don't do reinit_io_cache() if the cache was not initialized
in ma_check.c
* extract duplicate functionality from various _my_b_*_read
functions into a common wrapper. Same for _my_b_*_write
* create _my_b_cache_write_r instead of having if's in
_my_b_cache_write (similar to existing _my_b_cache_read and
_my_b_cache_read_r)
* don't call mysql_file_write() from my_b_flush_io_cache(),
call info->write_function() instead
remove some 14-year old code that added support for
LOAD DATA replication to IO_CACHE:
* three callbacks, of which only two were actually used and that
were only needed for LOAD DATA replication but were
tested in every IO_CACHE instance
* an additional opaque void * argument in IO_CACHE, also only
used for LOAD DATA replication, but present everywhere
* the code to close IO_CACHE prematurely in LOAD DATA to have
these callbacks called in the correct order and a long
comment explaining what will happen if IO_CACHE is not
closed prematurely
* a variable to track whether IO_CACHE was closed prematurely
(to avoid double-closing it)
* my_aes.h doesn't compile without my_global.h
* typo in a comment
* redundant condition
* if encryption plugin fails, there's no encryption_key_manager
at plugin deinit time
* encryption plugin tests must run when plugin.so is present,
not when a plugin is active (otherwise the test will be skipped
when plugin fails to initialize).
Take into account that PIE binaries are loaded at some offset, so
addresses cannot be directly resolved with addr2line. Find this
offset and subtract it before resolving an address.
fails in buildbot)
There was a race condition in timer functionality of query timeouts.
Race was as following:
main thread: init_thr_timers()
timer handler thread: my_thread_init()
main thread: end_thr_timer()/timer_thread_state= ABORTING
timer handler thread: timer_thread_state= RUNNING, continue normal op
main thread: waits indefinitely for timer handler thread to go down
The original idea of the fix is to set RUNNING state in main thread, before
starting timer handler thread. But it didn't survive further cleanups:
- removed "timer_thread_state" and used "thr_timer_inited" for this purpose
- removed unused "timer_thread_running"
- removed code responisble for "timer handler thread" shutdown synchronization,
use pthread_join() instead.
mysql-test/suite/maria/insert_select.result:
Added test case
mysql-test/suite/maria/insert_select.test:
Added test case
mysys/thr_lock.c:
Ensure we don't allow concurrent_insert when a read_no_write lock is in use
Stage "Filling schema table" is now properly shown in 'show processlist'
mysys/mf_keycache.c:
Simple cleanup with more comments
sql/lock.cc:
Return to original stage after mysql_lock_tables
Made 'Table lock' as a true stage
sql/sql_show.cc:
Restore original stage after get_schema_tables_result
if lf_hash_delete() and lf_hash_search() cannot create a new bucket
because of OOM, they'll start the search from the parent bucket.
As for lf_hash_iterate() - it only ever uses bucket number 0, so if it
cannot create *that* bucket, the hash must surely be empty.
casts, etc
real changes are:
* remove one retry, it is enough to check for DELETED
after the key is read
* advance 'head' pointer when we see a dummy node to have
shorter retries
There was a bug in lock handling when mixing INSERT ... SELECT on the same table.
mysql-test/suite/maria/insert_select.result:
Test case for MDEV_4010
mysql-test/suite/maria/insert_select.test:
Test case for MDEV_4010
mysys/thr_lock.c:
We wrongly alldoed TL_WRITE_CONCURRENT_INSERT when there was a TL_READ_NO_INSERT lock
commit 85fd3d901311688e18ffce92ffc78129e5625791
Author: Monty <monty@mariadb.org>
Date: Fri Aug 29 14:07:43 2014 +0300
my_alloc.c
- Changed 0x%lx -> %p
array.c:
- Static (preallocated) buffer can now be anywhere
my_sys.h
- Define MY_INIT_BUFFER_USED
sql_delete.cc & sql_lex.cc
- Use memroot when allocating classes (avoids call to current_thd)
sql_explain.h:
- Use preallocated buffers
sql_explain.cc:
- Use preallocated buffers and memroot
sql_select.cc:
- Use multi_alloc_root() instead of many alloc_root()
- Update calls to Explain
All callers of open_cached_file() use 2 characters prefix. Allocating memory for
such short string is an overkill. Store it on IO_CACHE structure instead.
All callers of open_cached_file() use mysql_tmpdir as dir. No need to allocate
memory for it since it is constant and available till server shutdown.
This reduces number of allocations from 31 to 27 per OLTP RO transaction.
Problem: For every event read, mysqlbinlog calls localtime() which in turn
calls stat(/etc/localtime) which is causing kernel mutex contention.
Analysis and Fix:
localtime() calls stat(/etc/localtime) for every instance of the call
where as localtime_r() the reentrant version was optimized to store
the read only tz internal structure. Hence it will not call
stat(/etc/localtime). It will call only once at the beginning.
The mysql server is calling localtime_r() and mysqlbinlog tool is
one place where we are still using localtime().
Once the process (mysqlbinlog) is started if timezone is changed
it will be not picked up the the process and it will continue
with the same values as the beginning of the process. This
behavior is in-lined with mysql server.
Also adding localtime_r() and gmtime_r() support for windows.
Problem: For every event read, mysqlbinlog calls localtime() which in turn
calls stat(/etc/localtime) which is causing kernel mutex contention.
Analysis and Fix:
localtime() calls stat(/etc/localtime) for every instance of the call
where as localtime_r() the reentrant version was optimized to store
the read only tz internal structure. Hence it will not call
stat(/etc/localtime). It will call only once at the beginning.
The mysql server is calling localtime_r() and mysqlbinlog tool is
one place where we are still using localtime().
Once the process (mysqlbinlog) is started if timezone is changed
it will be not picked up the the process and it will continue
with the same values as the beginning of the process. This
behavior is in-lined with mysql server.
Also adding localtime_r() and gmtime_r() support for windows.
Added MAX_STATEMENT_TIME user variable to automaticly kill queries after a given time limit has expired.
- Added timer functions based on pthread_cond_timedwait
- Added kill_handlerton() to signal storage engines about kill/timeout
- Added support for GRANT ... MAX_STATEMENT_TIME=#
- Copy max_statement_time to current user, if stored in mysql.user
- Added status variable max_statement_time_exceeded
- Added KILL_TIMEOUT
- Removed digest hash from performance schema tests as they change all the time.
- Updated test results that changed because of the new user variables or new fields in mysql.user
This functionallity is inspired by work done by Davi Arnaut at twitter.
Test case is copied from Davi's work.
Documentation can be found at
https://kb.askmonty.org/en/how-to-limittimeout-queries/
mysql-test/r/mysqld--help.result:
Updated for new help message
mysql-test/suite/perfschema/r/all_instances.result:
Added new mutex
mysql-test/suite/sys_vars/r/max_statement_time_basic.result:
Added testing of max_statement_time
mysql-test/suite/sys_vars/t/max_statement_time_basic.test:
Added testing of max_statement_time
mysql-test/t/max_statement_time.test:
Added testing of max_statement_time
mysys/CMakeLists.txt:
Added thr_timer
mysys/my_init.c:
mysys/mysys_priv.h:
Added new mutex and condition variables
Added new mutex and condition variables
mysys/thr_timer.c:
Added timer functions based on pthread_cond_timedwait()
This can be compiled with HAVE_TIMER_CREATE to benchmark agains timer_create()/timer_settime()
sql/lex.h:
Added MAX_STATEMENT_TIME
sql/log_event.cc:
Safety fix (timeout should be threated as an interrupted query)
sql/mysqld.cc:
Added support for timers
Added status variable max_statement_time_exceeded
sql/share/errmsg-utf8.txt:
Added ER_QUERY_TIMEOUT
sql/signal_handler.cc:
Added support for KILL_TIMEOUT
sql/sql_acl.cc:
Added support for GRANT ... MAX_STATEMENT_TIME=#
Copy max_statement_time to current user
sql/sql_class.cc:
Added timer functionality to THD.
Added thd_kill_timeout()
sql/sql_class.h:
Added timer functionality to THD.
Added KILL_TIMEOUT
Added max_statement_time variable in similar manner as long_query_time was done.
sql/sql_connect.cc:
Added handling of max_statement_time_exceeded
sql/sql_parse.cc:
Added starting and stopping timers for queries.
sql/sql_show.cc:
Added max_statement_time_exceeded for user/connects status in MariaDB 10.0
sql/sql_yacc.yy:
Added support for GRANT ... MAX_STATEMENT_TIME=# syntax, to be enabled in 10.0
sql/structs.h:
Added max_statement_time user resource
sql/sys_vars.cc:
Added max_statement_time variables
mysql-test/suite/roles/create_and_drop_role_invalid_user_table.test
Removed test as we require all fields in mysql.user table.
scripts/mysql_system_tables.sql
scripts/mysql_system_tables_data.sql
scripts/mysql_system_tables_fix.sql
Updated mysql.user with new max_statement_time field
pass --defaults-file and --defaults-extra-file
(whatever was specified, or none)
from mysqld down to SST scripts.
parse these options in SST scripts and pass them down
to mysqldump, my_print_defaults, and xtrabackup
Merged lp:maria/maria-10.0-galera up to revision 3879.
Added a new functions to handler API to forcefully abort_transaction,
producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These
were added for future possiblity to add more storage engines that
could use galera replication.
~40% bugfixed(*) applied
~40$ bugfixed reverted (incorrect or we're not buggy)
~20% bugfixed applied, despite us being not buggy
(*) only changes in the server code, e.g. not cmakefiles
when looking for my.cnf files: if DEFAULT_SYSCONFDIR (or INSTALL_SYSCONFDIR)
is specified (for rpms it always is), use that instead of hardcoded /etc path.