There was a race between end_slave() and cleanup code at the end of
handle_slave_sql(). This could cause access to master_info_index and
global_rpl_thread_pool after they had been freed.
Fix by skipping that cleanup if server shutdown is in progress, as is done
in other parts of the code as well (the cleanup, which stops worker threads
that are not needed anymore, is redundant anyway when the server is shutting
down).
INTERVALS
ISSUE:
------
Some string functions return one or a combination of the
parameters as their result. Here the resultant string's
charset could be incorrectly set to that of the chosen
parameter.
This results in incorrect behavior when an ascii string is
expected.
SOLUTION:
---------
Since an ascii string is expected, val_str_ascii should
explicitly convert the string.
Part of the fix is a backport of Bug#22340858 for mysql-5.5
and mysql-5.6.
IS NOT FOUND
DESCRIPTION
===========
If script mysqld_multi and utility my_print_defaults are in
the same folder (not included in $PATH) and the former is
made to run, it complaints that the mysqld binary is absent
eventhough the binary exists.
ANALYSIS
========
We've a subroutine my_which() mimicking the behaviour of
POSIX "which" command. Its current behaviour is to check
for a given argument as follows:
- Step 1: Assume the argument to be a command having full
fledged absolute path. If it exists "as-is", return the
argument (which will be pathname), else proceed to Step 2.
- Step 2: Assume the argument to be a plain command with no
aboslute path. Try locating it in all of the paths
(mentioned in $PATH) one by one. If found return the
pathname. If found nowhere, return NULL.
Currently when my_which(my_print_defaults) is called, it
returns from Step 1 (since utlity exists in current
folder) and doesn't proceed to Step 2. This is wrong since
the returned value is same as the argument i.e.
'my_print_default' which defies the purpose of this
subroutine whose job is to return a pathname either in Step
1 or Step 2.
Later when the utility is executed in subroutine
defaults_for_group(), it evaluates to NULL and returns the
same. This is because the plain command 'my_print_defaults
{options} ...' would execute properly only if
my_print_defaults exists in one of the paths (in $PATH). In
such a case, in the course of the flow it looks onto the
variable $mysqld_found which comes out to be NULL and
hence ethe error.
In this case, call to my_which should fail resulting in
script being aborted and thus avoiding this mess.
FIX
===
This utility my_print_defaults should be tested only in
Step 2 since it does not have an absolute path. Thus added
a condition in Step 1 so that is gets executed iff not
called for my_print_defaults thus bypassing it to proceed
to Step 2 where the check is made for various paths (in
$PATH)
FULL
Bug#21753696: MAKE SHOW SLAVE STATUS NON BLOCKING IF IO
THREAD WAITS FOR DISK SPACE
Problem:
========
Currently SHOW SLAVE STATUS blocks if IO thread waits for
disk space. This makes automation tools verifying
server health block on taking relevant action. Finally this
will create SHOW SLAVE STATUS piles.
Analysis:
=========
SHOW SLAVE STATUS hangs on mi->data_lock if relay log write
is waiting for free disk space while holding mi->data_lock.
mi->data_lock is needed to protect the format description
event (mi->format_description_event) which is accessed by
the clients running FLUSH LOGS and slave IO thread. Note
relay log writes don't need to be protected by
mi->data_lock, LOCK_log is used to protect relay log between
IO and SQL thread (see MYSQL_BIN_LOG::append_event). The
code takes mi->data_lock to protect
mi->format_description_event during relay log rotate which
might get triggered right after relay log write.
Fix:
====
Release the data_lock just for the duration of writing into
relay log.
Made change to ensure the following lock order is maintained
to avoid deadlocks.
data_lock, LOCK_log
data_lock is held during relay log rotations to protect
the description event.
REPLICATION
Problem: In RBR mode, merge table updates are not successfully applied on a cascading
replication.
Analysis & Fix: Every type of row event is preceded by one or more table_map_log_events
that gives the information about all the tables that are involved in the row
event. Server maintains the list in RPL_TABLE_LIST and it goes through all the
tables and checks for the compatibility between master and slave. Before
checking for the compatibility, it calls 'open_tables()' which takes the list
of all tables that needs to be locked and opened. In RBR, because of the
Table_map_log_event , we already have all the tables including base tables in
the list. But the open_tables() which is generic call takes care of appending
base tables if the list contains merge tables. There is an assumption in the
current replication layer logic that these tables (TABLE_LIST type objects) are always
added in the end of the list. Replication layer maintains the count of
tables(tables_to_lock_count) that needs to be verified for compatibility check
and runs through only those many tables from the list and rest of the objects
in linked list can be skipped. But this assumption is wrong.
open_tables()->..->add_children_to_list() adds base tables to the list immediately
after seeing the merge table in the list.
For eg: If the list passed to open_tables() is t1->t2->t3 where t3 is merge
table (and t1 and t2 are base tables), it adds t1'->t2' to the list after t3.
New table list looks like t1->t2->t3->t1'->t2'. It looks like it added at the
end of the list but that is not correct. If the list passed to open_tables()
is t3->t1->t2 where t3 is merge table (and t1 and t2 are base tables), the new
prepared list will be t3->t1'->t2'->t1->t2. Where t1' and t2' are of
TABLE_LIST objects which were added by add_children_to_list() call and replication
layer should not look into them. Here tables_to_lock_count will not help as the
objects are added in between the list.
Fix: After investigating add_children_list() logic (which is called from open_tables()),
there is no flag/logic in it to skip adding the children to the list even if the
children are already included in the table list. Hence to fix the issue, a
logic should be added in the replication layer to skip children in the list by
checking whether 'parent_l' is non-null or not. If it is children, we will skip 'compatibility'
check for that table.
Also this patch is not removing 'tables_to_lock_count' logic for the performance issues
if there are any children at the end of the list, those can be easily skipped directly by
stopping the loop with tables_to_lock_count check.
FOUND
Description:- Failure during the validation of CA
certificate path which is provided as an option for 'ssl-ca'
returns two different errors for YaSSL and OPENSSL.
Analysis:- 'ssl-ca', option used for specifying the ssl ca
certificate path. Failing to validate this certificate with
OPENSSL returns an error, "ERROR 2026 (HY000): SSL
connection error: SSL_CTX_set_default_verify_paths failed".
While YASSL returns "ERROR 2026 (HY000): SSL connection
error: ASN: bad other signature confirmation". Error
returned by the OPENSSL is correct since
"SSL_CTX_load_verify_locations()" returns 0 (in case of
OPENSSL) for the failure and sets error as
"SSL_INITERR_BAD_PATHS". In case of YASSL,
"SSL_CTX_load_verify_locations()" returns an error number
which is less than or equal to 0 in case of error. Error
numbers for YASSL is mentioned in the file,
'extra/yassl/include/openssl/ssl.h'(line no : 292). Also
'ssl-ca' does not accept tilde home directory path
substitution.
Fix:- The condition which checks for the error in the
"SSL_CTX_load_verify_locations()" is changed in order to
accommodate YASSL as well. A logic is written in
"mysql_ssl_set()" in order accept the tilde home directory
path substitution for all ssl options.
The main.merge test case was failing when tested using row based
binlog format.
While analyzing the issue it was found the following issues:
a) The server is calling binlog related code even when a statement will
not be binlogged;
b) The child table list was not present into table structure by the time
to generate the create table statement;
c) The tables in the child table list will not be opened yet when
generating table create info using row based replication;
d) CREATE TABLE LIKE TEMP_TABLE does not preserve original table storage
engine when using row based replication;
This patch addressed all above issues.
@ sql/sql_class.h
Added a function to determine if the binary log is disabled to
the current session. This is related with issue (a) above.
@ sql/sql_table.cc
Added code to skip binary logging related code if the statement
will not be binlogged. This is related with issue (a) above.
Added code to add the children to the query list of the table that
will have its CREATE TABLE generated. This is related with issue (b)
above.
Added code to force the storage engine to be generated into the
CREATE TABLE. This is related with issue (d) above.
@ storage/myisammrg/ha_myisammrg.cc
Added a test to skip a table getting info about a child table if the
child table is not opened. This is related to issue (c) above.
The test created a file in location relative to the datadir
(a few levels above datadir).
The file was created by MariaDB server (via INTO OUTFILE), and
later removed by mysqltest (via remove_file). The problem is that
when the vardir is a symlink, MariaDB server and mysqltest can
resolve such paths differently. MariaDB server would return back
to where the symlink is located, while mysqltest would go above
the real directory. For example, if the test is run with --mem,
and /bld/5.5/mysql-test/var points at /dev/shm/var_auto_X, then
SELECT INTO OUTFILE created a file in /bld/5.5/mysql-test , but
remove_file would look for it in /dev/shm/.
The test is re-written so that all paths are resolved in perl,
the logic itself hasn't changed.
If any given variable the xtrabackup-v2 sst script looks for is specified
multiple times in cnf file then it tend to pick both of them causing
some of the follow-up command to fail.
Avoid this programatic mistake by honoring only the last variable assigned
setting as done by mysqld too.
Check https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1362830
Semantics:
---------
* Generally end-user will create a separate user with needed
privileges for
performing DONOR action.
* This user credentials are specified using wsrep_sst_auth.
* Along with this user there could be other user(s) created on the
server
that sysadmin may use for normal or other operations
* Credentials for these user(s) can be specified in same
cluster/server
cnf file as part of [client] section
When cluster act as DONOR and if wsrep_sst_auth is provided then it
should
strictly use it for performing SST based action.
What if end-user has same credentials for performing both SST action
and
normal admin work ?
* Then end-user can simply specify these credentials as part of
[client]
section in cnf file and skip providing wsrep_sst_auth.
Issue:
-----
MySQL client user/password parsing preference order is as follows:
* command line (through --user/--password)
* cnf file
* MYSQL_PWD enviornment variable.
Recent change tried passing sst user password through MYSQL_PWD
(and user though --user command line param as before).
On the system where-in admin had another user for performing non-SST
actions,
credentials for such user were present in cnf file under [client]
section.
Due to mysql client preference order, SST user name was used (as it
was
passed through command line) but password of other user (meant for
non-SST)
action was being used as it was passed through cnf file.
Password passed through MYSQL_PWD was completely ignored causing
user-name/password mismatch.
Solution:
---------
* If user has specified credentials for SST then pass them through
command
line so that they are used in priority.
(There could be security concern on passing things through command
line but
when I tried passing user-name and password through command line to
mysql
client and then did ps I saw this
./bin/mysql --user=sstuser --password=x xxxxxxxx -S /tmp/n1.sock
so seems like password is not shown)
- avoiding the race condition, by not grabbing thd->LOCK_wsrep_thd for
accessing thd->wsrep_exec_mode. The caller is same thread and exec mode
can only be changed by self.
Fix remaining issues with wsrep_sync_wait and query cache.
- Fixes misplaced call to invalidate query cache in
Rows_log_event::do_apply_event().
Query cache was invalidated too early, and allowed old
entries to be inserted to the cache.
- Reset thd->wsrep_sync_wait_gtid on query cache hit.
THD->cleanup_after_query is not called in such cases,
and thd->wsrep_sync_wait_gtid remained initialized.
* Total order isolation was started twice for FLUSH TABLES, from
reload_acl_and_cache() and from mysql_execute_command(). Removed
the reload_acl_and_cache() part.
* Removed PXC specific stuff from MTR tests
- Eliminates code duplication in query cache patch
- Reduces the number of iterations in mysql-wsrep#201.test
to shorten the execution time
- Adds a new test case that exercises more scenarios