Description:
THREAD_CONCURRENCY is deprecated and there is no
deprecation warning message while setting this variable
while starting the server.
Analysis:
This variable is specific to Solaris 8 and earlier systems
and is ignored on all other platforms. But since many
customers, who uses other than Solaris, still has this
variable in their configuration file, it is important to
have a deprecation warning.
Fix:
THREAD_CONCURRENCY deprecation warning message is added.
SHOW PROCESSLIST, SHOW BINLOGS
Problem: A deadlock was occurring when 4 threads were
involved in acquiring locks in the following way
Thread 1: Dump thread ( Slave is reconnecting, so on
Master, a new dump thread is trying kill
zombie dump threads. It acquired thread's
LOCK_thd_data and it is about to acquire
mysys_var->current_mutex ( which LOCK_log)
Thread 2: Application thread is executing show binlogs and
acquired LOCK_log and it is about to acquire
LOCK_index.
Thread 3: Application thread is executing Purge binary logs
and acquired LOCK_index and it is about to
acquire LOCK_thread_count.
Thread 4: Application thread is executing show processlist
and acquired LOCK_thread_count and it is
about to acquire zombie dump thread's
LOCK_thd_data.
Deadlock Cycle:
Thread 1 -> Thread 2 -> Thread 3-> Thread 4 ->Thread 1
The same above deadlock was observed even when thread 4 is
executing 'SELECT * FROM information_schema.processlist' command and
acquired LOCK_thread_count and it is about to acquire zombie
dump thread's LOCK_thd_data.
Analysis:
There are four locks involved in the deadlock. LOCK_log,
LOCK_thread_count, LOCK_index and LOCK_thd_data.
LOCK_log, LOCK_thread_count, LOCK_index are global mutexes
where as LOCK_thd_data is local to a thread.
We can divide these four locks in two groups.
Group 1 consists of LOCK_log and LOCK_index and the order
should be LOCK_log followed by LOCK_index.
Group 2 consists of other two mutexes
LOCK_thread_count, LOCK_thd_data and the order should
be LOCK_thread_count followed by LOCK_thd_data.
Unfortunately, there is no specific predefined lock order defined
to follow in the MySQL system when it comes to locks across these
two groups. In the above problematic example,
there is no problem in the way we are acquiring the locks
if you see each thread individually.
But If you combine all 4 threads, they end up in a deadlock.
Fix:
Since everything seems to be fine in the way threads are taking locks,
In this patch We are changing the duration of the locks in Thread 4
to break the deadlock. i.e., before the patch, Thread 4
('show processlist' command) mysqld_list_processes()
function acquires LOCK_thread_count for the complete duration
of the function and it also acquires/releases
each thread's LOCK_thd_data.
LOCK_thread_count is used to protect addition and
deletion of threads in global threads list. While show
process list is looping through all the existing threads,
it will be a problem if a thread is exited but there is no problem
if a new thread is added to the system. Hence a new mutex is
introduced "LOCK_thd_remove" which will protect deletion
of a thread from global threads list. All threads which are
getting exited should acquire LOCK_thd_remove
followed by LOCK_thread_count. (It should take LOCK_thread_count
also because other places of the code still thinks that exit thread
is protected with LOCK_thread_count. In this fix, we are changing
only 'show process list' query logic )
(Eg: unlink_thd logic will be protected with
LOCK_thd_remove).
Logic of mysqld_list_processes(or file_schema_processlist)
will now be protected with 'LOCK_thd_remove' instead of
'LOCK_thread_count'.
Now the new locking order after this patch is:
LOCK_thd_remove -> LOCK_thd_data -> LOCK_log ->
LOCK_index -> LOCK_thread_count
This is a backport of the patch of bug#11765785. Commit message
by Prabakaran Thirumalai from bug#11765785 is reproduced below:
Description:
------------
Global Query ID (global_query_id ) is not incremented for PING and
statistics command. These two query types are filtered before
incrementing the global query id. This causes race condition and
results in duplicate query id for different queries originating from
different connections.
Analysis:
---------
sqlparse.cc::dispath_command() is the only place in code which sets
thd->query_ id to global_query_id and then increments it based on the
query type. In all other places it is incremented first and then
assigned to thd->query_id.
This is done such that global_query_id is not incremented for PING
and statistics commands in dispatch_command() function.
Fix:
----
As per suggestion from Serg, "There is no reason to skip query_id for
the PING and STATISTICS command.", removing the check which filters
PING and statistics commands.
Instead of using get_query_id() and next_query_id() which can still
cause race condition if context switch happens soon after executing
get_query_id(), changing the code to use next_query_id() instead of
get_query_id() as it is done in other parts of code which deals with
global_query_id.
Removed get_query_id() function and forced next_query_id() caller
to use the return value by specifying warn_unused_result attribute.
PROBLEM:
When large number of connections are continuously made
with wait_timeout of 600 seconds for some hours, some
connections remain after wait_timeout expired and also
new connections get struck under the configuration and
the scenario reported in bug#16196591.
FIX:
The cause of this bug is the issue identified and fixed in
the BUG#16088658 in 5.6.Also LOCK_thread_count contention
issue fixed in BUG#15921866 in 5.6 need to be in 5.5 as
well. Since the issue is not reproducible, it has been
verified at customer configuration the issue could not
be reproduced after a 48-hour test with a non-debug build
which includes the above two fixes backported.
IF LOCALHOST IS BOTH IPV4/IPV6 ENABLED.
The original patch removed default value of the bind-address option.
So, the default value became NULL. By coincedence NULL resolves
to 0.0.0.0 and ::, and since the server chooses first IPv4-address,
0.0.0.0 is choosen. So, there was no change in the behaviour.
This patch restores default value of the bind-address option to "0.0.0.0".
MEMORY LEAK.
Background:
- There are caches for stored functions and stored procedures (SP-cache);
- There is no similar cache for events;
- Triggers are cached together with TABLE objects;
- Those SP-caches are per-session (i.e. specific to each session);
- A stored routine is represented by a sp_head-instance internally;
- SP-cache basically contains sp_head-objects of stored routines, which
have been executed in a session;
- sp_head-object is added into the SP-cache before the corresponding
stored routine is executed;
- SP-cache is flushed in the end of the session.
The problem was that SP-cache might grow without any limit. Although this
was not a pure memory leak (the SP-cache is flushed when session is closed),
this is still a problem, because the user might take much memory by
executing many stored routines.
The patch fixes this problem in the least-intrusive way. A soft limit
(similar to the size of table definition cache) is introduced. To represent
such limit the new runtime configuration parameter 'stored_program_cache'
is introduced. The value of this parameter is stored in the new global
variable stored_program_cache_size that used to control the size of SP-cache
to overflow.
The parameter 'stored_program_cache' limits number of cached routines for
each thread. It has the following min/default/max values given from support:
min = 256, default = 256, max = 512 * 1024.
Also it should be noted that this parameter limits the size of
each cache (for stored procedures and for stored functions) separately.
The SP-cache size is checked after top-level statement is parsed.
If SP-cache size exceeds the limit specified by parameter
'stored_program_cache' then SP-cache is flushed and memory allocated for
cache objects is freed. Such approach allows to flush cache safely
when there are dependencies among stored routines.
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
BIN LOG HAS BEEN MOVED
When moving the binary/relay log files from one location to
another and restarting the server with a different log-bin or
relay-log paths, would cause the startup process to abort. The
root cause was that the server would not be able to find the log
files because it would consider old paths for entries in the
index file instead of the new location. What's even worse, the
relative paths would not be considered relative to the path
provided in log-bin and relay-log, but to mysql_data_dir.
We fix the cases where the server contains relative paths. When
the server is reading from the index file, it checks whether the
entry contains relative paths. If it does, we replace it with the
absolute path set in log-bin/relay-log option. Absolute paths
remain unchanged and the index must be manually edited to
consider the new log-bin and/or relay-log path (this should be
documented). This is a fix for a GA version, that does not break
behavior (that much).
For development versions, we should go with Zhenxing's approach
that removes paths altogether from index files.
@ mysql-test/r/ctype_latin1.result
@ mysql-test/r/ctype_utf8.result
@ mysql-test/t/ctype_latin1.test
@ mysql-test/t/ctype_utf8.test
Adding tests
@ sql/mysqld.h
@ sql/item.cc
@ sql/sql_parse.cc
@ sql/sql_view.cc
Refactoring (thanks to Guilhem for the idea):
Item_string::print() was hard to understand because of the different
QT_ constants: in "query_type==QT_x", QT_x is explicitely included
but the other two QT_ are implicitely excluded. The combinations
with '||' and '&&' make this even harder.
- logic is now more "explicit" by changing QT_ constants to a bitmap of flags:
QT_ORDINARY: no change,
QT_IS -> QT_TO_SYSTEM_CHARSET | QT_WITHOUT_INTRODUCERS,
QT_EXPLAIN -> QT_TO_SYSTEM_CHARSET
(QT_EXPLAIN was introduced in the first version of the Bug#57341 patch)
- Item_string::print() is rewritten using those flags
Bugfix itself:
When QT_TO_SYSTEM_CHARSET is used alone (with no QT_WITHOUT_INTRODUCERS),
we print string literals as follows:
- display introducers if they were in the original query
- print ASCII characters as is
- print non-ASCII characters using hex-escape
Note: as "EXPLAIN" output is only for human readability purposes
and does not need to be a pasrable SQL, so using hex-escape is Ok.
ErrConvString class perfectly suites for hex escaping purposes.
Before this fix, all the performance schema instrumentation for both the binary log
and the relay log would use the following instruments:
- wait/io/file/sql/binlog
- wait/io/file/sql/binlog_index
- wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
This instrumentation is too general and can be more specific.
With this fix, the binlog instrumentation is identical,
and the relay log instrumentation is changed to:
- wait/io/file/sql/relaylog
- wait/io/file/sql/relaylog_index
- wait/synch/mutex/sql/MYSQL_RELAY_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
With this change, the performance instrumentation for the binary log and the relay log,
which share the same structure but have different uses, is more detailed.
This is especially important for hosts in the middle of a replication chain,
that are both masters (binlog) and slaves (relaylog).
The problem is a race between a session closing its vio
(i.e. after a COM_QUIT) at the same time it is being killed by
another thread. This could trigger a assertion in vio_close()
as the two threads could end up closing the same vio, at the
same time. This could happen due to the implementation of
SIGNAL_WITH_VIO_CLOSE, which closes the vio of the thread
being killed.
The solution is to serialize the close of the Vio under
LOCK_thd_data, which protects THD data.
No regression test is added as this is essentially a debug
issue and the test case would be quite convoluted as we would
need to synchronize a session that is being killed -- which
is a bit difficult since debug sync points code does not
synchronize killed sessions.
Before this fix, file io for the binary log file was not accounted properly,
and showed no io at all.
This bug was due to the following issues:
1) file io for the binlog was instrumented:
- sometime as "wait/io/file/sql/binlog"
- sometime as "wait/io/file/sql/MYSQL_LOG"
leading to inconsistent event_names.
2) the binlog file itself was using an IO_CACHE,
but the IO_CACHE implementation in mysys/mf_iocache.c was
not instrumented to make performance schema calls to record file io.
3) The "wait/io/file/sql/MYSQL_LOG" instrumentation was used
for several log files, such as:
- the binary log
- the slow log
- the query log
which caused file io in these different log files to be accounted
against the same instrument.
The instrumentation needs to have a finer grain and report io
in different event_names, because each file really serves a different purpose.
With this fix:
- the IO_CACHE implementation is now instrumented
- the "wait/io/file/sql/MYSQL_LOG" instrument has been removed
- binlog io is now always instrumented with "wait/io/file/sql/binlog"
- the slow log is instrumented with a new name, "wait/io/file/sql/slow_log"
- the query log is instrumented with a new name, "wait/io/file/sql/query_log"
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
After the WL#2687, the binlog_cache_size and max_binlog_cache_size affect both the
stmt-cache and the trx-cache. This means that the resource used is twice the amount
expected/defined by the user.
The binlog_cache_use is incremented when the stmt-cache or the trx-cache is used
and binlog_cache_disk_use is incremented when the disk space from the stmt-cache or the
trx-cache is used. This behavior does not allow to distinguish which cache may be harming
performance due to the extra disk accesses and needs to have its in-memory cache
increased.
To fix the problem, we introduced two new options and status variables related to the
stmt-cache:
Options:
. binlog_stmt_cache_size
. max_binlog_stmt_cache_size
Status Variables:
. binlog_stmt_cache_use
. binlog_stmt_cache_disk_use
So there are
. binlog_cache_size that defines the size of the transactional cache for
updates to transactional engines for the binary log.
. binlog_stmt_cache_size that defines the size of the statement cache for
updates to non-transactional engines for the binary log.
. max_binlog_cache_size that sets the total size of the transactional
cache.
. max_binlog_stmt_cache_size that sets the total size of the statement
cache.
. binlog_cache_use that identifies the number of transactions that used the
temporary transactional binary log cache.
. binlog_cache_disk_use that identifies the number of transactions that used
the temporary transactional binary log cache but that exceeded the value of
binlog_cache_size.
. binlog_stmt_cache_use that identifies the number of statements that used the
temporary non-transactional binary log cache.
. binlog_stmt_cache_disk_use that identifies the number of statements that used
the temporary non-transactional binary log cache but that exceeded the value of
binlog_stmt_cache_size.
TABLES <list> WITH READ LOCK are incompatible".
The problem was that FLUSH TABLES <list> WITH READ LOCK
which was issued when other connection has acquired global
read lock using FLUSH TABLES WITH READ LOCK was blocked
and has to wait until global read lock is released.
This issue stemmed from the fact that FLUSH TABLES <list>
WITH READ LOCK implementation has acquired X metadata locks
on tables to be flushed. Since these locks required acquiring
of global IX lock this statement was incompatible with global
read lock.
This patch addresses problem by using SNW metadata type of
lock for tables to be flushed by FLUSH TABLES <list> WITH
READ LOCK. It is OK to acquire them without global IX lock
as long as we won't try to upgrade those locks. Since SNW
locks allow concurrent statements using same table FLUSH
TABLE <list> WITH READ LOCK now has to wait until old
versions of tables to be flushed go away after acquiring
metadata locks. Since such waiting can lead to deadlock
MDL deadlock detector was extended to take into account
waits for flush and resolve such deadlocks.
As a bonus code in open_tables() which was responsible for
waiting old versions of tables to go away was refactored.
Now when we encounter old version of table in open_table()
we don't back-off and wait for all old version to go away,
but instead wait for this particular table to be flushed.
Such approach supported by deadlock detection should reduce
number of scenarios in which FLUSH TABLES aborts concurrent
multi-statement transactions.
Note that active FLUSH TABLES <list> WITH READ LOCK still
blocks concurrent FLUSH TABLES WITH READ LOCK statement
as the former keeps tables open and thus prevents the
latter statement from doing flush.
DATABASE with open HANDLER"
Remove LOCK_create_db, database name locks, and use metadata locks instead.
This exposes CREATE/DROP/ALTER DATABASE statements to the graph-based
deadlock detector in MDL, and paves the way for a safe, deadlock-free
implementation of RENAME DATABASE.
Database DDL statements will now take exclusive metadata locks on
the database name, while table/view/routine DDL statements take
intention exclusive locks on the database name. This prevents race
conditions between database DDL and table/view/routine DDL.
(e.g. DROP DATABASE with concurrent CREATE/ALTER/DROP TABLE)
By adding database name locks, this patch implements
WL#4450 "DDL locking: CREATE/DROP DATABASE must use database locks" and
WL#4985 "DDL locking: namespace/hierarchical locks".
The patch also changes code to use init_one_table() where appropriate.
The new lock_table_names() function requires TABLE_LIST::db_length to
be set correctly, and this is taken care of by init_one_table().
This patch also adds a simple template to help work with
the mysys HASH data structure.
Most of the patch was written by Konstantin Osipov.
In order to allow thread schedulers to be dynamically loaded,
it is necessary to make the following changes to the server:
- Two new service interfaces
- Modifications to InnoDB to inform the thread scheduler of state changes.
- Changes to the VIO subsystem for checking if data is available on a socket.
- Elimination of remains of the old thread pool implementation.
The two new service interfaces introduces are:
my_thread_scheduler
A service interface to register a thread
scheduler.
thd_wait
A service interface to inform thread scheduler
that the thread is about to start waiting.
In addition, the patch adds code that:
- Add a call to thd_wait for table locks in mysys
thd_lock.c by introducing a set function that
can be used to set a callback to be used when
waiting on a lock and resuming from waiting.
- Calling the mysys set function from the server
to set the callbacks correctly.
Conflicts:
Text conflict in mysql-test/r/explain.result
Text conflict in mysql-test/t/explain.test
Text conflict in sql/net_serv.cc
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_priv.h
when trying to build innodb as plugin.
The reason for the error is mismatch in mysql_temp_dir_list
declaration between mysqld.h and usage in ha_innodb.cc
Add missing MYSQL_PLUGIN_IMPORT to mysql_tmpdir_list
(variables exported by the server and used by plugin need it).
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h