Description: On an example MySQL instance with 28k empty
InnoDB tables, a specific query to information_schema.tables
and information_schema.columns leads to memory consumption
over 38GB RSS.
Analysis: In get_all_tables() call, we fill the I_S tables
from frm files and storage engine. As part of that process
we call make_table_name_list() and allocate memory for all
the 28k frm file names in the THD mem_root through
make_lex_string_root(). Since it has been called around
28k * 28k times there is a huge memory getting hogged in
THD mem_root. This causes the RSS to grow to 38GB.
Fix: As part of fix we are creating a temporary mem_root
in get_all_tables and passing it to fill_fiels(). There we
replace the THD mem_root with the temporary mem_root and
allocates the file names in temporary mem_root and frees
it once we fill the I_S tables in get_all_tables and
re-assign the original mem_root back to THD mem_root.
Note: Checked the massif out put with the fix now the memory growth is just around 580MB at peak.
on disconnect THD must clean user_var_events array before
dropping temporary tables. Otherwise when binlogging a DROP,
it'll access user_var_events, but they were allocated
in the already freed memroot.
if we clear the error status (in THD::clear_error())
make sure to clear the thd->killed == KILL_BAD_DATA too,
because it was caused by the error that we're clearing.
Remove the too restrictive bugfix for bug#67088.
FIFO can be used for general/slow logs, but lseek() and fsync() on
FIFO fail. And open() needs to be non-blocking, in case the other
end isn't reading.
Three-way deadlock:
T1: SHOW GLOBAL STATUS
-> acquire LOCK_status
T2: STOP SLAVE
-> acquire LOCK_active_mi
-> terminate_slave_thread()
-> -> cond_timedwait for handle_slave_sql to stop
T3: sql slave thread (same applies to io thread)
-> handle_slave_sql(), when exiting
-> -> THD::add_status_to_global()
-> -> -> wait for LOCK_status...
T1: SHOW GLOBAL STATUS
-> for "Slave_heartbeat_period" status variable
-> -> show_heartbeat_period()
-> -> -> wait for LOCK_active_mi
cherry-pick from 5.6:
commit fc8b395898f40387b3468122bd0dae31e29a6fde
Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com>
Date: Wed Jun 12 21:41:05 2013 +0530
BUG#16904035-SHOW STATUS - EXCESSIVE LOCKING ON LOCK_ACTIVE_MI AND
ACTIVE_MI->RLI->DATA_LOCK
Problem: Excessive locking on lock_active_mi and rli->data_lock
while executing any `show status like 'X'` command.
Analysis: SHOW_FUNCs for Slave_running, Slave_retried_transactions,
Slave_heartbeat_period, Slave_received_heartbeats,
Slave_last_heartbeat are acquiring lock_active_mi and rli->data_lock
to show their variable value. It is ok to show stale data while showing
the status variables i.e., even if they miss one update, it will
not cause any great trouble.
Fix: Remove the locks from the above mentioned SHOW_FUNC functions.
Add a test case
Taking into account implicit dependence of constant view field from nullable table of left join added.
Fixed finding real table to check if it turned to NULL (materialized view & derived taken into account)
Removed incorrect uninitialization.
There was a rare race, where a deadlock error might not be correctly
handled, causing the slave to stop with something like this in the error
log:
150423 14:04:10 [ERROR] Slave SQL: Connection was killed, Gtid 0-1-2, Internal MariaDB error code: 1927
150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
150423 14:04:10 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
150423 14:04:10 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master-bin.000001 position 1234
The problem was incorrect error handling. When a deadlock is detected, it
causes a KILL CONNECTION on the offending thread. This error is then later
converted to a deadlock error, and the transaction is retried.
However, the deadlock error was not cleared at the start of the retry, nor
was the lingering kill signal. So it was possible to get another deadlock
kill early during retry. If this happened with particular thread
scheduling/timing, it was possible that the new KILL CONNECTION error was
masked by the earlier deadlock error, so that the second kill was not
properly converted into a deadlock error and retry.
This patch adds code that clears the old error and killed flag before
starting the retry. It also adds code to handle a deadlock kill caught in a
couple of places where it was not handled before.
convert_subq_to_sj() must check the results of in_equality->fix_fields()
call. It can fail in a meaningful way when e.g. we're trying to compare
columns with incompatible collations.
This was a regression from the patch for MDEV-7668.
A test was incorrect, so the slave would not properly handle re-using
temporary tables, which lead to replication failure in this case.
It is possible for Item_field to have a NULL field_name. This is true if
the Item_field is created based on a field in a temporary table that has
no name. It is thus necessary to do a null check before attempting a
strcmp.
Add some suppressions that were missing. They are for if a STOP SLAVE is
executed early during IO thread startup, when it is negotiating with the
master. The master connection may be killed in the middle of a
mysql_real_query(), which is not a test failure if it is a network error.
This also caught one real code error, fixed with this commit: The I/O thread
would fail to automatically reconnect if a network error happened while
fetching the value of @@GLOBAL.gtid_domain_id.
Make sure that in parallel replication, we execute wait_for_prior_commit()
before setting table->in_use for a temporary table. Otherwise we can end up
with two parallel replication worker threads competing with each other for
use of a temporary table.
Re-factor the use of find_temporary_table() to be able to handle errors
in the caller (as wait_for_prior_commit() can return error in case of
deadlock kill).
[This commit cherry-picked to be able to merge MDEV-7936, of which it
is a pre-requisite, into both 10.0 and 10.1.]
Parallel replication depends on locking (table locks, row locks, etc.) to
prevent two conflicting transactions from running and committing in parallel.
But temporary tables are designed to be visible only to one thread, and have
no such locking.
In the concrete issue, an intermediate master could commit a CREATE TEMPORARY
TABLE in the same group commit as in INSERT into that table. Thus, a
lower-level master could attempt to run them in parallel and get an error.
More generally, we need protection from parallel replication trying to run
transactions in parallel that access a common temporary table.
This patch simply causes use of a temporary table from parallel replication
to wait for all previous transactions to commit, serialising the replication
at that point.
(A more fine-grained locking could be added later, possibly. However,
using temporary tables in statement-based replication is in any case
normally undesirable; for example a restart of the server will lose
temporary tables and can break replication).
Note that row-based replication is not affected, as it does not do any
temporary tables on the slave-side.
This patch also cleans up the locking around protecting the list of
temporary tables in Relay_log_info. This used to take the
rli->data_lock at the end of every statement, which is very bad for
concurrency. With this patch, the lock is not taken unless temporary
tables (with statement-based binlogging) are in use on the slave.
> + if (lex->no_write_to_binlog && lex->only_view)
> + {
> + my_parse_error(ER(ER_SYNTAX_ERROR));
> + MYSQL_YYABORT;
Why? REPAIR NO_WRITE_TO_BINLOG VIEW makes perfect sense to me, why did
you want to disallow it?
The hangs occur when the group_commit_orderer object is freed before the last
mark_start_commit() call on it - this loses the wakeup to other waiting worker
threads, causing them to hang until killed manually.
The object was freed because wakeup_subsequent_commits() was called two early
in two places. For MDEV-7888, during ANALYZE TABLE, and for MDEV-7929 during
record_gtid() after processing a DDL event. The group_commit_orderer object
can be freed when its last transaction has called wait_for_prior_commit().
Fix by implementing a suspend/resume mechanism for wakeup_subsequent_commits()
that can be used in places where a transaction is committed without this being
the commit of the actual replication event group.
Also add a protection mechanism (that asserts in debug builds) which can
prevent the too-early free and hang if other similar bugs should remain in
other parts of the code.
This patch fixes a bug in the error handling in parallel replication, when one
worker thread gets a failure and other worker threads processing later
transactions have to rollback and abort.
The problem was with the lifetime of group_commit_orderer objects (GCOs).
A GCO is freed when we register that its last event group has committed. This
relies on register_wait_for_prior_commit() and wait_for_prior_commit() to
ensure that the fact that T2 has committed implies that any earlier T1 has
also committed, and can thus no longer execute mark_start_commit().
However, in the error case, the code was skipping the
register_wait_for_prior_commit() and wait_for_prior_commit() calls. Thus
commit ordering was not guaranteed, and a GCO could be freed too early. Then a
later mark_start_commit() would reference deallocated GCO, which could lead to
lost wakeup (causing slave threads to hang) or other corruption.
This patch makes also the error case respect commit order. This way, also the
error case gets the GCO lifetime correct, and the hang no longer occurs.
When a transaction in parallel replication needs to retry (eg. because of
deadlock kill), first wait for all prior transactions to commit before doing
the retry. This way, we avoid the retry once again conflicting with a prior
transaction, requiring yet another retry.
Without this patch, we saw "in the wild" that transactions had to be retried
more than 10 times to succeed, which exceeds the default
--slave_transaction_retries value and is in any case undesirable.
(We already do this in 10.1 in "optimistic" parallel replication mode; this
patch just makes the code use the same logic for "conservative" mode (only
mode in 10.0)).
Backport from mysql-5.5 to mysql-5.1
Bug# 19699237: UNINITIALIZED VARIABLE IN
ITEM_FIELD::STR_RESULT LEADS TO INCORRECT
BEHAVIOR
ISSUE:
------
When the following conditions are satisfied in a query, a
server crash occurs:
a) Two rows are compared using a NULL-safe equal-to operator.
b) Each of these rows belong to different charsets.
SOLUTION:
---------
When one charset is converted to another for comparision,
the constructor of "Item_func_conv_charset" is called.
This will attempt to use the Item_cache if the string is a
constant. This check succeeds because the "used_table_map"
of the Item_cache class is never set to the correct value.
Since it is mistakenly assumed to be a constant, it tries
to fetch the relevant null value related fields which are
yet to be initialized. This results in valgrind issues
and wrong results.
The fix is to update the "used_table_map" of "Item_cache".
This will allow "Item_func_conv_charset" to realise that
this is not a constant.
Problem: UDF doesn't handle the arguments properly when they
are of string type due to a misplaced break.
The length of arguments is also not set properly
when the argument is NULL.
Solution: Fixed the code by putting the break at right place
and setting the argument length to zero when the
argument is NULL.
Backport from mysql-5.5 to mysql-5.1
Bug#19880368 : GROUP_CONCAT CRASHES AFTER DUMP_LEAF_KEY
Problem:
find_order_by_list does not update the address of order_item
correctly after resolving.
Solution:
Change the ref_by address for a order_by field if its
SUM_FUNC_ITEM to the address of the field present in
all_fields.
when using function.
Merged upstream fix to Bug#16221433 MYSQL REJECTS QUERY DUE TO BAD
RESOLUTION OF NAMES IN HAVING; VIEW UNREADABLE
authored by Guilhem Bichot <guilhem.bichot@oracle.com>.
Backport from mysql-5.5 to mysql-5.1
Bug #19612819 : FILESORT: ASSERTION FAILED: POS->FIELD != 0 || POS->ITEM != 0
Problem:
While getting the temp table field for a REF_ITEM
make_sortorder is using the real_item. As a result
server fails later with an assert.
Solution:
Do not use real_item to get the temp table field.
Instead use the REF_ITEM itself as temp table fields
are created for REF_ITEM not the real_item.
If the spatial key is used within an equality comparison, the comparison
does not produce relevant results generally as identical geometry can be
stored differently. Still, we want to support the operation. In order
to allow a hash join plan, we must define a key_length for Field_geom.