The actual Bug#11754376 does not exist in MySQL 5.5 because at startup
we drop entries for temporary tables from InnoDB dictionary cache (only
if ROW_FORMAT is not REDUNDANT). But nevertheless the bug in
normalize_table_name_low() is present so we fix it.
GRACEFUL SHUTDOWN
During startup mysql picks up .frm files from the tmpdir directory and
tries to drop those tables in the storage engine.
The problem is that when tmpdir ends in / then ha_innobase::delete_table()
is passed a string like "/var/tmp//#sql123", then it wrongly normalizes it
to "/#sql123" and calls row_drop_table_for_mysql() which of course fails
to delete the table entry from the InnoDB dictionary cache.
ha_innobase::delete_table() returns an error but nevertheless mysql wipes
away the .frm file and the entry in the InnoDB dictionary cache remains
orphaned with no easy way to remove it.
The "no easy" way to remove it is to create a similar temporary table again,
copy its .frm file to tmpdir under "#sql123.frm" and restart mysqld with
tmpdir=/var/tmp (no trailing slash) - this way mysql will pick the .frm file
after restart and will try to issue drop table for "/var/tmp/#sql123"
(notice do double slash), ha_innobase::delete_table() will normalize it to
"tmp/#sql123" and row_drop_table_for_mysql() will successfully remove the
table entry from the dictionary cache.
The solution is to fix normalize_table_name_low() to normalize things like
"/var/tmp//table" correctly to "tmp/table".
This patch also adds a test function which invokes
normalize_table_name_low() with various inputs to make sure it works
correctly and a mtr test that calls this test function.
Reviewed by: Marko (http://bur03.no.oracle.com/rb/r/929/)
rpl_heartbeat_basic test fails sporadically on pushbuild because did
not received all heartbeats from slave in circular replication.
MASTER_HEARTBEAT_PERIOD had the default value (slave_net_timeout/2) so
wait on "Heartbeat event received on master", that only waits for 1
minute, sometimes timeout before heartbeat arrives. Fixed setting a
smaller period value.
MEMORY LEAK.
Background:
- There are caches for stored functions and stored procedures (SP-cache);
- There is no similar cache for events;
- Triggers are cached together with TABLE objects;
- Those SP-caches are per-session (i.e. specific to each session);
- A stored routine is represented by a sp_head-instance internally;
- SP-cache basically contains sp_head-objects of stored routines, which
have been executed in a session;
- sp_head-object is added into the SP-cache before the corresponding
stored routine is executed;
- SP-cache is flushed in the end of the session.
The problem was that SP-cache might grow without any limit. Although this
was not a pure memory leak (the SP-cache is flushed when session is closed),
this is still a problem, because the user might take much memory by
executing many stored routines.
The patch fixes this problem in the least-intrusive way. A soft limit
(similar to the size of table definition cache) is introduced. To represent
such limit the new runtime configuration parameter 'stored_program_cache'
is introduced. The value of this parameter is stored in the new global
variable stored_program_cache_size that used to control the size of SP-cache
to overflow.
The parameter 'stored_program_cache' limits number of cached routines for
each thread. It has the following min/default/max values given from support:
min = 256, default = 256, max = 512 * 1024.
Also it should be noted that this parameter limits the size of
each cache (for stored procedures and for stored functions) separately.
The SP-cache size is checked after top-level statement is parsed.
If SP-cache size exceeds the limit specified by parameter
'stored_program_cache' then SP-cache is flushed and memory allocated for
cache objects is freed. Such approach allows to flush cache safely
when there are dependencies among stored routines.
sql/mysqld.cc:
Added global variable stored_program_cache_size to store value of
configuration parameter 'stored-program-cache'.
sql/mysqld.h:
Added declaration of global variable stored_program_cache_size.
sql/sp_cache.cc:
Extended interface for sp_cache by adding helper routine
sp_cache_enforce_limit to control size of stored routines cache for
overflow. Also added method enforce_limit into class sp_cache that
implements control of cache size for overflow.
sql/sp_cache.h:
Extended interface for sp_cache by adding standalone routine
sp_cache_enforce_limit to control size of stored routines cache
for overflow.
sql/sql_parse.cc:
Added flush of sp_cache after processing of next sql-statement
received from a client.
sql/sql_prepare.cc:
Added flush of sp_cache after preparation/execution of next prepared
sql-statement received from a client.
sql/sys_vars.cc:
Added support for configuration parameter stored-program-cache.
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
mysql-test/suite/rpl/r/rpl_start_stop_slave.result:
The result file associated with the test added.
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
A test to check the new functionality.
sql/rpl_mi.cc:
The constructor using the new mutex and condition variables for the master_info.
sql/rpl_mi.h:
The condition variable and mutex have been added for the master_info.
sql/rpl_rli.cc:
The constructor using the new mutex and condition variables for the realy_log_info.
sql/rpl_rli.h:
The condition variable and mutex have been added for the relay_log_info.
sql/slave.cc:
Use a timed wait on a condition variable to implement a interruptible sleep.
The wait is registered with the THD object so that the thread will be woken
up if killed.
If an error message contains '\' backslash it is displayed correctly
through show-slave-status or
query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1);. But when
SELECT REPLACE(...) is applied backslash is escaped resulting in a
different test output.
Disabled backslash escape on show_slave_status.inc and replaced '\' for
'/' using replace_regex function in order to achieve the same test
output when different path separators are used.
A follow-up patch corrects max sizes of printed strings and changes llstr() to %lld.
Credits go to Davi who provided a great feedback.
sql/share/errmsg-utf8.txt:
Max size for the whole message is 512 so a part of - like '%-.512s' should be less,
reduction to 320 is safe and with good chances won't cut off a part of a rather log
message in Last_IO_Error = 'Got fatal error 1236 ...'
sql/sql_repl.cc:
llstr() is replaced by %lld.
The server crashes when receiving a COM_BINLOG_DUMP command with a position of 0 or
larger than the file size.
The execution proceeds to an error block having the last read replication coordinates
pointer be NULL and its dereferencing crashed the server.
Fixed with making "public" previously used only for heartbeat coordinates.
mysql-test/extra/rpl_tests/rpl_start_stop_slave.test:
regression test for bug#3593869-64035 is added.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_log_pos.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_packet.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_stm_start_stop_slave.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/t/rpl_stm_start_stop_slave.test:
Slave is stopped by bug#3593869-64035 tests so
-let $rpl_only_running_threads= 1 is set prior to rpl_end.
sql/share/errmsg-utf8.txt:
Increasing the max length of explanatory message to 512.
sql/sql_repl.cc:
Making `coord' to carry the last read from binlog event coordinates
regardless of heartbeat.
Renaming, small cleanup and simplifying the code after if (coord) becomes unnecessary.
Adding yet another 3rd pair of coordinates - the starting replication -
into error text.
and cryptic error 1126 message
The problem was that dlopen() related code was using just a subset
of the path normalization routines used in other places.
Fixed the expansion of the pre-dlopen() behavior for plugins and UDFs
to use a platform-dependent consistent encoding of the paths.
Fixed the error dlopen() error handling to take the correct error message
and strip off the trailing newline character(s).
Fixed tests to do a platform independent replace of directories and to
account for the traling slash.
Test extra/rpl_tests/rpl_extra_col_master.test (used by
rpl_extra_col_master_*) ends with the active connection pointing to the
slave. Thus, the two last tests never succeed in changing the binlog
format of the master away from 'row'. With correct active connection
(master) tests fail for binlog 'statement' and 'mixed' formats.
Tests rpl_extra_col_master_* only run when binary log format is
row. Statement and mix replication do not make sense in this
tests since it will try to execute statements on columns that do
not exist. This fix is basically a backport from mysql-5.5, see
changes done as part of BUG 39934.
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.
rb:875 approved by Jimmy Yang
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.
rb:875 approved by Jimmy Yang
There was memory leak when running some tests on PB2.
The reason of the failure is an early return from change_master()
that was supposed to deallocate a dyn-array.
Actually the same bug58915 was fixed in trunk with relocating the dyn-array
destruction into THD::cleanup_after_query() which can't be bypassed.
The current patch backports magne.mahre@oracle.com-20110203101306-q8auashb3d7icxho
and adds two optimizations: were done: the static buffer for the dyn-array to base on,
and the array initialization is called precisely when it's necessary rather than
per each CHANGE-MASTER as before.
mysql-test/suite/rpl/t/rpl_empty_master_host.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
mysql-test/suite/rpl/t/rpl_server_id_ignore.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
sql/sql_class.cc:
relocating the dyn-array
destruction into THD::cleanup_after_query().
sql/sql_lex.cc:
LEX.mi zero initialization is done in LEX().
sql/sql_lex.h:
Optimization for repl_ignore_server_ids to base on a static buffer
which size is chosen to fit to most common use cases.
sql/sql_repl.cc:
dyn-array destruction is relocated to THD::cleanup_after_query().
sql/sql_yacc.yy:
Refining logics of Lex->mi.repl_ignore_server_ids initialization.
The array is initialized once a corresponding option in CHANGE MASTER token sequence
is found.
The counter handler_read_key (SSV::ha_read_key_count) is incremented
incorrectly.
The mysql server maintains a per thread system_status_var (SSV)
object. This object contains among other things the counter
SSV::ha_read_key_count. The purpose of this counter is to measure the
number of requests to read a row based on a key (or the number of
index lookups).
This counter was wrongly incremented in the
ha_innobase::innobase_get_index(). The fix removes
this increment statement (for both innodb and innodb_plugin).
The various callers of the innobase_get_index() was checked to
determine if anybody must increment this counter (if they first call
innobase_get_index() and then perform an index lookup). It was found
that no caller of innobase_get_index() needs to worry about the
SSV::ha_read_key_count counter.
WITH LARGE BUFFER POOL
(Note: this a backport of revno:3472 from mysql-trunk)
rb://845
approved by: Marko
When dropping a table (with an .ibd file i.e.: with
innodb_file_per_table set) we scan entire LRU to invalidate pages from
that table. This can be painful in case of large buffer pools as we hold
the buf_pool->mutex for the scan. Note that gravity of the problem does
not depend on the size of the table. Even with an empty table but a
large and filled up buffer pool we'll end up scanning a very long LRU
list.
The fix is to scan flush_list and just remove the blocks belonging to
the table from the flush_list, marking them as non-dirty. The blocks
are left in the LRU list for eventual eviction due to aging. The
flush_list is typically much smaller than the LRU list but for cases
where it is very long we have the solution of releasing the
buf_pool->mutex after scanning 1K pages.
buf_page_[set|unset]_sticky(): Use new IO-state BUF_IO_PIN to ensure
that a block stays in the flush_list and LRU list when we release
buf_pool->mutex. Previously we have been abusing BUF_IO_READ to achieve
this.
Refactored the test case: hardened and extended it. Created test inc file
to abstract the task of relocating binlogs.
Also, disabled it on windows for not cluttering the test case any further,
as it depends heavily on doing filesystem operations and path handling.
mysql-test/include/relocate_binlogs.inc:
Auxiliar include file that performs the relocation of binary logs
listed in an index file.
There was memory leak when running some tests on PB2.
The reason of the failure is an early return from change_master()
that was supposed to deallocate a dyn-array.
Fixed with relocating the dyn-array's destructor at ~LEX() that is
the end of the session, per Gleb's patch idea.
Two optimizations were done: the static buffer for the dyn-array to base on,
and the array initialization is called precisely when it's necessary rather than
per each CHANGE-MASTER as before.
mysql-test/suite/rpl/t/rpl_empty_master_host.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
sql/sql_lex.cc:
the new flag is initialized.
sql/sql_lex.h:
A new bool flag new member to LEX.mi is added to stay UP since after
LEX.mi.repl_ignore_server_ids dynarray initialization was called
for the first time on the session. So it is set once and its life time
is session.
The array is destroyed at the end of the session.
sql/sql_repl.cc:
dyn-array destruction is relocated to ~LEX.
sql/sql_yacc.yy:
Refining logics of Lex->mi.repl_ignore_server_ids initialization.
The array is initialized once a corresponding option in CHANGE MASTER token sequence
is found.
The fact of initialization is memorized into the new flag.
BIN LOG HAS BEEN MOVED
When moving the binary/relay log files from one location to
another and restarting the server with a different log-bin or
relay-log paths, would cause the startup process to abort. The
root cause was that the server would not be able to find the log
files because it would consider old paths for entries in the
index file instead of the new location. What's even worse, the
relative paths would not be considered relative to the path
provided in log-bin and relay-log, but to mysql_data_dir.
We fix the cases where the server contains relative paths. When
the server is reading from the index file, it checks whether the
entry contains relative paths. If it does, we replace it with the
absolute path set in log-bin/relay-log option. Absolute paths
remain unchanged and the index must be manually edited to
consider the new log-bin and/or relay-log path (this should be
documented). This is a fix for a GA version, that does not break
behavior (that much).
For development versions, we should go with Zhenxing's approach
that removes paths altogether from index files.
mysql-test/include/begin_include_file.inc:
Added parameter to keep the begin_include_file.inc silent. Useful when
including scripts that contain platform dependent parameters, for example:
--let $rpl_server_parameters=--log-bin=$tmpdir/slave-bin --relay-log=$tmpdir/slave-relay-bin
--let $keep_include_silent=1
source include/rpl_start_server.inc;
--let $keep_include_silent=0
We want the paths ($tmpdir/slave-bin and $tmpdir/slave-relay-bin) not to be in the
result file.
mysql-test/suite/rpl/t/rpl_binlog_index.test:
Test case.
sql/log.cc:
When finding the corresponding log entry in the index file, we first
normalize the paths before doing the comparison. This will make relative
paths to be turned into absolute paths (based on the opt_bin_logname or
opt_relay_logname) and then compared against also, expanded paths entered,
through CHANGE MASTER for instance.
sql/log.h:
Added normalize_binlog_name, which turns relative paths, into absolute paths
given the parameter: is_relay_log ? opt_relay_logname : opt_bin_logname .
sql/mysqld.cc:
Exposing opt_bin_logname.
sql/mysqld.h:
Exposing opt_bin_logname.
When passing an empty user to the connect function will cause
valgrind warnings. Seems that the client code is not prepared
to handle empty users. On 5.6 this can even be triggered by
START SLAVE PASSWORD='...'; i.e., without setting USER='...' on
the START SLAVE command (see WL#4143 for details on the new
additional START SLAVE commands).
To fix this, we disallow empty users when configuring the slave
connection parameters (this decision might be revisited if the
client code accepts empty users in the future).
sql/slave.cc:
We throw an error if an empty user is supplied to the connection
function.
Setting query_cache_size to larger values might fail depending on the memory
pressure being put on the system. This can be seen on pushbuild as the test
case query_cache_size_basic tries to allocate a +3GB query cache, which
succeeds in some machines and fails in others.
So this part of the code where the test case tries to allocate +3GB query cache has been
disabled for now to get the test running on pb2.
rb://816
approved by: Marko Makela
The title is misleading. This bug was actually introduced by
bug 12635227 and was unearthed by a later optimization.
We need to free buf_page_t structs that we are allocating using
malloc() at shutdown.
BY CACHING OR REDUCING CREATEEVENT CALLS".
5.5 versions of MySQL server performed worse than 5.1 versions
under single-connection workload in autocommit mode on Windows XP.
Part of this slowdown can be attributed to overhead associated
with constant creation/destruction of MDL_lock objects in the MDL
subsystem. The problem is that creation/destruction of these
objects causes creation and destruction of associated
synchronization primitives, which are expensive on Windows XP.
This patch tries to alleviate this problem by introducing a cache
of unused MDL_object_lock objects. Instead of destroying such
objects we put them into the cache and then reuse with a new
key when creation of a new object is requested.
To limit the size of this cache, a new --metadata-locks-cache-size
start-up parameter was introduced.
mysql-test/r/mysqld--help-notwin.result:
Updated test after adding --metadata-locks-cache-size
parameter.
mysql-test/r/mysqld--help-win.result:
Updated test after adding --metadata-locks-cache-size
parameter.
mysql-test/suite/sys_vars/r/metadata_locks_cache_size_basic.result:
Added test coverage for newly introduced --metadata_locks_cache_size
start-up parameter and corresponding global read-only variable.
mysql-test/suite/sys_vars/t/metadata_locks_cache_size_basic-master.opt:
Added test coverage for newly introduced --metadata_locks_cache_size
start-up parameter and corresponding global read-only variable.
mysql-test/suite/sys_vars/t/metadata_locks_cache_size_basic.test:
Added test coverage for newly introduced --metadata_locks_cache_size
start-up parameter and corresponding global read-only variable.
sql/mdl.cc:
Introduced caching of unused MDL_object_lock objects, in order to
avoid costs associated with constant creation and destruction of
such objects in single-connection workloads run in autocommit mode.
Such costs can be pretty high on systems where creation and
destruction of synchronization primitives require a system call
(e.g. Windows XP).
To implement this cache,a list of unused MDL_object_lock instances
was added to MDL_map object. Instead of being destroyed
MDL_object_lock instances are put into this list and re-used later
when creation of a new instance is required. Also added
MDL_lock::m_version counter to allow threads having outstanding
references to an MDL_object_lock instance to notice that it has
been moved to the unused objects list.
Added a global variable for a start-up parameter that limits
the size of the unused objects list.
Note that we don't cache MDL_scoped_lock objects since they
are supposed to be created only during execution of DDL
statements and therefore should not affect performance much.
sql/mdl.h:
Added a global variable for start-up parameter that limits the
size of the unused MDL_object_lock objects list and constant
for its default value.
sql/sql_plist.h:
Added I_P_List<>::pop_front() function.
sql/sys_vars.cc:
Introduced --metadata-locks-cache-size start-up parameter
for specifying size of the cache of unused MDL_object_lock
objects.
SCAN/CPU) => SLAVE FAILURE
When a statement containing a large amount of ROWs to be applied on
the slave, and the slave's table does not contain a PK, it can take a
considerable amount of time to find and change all the rows that are
to be changed.
The proper slave enhancement will be implemented in WL 5597. However,
in this bug we are making it clear to the user what the problem is, by
printing a message to the error log if the execution time, for a given
statement in RBR, takes more than LONG_FIND_ROW_THRESHOLD (set to 60
seconds). This shall help the DBA to diagnose what's happening when
facing a slave server that is quiet for no apparent reason...
The note is only printed to the error log if log_warnings is set to be
greater than 1.
sql/log_event.cc:
Core of the patch.
In Rows_log_event::do_apply_event, sets STMT start
timestamp.
In Rows_log_event::find_row, if a PK is not used, then the start
timestamp is checked to see if the time spent on this STMT is enough
to justify the printing of a note (only if it was not printed before).
sql/log_event.h:
Defining LONG_FIND_ROW_THRESHOLD.
sql/rpl_rli.cc:
Resets long_find_row state in rli->context_cleanup().
sql/rpl_rli.h:
Two new rli properties that are necessary to control when to
emit a note in the error log: 1) timestamp that states when the
ROW statement started; 2) flag indicating whether the note has
been emitted for the current statement or not. Also deployed
accessors.
The bug was accidentally fixed by fixing
Bug#11759688 52020: InnoDB can still deadlock on just INSERT...ON DUPLICATE KEY
a.k.a. the reintroduction of
Bug#7975 deadlock without any locking, simple select and update