Backporting the WL#5716, "Information schema table for InnoDB
buffer pool information". Backporting revisions 2876.244.113,
2876.244.102 from mysql-trunk.
rb://1175 approved by Jimmy Yang.
Problem description:
Giving "help 'contents'" in the mysql client as a first statement
gives error
Analysis:
In com_server_help() function the "server_cmd" variable was
initialised with buffer->ptr(). And the "server_cmd" variable is not
updated since we are passing "'contents'"(with single quote) so the
buffer->ptr() consists of the previous buffer values and it was sent
to the mysql_real_query() hence we are getting error.
Fix:
We are not initialising the "server_cmd" variable and we are updating
the variable with "server_cmd= cmd_buf" in any of the case i.e with
single quote or without single quote for the contents.
As part of error message improvement, added new error message in case
of "help 'contents'".
"ORDER BY" AND "LIMIT BY" CLAUSE
PROBLEM:
When a 'limit' clause is specified in a query along with
group by and order by, optimizer chooses wrong index
there by examining more number of rows than required.
However without the 'limit' clause, optimizer chooses
the right index.
ANALYSIS:
With respect to the query specified, range optimizer chooses
the first index as there is a range present ( on 'a'). Optimizer
then checks for an index which would give records in sorted
order for the 'group by' clause.
While checking chooses the second index (on 'c,b,a') based on
the 'limit' specified and the selectivity of
'quick_condition_rows' (number of rows present in the range)
in 'test_if_skip_sort_order' function.
But, it fails to consider that an order by clause on a
different column will result in scanning the entire index and
hence the estimated number of rows calculated above are
wrong (which results in choosing the second index).
FIX:
Do not enforce the 'limit' clause in the call to
'test_if_skip_sort_order' if we are creating a temporary
table. Creation of temporary table indicates that there would be
more post-processing and hence will need all the rows.
This fix is backported from 5.6. This problem is fixed in 5.6 as
part of changes for work log #5558
RBR AND RC
Description: When scanning and locking rows with < or <=, InnoDB locks
the next row even though row based binary logging and read committed
is used.
Solution: In the handler, when the row is identified to fall outside
of the range (as specified in the query predicates), then request the
storage engine to unlock the row (if possible). This is done in
handler::read_range_first() and handler::read_range_next().
Problem:
=======
The return value from my_b_write is ignored by: `my_b_write_quoted',
`my_b_write_bit',`Query_log_event::print_query_header'
Most callers of `my_b_printf' ignore the return value. `log_event.cc'
has many calls to it.
Analysis:
========
`my_b_write' is used to write data into a file. If the write fails it
sets appropriate error number and error message through my_error()
function call and sets the IO_CACHE::error == -1.
`my_b_printf' function is also used to write data into a file, it
internally invokes my_b_write to do the write operation. Upon
success it returns number of characters written to file and on error
it returns -1 and sets the error through my_error() and also sets
IO_CACHE::error == -1. Most of the event specific print functions
for example `Create_file_log_event::print', `Execute_load_log_event::print'
etc are the ones which make several calls to the above two functions and
they do not check for the return value after the 'print' call. All the above
mentioned abuse cases deal with the client side.
Fix:
===
As part of bug fix a check for IO_CACHE::error == -1 has been added at
a very high level after the call to the 'print' function. There are
few more places where the return value of "my_b_write" is ignored
those are mentioned below.
+++ mysys/mf_iocache2.c 2012-06-04 07:03:15 +0000
@@ -430,7 +430,8 @@
memset(buffz, '0', minimum_width - length2);
else
memset(buffz, ' ', minimum_width - length2);
- my_b_write(info, buffz, minimum_width - length2);
+++ sql/log.cc 2012-06-08 09:04:46 +0000
@@ -2388,7 +2388,12 @@
{
end= strxmov(buff, "# administrator command: ", NullS);
buff_len= (ulong) (end - buff);
- my_b_write(&log_file, (uchar*) buff, buff_len);
At these places appropriate return value handlers have been added.
Fixes for BUG11761686 left a flaw that managed to slip away from testing.
Only effective filtering branch was actually tested with a regression test
added to rpl_filter_tables_not_exist.
The reason of the failure is destuction of too early mem-root-allocated memory
at the end of the deferred User-var's do_apply_event().
Fixed with bypassing free_root() in the deferred execution branch.
Deallocation of created in do_apply_event() items is done by the base code
through THD::cleanup_after_query() -> free_items() that the parent Query
can't miss.
HANDLE_FATAL_SIGNAL IN STRNLEN
Fixed the following bounds checking problems :
1. in check_if_legal_filename() make sure the null terminated
string is long enough before accessing the bytes in it.
Prevents pottential read-past-buffer-end
2. in my_wc_mb_filename() of the filename charset check
for the end of the destination buffer before sending single
byte characters into it.
Prevents write-past-end-of-buffer (and garbaling stack in
the cases reported here) errors.
Added test cases.
This is a followup patch for the bug enabling the test
i_binlog.binlog_mysqlbinlog_file_write.test
this was disabled in mysql trunk and mysql 5.5 as in the release
build mysqlbinlog was not debug compiled whereas the mysqld was.
Since have_debug.inc script checks only for mysqld to be debug
compiled, the test was not being skipped on release builds.
We resolve this problem by creating a new inc file
mysqlbinlog_have_debug.inc which checks exclusively for mysqlbinlog
to be debug compiled. if not it skips the test.
Several fixes :
* sql-common/client.c
Added a validity check of the fields metadata packet sent
by the server.
Now libmysql will check if the length of the data sent by
the server matches what's expected by the protocol before
using the data.
* client/mysqltest.cc
Fixed the error handling code in mysqltest to avoid sending
new commands when the reading the result set failed (and
there are unread data in the pipe).
* sql_common.h + libmysql/libmysql.c + sql-common/client.c
unpack_fields() now generates a proper error when it fails.
Added a new argument to this function to support the error
generation.
* sql/protocol.cc
Added a debug trigger to cause the server to send a NULL
insted of the packet expected by the client for testing
purposes.
Print the warning(note):
YEAR(x) is deprecated and will be removed in a future release. Please use YEAR(4) instead
on "CREATE TABLE ... YEAR(x)" or "ALTER TABLE MODIFY ... YEAR(x)", where x != 4
Problem: Some queries with subqueries and a HAVING clause that
consists only of a column not in the select or grouping lists causes
the server to crash.
During parsing, an Item_ref is constructed for the HAVING column. The
name of the column is resolved when JOIN::prepare calls fix_fields()
on its having clause. Since the column is not mentioned in the select
or grouping lists, a ref pointer is not found and a new Item_field is
created instead. The Item_ref is replaced by the Item_field in the
tree of HAVING clauses. Since the tree consists only of this item, the
pointer that is updated is JOIN::having. However,
st_select_lex::having still points to the Item_ref as the root of the
tree of HAVING clauses.
The bug is triggered when doing filesort for create_sort_index(). When
find_all_keys() calls select->cond->walk() it eventually reaches
Item_subselect::walk() where it continues to walk the having clauses
from lex->having. This means that it finds the Item_ref instead of the
new Item_field, and Item_ref::walk() tries to dereference the ref
pointer, which is still null.
The crash is reproducible only in 5.5, but the problem lies latent in
5.1 and trunk as well.
Fix: After calling fix_fields on the having clause in JOIN::prepare(),
set select_lex::having to point to the same item as JOIN::having.
This patch also fixes a bug in 5.1 and 5.5 that is triggered if the
query is executed as a prepared statement. The Item_field is created
in the runtime arena when the query is prepared, and the pointer to
the item is saved by st_select_lex::fix_prepare_information() and
brought back as a dangling pointer when the query is executed, after
the runtime arena has been reclaimed.
Fix: Backport fix from trunk that switches to the permanent arena
before calling Item_ref::fix_fields() in JOIN::prepare().
INC_HOST_ERRORS() IS CALLED.
Issue : Sequence of calling inc_host_errors()
and reset_host_errors() required some
changes in order to maintain correct
connection error count.
Solution : Call to reset_host_errors() is shifted
to a location after which no calls to
inc_host_errors() are made.
Problem
========
Replication breaks in the cases if the event length exceeds
the size of master Dump thread's max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet, on addition of the
max_event_header length exceeds the max_allowed_packet of the DUMP thread.
This causes the Dump thread to break replication and throw an error.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem is fixed in 2 steps:
1.) The Dump thread limit to read event is increased to the upper limit
i.e. Dump thread reads whatever gets logged in the binary log.
2.) On the slave side we increase the the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option (slave_max_allowed_packet)
included, is used to regulate the max_allowed_packet of the
slave thread (IO/SQL) by the DBA, and facilitates the sending of
large packets from the master to the slave.
This causes the large packets to be received by the slave and apply
it successfully.
WHEN KILLING
Suppose there is a query waiting for a lock. If the user kills
this query, then "Got error -1 when reading table" error message
must not be logged in the server log file. Since this is a user
requested interruption, no spurious error message must be logged
in the server log. This patch will remove the error message from
the log.
approved by joh and tatjana
Problem: mysqlbinlog exits without any error code in case of
file write error. It is because of the fact that the calls
to Log_event::print() method does not return a value and the
thus any error were being ignored.
Resolution: We resolve this problem by checking for the
IO_CACHE::error == -1 after every call to Log_event:: print()
and terminating the further execution.
rb://1088
approved by: Marko Makela
This bug was introduced in early stages of plugin. We were not
checking for an implicit lock on sec index rec for trx_id that is
stamped on current version of the clust_index in case where the
clust_index has a previous delete marked version.
INNODB_AUTOINC_LOCK_MODE=1 AND USING TRIGGER
When an insert stmt like "insert into t values (1),(2),(3)" is
executed, the autoincrement values assigned to these three rows are
expected to be contiguous. In the given lock mode
(innodb_autoinc_lock_mode=1), the auto inc lock will be released
before the end of the statement. So to make the autoincrement
contiguous for a given statement, we need to reserve the auto inc
values at the beginning of the statement.
Modified the fix based on review comment by Svoj.
Problem
========
SQL statements close to the size of max_allowed_packet produce binary
log events larger than max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet + max_event_header
length. Now since the event length exceeds this size master Dump
thread is unable to send the packet on to the slave.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem was fixed by increasing the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option included which is used to
regulate the max_allowed_packet of the slave thread (IO/SQL).
This causes the large packets to be received by the slave and apply
it successfully.
Problem: After the fix for Bug#12589870, a new field that
stores the length of db name was added in the buffer that
stores the query to be executed. Unlike for the plain user
session, the replication execution did not allocate the
necessary chunk in Query-event constructor. This caused an
invalid read while accessing this field.
Solution: We fix this problem by allocating a necessary chunk
in the buffer created in the Query_log_event::Query_log_event()
and store the length of database name.
PROBLEM:
Threads end-up in deadlock due to locks acquired as described
below,
con1: Run Query on a table.
It is important that this SELECT must back-off while
trying to open the t1 and enter into wait_for_condition().
The SELECT then is blocked trying to lock mysys_var->mutex
which is held by con3. The very significant fact here is
that mysys_var->current_mutex will still point to LOCK_open,
even if LOCK_open is no longer held by con1 at this point.
con2: Try dropping table used in con1 or query some table.
It will hold LOCK_open and be blocked trying to lock
kernel_mutex held by con4.
con3: Try killing the query run by con1.
It will hold THD::LOCK_thd_data belonging to con1 while
trying to lock mysys_var->current_mutex belonging to con1.
But current_mutex will point to LOCK_open which is held
by con2.
con4: Get innodb engine status
It will hold kernel_mutex, trying to lock THD::LOCK_thd_data
belonging to con1 which is held by con3.
So while technically only con2, con3 and con4 participate in the
deadlock, con1's mysys_var->current_mutex pointing to LOCK_open
is a vital component of the deadlock.
CYCLE = (THD::LOCK_thd_data -> LOCK_open ->
kernel_mutex -> THD::LOCK_thd_data)
FIX:
LOCK_thd_data has responsibility of protecting,
1) thd->query, thd->query_length
2) VIO
3) thd->mysys_var (used by KILL statement and shutdown)
4) THD during thread delete.
Among above responsibilities, 1), 2)and (3,4) seems to be three
independent group of responsibility. If there is different LOCK
owning responsibility of (3,4), the above mentioned deadlock cycle
can be avoid. This fix introduces LOCK_thd_kill to handle
responsibility (3,4), which eliminates the deadlock issue.
Note: The problem is not found in 5.5. Introduction MDL subsystem
caused metadata locking responsibility to be moved from TDC/TC to
MDL subsystem. Due to this, responsibility of LOCK_open is reduced.
As the use of LOCK_open is removed in open_table() and
mysql_rm_table() the above mentioned CYCLE does not form.
Revision ID for changes,
open_table() = dlenev@mysql.com-20100727133458-m3ua9oslnx8fbbvz
mysql_rm_table() = jon.hauglid@oracle.com-20101116100012-kxep9txz2fxy3nmw
The following scenario crashes our mysql server:
1. set global innodb_file_per_table=1;
2. create table t1(c1 int) engine=innodb;
3. alter table t1 discard tablespace;
4. alter table t1 add unique index(c1);
Step 4 crashes the server. This patch introduces a check on discarded
tablespace to avoid the crash.
rb://1041 approved by Marko Makela
FULLTEXT INDEX AND CONCURRENT DML.
Problem Statement:
------------------
1) Create a table with FT index.
2) Enable concurrent inserts.
3) In multiple threads do below operations repeatedly
a) truncate table
b) insert into table ....
c) select ... match .. against .. non-boolean/boolean mode
After some time we could observe two different assert core dumps
Analysis:
--------
1)assert core dump at key_read_cache():
Two select threads operating in-parallel on same key
root block.
1st select thread block->status is set to BLOCK_ERROR
because the my_pread() in read_block() is returning '0'.
Truncate table made the index file size as 1024 and pread
was asked to get the block of count bytes(1024 bytes)
from offset of 1024 which it cannot read since its
"end of file" and retuning '0' setting
"my_errno= HA_ERR_FILE_TOO_SHORT" and the key_file_length,
key_root[0] is same i.e. 1024. Since block status has BLOCK_ERROR
the 1st select thread enter into the free_block() and will
be under wait on conditional mutex by making status as
BLOCK_REASSIGNED and goes for wait_on_readers(). Other select
thread will also work on the same block and sees the status as
BLOCK_ERROR and enters into free_block(), checks for BLOCK_REASSIGNED
and asserting the server.
2)assert core dump at key_write_cache():
One select thread and One insert thread.
Select thread gets the unlocks the 'keycache->cache_lock',
which allows other threads to continue and gets the pread()
return value as'0'(please see the explanation above) and
tries to get the lock on 'keycache->cache_lock' and waits
there for the lock.
Insert thread requests for the block, block will be assigned
from the hash list and makes the page_status as
'PAGE_WAIT_TO_BE_READ' and goes for the read_block(), waits
in the queue since there are some other threads performing
reads on the same block.
Select thread which was waiting for the 'keycache->cache_lock'
mutex in the read_block() will continue after getting the my_pread()
value as '0' and sets the block status as BLOCK_ERROR and goes to
the free_block() and go to the wait_for_readers().
Now the insert thread will awake and continues. and checks
block->status as not BLOCK_READ and it asserts.
Fix:
---
In the full text code, multiple readers of index file is not guarded.
Hence added below below code in _ft2_search() and walk_and_match().
to lock the key_root I have used below code in _ft2_search()
if (info->s->concurrent_insert)
mysql_rwlock_rdlock(&share->key_root_lock[0]);
and to unlock
if (info->s->concurrent_insert)
mysql_rwlock_unlock(&share->key_root_lock[0]);
INNODB_AUTOINC_LOCK_MODE=1 AND USING TRIGGER
When an insert stmt like "insert into t values (1),(2),(3)" is
executed, the autoincrement values assigned to these three rows are
expected to be contiguous. In the given lock mode
(innodb_autoinc_lock_mode=1), the auto inc lock will be released
before the end of the statement. So to make the autoincrement
contiguous for a given statement, we need to reserve the auto inc
values at the beginning of the statement.
rb://1074 approved by Alexander Nozdrin
dict_table_replace_index_in_foreign_list(): Replace the dropped index
also in the foreign key constraints of child tables that are
referencing this table.
row_ins_check_foreign_constraint(): If the underlying index is
missing, refuse the operation.
rb:1051 approved by Jimmy Yang