MySQL's hash functions MD5 and SHA relied on the somewhat slow
sprintf function to convert the digests to hex representations.
This patch replaces the sprintf with a specific and inline hex
conversion function.
Patch contributed by Jan Steemann.
In RBR, DDL statement will change binlog format to non row-based
format before it is binlogged, but the binlog format was not be
restored, and then manipulating a temporary table can not reset binlog
format to row-based format rightly. So that the manipulated statement
is binlogged with statement-based format.
To fix the problem, restore the state of binlog format after the DDL
statement is binlogged.
cant find record
Some engines return data for the record. Despite the fact that
the null bit is set for some fields, their old value may still in
the row. This can happen when unpacking an AI from the binlog on
top of a previous record in which a field is set to NULL, which
previously contained a value. Ultimately, this may cause the
comparison of records to fail when the slave is doing an index or
range scan.
We fix this by deploying a call to reset() for each field that is
set to null while unpacking a row from the binary log.
Furthermore, we also add mixed mode test case to cover the
scenario where updating and setting a field to null through a
Query event and later searching it through a rows event will
succeed.
Finally, we also change the reset() method, from Field_bit class,
so that it takes into account bits stored among the null bits and
not only the ones stored in the record.
Several items said to be deprecated in the 4.1 manual
have never been removed. This worklog adds deprecation
warnings when these items are used, and warns the user
that the items will be removed in MySQL 5.6.
A couple of previously deprecation decision have been
reversed (see single file comments)
It is well-known that due to concurrency issues, a slave can become
inconsistent when a transaction contains updates to both transaction and
non-transactional tables in statement and mixed modes.
In a nutshell, the current code-base tries to preserve causality among the
statements by writing non-transactional statements to the txn-cache which
is flushed upon commit. However, modifications done to non-transactional
tables on behalf of a transaction become immediately visible to other
connections but may not immediately get into the binary log and therefore
consistency may be broken.
In general, it is impossible to automatically detect causality/dependency
among statements by just analyzing the statements sent to the server. This
happen because dependency may be hidden in the application code and it is
necessary to know a priori all the statements processed in the context of
a transaction such as in a procedure. Moreover, even for the few cases that
we could automatically address in the server, the computation effort
required could make the approach infeasible.
So, in this patch we introduce the option
- "--binlog-direct-non-transactional-updates" that can be used to bypass
the current behavior in order to write directly to binary log statements
that change non-transactional tables.
'CREATE TABLE IF NOT EXISTS ... SELECT' statement were causing 'CREATE
TEMPORARY TABLE ...' to be written to the binary log in row-based
mode (a.k.a. RBR), when there was a temporary table with the same name.
Because the 'CREATE TABLE ... SELECT' statement was executed as
'INSERT ... SELECT' into the temporary table. Since in RBR mode no
other statements related to temporary tables are written into binary log,
this sometimes broke replication.
This patch changes behavior of 'CREATE TABLE [IF NOT EXISTS] ... SELECT ...'.
it ignores existence of temporary table with the
same name as table being created and is interpreted
as attempt to create/insert into base table. This makes behavior of
'CREATE TABLE [IF NOT EXISTS] ... SELECT' consistent with
how ordinary 'CREATE TABLE' and 'CREATE TABLE ... LIKE' behave.
The optimizer must not continue executing the current query
if e.g. the storage engine reports an error.
This is somewhat hard to implement with Item::val_xxx()
because they do not have means to return error code.
This is why we need to check the thread's error state after
a call to one of the Item::val_xxx() methods.
Fixed store_key_item::copy_inner() to return an error state
if an error happened during the call to Item::save_in_field()
because it calls Item::val_xxx().
Also added similar checks to related places.
BUG#49481: RBR: MyISAM and bit fields may cause slave to stop on delete:
cant find record
BUG#49482: RBR: Replication may break on deletes when MyISAM tables +
char field are used
When using MyISAM tables, despite the fact that the null bit is
set for some fields, their old value is still in the row. This
can cause the comparison of records to fail when the slave is
doing an index or range scan.
We fix this by avoiding memcmp for MyISAM tables when comparing
records. Additionally, when comparing field by field, we first
check if both fields are not null and if so, then we compare
them. If just one field is null we return failure immediately. If
both fields are null, we move on to the next field.
In certain rare cases when a process was interrupted
during a FLUSH PRIVILEGES operation the diagnostic
area would be set to an error state but the function
responsible for the operation would still signal
success. This would lead to a debug assertion error
later on when the server would attempt to reset the
DA before sending the error message.
This patch fixes the issue by assuring that
reload_acl_and_cache() always fails if an error
condition is raised.
The second issue was that a KILL could cause
a console error message which referred to a DA
state without first making sure that such a
state existed.
This patch fixes this issue in two different
palces by first checking DA state before
fetching the error message.
Problem: When RAND() is binlogged in statement mode, the seed is
binlogged too, so the replication slave generates the same
sequence of random numbers. This makes replication work in many
cases, but not in all cases: the order of rows is not guaranteed
for, e.g., UPDATE or INSERT...SELECT statements, so the row data
will be different if master and slave retrieve the rows in
different orders.
Fix: Mark RAND() as unsafe. It will generate a warning if
binlog_format=STATEMENT and switch to row-logging if
binlog_format=ROW.
Selecting of the CONCAT_WS(...<PS parameter>...) result into
a user variable may return wrong data.
Item_func_concat_ws::val_str contains a number of memory
allocation-saving optimization tricks. After the fix
for bug 46815 the control flow has been changed to a
branch that is commented as "This is quite uncommon!":
one of places where we are trying to concatenate
strings inplace. However, that "uncommon" place
didn't care about PS parameters, that have another
trick in Item_sp_variable::val_str(): they use the
intermediate Item_sp_variable::str_value field,
where they may store a reference to an external
argument's buffer.
The Item_func_concat_ws::val_str function has been
modified to take into account val_str functions
(such as Item_sp_variable::val_str) that return a
pointer to an internal Item member variable that
may reference to a buffer provided.
MySQL handles the join syntax "JOIN ... USING( field1,
... )" and natural joins by building the same parse tree as
a corresponding join with an "ON t1.field1 = t2.field1 ..."
expression would produce. This parse tree was not cleaned up
properly in the following scenario. If a thread tries to
lock some tables and finds that the tables were dropped and
re-created while waiting for the lock, it cleans up column
references in the statement by means a per-statement free
list. But if the statement was part of a stored procedure,
column references on the stored procedure's free list weren't
cleaned up and thus contained pointers to freed objects.
Fixed by adding a call to clean up the current prepared
statement's free list.
Manually deleteing one or more entries from 'master-bin.index', will
cause master infinitely loop to send one binlog file.
When starting a dump session, master opens index file and search the binlog file
which is being requested by the slave. The position of the binlog file in the
index file is recorded. it will be used to find the next binlog file when current
binlog file has dumped completely. As only the position is used, it may
not get the correct file if some entries has been removed manually from the index file.
the master will reopen the current binlog file which has been dump completely
and redump it if it can not get the next binlog file's name from index file.
It obviously is a logical error.
Even though it is allowed to manually change index file,
but it is not recommended. so after this patch, master
sends a fatal error to slave and close the dump session if a new binlog file
has been generated and master can not get it from the index file.
For tables with metadata sizes ranging from 251 to 255 the size
of the event data (m_data_size) was being improperly calculated
in the Table_map_log_event constructor. This was due to the fact
that when writing the Table_map_log_event body (in
Table_map_log_event::write_data_body) a call to net_store_length
is made for packing the m_field_metadata_size. It happens that
net_store_length uses *one* byte for storing
m_field_metadata_size when it is smaller than 251 but *three*
bytes when it exceeds that value. BUG 42749 had already
pinpointed and fix this fact, but the fix was incomplete, as the
calculation in the Table_map_log_event constructor considers 255
instead of 251 as the threshold to increment m_data_size by
three. Thence, the window for having a mismatch between the
number of bytes written and the number of bytes accounted in the
event length (m_data_size) was left open for
m_field_metadata_size values between 251 and 255.
We fix this by changing the condition in the Table_map_log_event
constructor to match the one in the net_store_length, ie,
increment one byte if m_field_metadata_size < 251 and three if it
exceeds this value.
In statement-based or mixed-mode replication, use DROP TEMPORARY TABLE
to drop multiple tables causes different errors on master and slave,
when one or more of these tables do not exist. Because when executed
on slave, it would automatically add IF EXISTS to the query to ignore
all ER_BAD_TABLE_ERROR errors.
To fix the problem, do not add IF EXISTS when executing DROP TEMPORARY
TABLE on the slave, and clear the ER_BAD_TABLE_ERROR error after
execution if the query does not expect any errors.
In statement-based or mixed-mode replication, use DROP TEMPORARY TABLE
to drop multiple tables causes different errors on master and slave,
when one or more of these tables do not exist. Because when executed
on slave, it would automatically add IF EXISTS to the query to ignore
all ER_BAD_TABLE_ERROR errors.
To fix the problem, do not add IF EXISTS when executing DROP TEMPORARY
TABLE on the slave, and clear the ER_BAD_TABLE_ERROR error after
execution if the query does not expect any errors.
subselect_single_select_engine::exec()
When a subquery doesn't need to be evaluated because
it returns only aggregate functions and these aggregates
can be calculated from the metadata about the table it
was not updating all the relevant members of the JOIN
structure to reflect that this is a constant query.
This caused problems to the enclosing subquery
('<> SOME' in the test case above) trying to read some
data about the tables.
Fixed by setting const_tables to the number of tables
when the SELECT is optimized away.
REORGANIZE PARTITION
There were several problems which lead to this this,
all related to bad error handling.
1) There was several bugs preventing the ddl-log to be used for
cleaning up created files on error.
2) The error handling after the copy partition rows did not close
and unlock the tables, resulting in deletion of partitions
which were in use, which lead InnoDB to put the partition to
drop in a background queue.
error in the query.
Fixes a leak after materializing a GROUP BY subquery to a
temp table when the subquery has a blob column in the SELECT
list.
Fixed by correctly destructing temporary buffers after doing
the conversion.
Problem was when calculating the range of partitions for
pruning.
Solution was to get the calculation correct. I also simplified
it a bit for easier understanding.
Several problems fixed :
1. Non constant expressions in UNION ... ORDER BY were not correctly cleaned up
in st_select_lex_unit::cleanup() causing crashes in EXPLAIN EXTENDED because of
fields quoted by these expressions pointing to the already freed temporary table
used to calculate the UNION.
Fixed by correctly cleaning up expressions of any depth.
2. Subqueries in the order by part of UNION ... ORDER BY ... caused a crash in
EXPLAIN EXTENDED because of a transformation attempt made during EXPLAIN EXTENDED
execution. Fixed by not doing the transformation when in EXPLAIN.
3. Fulltext functions caused crash when in the ORDER BY part of an un-parenthesized
UNION that gets "promoted" to be valid for the whole union, e.g.
SELECT * FROM t1 UNION SELECT * FROM t2 ORDER BY MATCHES (a) AGAINST ('abc' IN BOOLEAN MODE).
This is a case that demonstrates a more general problem of parts of the query being
moved to another level. When doing such transformation late in the optimization run
when most of the flags about the contents of the query are already aggregated it's possible
to "split" the flags so that they correctly reflect the new queries after the transformation.
In specific the ST_SELECT_LEX::ftfunc_list is holding all the free text function for all the
parts of the second SELECT in the UNION and we don't know what part of that is in the ORDER BY
that we're to move to the UNION level and what part is about the other parts of the second SELECT.
Fixed by throwing and error when such statements are about to be processed by adding a check
for the presence of MATCH() inside the ORDER BY clause that's going to get promoted to UNION.
To workaround this new limitation one must parenthesize the UNION SELECTs and provide a real
global ORDER BY for the UNION outside of the parenthesis.
At the end of execution top level join execution
we cleanup this join with true argument.
It leads to underlying join cleanup(subquery) with true argument too
and to tmp_table_param->field array cleanup which is required later.
The problem is that Item_func_set_user_var does not set
result_filed which leads to unnecessary repeated excution of subquery
on final stage.
The fix is to set result_field for Item_func_set_user_var.
on re-execution of prepared statement
Problem: some (see eq_ref_table()) ORDER BY/GROUP BY optimization
is called before each PS execution. However, we don't properly
initialize its stucture every time before the call.
Fix: properly initialize the sturture used.