In case of out-of-memory error received from the master, print the corresponding message to the error log and stop slave I/O thread to avoid reconnecting with a wrong binary log position.
The issue found with bug 25411 is due to the function skip_rear_comments()
which damages the source code while implementing a work around.
The root cause of the problem is in the lexical analyser, which does not
process special comments properly.
For special comments like :
[1] aaa /*!50000 bbb */ ccc
since 5.0 is a version older that the current code, the parser is in lining
the content of the special comment, so that the query to process is
[2] aaa bbb ccc
However, the text of the query captured when processing a stored procedure,
stored function or trigger (or event in 5.1), can be after rebuilding it:
[3] aaa bbb */ ccc
which is wrong.
To fix bug 25411 properly, the lexical analyser needs to return [2] when
in lining special comments.
In order to implement this, some preliminary cleanup is required in the code,
which is implemented by this patch.
Before this change, the structure named LEX (or st_lex) contains attributes
that belong to lexical analysis, as well as attributes that represents the
abstract syntax tree (AST) of a statement.
Creating a new LEX structure for each statements (which makes sense for the
AST part) also re-initialized the lexical analysis phase each time, which
is conceptually wrong.
With this patch, the previous st_lex structure has been split in two:
- st_lex represents the Abstract Syntax Tree for a statement. The name "lex"
has not been changed to avoid a bigger impact in the code base.
- class lex_input_stream represents the internal state of the lexical
analyser, which by definition should *not* be reinitialized when parsing
multiple statements from the same input stream.
This change is a pre-requisite for bug 25411, since the implementation of
lex_input_stream will later improve to deal properly with special comments,
and this processing can not be done with the current implementation of
sp_head::reset_lex and sp_head::restore_lex, which interfere with the lexer.
This change set alone does not fix bug 25411.
Problem: to handle a situation when the size of event on the master is greater than max_allowed_packet on slave, we checked for the wrong constant (ER_NET_PACKET_TOO_LARGE instead of CR_NET_PACKET_TOO_LARGE).
Solution: test for the client "packet too large" error code instead of the server one in slave I/O thread.
"INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values"
didn't make it into 5.0.36 and 5.1.16,
so we need to adjust the bug-detection-based-on-version-number code.
Because the rpl tree has a too old version, rpl_insert_id cannot pass,
so I disable it (like is already the case in 5.1-rpl for the same reason),
and the repl team will re-enable it when they merge 5.0 and 5.1 into
their trees (thus getting the right version number).
"INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values".
When in an INSERT ON DUPLICATE KEY UPDATE, using
an autoincrement column, we inserted some autogenerated values and
also updated some rows, some autogenerated values were not used
(for example, even if 10 was the largest autoinc value in the table
at the start of the statement, 12 could be the first autogenerated
value inserted by the statement, instead of 11). One autogenerated
value was lost per updated row. Led to exhausting the range of the
autoincrement column faster.
Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12.
This bug breaks replication from a pre-5.0.24 master.
But the present bugfix, as it makes INSERT ON DUP KEY UPDATE
behave like pre-5.0.24, breaks replication from a [5.0.24,5.0.34]
master to a fixed (5.0.36) slave! To warn users against this when
they upgrade their slave, as agreed with the support team, we add
code for a fixed slave to detect that it is connected to a buggy
master in a situation (INSERT ON DUP KEY UPDATE into autoinc column)
likely to break replication, in which case it cannot replicate so
stops and prints a message to the slave's error log and to SHOW SLAVE
STATUS.
For 5.0.36->[5.0.24,5.0.34] replication we cannot warn as master
does not know the slave's version (but we always recommended to users
to have slave at least as new as master).
As agreed with support, I'll also ask for an alert to be put into
the MySQL Network Monitoring and Advisory Service.
The possibility of the race is removed by changing sequence of calls
pthread_mutex_unlock(&mi->run_lock);
pthread_cond_broadcast(&mi->stop_cond);
into
pthread_cond_broadcast(&mi->stop_cond);
pthread_mutex_unlock(&mi->run_lock);
at the end of I/O thread (similar change at the end of SQL thread). This ensures
that no thread waiting on the condition executes between the broadcast and the
unlock and thus can't delete the mi structure which caused the bug.
The relay log may not be open for some reason (e.g. disk error) after rotation,
and using it causes the slave crash.
Fix: check we have it open before access, return error otherwise.
- Removed not used variables and functions
- Added #ifdef around code that is not used
- Renamed variables and functions to avoid conflicts
- Removed some not used arguments
Fixed some class/struct warnings in ndb
Added define IS_LONGDATA() to simplify code in libmysql.c
I did run gcov on the changes and added 'purecov' comments on almost all lines that was not just variable name changes
Fixed compiler warnings (detected by VC++):
- Removed not used variables
- Added casts
- Fixed wrong assignments to bool
- Fixed wrong calls with bool arguments
- Added missing argument to store(longlong), which caused wrong store method to be called.
(Mostly in DBUG_PRINT() and unused arguments)
Fixed bug in query cache when used with traceing (--with-debug)
Fixed memory leak in mysqldump
Removed warnings from mysqltest scripts (replaced -- with #)
When we write 'query=...' string to a frm file for views on a slave,
indentifiers are not properly quoted due to missing OPTION_QUOTE_SHOW_CREATE
flag in the thd->options.
Fix: properly set thd->options for the slave thread.
Transaction on the slave sql thread got blocked against a slave's mysqld local ta's
lock. Since the default, slave-transaction-retries=10, there was replaying of the
replicated ta. That failed because of a new started from 5.0.13 policy not to rollback
a timed-out transaction. Effectively the first round of a timed-out ta becomes committed
by the replaying's first "BEGIN".
It was decided to backport already existed method working in 5.1 implemented in
bug #16228 for handling symmetrical deadlock problem. That patch introduced end_trans
execution whenever a replicated ta deadlocks or timed-out.
Note, that this solution can be practically suboptimal - in the light of the changed behavior
due to timeout we still could replay only the last statement - only with a high rate of timeouting
replicated transactions.
A communication packet can also be a binlog event sent from the master to the slave.
To be sent by master dump and accepted by slave io thread both have to have
the value of max_allowed_packet bigger than one that client connection had.
In the patch there is the MAX possible replicatio header size estimation for events
in binlog that embedded user query. Only these events of query_log_event type, i.e
just plain queries, require attention.
Fix when __attribute__() is stubbed out, add ATTRIBUTE_FORMAT() for specifying
__attribute__((format(...))) safely, make more use of the format attribute,
and fix some of the warnings that this turns up (plus a bonus unrelated one).
when calling a SP from C API"
The bug was caused by lack of checks for misuse in mysql_real_query.
A stored procedure always returns at least one result, which is the
status of execution of the procedure itself.
This result, or so-called OK packet, is similar to a result
returned by INSERT/UPDATE/CREATE operations: it contains the overall
status of execution, the number of affected rows and the number of
warnings. The client test program attached to the bug did not read this
result and ivnoked the next query. In turn, libmysql had no check for
such scenario and mysql_real_query was simply trying to send that query
without reading the pending response, thus messing up the communication
protocol.
The fix is to return an error from mysql_real_query when it's called
prior to retrieval of all pending results.
No test case as the bug is in an existing test case (rpl_trigger.test
when it is run under valgrind).
The warning was caused by memory corruption in replication slave: thd->db
was pointing at a stack address that was previously used by
sp_head::execute()::old_db. This happened because mysql_change_db
behaved differently in replication slave and did not make a copy of the
argument to assign to thd->db.
The solution is to always free the old value of thd->db and allocate a new
copy, regardless whether we're running in a replication slave or not.
Bug#19022 "Memory bug when switching db during trigger execution"
Bug#17199 "Problem when view calls function from another database."
Bug#18444 "Fully qualified stored function names don't work correctly in
SELECT statements"
Documentation note: this patch introduces a change in behaviour of prepared
statements.
This patch adds a few new invariants with regard to how THD::db should
be used. These invariants should be preserved in future:
- one should never refer to THD::db by pointer and always make a deep copy
(strmake, strdup)
- one should never compare two databases by pointer, but use strncmp or
my_strncasecmp
- TABLE_LIST object table->db should be always initialized in the parser or
by creator of the object.
For prepared statements it means that if the current database is changed
after a statement is prepared, the database that was current at prepare
remains active. This also means that you can not prepare a statement that
implicitly refers to the current database if the latter is not set.
This is not documented, and therefore needs documentation. This is NOT a
change in behavior for almost all SQL statements except:
- ALTER TABLE t1 RENAME t2
- OPTIMIZE TABLE t1
- ANALYZE TABLE t1
- TRUNCATE TABLE t1 --
until this patch t1 or t2 could be evaluated at the first execution of
prepared statement.
CURRENT_DATABASE() still works OK and is evaluated at every execution
of prepared statement.
Note, that in stored routines this is not an issue as the default
database is the database of the stored procedure and "use" statement
is prohibited in stored routines.
This patch makes obsolete the use of check_db_used (it was never used in the
old code too) and all other places that check for table->db and assign it
from THD::db if it's NULL, except the parser.
How this patch was created: THD::{db,db_length} were replaced with a
LEX_STRING, THD::db. All the places that refer to THD::{db,db_length} were
manually checked and:
- if the place uses thd->db by pointer, it was fixed to make a deep copy
- if a place compared two db pointers, it was fixed to compare them by value
(via strcmp/my_strcasecmp, whatever was approproate)
Then this intermediate patch was used to write a smaller patch that does the
same thing but without a rename.
TODO in 5.1:
- remove check_db_used
- deploy THD::set_db in mysql_change_db
See also comments to individual files.