Fix remaining cases of Bug #3596: fix possible races caused by an obsolete value of thd->query_length in SHOW PROCESSLIST and SHOW INNODB STATUS; this fix depends on the fact that thd->query is always set to NULL before setting it to point to a new query
in hard-coded replication messages, always put small-length info (error codes, explanation of the error) at the beginning,
so that it is not cut by truncation if the query is very long (which happens if the query goes first).
too big by 6 bytes. So I add code to substract 6 bytes if the master is 3.23.
This is not perfect (because it won't work if the slave I/O thread has not
noticed yet that the master is 3.23), but as long as the slave I/O thread
starts Exec_master_log_pos will be ok.
It must be merged to 4.1 but not to 5.0 (or it can be, because of #if MYSQL_VERSION_ID),
because 5.0 already works if the master is 3.23 (and in a more natural way:
in 5.0 we store the end_log_pos in the binlog and relay log).
I had to move functions from slave.h to slave.cc to satisfy gcc.
Fixed bugs in group_concat with ORDER BY and DISTINCT (Bugs #2695, #3381 and #3319)
Fixed crash when doing rollback in slave and the io thread catched up with the sql thread
Set locked_in_memory properly
We introduce a new function mysql_test_parse_for_slave().
If the slave sees that the query got a really bad error on master
(killed e.g.), then it calls this function to know if this query
can be ignored because of replicate-*-table rules (do not worry
about replicate-*-db rules: they are checked so early that they have
no bug). If the answer is yes, it skips the query and continues. If
it's no, then it stops and say "fix your slave data manually" (like it
did before this change).
- the one about BUG#2921
- the one about relay log flushing
Both will be rewritten in a next changeset
(this one will not be pushed before the next changeset).
INSERT DELAYED works only for one-row inserts (in latest 4.0 versions
at least). So killing a delayed_insert thread does not spoil replication:
the rows which actually went into the table are exactly those listed
in the binlog. So when the delayed_insert thread is killed, don't log
it as 'killed', because it causes superfluous stops on the slave.
"(binlog, position) stored by InnoDB for a replication slave can be wrong".
This code contains conditional #if to distinguish between versions;
it should be merged into 4.1 and 5.0.
Added more DBUG statements
Ensure that we are comparing end space with BINARY strings
Use 'any_db' instead of '' to mean any database. (For HANDLER command)
Only strip ' ' when comparing CHAR, not other space-like characters (like \t)
reset errors (in thd) before executing the event. Otherwise if an event is ignored
because of replicate-*-table rules (error ER_SLAVE_IGNORED_TABLE) this error code
may remain in thd->net and the next event may pick it.
For previous commit I had run only rpl* tests, here the other ones had a
few surprises. Latest status:
- all tests pass
- all replication tests pass with Valgrind
This is the final-final commit & push.
Doc remains.
* A more dynamic binlog format which allows small changes (1064)
* Log session variables in Query_log_event (1063)
It contains a few bugfixes (which I made when running the testsuite).
I carefully updated the results of the testsuite (i.e. I checked for every one,
if the difference between .reject and .result could be explained).
Apparently mysql-test-run --manager is broken in 4.1 and 5.0 currently,
so I could neither run the few tests which require --manager, nor check
that they pass nor modify their .result. But for builds, we don't run
with --manager.
Apart from --manager, the full testsuite passes, with Valgrind too (no errors).
I'm going to push in the next minutes. Remains: update the manual.
Note: by chance I saw that (in 4.1, in 5.0) rpl_get_lock fails when run alone;
this is normal at it makes assumptions on thread ids. I will fix this one day
in 4.1.
This is the main commit for Worklog tasks:
* A more dynamic binlog format which allows small changes (1064)
* Log session variables in Query_log_event (1063)
Below 5.0 means 5.0.0.
MySQL 5.0 is able to replicate FOREIGN_KEY_CHECKS, UNIQUE_KEY_CHECKS (for speed),
SQL_AUTO_IS_NULL, SQL_MODE. Not charsets (WL#1062), not some vars (I can only think
of SQL_SELECT_LIMIT, which deserves a special treatment). Note that this
works for queries, except LOAD DATA INFILE (for this it would have to wait
for Dmitri's push of WL#874, which in turns waits for the present push, so...
the deadlock must be broken!). Note that when Dmitri pushes WL#874 in 5.0.1,
5.0.0 won't be able to replicate a LOAD DATA INFILE from 5.0.1.
Apart from that, the new binlog format is designed so that it can tolerate
a little variation in the events (so that a 5.0.0 slave could replicate a
5.0.1 master, except for LOAD DATA INFILE unfortunately); that is, when I
later add replication of charsets it should break nothing. And when I later
add a UID to every event, it should break nothing.
The main change brought by this patch is a new type of event, Format_description_log_event,
which describes some lengthes in other event types. This event is needed for
the master/slave/mysqlbinlog to understand a 5.0 log. Thanks to this event,
we can later add more bytes to the header of every event without breaking compatibility.
Inside Query_log_event, we have some additional dynamic format, as every Query_log_event
can have a different number of status variables, stored as pairs (code, value); that's
how SQL_MODE and session variables and catalog are stored. Like this, we can later
add count of affected rows, charsets... and we can have options --don't-log-count-affected-rows
if we want.
MySQL 5.0 is able to run on 4.x relay logs, 4.x binlogs.
Upgrading a 4.x master to 5.0 is ok (no need to delete binlogs),
upgrading a 4.x slave to 5.0 is ok (no need to delete relay logs);
so both can be "hot" upgrades.
Upgrading a 3.23 master to 5.0 requires as much as upgrading it to 4.0.
3.23 and 4.x can't be slaves of 5.0.
So downgrading from 5.0 to 4.x may be complicated.
Log_event::log_pos is now the position of the end of the event, which is
more useful than the position of the beginning. We take care about compatibility
with <5.0 (in which log_pos is the beginning).
I added a short test for replication of SQL_MODE and some other variables.
TODO:
- after committing this, merge the latest 5.0 into it
- fix all tests
- update the manual with upgrade notes.
"EE_ error codes (EE_DELETE, EE_WRITE) end up in the binlog, making slave stop".
The problem was that during execution of the command on the master, an error
can occur (for example, not space left on device, then mysqld waits and when
there is space it completes successfully: so finally it worked but the error
EE_WRITE remains in thd->net.last_errno and thd->net.last_error).
To know if finally the command succeeded, we test the 'error' variable in
every place, and if it shows no failure we reset thd->net.last_err* using
the function THD::clear_error() which is backported from 4.1.
A new test to see if now only real errors get to the binlog (note: the test
uses "rm").
Also a bit of memory free/alloc saving in log_event.cc (do not free the whole
mem_root after every query in the slave SQL thread: we can keep the initial
block of it; which will be freed when the thread terminates).
The constructor of Rotate_log_event used when we are rotating our binlog or
relay log, should not assume that there is a nonzero THD available.
For example, when we are reacting to SIGHUP, the THD is 0.
In fact we don't need to use the THD in this constructor;
we can do like for Stop_log_event, and use the minimal Log_event
constructor.
If we were allowed to put Unix-specific commands in the testsuite,
I'd add a test for this (<sigh>).