This patch fixes problem that LOAD DATA could use different
character sets when loading files on master and on slave sides:
- Adding replication of thd->variables.collation_database
- Adding optional character set clause into LOAD DATA
Note, the second way, with explicit CHARACTER SET clause
should be the recommended way to load data using an alternative
character set.
The old way, using "SET @@character_set_database=xxx" should be
gradually depricated.
The problem happened because those tests were using "cp932" and "ucs2" without checking whether these character sets are available. This fix moves test parts to make character set specific parts be tested only if they are:
- some parts were moved to "ctype_ucs.test" and "ctype_cp932.test"
- some parts were moved to the newly added tests "innodb-ucs2.test", "mysqlbinglog-cp932.test" and "sp-ucs2.test"
Problem: mysqlbinlog_base64 failed sporadically.
Reason: Missing "flush logs" before running $MYSQL_BINLOG,
which could start dumping the log file before server
has finished writting into it.
Fix:
- implementing --force-if-open option to "mysqlbinlog"
- adding --disable-force-if-open to make $MYSQL_BINLOG
fail on non-closed log files, to garantee that nobody
will forget "flush logs" in the future.
- adding "flush logs" into all affected tests.
Problem: when loading mysqlbinlog dumps, CREATE PROCEDURE having semicolons
in their bodies failed.
Fix: Using safe delimiter "/*!*/;" to dump log entries.
Binlog lacks encoding info about DROPped temporary table.
Idea of the fix is to switch temporary to system_charset_info when a temporary table
is DROPped for binlog. Since that is the server, that automatically, but not the client, who generates the query
the binlog should be updated on the server's encoding for the coming DROP.
The `write_binlog_with_system_charset()' is introduced to replace similar problematic places in the code.
internal charset to one associated with currently being handled query.
To note such a query can come from interactive client either.
There was a discussion within replication team and Monty who's suggestion won.
It avoids straightforward parsing of all `set' queries that could affect client side
character set.
According to the idea, mysql client does not parse `set' queries but rather cares of
`charset new_cs_name' command.
This command is generated by mysqlbinlog in form of exclaiming comment (Lars' suggestion)
so that enlightened clients like `mysql' knows what to do with it.
Interactive human can switch between many multi-byte charsets during the session
providing the command explicitly.
To note that setting new internal mysql's charset does not
trigger sending any `SET' sql statement to the server.
Now one can use user variables as target for data loaded from file
(besides table's columns). Also LOAD DATA got new SET-clause in which
one can specify values for table columns as expressions.
For example the following is possible:
LOAD DATA INFILE 'words.dat' INTO TABLE t1 (a, @b) SET c = @b + 1;
This patch also implements new way of replicating LOAD DATA.
Now we do it similarly to other queries.
We store LOAD DATA query in new Execute_load_query event
(which is last in the sequence of events representing LOAD DATA).
When we are executing this event we simply rewrite part of query which
holds name of file (we use name of temporary file) and then execute it
as usual query. In the beggining of this sequence we use Begin_load_query
event which is almost identical to Append_file event
we store 7 bytes (1 + 2*3) in every Query_log_event.
In the future if users want binlog optimized for small size and less safe,
we could add --binlog-no-charset (and binlog-no-sql-mode etc): charset info
is something by design optional (even if for now we don't offer possibility to disable it):
it's not a binlog format change.
We try to reduce the number of get_charset() calls in the slave SQL thread to a minimum
by caching the charset read from the previous event (which will often be equal to the one of the current event).
We don't use SET ONE_SHOT for charset-aware repl (we still do for timezones, will be fixed later).
No more errors if one changes the global value of charset vars on master or slave
(as we log charset info in all Query_log_event).
Not fixing Load_log_event as it will be rewritten soon by Dmitri.
Testing how mysqlbinlog behaves in rpl_charset.test.
mysqlbinlog needs to know where charset file is (to be able to convert a charset number found
in binlog (e.g. in User_var_log_event) to a charset name); mysql-test-run needs to pass
the correct value for this option to mysqlbinlog.
Many result udpates (adding charset info into every event shifts log_pos in SHOW BINLOG EVENTS).
Roughly the same job is to be done for timezones :)
Note: The following tests fails
- fulltext (Sergei has promised to fix)
- rpl_charset (Guilhem should fix)
- rpl_timezone (Dimitray has promised to fix)
Sanja needs to check out the calling of close_thread_tables() in sp_head.cc
binlog even if they changed nothing, and a test for this.
This is useful when users use these commands to clean up their master and slave by issuing
one command on master (assume master and slave have slightly different data for some
reason and you want to clean up both).
Note that I have not changed multi-table DELETE and multi-table UPDATE because their
error-reporting mechanism is more complicated.
we now detect that the server is sending us a log which we did not request
by testing the info in the fake Rotate event.
I also changed code to not print the fake Rotate which describes the
log we asked for (it's always the first received event but old masters
may not send it).