- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
Sometimes the number of really updated rows (with changed
column values) cannot be determined at the server level
alone (e.g. if the storage engine does not return enough
column values to verify that). So the only dependable way
in such cases is to let the storage engine return that
information if possible.
Fixed the bug at server level by providing a way for the
storage engine to return information about wether it
actually updated the row or the old and the new column
values are the same. It can do that by returning
HA_ERR_RECORD_IS_THE_SAME in ha_update_row().
Note that each storage engine may choose not to try to
return this status code, so this behaviour remains
storage engine specific.
MySQL uses _beginthread()/_endthread() instead of
_beginthreadex()/_endthreadex() to create/end its threads
on Windows.
According to MSDN _endthread() does close the thread handle.
So there's no need the handle to be closed explicitly.
Besides : WaitForSingleObject(, INFINITE) != WAIT_OBJECT_0) is
true for all practical cases as the other two possible return
codes (according to MSDN) cannot happen in that case the
CloseHandle() was actually a dead code.
Fixed by removing the CloseHandle() call. No test case added
because it's not possible to test for absence of dead code.
The patch contains the following changes:
- Introduce auxilary functions to convenient work with character sets:
- resolve_charset();
- resolve_collation();
- get_default_db_collation();
- Introduce lex_string_set();
- Refactor Table_trigger_list::process_triggers() &
sp_head::execute_trigger() to be consistent with other code;
- Move reusable code from add_table_for_trigger() into
build_trn_path(), check_trn_exists() and load_table_name_for_trigger()
to be used in the following patch.
- Rename triggers_file_ext and trigname_file_ext into TRN_EXT and
TRG_EXT respectively.
and is not described in the manual
- Adding missing initialization for utf8 collations
- Minor code clean-ups: renaming variables,
moving code into a new separate function.
- Adding test, to check that both ucs2 and utf8 user
defined collations work (ucs2_test_ci and utf8_test_ci)
- Adding Vietnamese collation as a complex user defined
collation example.
The value of "low-priority-updates" option and the LOW PRIORITY
prefix was taken into account at parse time.
This caused triggers (among others) to ignore this flag (if
supplied for the DML statement).
Moved reading of the LOW_PRIORITY flag at run time.
Fixed an incosistency when handling
SET GLOBAL LOW_PRIORITY_UPDATES : now it is in effect for
delayed INSERTs.
Tested by checking the effect of LOW_PRIORITY flag via a
trigger.
Setting a key_cache_block_size which is not a power of 2
could corrupt MyISAM tables.
A couple of computations in the key cache code use bit
operations which do only work if key_cache_block_size
is a power of 2.
Replaced bit operations by arithmetic operations
to make key cache able to handle block sizes that are
not a power of 2.