Test that we can start a MySQL Server with a default multibyte charset with
NDB running. Test *really* basic functionality too.
Index: ndb-work/mysql-test/r/rpl_ndb_ctype_ucs2_def.result
===================================================================
NDB util thread calls mysql_parse internally with plain old c strings (7bit ascii) to create tables (e.g. mysql.ndb_schema). With mysqld default charset set to a multibyte one (e.g. ucs2) mysql_parse would try to interpret the 7bit string as UCS2 and promptly explode in a heap.
Solution is to set the util thread to be using utf8 charset.
Index: ndb-work/sql/ha_ndbcluster.cc
===================================================================
previous correction didn't. make sure "tail" is fixed up
when filling cache several times; rework formulae.
---
Merge sin.intern.azundris.com:/home/tnurnberg/22540/50-22540
into sin.intern.azundris.com:/home/tnurnberg/22540/51-22540
When a UNION statement forced conversion of an UTF8
charset value to a binary charset value, the byte
length of the result values was truncated to the
CHAR_LENGTH of the original UTF8 value.
In Automake,
mysqld_LDADD = libndb.la
only adds libndb to the link command for mysqld,
but does not declare a dependency.
Added libndb.la to mysqld_DEPENDENCIES
This fix a build race condition that currently
breaks make -J builds, and also enforce a re-link
of mysqld when libndb.la changes.
spaces.
When the my_strnncollsp_simple function compares two strings and one is a prefix
of another then this function compares characters in the rest of longer key
with the space character to find whether the longer key is greater or less.
But the sort order of the collation isn't used in this comparison. This may
lead to a wrong comparison result, wrongly created index or wrong order of the
result set of a query with the ORDER BY clause.
Now the my_strnncollsp_simple function uses collation sort order to compare
the characters in the rest of longer key with the space character.
Problem: "Under high load, the slave registering to the master can timeout
during the COM_REGISTER_SLAVE execution. This causes an error, which
prevents the slave from connecting at all."
Fix: Do not abort immediately, but retry registering on master.
1. Fix ddl_i18n_koi8r, ddl_i18n_utf8: explicitly specify character-sets
directory for mysqldump;
2. Fix crash in mysqldump if collation is not found;
3. Use proper way to compare character set names.
query / no aggregate of subquery
The optimizer counts the aggregate functions that
appear as top level expressions (in all_fields) in
the current subquery. Later it makes a list of these
that it uses to actually execute the aggregates in
end_send_group().
That count is used in several places as a flag whether
there are aggregates functions.
While collecting the above info it must not consider
aggregates that are not aggregated in the current
context. It must treat them as normal expressions
instead. Not doing that leads to incorrect data about
the query, e.g. running a query that actually has no
aggregate functions as if it has some (and hence is
expected to return only one row).
Fixed by ignoring the aggregates that are not aggregated
in the current context.
One other smaller omission discovered and fixed in the
process : the place of aggregation was not calculated for
user defined functions. Fixed by calling
Item_sum::init_sum_func_check() and
Item_sum::check_sum_func() as it's done for the rest of
the aggregate functions.
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
Sometimes the number of really updated rows (with changed
column values) cannot be determined at the server level
alone (e.g. if the storage engine does not return enough
column values to verify that). So the only dependable way
in such cases is to let the storage engine return that
information if possible.
Fixed the bug at server level by providing a way for the
storage engine to return information about wether it
actually updated the row or the old and the new column
values are the same. It can do that by returning
HA_ERR_RECORD_IS_THE_SAME in ha_update_row().
Note that each storage engine may choose not to try to
return this status code, so this behaviour remains
storage engine specific.
SHOW CREATE TABLE or SELECT FROM I_S.
Actually, the bug discovers two problems:
- the original query is not preserved properly. This is the problem
of BUG#16291;
- the resultset of SHOW CREATE TABLE statement is binary.
This patch fixes the second problem for the 5.0.
Both problems will be fixed in 5.1.