After the MDEV-13118 fix there's no code in the server that
wants caseup/casedn to change the argument in place for simple
charsets. Let's remove this logic and always return the result in a
new string for all charsets, both simple and complex.
1. Removing the optimization that *some* character sets used in casedn()
and caseup(), which allowed (and required) to change the case in-place,
overwriting the string passed as the "src" argument.
Now all CHARSET_INFO's work in the same way:
non of them change the source string in-place, all of them now convert
case from the source string to the destination string, leaving
the source string untouched.
2. Adding "const" qualifier to the "char *src" parameter
to caseup() and casedn().
3. Removing duplicate implementations in ctype-mb.c.
Now both caseup() and casedn() implementations for all CJK character sets
use internally the same function my_casefold_mb()
(the former my_casefold_mb_varlen()).
4. Removing the "unused" attribute from parameters of some my_case{up|dn}_xxx()
implementations, as the affected parameters are now *used* in the code.
Previously these parameters were used only in DBUG_ASSERT().
This is a pre-requisite patch for:
- MDEV-8433 Make field<'broken-string' use indexes
- MDEV-8625 Bad result set with ignorable characters when using a prefix key
- MDEV-8626 Bad result set with contractions when using a prefix key
Adding a new virtual function MY_CHARSET_HANDLER::copy_abort().
Moving character set specific code into the correspoding implementations
(for simple, multi-byte and mbmaxlen>1 character sets).
The problem was that my_hash_sort didn't properly delete end-space characters properly, so strings that should compare
identically was seen as different strings. (Space was handled correctly, but not NBSP)
This caused duplicate key errors when a heap table was converted to Aria as part of overflow in group by.
Fixed by removing all characters that compares as end space when creating a hash.
Other things:
- Fixed that --sorted_results also works for errors in mysqltest.
- Speed up hash by not comparing strings that has different hash.
- Speed up many my_hash_sort functions by using registers to calculate hash instead of pointers.
This was previously done for some functions, but not for all.
- Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions.
client/mysqltest.cc:
Fixed that --sorted_results also works for error messages.
mysql-test/r/ctype_partitions.result:
New test to ensure that partitions on hash works
mysql-test/suite/multi_source/gtid.result:
Updated result
mysql-test/suite/multi_source/gtid.test:
Test that --sorted_result works for error messages
mysql-test/suite/multi_source/gtid_ignore_duplicates.result:
Updated result
mysql-test/suite/multi_source/gtid_ignore_duplicates.test:
Updated result
mysql-test/suite/multi_source/load_data.result:
Updated result
mysql-test/suite/multi_source/load_data.test:
Updated result
mysql-test/t/ctype_partitions.test:
New test to ensure that partitions on hash works
storage/heap/hp_write.c:
Speed up hash by not comparing strings that has different hash.
storage/maria/ma_check.c:
Extra debug
strings/ctype-bin.c:
Use macro for hash function
strings/ctype-latin1.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-mb.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-simple.c:
Use macro for hash function
Use same variable names as in other my_hash_sort functions.
Update my_hash_sort_simple() to properly remove end space (patch by Bar)
strings/ctype-uca.c:
Ignore duplicated space inside strings and end space in my_hash_sort_uca(). This fixed MDEV-6255
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-ucs2.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-utf8.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/strings_def.h:
Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions.
Problem:-
We have created a table with UTF8_BIN collation.
In case, when in our query we have ORDER BY clause over a function
call we are getting result in incorrect order.
Note:the bug is not there in 5.5.
Analysis:
In 5.5, for UTF16_BIN, we have min and max multi-byte length is 2 and 4
respectively.In make_sortkey(),for 2 byte character character we are
assuming that the resultant length will be 2 byte/character. But when we
use my_strnxfrm_unicode_full_bin(), we store sorting weights using 3 bytes
per character.This result in truncated result.
Same thing happen for UTF8MB4, where we have 1 byte min multi-byte and
4 byte max multi-byte.We will accsume resultant data as 1 byte/character,
which result in truncated result.
Solution:-
use strnxfrm(means use of MY_CS_STRNXFRM macro) is used for sort, in
which the resultant length is not dependent on source length.
Workaround for a possible GCC bug happening in my_fill_ucs2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58039
Commenting out the optimized loop.
Using the original MySQL version.
modified:
strings/ctype-ucs2.c
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5