- Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars()
and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES.
The flag defines if strnncollsp_nchars() should emulate trailing spaces
which were possibly trimmed earlier (e.g. in InnoDB CHAR compression).
This is important for NOPAD collations.
For example, with this input:
- str1= 'a ' (Latin letter a followed by one space)
- str2= 'a ' (Latin letter a followed by two spaces)
- nchars= 3
if the flag is given, strnncollsp_nchars() will virtually restore
one trailing space to str1 up to nchars (3) characters and compare two
strings as equal:
- str1= 'a ' (one extra trailing space emulated)
- str2= 'a ' (as is)
If the flag is not given, strnncollsp_nchars() does not add trailing
virtual spaces, so in case of a NOPAD collation, str1 will be compared
as less than str2 because it is shorter.
- Field_string::cmp_prefix() now passes the new flag.
Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do
not pass the new flag.
- The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc
(which handles the CHAR data type) now also passed the new flag.
- Fixing UCA collations to respect the new flag.
Other collations are possibly also affected, however
I had no success in making an SQL script demonstrating the problem.
Other collations will be extended to respect this flags in a separate
patch later.
- Changing the meaning of the last parameter of Field::cmp_prefix()
from "number of bytes" (internal length)
to "number of characters" (user visible length).
The code calling cmp_prefix() from handler.cc was wrong.
After this change, the call in handler.cc became correct.
The code calling cmp_prefix() from key_rec_cmp() in key.cc
was adjusted according to this change.
- Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c
now pass the new flag.
A few new tests also were added, without the flag.
This follows up commit
commit 94a520ddbe and
commit 7c5519c12d.
After these changes, the default test suites on a
cmake -DWITH_UBSAN=ON build no longer fail due to passing
null pointers as parameters that are declared to never be null,
but plenty of other runtime errors remain.
After the MDEV-13118 fix there's no code in the server that
wants caseup/casedn to change the argument in place for simple
charsets. Let's remove this logic and always return the result in a
new string for all charsets, both simple and complex.
1. Removing the optimization that *some* character sets used in casedn()
and caseup(), which allowed (and required) to change the case in-place,
overwriting the string passed as the "src" argument.
Now all CHARSET_INFO's work in the same way:
non of them change the source string in-place, all of them now convert
case from the source string to the destination string, leaving
the source string untouched.
2. Adding "const" qualifier to the "char *src" parameter
to caseup() and casedn().
3. Removing duplicate implementations in ctype-mb.c.
Now both caseup() and casedn() implementations for all CJK character sets
use internally the same function my_casefold_mb()
(the former my_casefold_mb_varlen()).
4. Removing the "unused" attribute from parameters of some my_case{up|dn}_xxx()
implementations, as the affected parameters are now *used* in the code.
Previously these parameters were used only in DBUG_ASSERT().
- Removing the "diff_if_only_endspace_difference" argument from
MY_COLLATION_HANDLER::strnncollsp(), my_strnncollsp_simple(),
as well as in the function template MY_FUNCTION_NAME(strnncollsp)
in strcoll.ic
- Removing the "diff_if_only_space_different" from ha_compare_text(),
hp_rec_key_cmp().
- Adding a new function my_strnncollsp_padspace_bin() and reusing
it instead of duplicate code pieces in my_strnncollsp_8bit_bin(),
my_strnncollsp_latin1_de(), my_strnncollsp_tis620(),
my_strnncollsp_utf8_cs().
- Adding more tests for better coverage of the trailing space handling.
- Removing the unused definition of HA_END_SPACE_ARE_EQUAL
Note, the patch for MDEV-8661 unintentionally fixed MDEV-8694 as well,
as a side effect. Adding a real clear fix: implementing
Item_func_like::propagate_equal_fields() with comments.
This is a pre-requisite patch for:
- MDEV-8433 Make field<'broken-string' use indexes
- MDEV-8625 Bad result set with ignorable characters when using a prefix key
- MDEV-8626 Bad result set with contractions when using a prefix key
Adding a new virtual function MY_CHARSET_HANDLER::copy_abort().
Moving character set specific code into the correspoding implementations
(for simple, multi-byte and mbmaxlen>1 character sets).
The problem was that my_hash_sort didn't properly delete end-space characters properly, so strings that should compare
identically was seen as different strings. (Space was handled correctly, but not NBSP)
This caused duplicate key errors when a heap table was converted to Aria as part of overflow in group by.
Fixed by removing all characters that compares as end space when creating a hash.
Other things:
- Fixed that --sorted_results also works for errors in mysqltest.
- Speed up hash by not comparing strings that has different hash.
- Speed up many my_hash_sort functions by using registers to calculate hash instead of pointers.
This was previously done for some functions, but not for all.
- Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions.
client/mysqltest.cc:
Fixed that --sorted_results also works for error messages.
mysql-test/r/ctype_partitions.result:
New test to ensure that partitions on hash works
mysql-test/suite/multi_source/gtid.result:
Updated result
mysql-test/suite/multi_source/gtid.test:
Test that --sorted_result works for error messages
mysql-test/suite/multi_source/gtid_ignore_duplicates.result:
Updated result
mysql-test/suite/multi_source/gtid_ignore_duplicates.test:
Updated result
mysql-test/suite/multi_source/load_data.result:
Updated result
mysql-test/suite/multi_source/load_data.test:
Updated result
mysql-test/t/ctype_partitions.test:
New test to ensure that partitions on hash works
storage/heap/hp_write.c:
Speed up hash by not comparing strings that has different hash.
storage/maria/ma_check.c:
Extra debug
strings/ctype-bin.c:
Use macro for hash function
strings/ctype-latin1.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-mb.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-simple.c:
Use macro for hash function
Use same variable names as in other my_hash_sort functions.
Update my_hash_sort_simple() to properly remove end space (patch by Bar)
strings/ctype-uca.c:
Ignore duplicated space inside strings and end space in my_hash_sort_uca(). This fixed MDEV-6255
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-ucs2.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/ctype-utf8.c:
Use macro for hash function
Use registers to calculate hash (speedup)
strings/strings_def.h:
Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions.
~40% bugfixed(*) applied
~40$ bugfixed reverted (incorrect or we're not buggy)
~20% bugfixed applied, despite us being not buggy
(*) only changes in the server code, e.g. not cmakefiles
Reduced number of my_hash_sort_bin() calls from 4 to 1 per query.
Reduced number of memory accesses done by my_hash_sort_bin().
Details:
- let MDL subsystem use pre-calculated hash value for hash
inserts and deletes
- let table cache use pre-calculated MDL hash value
- MDL namespace is excluded from hash value calculation, so that
hash value can be used by table cache as is
- hash value for MDL is calculated as resulting hash value + MDL
namespace
- extended hash implementation to accept user defined hash function
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
Added strings_def.h into strings library to be able to have a DBUG_ASSERT() version without _db_flush() call (as strings.a should not depend on dbug.a)
Remove include of m_string.h in all string files (as it's included by string_def.h).
Fixed include order.
Changed "m_ctype.h" -> <m_ctype.h>
include/my_dbug.h:
Flush DBUG log in case of DBUG_ASSERT()
strings/bchange.c:
Include strings_def.h
strings/bcmp.c:
Include strings_def.h
strings/bfill.c:
Include strings_def.h
strings/bmove.c:
Include strings_def.h
strings/bmove512.c:
Include strings_def.h
strings/bmove_upp.c:
Include strings_def.h
strings/conf_to_src.c:
Include strings_def.h
Fixed copyright
strings/ctype-big5.c:
Include strings_def.h
strings/ctype-bin.c:
Include strings_def.h
strings/ctype-cp932.c:
Include strings_def.h
strings/ctype-czech.c:
Include strings_def.h
strings/ctype-euc_kr.c:
Include strings_def.h
strings/ctype-eucjpms.c:
Include strings_def.h
strings/ctype-extra.c:
Include strings_def.h
strings/ctype-gbk.c:
Include strings_def.h
strings/ctype-latin1.c:
Include strings_def.h
strings/ctype-mb.c:
Include strings_def.h
strings/ctype-simple.c:
Include strings_def.h
strings/ctype-sjis.c:
Include strings_def.h
strings/ctype-tis620.c:
Include strings_def.h
strings/ctype-uca.c:
Include strings_def.h
strings/ctype-ucs2.c:
Include strings_def.h
strings/ctype-ujis.c:
Include strings_def.h
strings/ctype-utf8.c:
Include strings_def.h
strings/ctype-win1250ch.c:
Include strings_def.h
strings/ctype.c:
Include strings_def.h
strings/decimal.c:
Include strings_def.h
strings/do_ctype.c:
Include strings_def.h
strings/int2str.c:
Include strings_def.h
strings/is_prefix.c:
Include strings_def.h
strings/llstr.c:
Include strings_def.h
strings/longlong2str.c:
Include strings_def.h
strings/longlong2str_asm.c:
Include strings_def.h
strings/my_strchr.c:
Include strings_def.h
strings/my_strtoll10.c:
Include strings_def.h
strings/my_vsnprintf.c:
Include strings_def.h
strings/r_strinstr.c:
Include strings_def.h
strings/str2int.c:
Include strings_def.h
strings/str_alloc.c:
Include strings_def.h
strings/str_test.c:
Include strings_def.h
Fixed compiler warnings
strings/strappend.c:
Include strings_def.h
strings/strcend.c:
Include strings_def.h
strings/strcont.c:
Include strings_def.h
strings/strend.c:
Include strings_def.h
strings/strfill.c:
Include strings_def.h
strings/strinstr.c:
Include strings_def.h
strings/strmake.c:
Include strings_def.h
strings/strmov.c:
Include strings_def.h
strings/strmov_overlapp.c:
Include strings_def.h
strings/strnlen.c:
Include strings_def.h
strings/strnmov.c:
Include strings_def.h
strings/strstr.c:
Include strings_def.h
strings/strto.c:
Include strings_def.h
strings/strtod.c:
Include strings_def.h
strings/strtol.c:
Include strings_def.h
strings/strtoll.c:
Include strings_def.h
strings/strtoul.c:
Include strings_def.h
strings/strtoull.c:
Include strings_def.h
strings/strxmov.c:
Include strings_def.h
strings/strxnmov.c:
Include strings_def.h
strings/uctypedump.c:
Include strings_def.h
Fixed compiler warnings
Removed double include of m_ctype.h
strings/udiv.c:
Include strings_def.h
strings/xml.c:
Include strings_def.h