LIKE craashed with a pattern having letters in the range 128..255
(e.g. A WITH ACUTE or C WITH CARON) because of wrong cast from
signed char to unsigned int.
The length of the prefix of the pattern string in the LIKE predicate that
determined the index range to be scanned was calculated incorrectly for
multi-byte character sets.
As a result of this in 4. 1 the the scanned range was wider then necessary
if the prefix contained not only one-byte characters.
In 5.0 additionally it caused missing some rows from the result set.
Backporting a 5.0 change:
MAX_BUF was too small for Index.xml
Changeing MAX_BUF and adding assert to easier
catch the same problem in the future.
ctype-extra.c:
Regenerating ctype-extra.c with the fixed conf_to_src.
When InnoDB compares varchar field in ucs2 with given key using bin collation,
it calls my_strnncollsp_ucs2_bin() to perform comparison.
Because field length was lesser than length of key field should be padded
with trailing spaces in order to get correct result.
Because my_strnncollsp_ucs2_bin() was calling my_strnncollp_ucs2_bin(), which
doesn't pads field, wrong comparison result was returned. This results in
wrong result set.
my_strnncollsp_ucs2_bin() now compares fields like my_strnncollsp_ucs2 do,
but using binary collation.
into parts when converting to Unicode.
m_ctype.h:
Reorganizing mb_wc return codes to be able
to return "an unassigned N-byte-long character".
sql_string.cc:
Adding code to detect and properly handle
unassigned characters (i.e. the those character
which are correctly formed according to the
character specifications, but don't have Unicode
mapping).
Many files:
Fixing conversion function to return new codes.
ctype_ujis.test, ctype_gbk.test, ctype_big5.test:
Adding a test case.
ctype_ujis.result, ctype_gbk.result, ctype_big5.result:
Fixing results accordingly.
ctype-euc_kr.c:
ctype-gb2312.c:
Adding specific well_formed_length functions
for gb2312 and euckr, to allow storing characters
which are correct according to the character set
specifications but just don't have Unicode mapping.
Previously only those which have Unicode mapping
could be stored, while unassigned characters lead
to data truncation.
Many files:
new file
Fixing latin1 to cp1252, according to
recent changes in ctype-latin1.c.
Index.xml:
Marking latin1_swedish_ci as "compiled"
to avoid its generating into ctype-extra.c.
ctype-extra.c:
Bug#12076 --with-extra-charsets has no effect
All supported dnamic charsets were generated
from sql/share/charsets/*.xml under #ifdefs.
Compiling is to be activated from "configure"
by adding --with-extra-charsets.
I'm not including auto-recreating of ctype-extra.c
into build process for ease purposes.
index doesn't return correct result
item_cmpfunc.cc:
Use charset of LIKE to decide whether
to use 8bit or Unicode "escape" value.
But use charset of "escape" to scan escape character.
strings/ctype-xxx.c:
We cannot reduce "end" pointer using charpos(),
because of possible escape characters in the string.
Limit the loop using count of written characters instead.
ctype_like_escape.inc:
new file
mysql-test/t/ctype_xxx:
mysql-test/r/ctype_xxx:
Adding test case.
ctype_latin1.test, ctype_latin1.result:
adding test case
ctype-latin1.c:
Fixing ctype array to treat extended cp1252
letters as valid identifiers on server side,
and as valid "isprint" characters (e.g. on client side).
In cp932, '\' character can be the second byte in a
multi-byte character stream. This makes it difficult to use
mysql_escape_string. Added flag to indicate which languages allow
'\' as second byte of multibyte sequence so that when putting a prepared
statement into the binlog we can decide at runtime whether hex encoding
is really needed.
Changed assembler functions to not access global variables or variables in text segement
Added wrapper function in C to longlong2str() to pass _dig_vec_upper as an argument
ctype-cp932.c:
ctype-gbk.c:
ctype-mb.c:
ctype-simple.c:
ctype-sjis.c:
ctype-ucs2.c:
ctype-ujis.c:
ctype-utf8.c:
Adding explicit cast to return type
in pointer substructions to avoid
warnings from some compilers.
Bug #11987
mysql will truncate the text when the text contain GBK char:"0xA3A0" and "0xA1"
Allow to store and retrieve even unassigned GBK codes.
Like we did in Big5 earlier.
have_gbk.inc, have_gbk.require, ctype_gbk.result, ctype_gbk.test:
new file
Index.xml:
Fixing latin1 comment:
it is actually cp1252, not iso-8859-1
ctype_latin1.result:
changeing test results accordingly.
ctype-latin1.c:
Fixed to- and from-Unicode conversion maps
for better Unicode round trip of undefined
characters.
New BitKeeper file ``mysql-test/include/ctype_innodb_like.inc''
Many files:
bug#11650: LIKE pattern matching using prefix index doesn't return correct result
min and max values were too long in the case of prefix key.
Fix my_like_range functions not to exceed prefix length.
ctype_innodb_like.inc:
new file
Fixing tests accordingly.
ctype-ucs2.c:
The same fix for UCS2.
ctype-utf8.c:
Bug #9557
MyISAM utf8 table crash
The problem was that my_strnncollsp_xxx could
return big value in the range 0..0xffff.
for some constant pairs it could return 32738,
which is defined as MI_FOUND_WRONG_KEY in
myisamdef.h. As a result, table considered to
be crashed.
Fix to return -1,0 or 1.