This patch ensures that all identical character sets shares the same
cs->csname.
This allows us to replace strcmp() in my_charset_same() with comparisons
of pointers. This fixes a long standing performance issue that could cause
as strcmp() for every item sent trough the protocol class to the end user.
One consequence of this patch is that we don't allow one to add a character
definition in the Index.xml file that changes the csname of an existing
character set. This is by design as changing character set names of existing
ones is extremely dangerous, especially as some storage engines just records
character set numbers.
As we now have a hash over character set's csname, we can in the future
use that for faster access to a specific character set. This could be done
by changing the hash to non unique and use the hash to find the next
character set with same csname.
The code did not take into account that:
- U+005C (backslash) can occupy more than mbminlen characters (e.g. in sjis)
- Some character sets do not have a code for U+005C (e.g. swe7)
Adding a new function my_wc_to_printable into MY_CHARSET_HANDLER to
cover all special cases easier.
This is a pre-requisite patch for:
- MDEV-8433 Make field<'broken-string' use indexes
- MDEV-8625 Bad result set with ignorable characters when using a prefix key
- MDEV-8626 Bad result set with contractions when using a prefix key
Adding a new virtual function MY_CHARSET_HANDLER::copy_abort().
Moving character set specific code into the correspoding implementations
(for simple, multi-byte and mbmaxlen>1 character sets).
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
Added strings_def.h into strings library to be able to have a DBUG_ASSERT() version without _db_flush() call (as strings.a should not depend on dbug.a)
Remove include of m_string.h in all string files (as it's included by string_def.h).
Fixed include order.
Changed "m_ctype.h" -> <m_ctype.h>
include/my_dbug.h:
Flush DBUG log in case of DBUG_ASSERT()
strings/bchange.c:
Include strings_def.h
strings/bcmp.c:
Include strings_def.h
strings/bfill.c:
Include strings_def.h
strings/bmove.c:
Include strings_def.h
strings/bmove512.c:
Include strings_def.h
strings/bmove_upp.c:
Include strings_def.h
strings/conf_to_src.c:
Include strings_def.h
Fixed copyright
strings/ctype-big5.c:
Include strings_def.h
strings/ctype-bin.c:
Include strings_def.h
strings/ctype-cp932.c:
Include strings_def.h
strings/ctype-czech.c:
Include strings_def.h
strings/ctype-euc_kr.c:
Include strings_def.h
strings/ctype-eucjpms.c:
Include strings_def.h
strings/ctype-extra.c:
Include strings_def.h
strings/ctype-gbk.c:
Include strings_def.h
strings/ctype-latin1.c:
Include strings_def.h
strings/ctype-mb.c:
Include strings_def.h
strings/ctype-simple.c:
Include strings_def.h
strings/ctype-sjis.c:
Include strings_def.h
strings/ctype-tis620.c:
Include strings_def.h
strings/ctype-uca.c:
Include strings_def.h
strings/ctype-ucs2.c:
Include strings_def.h
strings/ctype-ujis.c:
Include strings_def.h
strings/ctype-utf8.c:
Include strings_def.h
strings/ctype-win1250ch.c:
Include strings_def.h
strings/ctype.c:
Include strings_def.h
strings/decimal.c:
Include strings_def.h
strings/do_ctype.c:
Include strings_def.h
strings/int2str.c:
Include strings_def.h
strings/is_prefix.c:
Include strings_def.h
strings/llstr.c:
Include strings_def.h
strings/longlong2str.c:
Include strings_def.h
strings/longlong2str_asm.c:
Include strings_def.h
strings/my_strchr.c:
Include strings_def.h
strings/my_strtoll10.c:
Include strings_def.h
strings/my_vsnprintf.c:
Include strings_def.h
strings/r_strinstr.c:
Include strings_def.h
strings/str2int.c:
Include strings_def.h
strings/str_alloc.c:
Include strings_def.h
strings/str_test.c:
Include strings_def.h
Fixed compiler warnings
strings/strappend.c:
Include strings_def.h
strings/strcend.c:
Include strings_def.h
strings/strcont.c:
Include strings_def.h
strings/strend.c:
Include strings_def.h
strings/strfill.c:
Include strings_def.h
strings/strinstr.c:
Include strings_def.h
strings/strmake.c:
Include strings_def.h
strings/strmov.c:
Include strings_def.h
strings/strmov_overlapp.c:
Include strings_def.h
strings/strnlen.c:
Include strings_def.h
strings/strnmov.c:
Include strings_def.h
strings/strstr.c:
Include strings_def.h
strings/strto.c:
Include strings_def.h
strings/strtod.c:
Include strings_def.h
strings/strtol.c:
Include strings_def.h
strings/strtoll.c:
Include strings_def.h
strings/strtoul.c:
Include strings_def.h
strings/strtoull.c:
Include strings_def.h
strings/strxmov.c:
Include strings_def.h
strings/strxnmov.c:
Include strings_def.h
strings/uctypedump.c:
Include strings_def.h
Fixed compiler warnings
Removed double include of m_ctype.h
strings/udiv.c:
Include strings_def.h
strings/xml.c:
Include strings_def.h
Problem: The functions my_like_range_xxx() returned
badly formed maximum strings for Asian character sets,
which made problems for storage engines.
Fix:
- Removed a number my_like_range_xxx() implementations,
which were in fact dumplicate code pieces.
- Using generic my_like_range_mb() instead.
- Setting max_sort_char member properly for Asian character sets
- Adding unittest/strings/strings-t.c,
to test that my_like_range_xxx() return well-formed
min and max strings.
Notes:
- No additional tests in mysql/t/ available.
Old tests cover the affected code well enough.
Removed compiler warnings
extra/libevent/epoll.c:
Removed compiler warnings
extra/libevent/evbuffer.c:
Removed compiler warnings
extra/libevent/event.c:
Removed compiler warnings
extra/libevent/select.c:
Removed compiler warnings
extra/libevent/signal.c:
Removed compiler warnings
include/m_ctype.h:
Define CHARSET_INFO, MY_CHARSET_HANDLER, MY_COLLATION_HANDLER, MY_UNICASE_INFO, MY_UNI_CTYPE and MY_UNI_IDX as const structures.
Declare that pointers point to const data
include/m_string.h:
Declare that pointers point to const data
include/my_sys.h:
Redefine variables and function prototypes
include/mysql.h:
Declare charset as const
include/mysql.h.pp:
Declare charset as const
include/mysql/plugin.h:
Declare charset as const
include/mysql/plugin.h.pp:
Declare charset as const
mysys/charset-def.c:
Charset can't be of type CHARSET_INFO as they are changed when they are initialized.
mysys/charset.c:
Functions that change CHARSET_INFO must use 'struct charset_info_st'
Add temporary variables to not have to change all_charsets[] (Which now is const)
sql-common/client.c:
Added cast to const
sql/item_cmpfunc.h:
Added cast to avoid compiler error.
sql/sql_class.cc:
Added cast to const
sql/sql_lex.cc:
Added cast to const
storage/maria/ma_ft_boolean_search.c:
Added cast to avoid compiler error.
storage/maria/ma_ft_parser.c:
Added cast to avoid compiler error.
storage/maria/ma_search.c:
Added cast to const
storage/myisam/ft_boolean_search.c:
Added cast to avoid compiler error
storage/myisam/ft_parser.c:
Added cast to avoid compiler error
storage/myisam/mi_search.c:
Added cast to const
storage/pbxt/src/datadic_xt.cc:
Added cast to const
storage/pbxt/src/ha_pbxt.cc:
Added cast to const
Removed compiler warning by changing prototype of XTThreadPtr()
storage/pbxt/src/myxt_xt.h:
Character sets should be const
storage/pbxt/src/xt_defs.h:
Character sets should be const
storage/xtradb/btr/btr0cur.c:
Removed compiler warning
strings/conf_to_src.c:
Added const
Functions that change CHARSET_INFO must use 'struct charset_info_st'
strings/ctype-big5.c:
Made arrays const
strings/ctype-bin.c:
Made arrays const
strings/ctype-cp932.c:
Made arrays const
strings/ctype-czech.c:
Made arrays const
strings/ctype-euc_kr.c:
Made arrays const
strings/ctype-eucjpms.c:
Made arrays const
strings/ctype-extra.c:
Made arrays const
strings/ctype-gb2312.c:
Made arrays const
strings/ctype-gbk.c:
Made arrays const
strings/ctype-latin1.c:
Made arrays const
strings/ctype-mb.c:
Made arrays const
strings/ctype-simple.c:
Made arrays const
strings/ctype-sjis.c:
Made arrays const
strings/ctype-tis620.c:
Made arrays const
strings/ctype-uca.c:
Made arrays const
strings/ctype-ucs2.c:
Made arrays const
strings/ctype-ujis.c:
Made arrays const
strings/ctype-utf8.c:
Made arrays const
strings/ctype-win1250ch.c:
Made arrays const
strings/ctype.c:
Made arrays const
Added cast to const
Functions that change CHARSET_INFO must use 'struct charset_info_st'
strings/int2str.c:
Added cast to const
into host.loc:/home/uchum/work/5.1-bugteam
mysql-test/r/ctype_gbk.result:
Auto merged
mysql-test/r/subselect3.result:
Auto merged
mysql-test/t/subselect3.test:
Auto merged
sql/sql_select.cc:
Auto merged
strings/ctype-big5.c:
Merge with 5.0-bugteam (bug#35993).
strings/ctype-gbk.c:
Merge with 5.0-bugteam (bug#35993).
Grouping or ordering of long values in not indexed BLOB/TEXT columns
with GBK or BIG5 charsets crashes the server.
MySQL server uses sorting (the filesort procedure) in the temporary
table to evaluate the GROUP BY clause in case of lack of suitable index.
That procedure takes into account only first @max_sort_length bytes
(system variable, usually 1024) of TEXT/BLOB sorting key string.
The my_strnxfrm_gbk and my_strnxfrm_big5 fill temporary keys
with data of whole blob length instead of @max_sort_length bytes
length. That buffer overrun has been fixed.
mysql-test/r/ctype_gbk.result:
Added test case for bug #35993.
mysql-test/t/ctype_gbk.test:
Added test case for bug #35993.
strings/ctype-big5.c:
Fixed bug #35993: memory corruption and crash with multibyte conversion.
Buffer overrun has been fixed in the my_strnxfrm_big5 function.
strings/ctype-gbk.c:
Fixed bug #35993: memory corruption and crash with multibyte conversion.
Buffer overrun has been fixed in the my_strnxfrm_gbk function.
into mysql.com:/home/ram/work/b31070/b31070.5.1
strings/ctype-big5.c:
Auto merged
strings/ctype-euc_kr.c:
Auto merged
strings/ctype-gb2312.c:
Auto merged
strings/ctype-sjis.c:
Auto merged
into mysql.com:/home/ram/work/b31070/b31070.5.0
mysql-test/r/ctype_big5.result:
Auto merged
mysql-test/r/ctype_gbk.result:
Auto merged
mysql-test/r/ctype_uca.result:
Auto merged
strings/ctype-big5.c:
Auto merged
strings/ctype-euc_kr.c:
Auto merged
strings/ctype-gb2312.c:
Auto merged
strings/ctype-sjis.c:
Auto merged
BitKeeper/deleted/.del-ctype-cp932.c:
Auto merged
and for bug #31070: crash during conversion of charsets
Problem: passing a 0 byte length string to some my_mb_wc_XXX()
functions leads to server crash due to improper argument check.
Fix: properly check arguments passed to my_mb_wc_XXX() functions.
mysql-test/include/ctype_common.inc:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- test case.
mysql-test/r/ctype_big5.result:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- test result.
mysql-test/r/ctype_euckr.result:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- test result.
mysql-test/r/ctype_gb2312.result:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- test result.
mysql-test/r/ctype_gbk.result:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- test result.
mysql-test/r/ctype_uca.result:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- test result.
strings/ctype-big5.c:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- check the string length before testing its first byte.
strings/ctype-cp932.c:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- check the string length before testing its first byte.
strings/ctype-euc_kr.c:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- check the string length before testing its first byte.
strings/ctype-gb2312.c:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- check the string length before testing its first byte.
strings/ctype-sjis.c:
Fix for bug #31069: crash in 'sounds like'
and bug #31070: crash during conversion of charsets
- check the string length before testing its first byte.
into maint1.mysql.com:/data/localhome/tsmith/bk/maint/51
configure.in:
Auto merged
include/m_ctype.h:
Auto merged
mysql-test/Makefile.am:
Auto merged
mysql-test/t/innodb.test:
Auto merged
mysys/charset-def.c:
Auto merged
mysys/charset.c:
Auto merged
sql/log_event.cc:
Auto merged
sql/sql_acl.cc:
Auto merged
sql/sql_class.cc:
Auto merged
strings/ctype-big5.c:
Auto merged
strings/ctype-gbk.c:
Auto merged
strings/ctype-sjis.c:
Auto merged
strings/ctype-uca.c:
Auto merged
strings/ctype.c:
Auto merged
mysql-test/r/innodb.result:
Manual merge
mysql-test/r/multi_update.result:
Manual merge
mysql-test/t/multi_update.test:
Manual merge
sql/sql_update.cc:
SCCS merged
Problem: "SELECT INTO OUTFILE" created incorrect dumps for BLOBs,
so "LOAD DATA" later incorrectly interpreted 0x5C as the second
byte of a multi-byte sequence, instead of escape character.
Fix: adding escaping of multi-byte heads.
mysql-test/r/ctype_big5.result:
Adding test case
mysql-test/t/ctype_big5.test:
Adding test case
sql/sql_class.cc:
Add escape characters before multi-byte heads.
strings/ctype-big5.c:
Flagging character set as dangerous for escaping.
strings/ctype-gbk.c:
Flagging character set as dangerous for escaping.
strings/ctype-sjis.c:
Flagging character set as dangerous for escaping.