Commit graph

158 commits

Author SHA1 Message Date
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Alexander Barkov
b915f79e4e MDEV-25904 New collation functions to compare InnoDB style trimmed NO PAD strings 2022-01-21 12:16:07 +04:00
Vladislav Vaintroub
47e18af906 MDEV-27494 Rename .ic files to .inl 2022-01-17 16:41:51 +01:00
Alexander Barkov
0d68b0a2d6 MDEV-26669 Add MY_COLLATION_HANDLER functions min_str() and max_str() 2021-09-27 17:10:22 +04:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Monty
dbcd3384e0 MDEV-7947 strcmp() takes 0.37% in OLTP RO
This patch ensures that all identical character sets shares the same
cs->csname.
This allows us to replace strcmp() in my_charset_same() with comparisons
of pointers. This fixes a long standing performance issue that could cause
as strcmp() for every item sent trough the protocol class to the end user.

One consequence of this patch is that we don't allow one to add a character
definition in the Index.xml file that changes the csname of an existing
character set. This is by design as changing character set names of existing
ones is extremely dangerous, especially as some storage engines just records
character set numbers.

As we now have a hash over character set's csname, we can in the future
use that for faster access to a specific character set. This could be done
by changing the hash to non unique and use the hash to find the next
character set with same csname.
2020-07-23 10:54:33 +03:00
Alexander Barkov
cfe5ee90c8 MDEV-22043 Special character leads to assertion in my_wc_to_printable_generic on 10.5.2 (debug)
The code did not take into account that:
- U+005C (backslash) can occupy more than mbminlen characters (e.g. in sjis)
- Some character sets do not have a code for U+005C (e.g. swe7)

Adding a new function my_wc_to_printable into MY_CHARSET_HANDLER to
cover all special cases easier.
2020-05-09 16:01:30 +04:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Alexander Barkov
5058ced5df MDEV-7769 MY_CHARSET_INFO refactoring# On branch 10.2
Part 3 (final): removing MY_CHARSET_HANDLER::well_formed_len().
2016-10-10 14:36:09 +04:00
Alexander Barkov
ee19806b8e MDEV-9711 NO PAD collations
Based on the patch from Daniil Medvedev (a Google Summer of Code task)
2016-09-06 12:50:02 +04:00
Alexander Barkov
e7ff281d2e MDEV-6353 my_ismbchar() and my_mbcharlen() refactoring 2016-05-17 15:27:10 +04:00
Alexander Barkov
e09299511e MDEV-9665 Remove cs->cset->ismbchar()
Using a more powerfull cs->cset->charlen() instead.
2016-03-16 10:55:12 +04:00
Alexander Barkov
78b80cb6ba Adding MY_CHARSET_HANDLER::native_to_mb().
This is a pre-requisite patch for:
- MDEV-8433 Make field<'broken-string' use indexes
- MDEV-8625 Bad result set with ignorable characters when using a prefix key
- MDEV-8626 Bad result set with contractions when using a prefix key
2015-08-14 18:34:41 +04:00
Alexander Barkov
4f828a1cac MDEV-8214 Asian MB2 charsets: compare broken bytes as "greater than any non-broken character" 2015-06-26 13:40:28 +04:00
Alexander Barkov
197afb413f MDEV-6566 Different INSERT behaviour on bad bytes with and without character set conversion 2015-03-13 16:51:36 +04:00
Alexander Barkov
a7ed8523e3 Adding a shared include file ctype-mb.ic and removing a number
of very similar copies of my_well_formed_len_xxx(), implemented
for big5, cp932, euckr, eucjpms, gb2312m gbk, sjis, ujis.
2015-03-04 09:16:43 +04:00
Alexander Barkov
b1b6101af2 A preparatory patch for MDEV-6566.
Adding a new virtual function MY_CHARSET_HANDLER::copy_abort().
Moving character set specific code into the correspoding implementations
(for simple, multi-byte and mbmaxlen>1 character sets).
2015-03-02 18:24:22 +04:00
Sergei Golubchik
2ae7541bcf cleanup: s/const CHARSET_INFO/CHARSET_INFO/
as CHARSET_INFO is already const, using const on it
is redundant and results in compiler warnings (on Windows)
2014-12-04 10:41:51 +01:00
Alexander Barkov
426d246f5b MDEV-5163 Merge WEIGHT_STRING function from MySQL-5.6 2013-10-23 20:25:52 +04:00
Alexander Barkov
0b6c4bb34f MDEV-4928 Merge collation customization improvements
Merging the following MySQL-5.6 changes:
- WL#5624: Collation customization improvements
  http://dev.mysql.com/worklog/task/?id=5624

- WL#4013: Unicode german2 collation
  http://dev.mysql.com/worklog/task/?id=4013

- Bug#62429 XML: ExtractValue, UpdateXML max arg length 127 chars
  http://bugs.mysql.com/bug.php?id=62429
  (required by WL#5624)
2013-10-02 15:04:07 +04:00
Sergei Golubchik
4f435bddfd 5.3 merge 2012-01-13 15:50:02 +01:00
Michael Widenius
6920457142 Merge with MariaDB 5.1 2011-11-24 18:48:58 +02:00
Michael Widenius
a8d03ab235 Initail merge with MySQL 5.1 (XtraDB still needs to be merged)
Fixed up copyright messages.
2011-11-21 19:13:14 +02:00
Sergei Golubchik
0e007344ea mysql-5.5.18 merge 2011-11-03 19:17:05 +01:00
Sergei Golubchik
76f0b94bb0 merge with 5.3
sql/sql_insert.cc:
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
  ******
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
  small cleanup
  ******
  small cleanup
2011-10-19 21:45:18 +02:00
Sergei Golubchik
9809f05199 5.5-merge 2011-07-02 22:08:51 +02:00
Kent Boortz
68f00a5686 Updated/added copyright headers 2011-06-30 17:37:13 +02:00
Kent Boortz
02e07e3b51 Updated/added copyright headers 2011-06-30 17:46:53 +02:00
Michael Widenius
1be5462d59 Merge with MariaDB 5.1 2011-05-03 19:10:10 +03:00
Michael Widenius
e415ba0fb2 Merge with MySQL 5.1.57/58
Moved some BSD string functions from Unireg
2011-05-02 20:58:45 +03:00
Michael Widenius
3358cdd504 Merge with 5.1 to get in changes from MySQL 5.1.55 2011-02-28 19:39:30 +02:00
Michael Widenius
785695e7c3 Flush DBUG log in case of DBUG_ASSERT()
Added strings_def.h into strings library to be able to have a DBUG_ASSERT() version without _db_flush() call (as strings.a should not depend on dbug.a)
Remove include of m_string.h in all string files (as it's included by string_def.h).
Fixed include order.
Changed "m_ctype.h" -> <m_ctype.h>

 

include/my_dbug.h:
  Flush DBUG log in case of DBUG_ASSERT()
strings/bchange.c:
  Include strings_def.h
strings/bcmp.c:
  Include strings_def.h
strings/bfill.c:
  Include strings_def.h
strings/bmove.c:
  Include strings_def.h
strings/bmove512.c:
  Include strings_def.h
strings/bmove_upp.c:
  Include strings_def.h
strings/conf_to_src.c:
  Include strings_def.h
  Fixed copyright
strings/ctype-big5.c:
  Include strings_def.h
strings/ctype-bin.c:
  Include strings_def.h
strings/ctype-cp932.c:
  Include strings_def.h
strings/ctype-czech.c:
  Include strings_def.h
strings/ctype-euc_kr.c:
  Include strings_def.h
strings/ctype-eucjpms.c:
  Include strings_def.h
strings/ctype-extra.c:
  Include strings_def.h
strings/ctype-gbk.c:
  Include strings_def.h
strings/ctype-latin1.c:
  Include strings_def.h
strings/ctype-mb.c:
  Include strings_def.h
strings/ctype-simple.c:
  Include strings_def.h
strings/ctype-sjis.c:
  Include strings_def.h
strings/ctype-tis620.c:
  Include strings_def.h
strings/ctype-uca.c:
  Include strings_def.h
strings/ctype-ucs2.c:
  Include strings_def.h
strings/ctype-ujis.c:
  Include strings_def.h
strings/ctype-utf8.c:
  Include strings_def.h
strings/ctype-win1250ch.c:
  Include strings_def.h
strings/ctype.c:
  Include strings_def.h
strings/decimal.c:
  Include strings_def.h
strings/do_ctype.c:
  Include strings_def.h
strings/int2str.c:
  Include strings_def.h
strings/is_prefix.c:
  Include strings_def.h
strings/llstr.c:
  Include strings_def.h
strings/longlong2str.c:
  Include strings_def.h
strings/longlong2str_asm.c:
  Include strings_def.h
strings/my_strchr.c:
  Include strings_def.h
strings/my_strtoll10.c:
  Include strings_def.h
strings/my_vsnprintf.c:
  Include strings_def.h
strings/r_strinstr.c:
  Include strings_def.h
strings/str2int.c:
  Include strings_def.h
strings/str_alloc.c:
  Include strings_def.h
strings/str_test.c:
  Include strings_def.h
  Fixed compiler warnings
strings/strappend.c:
  Include strings_def.h
strings/strcend.c:
  Include strings_def.h
strings/strcont.c:
  Include strings_def.h
strings/strend.c:
  Include strings_def.h
strings/strfill.c:
  Include strings_def.h
strings/strinstr.c:
  Include strings_def.h
strings/strmake.c:
  Include strings_def.h
strings/strmov.c:
  Include strings_def.h
strings/strmov_overlapp.c:
  Include strings_def.h
strings/strnlen.c:
  Include strings_def.h
strings/strnmov.c:
  Include strings_def.h
strings/strstr.c:
  Include strings_def.h
strings/strto.c:
  Include strings_def.h
strings/strtod.c:
  Include strings_def.h
strings/strtol.c:
  Include strings_def.h
strings/strtoll.c:
  Include strings_def.h
strings/strtoul.c:
  Include strings_def.h
strings/strtoull.c:
  Include strings_def.h
strings/strxmov.c:
  Include strings_def.h
strings/strxnmov.c:
  Include strings_def.h
strings/uctypedump.c:
  Include strings_def.h
  Fixed compiler warnings
  Removed double include of m_ctype.h
strings/udiv.c:
  Include strings_def.h
strings/xml.c:
  Include strings_def.h
2011-01-30 12:41:44 +02:00
Alexander Barkov
435289acd4 Updating Copyright information 2011-01-19 16:17:52 +03:00
Alexander Barkov
dfb7930b33 Merging Copyright update from 5.1 2011-01-19 16:31:17 +03:00
Sergei Golubchik
65ca700def merge.
checkpoint.
does not compile.
2010-11-25 18:17:28 +01:00
Sergei Golubchik
a3d80d952d merge with 5.1 2010-09-11 20:43:48 +02:00
Michael Widenius
ad6d95d3cb Merge with MySQL 5.1.50
- Changed to still use bcmp() in certain cases becasue
  - Faster for short unaligneed strings than memcmp()
  - Bettern when using valgrind
- Changed to use my_sprintf() instead of sprintf() to get higher portability for old systems
- Changed code to use MariaDB version of select->skip_record()
- Removed -%::SCCS/s.% from Makefile.am:s to remove automake warnings
2010-08-27 17:12:44 +03:00
Alexander Barkov
fc74792a6b Merging from mysql-5.1-bugteam 2010-07-26 10:47:03 +04:00
Alexander Barkov
e57a9d6fe0 Bug#45012 my_like_range_cp932 generates invalid string
Problem: The functions my_like_range_xxx() returned
badly formed maximum strings for Asian character sets,
which made problems for storage engines.

Fix: 
- Removed a number my_like_range_xxx() implementations,
  which were in fact dumplicate code pieces.
- Using generic my_like_range_mb() instead.
- Setting max_sort_char member properly for Asian character sets
- Adding unittest/strings/strings-t.c, 
  to test that my_like_range_xxx() return well-formed 
  min and max strings.

Notes:

- No additional tests in mysql/t/ available.
  Old tests cover the affected code well enough.
2010-07-26 09:06:18 +04:00
Davi Arnaut
13f7a1d244 WL#5486: Remove code for unsupported platforms
Remove MS-DOS specific code.
2010-07-15 08:16:06 -03:00
Alexander Barkov
4e4122b999 WL#3090 Japanese Character Set adjustments
added:
  @ mysql-test/include/ctype_utf8_table.inc
    Adding a share file to populate all utf8 values [U+0000..U+FFFF]
modified:
  @ include/m_ctype.h
    Introducing MB2 and MY_PUT_MB2 macros

  @ mysql-test/r/ctype_cp932_binlog_stm.result
  @ mysql-test/r/ctype_eucjpms.result
  @ mysql-test/r/ctype_sjis.result
  @ mysql-test/r/ctype_ujis.result
  @ mysql-test/t/ctype_cp932_binlog_stm.test
  @ mysql-test/t/ctype_eucjpms.test
  @ mysql-test/t/ctype_sjis.test
  @ mysql-test/t/ctype_ujis.test
    Adding test

  @ strings/ctype-cp932.c
  @ strings/ctype-eucjpms.c
  @ strings/ctype-sjis.c
  @ strings/ctype-ujis.c
    Adding new functions using Big-Table approach.
2010-02-15 09:57:24 +04:00
Alexander Barkov
8dfc3fbbab WL#4583 Case conversion in Asian character sets
modified:
  include/m_ctype.h
  - Changing type for tolower/toupper members, to store values >= 0xFFFF.
  - Adding function prototypes

  mysql-test/r/ctype_big5.result
  mysql-test/r/ctype_cp932_binlog_stm.result
  mysql-test/r/ctype_eucjpms.result*
  mysql-test/r/ctype_euckr.result
  mysql-test/r/ctype_gb2312.result
  mysql-test/r/ctype_gbk.result
  mysql-test/r/ctype_sjis.result
  mysql-test/r/ctype_ujis.result
  mysql-test/t/ctype_big5.test
  mysql-test/t/ctype_cp932_binlog_stm.test
  mysql-test/t/ctype_eucjpms.test
  mysql-test/t/ctype_euckr.test
  mysql-test/t/ctype_gb2312.test
  mysql-test/t/ctype_gbk.test
  mysql-test/t/ctype_sjis.test
  mysql-test/t/ctype_ujis.test
  -  Adding tests

  strings/ctype-big5.c
  strings/ctype-cp932.c
  strings/ctype-euc_kr.c
  strings/ctype-eucjpms.c
  strings/ctype-gb2312.c
  strings/ctype-gbk.c
  strings/ctype-sjis.c
  - Adding upper/lower case conversion data

  strings/ctype-mb.c
  - Adding handling of upper/lower conversion for multi-byte characters.

  strings/ctype-ujis.c
  - Implementing shared upper/lower conversion
    functions  for ujis and eucjpms
  - Adding upper/lower case conversion data for ujis
2010-01-14 15:17:57 +04:00
Michael Widenius
f83113df07 Applied Antony T Curtis patch for declaring many CHARSET objects as const
Removed compiler warnings

extra/libevent/epoll.c:
  Removed compiler warnings
extra/libevent/evbuffer.c:
  Removed compiler warnings
extra/libevent/event.c:
  Removed compiler warnings
extra/libevent/select.c:
  Removed compiler warnings
extra/libevent/signal.c:
  Removed compiler warnings
include/m_ctype.h:
  Define CHARSET_INFO, MY_CHARSET_HANDLER, MY_COLLATION_HANDLER, MY_UNICASE_INFO, MY_UNI_CTYPE and MY_UNI_IDX as const structures.
  Declare that pointers point to const data
include/m_string.h:
  Declare that pointers point to const data
include/my_sys.h:
  Redefine variables and function prototypes
include/mysql.h:
  Declare charset as const
include/mysql.h.pp:
  Declare charset as const
include/mysql/plugin.h:
  Declare charset as const
include/mysql/plugin.h.pp:
  Declare charset as const
mysys/charset-def.c:
  Charset can't be of type CHARSET_INFO as they are changed when they are initialized.
mysys/charset.c:
  Functions that change CHARSET_INFO must use 'struct charset_info_st'
  Add temporary variables to not have to change all_charsets[] (Which now is const)
sql-common/client.c:
  Added cast to const
sql/item_cmpfunc.h:
  Added cast to avoid compiler error.
sql/sql_class.cc:
  Added cast to const
sql/sql_lex.cc:
  Added cast to const
storage/maria/ma_ft_boolean_search.c:
  Added cast to avoid compiler error.
storage/maria/ma_ft_parser.c:
  Added cast to avoid compiler error.
storage/maria/ma_search.c:
  Added cast to const
storage/myisam/ft_boolean_search.c:
  Added cast to avoid compiler error
storage/myisam/ft_parser.c:
  Added cast to avoid compiler error
storage/myisam/mi_search.c:
  Added cast to const
storage/pbxt/src/datadic_xt.cc:
  Added cast to const
storage/pbxt/src/ha_pbxt.cc:
  Added cast to const
  Removed compiler warning by changing prototype of XTThreadPtr()
storage/pbxt/src/myxt_xt.h:
  Character sets should be const
storage/pbxt/src/xt_defs.h:
  Character sets should be const
storage/xtradb/btr/btr0cur.c:
  Removed compiler warning
strings/conf_to_src.c:
  Added const
  Functions that change CHARSET_INFO must use 'struct charset_info_st'
strings/ctype-big5.c:
  Made arrays const
strings/ctype-bin.c:
  Made arrays const
strings/ctype-cp932.c:
  Made arrays const
strings/ctype-czech.c:
  Made arrays const
strings/ctype-euc_kr.c:
  Made arrays const
strings/ctype-eucjpms.c:
  Made arrays const
strings/ctype-extra.c:
  Made arrays const
strings/ctype-gb2312.c:
  Made arrays const
strings/ctype-gbk.c:
  Made arrays const
strings/ctype-latin1.c:
  Made arrays const
strings/ctype-mb.c:
  Made arrays const
strings/ctype-simple.c:
  Made arrays const
strings/ctype-sjis.c:
  Made arrays const
strings/ctype-tis620.c:
  Made arrays const
strings/ctype-uca.c:
  Made arrays const
strings/ctype-ucs2.c:
  Made arrays const
strings/ctype-ujis.c:
  Made arrays const
strings/ctype-utf8.c:
  Made arrays const
strings/ctype-win1250ch.c:
  Made arrays const
strings/ctype.c:
  Made arrays const
  Added cast to const
  Functions that change CHARSET_INFO must use 'struct charset_info_st'
strings/int2str.c:
  Added cast to const
2010-01-06 21:20:16 +02:00
Alexander Barkov
78c65d561c After-fix for back-port of WL#3759. 2009-10-01 12:22:31 +05:00
Sergey Petrunya
29f0dcb563 Merge MySQL->MariaDB
* Finished Monty and Jani's merge
* Some InnoDB tests still fail (because it's old xtradb code run against
  newer testsuite). They are expected to go after mergning with the latest
  xtradb.
2009-09-08 00:50:10 +04:00
Michael Widenius
9db357e2bf Added MY_CS_NONASCII marker for character sets that are not compatible with latin1 for characters 0x00-0x7f
This allows us to skip and speed up some very common character converts that MySQL is doing when sending data to the client
and this gives us a nice speed increase for most queries that uses only characters in the range 0x00-0x7f.

This code is based on Alexander Barkov's code that he has done in MySQL 6.0


include/m_ctype.h:
  Added MY_CS_NONASCII marker
libmysqld/lib_sql.cc:
  Added function net_store_data(...) that takes to and from CHARSET_INFO * as arguments
mysys/charset.c:
  Mark character sets with MY_CS_NONASCII
scripts/mysql_install_db.sh:
  Fixed messages to refer to MariaDB instead of MySQL
sql/protocol.cc:
  Added function net_store_data(...) that takes to and from CHARSET_INFO * as arguments
sql/protocol.h:
  Added function net_store_data(...) that takes to and from CHARSET_INFO * as arguments
sql/sql_string.cc:
  Quicker copy of strings with no characters above 0x7f
strings/conf_to_src.c:
  Added printing of MY_CS_NONASCII
strings/ctype-extra.c:
  Mark incompatible character sets with MY_CS_NONASCII
  Removed duplicated character set geostd
strings/ctype-sjis.c:
  Mark incompatible character sets with MY_CS_NONASCII
strings/ctype-uca.c:
  Mark incompatible character sets with MY_CS_NONASCII
strings/ctype-ucs2.c:
  Mark incompatible character sets with MY_CS_NONASCII
strings/ctype-utf8.c:
  Mark incompatible character sets with MY_CS_NONASCII
strings/ctype.c:
  Added function to check if character set is compatible with latin1 in ranges 0x00-0x7f
2009-07-02 13:15:33 +03:00