mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-28 17:54:16 +01:00

Author	SHA1	Message	Date
Alexander Barkov	133446828c	MDEV-27009 Add UCA-14.0.0 collations - Added one neutral and 22 tailored (language specific) collations based on Unicode Collation Algorithm version 14.0.0. Collations were added for Unicode character sets utf8mb3, utf8mb4, ucs2, utf16, utf32. Every tailoring was added with four accent and case sensitivity flag combinations, e.g: * utf8mb4_uca1400_swedish_as_cs * utf8mb4_uca1400_swedish_as_ci * utf8mb4_uca1400_swedish_ai_cs * utf8mb4_uca1400_swedish_ai_ci and their _nopad_ variants: * utf8mb4_uca1400_swedish_nopad_as_cs * utf8mb4_uca1400_swedish_nopad_as_ci * utf8mb4_uca1400_swedish_nopad_ai_cs * utf8mb4_uca1400_swedish_nopad_ai_ci - Introducing a conception of contextually typed named collations: CREATE DATABASE db1 CHARACTER SET utf8mb4; CREATE TABLE db1.t1 (a CHAR(10) COLLATE uca1400_as_ci); The idea is that there is no a need to specify the character set prefix in the new collation names. It's enough to type just the suffix "uca1400_as_ci". The character set is taken from the context. In the above example script the context character set is utf8mb4. So the CREATE TABLE will make a column with the collation utf8mb4_uca1400_as_ci. Short collations names can be used in any parts of the SQL syntax where the COLLATE clause is understood. - New collations are displayed only one time (without character set combinations) by these statements: SELECT * FROM INFORMATION_SCHEMA.COLLATIONS; SHOW COLLATION; For example, all these collations: - utf8mb3_uca1400_swedish_as_ci - utf8mb4_uca1400_swedish_as_ci - ucs2_uca1400_swedish_as_ci - utf16_uca1400_swedish_as_ci - utf32_uca1400_swedish_as_ci have just one entry in INFORMATION_SCHEMA.COLLATIONS and SHOW COLLATION, with COLLATION_NAME equal to "uca1400_swedish_as_ci", which is the suffix without the character set name: SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLLATIONS WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci'; +-----------------------+ \| COLLATION_NAME \| +-----------------------+ \| uca1400_swedish_as_ci \| +-----------------------+ Note, the behaviour of old collations did not change. Non-unicode collations (e.g. latin1_swedish_ci) and old UCA-4.0.0 collations (e.g. utf8mb4_unicode_ci) are still displayed with the character set prefix, as before. - The structure of the table INFORMATION_SCHEMA.COLLATIONS was changed. The NOT NULL constraint was removed from these columns: - CHARACTER_SET_NAME - ID - IS_DEFAULT and from the corresponding columns in SHOW COLLATION. For example: SELECT COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT FROM INFORMATION_SCHEMA.COLLATIONS WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci'; +-----------------------+--------------------+------+------------+ \| COLLATION_NAME \| CHARACTER_SET_NAME \| ID \| IS_DEFAULT \| +-----------------------+--------------------+------+------------+ \| uca1400_swedish_as_ci \| NULL \| NULL \| NULL \| +-----------------------+--------------------+------+------------+ The NULL value in these columns now means that the collation is applicable to multiple character sets. The behavioir of old collations did not change. Make sure your client programs can handle NULL values in these columns. - The structure of the table INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY was changed. Three new NOT NULL columns were added: - FULL_COLLATION_NAME - ID - IS_DEFAULT New collations have multiple entries in COLLATION_CHARACTER_SET_APPLICABILITY. The column COLLATION_NAME contains the collation name without the character set prefix. The column FULL_COLLATION_NAME contains the collation name with the character set prefix. Old collations have full collation name in both FULL_COLLATION_NAME and COLLATION_NAME. SELECT COLLATION_NAME, FULL_COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY WHERE FULL_COLLATION_NAME RLIKE '^(utf8mb4\|latin1).swedish.ci$'; +-----------------------------+-------------------------------------+--------------------+------+------------+ \| COLLATION_NAME \| FULL_COLLATION_NAME \| CHARACTER_SET_NAME \| ID \| IS_DEFAULT \| +-----------------------------+-------------------------------------+--------------------+------+------------+ \| latin1_swedish_ci \| latin1_swedish_ci \| latin1 \| 8 \| Yes \| \| latin1_swedish_nopad_ci \| latin1_swedish_nopad_ci \| latin1 \| 1032 \| \| \| utf8mb4_swedish_ci \| utf8mb4_swedish_ci \| utf8mb4 \| 232 \| \| \| uca1400_swedish_ai_ci \| utf8mb4_uca1400_swedish_ai_ci \| utf8mb4 \| 2368 \| \| \| uca1400_swedish_as_ci \| utf8mb4_uca1400_swedish_as_ci \| utf8mb4 \| 2370 \| \| \| uca1400_swedish_nopad_ai_ci \| utf8mb4_uca1400_swedish_nopad_ai_ci \| utf8mb4 \| 2372 \| \| \| uca1400_swedish_nopad_as_ci \| utf8mb4_uca1400_swedish_nopad_as_ci \| utf8mb4 \| 2374 \| \| +-----------------------------+-------------------------------------+--------------------+------+------------+ - Other INFORMATION_SCHEMA queries: SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS; SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.PARAMETERS; SELECT TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES; SELECT DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA; SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.ROUTINES; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.EVENTS; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.EVENTS; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.ROUTINES; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.ROUTINES; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.TRIGGERS; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.TRIGGERS; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.VIEWS; display full collation names, including character sets prefix, for all collations, including new collations. Corresponding SHOW commands also display full collation names in collation related columns: SHOW CREATE TABLE t1; SHOW CREATE DATABASE db1; SHOW TABLE STATUS; SHOW CREATE FUNCTION f1; SHOW CREATE PROCEDURE p1; SHOW CREATE EVENT ev1; SHOW CREATE TRIGGER tr1; SHOW CREATE VIEW; These INFORMATION_SCHEMA queries and SHOW statements may change in the future, to display show collation names.	2022-08-10 15:04:24 +02:00
Alexander Barkov	0c4c064f98	MDEV-27743 Remove Lex::charset This patch also fixes: MDEV-27690 Crash on `CHARACTER SET csname COLLATE DEFAULT` in column definition MDEV-27853 Wrong data type on column `COLLATE DEFAULT` and table `COLLATE some_non_default_collation` MDEV-28067 Multiple conflicting column COLLATE clauses are not rejected MDEV-28118 Wrong collation of `CAST(.. AS CHAR COLLATE DEFAULT)` MDEV-28119 Wrong column collation on MODIFY + CONVERT	2022-03-22 17:12:15 +04:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Alexander Barkov	b915f79e4e	MDEV-25904 New collation functions to compare InnoDB style trimmed NO PAD strings	2022-01-21 12:16:07 +04:00
Alexander Barkov	0d68b0a2d6	MDEV-26669 Add MY_COLLATION_HANDLER functions min_str() and max_str()	2021-09-27 17:10:22 +04:00
Monty	a206658b98	Change CHARSET_INFO character set and collaction names to LEX_CSTRING This change removed 68 explict strlen() calls from the code. The following renames was done to ensure we don't use the old names when merging code from earlier releases, as using the new variables for print function could result in crashes: - charset->csname renamed to charset->cs_name - charset->name renamed to charset->coll_name Almost everything where mechanical changes except: - Changed to use the new Protocol::store(LEX_CSTRING..) when possible - Changed to use field->store(LEX_CSTRING, CHARSET_INFO) when possible - Changed to use String->append(LEX_CSTRING&) when possible Other things: - There where compiler issues with ensuring that all character set names points to the same string: gcc doesn't allow one to use integer constants when defining global structures (constant char * pointers works fine). To get around this, I declared defines for each character set name length.	2021-05-19 22:54:07 +02:00
Marko Mäkelä	133b4b46fe	Merge 10.4 into 10.5	2020-11-03 16:24:47 +02:00
Marko Mäkelä	533a13af06	Merge 10.3 into 10.4	2020-11-03 14:49:17 +02:00
Marko Mäkelä	8036d0a359	MDEV-22387: Do not violate __attribute__((nonnull)) This follows up commit commit `94a520ddbe` and commit `7c5519c12d`. After these changes, the default test suites on a cmake -DWITH_UBSAN=ON build no longer fail due to passing null pointers as parameters that are declared to never be null, but plenty of other runtime errors remain.	2020-11-02 14:19:21 +02:00
Monty	dbcd3384e0	MDEV-7947 strcmp() takes 0.37% in OLTP RO This patch ensures that all identical character sets shares the same cs->csname. This allows us to replace strcmp() in my_charset_same() with comparisons of pointers. This fixes a long standing performance issue that could cause as strcmp() for every item sent trough the protocol class to the end user. One consequence of this patch is that we don't allow one to add a character definition in the Index.xml file that changes the csname of an existing character set. This is by design as changing character set names of existing ones is extremely dangerous, especially as some storage engines just records character set numbers. As we now have a hash over character set's csname, we can in the future use that for faster access to a specific character set. This could be done by changing the hash to non unique and use the hash to find the next character set with same csname.	2020-07-23 10:54:33 +03:00
Alexander Barkov	cfe5ee90c8	MDEV-22043 Special character leads to assertion in my_wc_to_printable_generic on 10.5.2 (debug) The code did not take into account that: - U+005C (backslash) can occupy more than mbminlen characters (e.g. in sjis) - Some character sets do not have a code for U+005C (e.g. swe7) Adding a new function my_wc_to_printable into MY_CHARSET_HANDLER to cover all special cases easier.	2020-05-09 16:01:30 +04:00
Monty	4d61f1247a	Fixed compiler warnings from gcc 7.4.1 - Fixed possible error in rocksdb/rdb_datadic.cc	2020-01-29 23:23:55 +02:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru	5543b75550	Update FSF Address * Update wrong zip-code	2019-05-11 21:29:06 +03:00
Marko Mäkelä	ef3070e997	Merge 10.1 into 10.2	2018-08-02 08:19:57 +03:00
Oleksandr Byelkin	cb5952b506	Merge branch '10.0' into bb-10.1-merge-sanja	2018-07-25 22:24:40 +02:00
Alexander Barkov	e2ac4098ed	Simplify caseup() and casedn() in charsets After the MDEV-13118 fix there's no code in the server that wants caseup/casedn to change the argument in place for simple charsets. Let's remove this logic and always return the result in a new string for all charsets, both simple and complex. 1. Removing the optimization that some character sets used in casedn() and caseup(), which allowed (and required) to change the case in-place, overwriting the string passed as the "src" argument. Now all CHARSET_INFO's work in the same way: non of them change the source string in-place, all of them now convert case from the source string to the destination string, leaving the source string untouched. 2. Adding "const" qualifier to the "char src" parameter to caseup() and casedn(). 3. Removing duplicate implementations in ctype-mb.c. Now both caseup() and casedn() implementations for all CJK character sets use internally the same function my_casefold_mb() (the former my_casefold_mb_varlen()). 4. Removing the "unused" attribute from parameters of some my_case{up\|dn}_xxx() implementations, as the affected parameters are now used* in the code. Previously these parameters were used only in DBUG_ASSERT().	2018-07-19 13:02:14 +04:00
Vladislav Vaintroub	9891ee5a2a	Fix and reenable Windows compiler warning C4800 (size_t conversion).	2018-01-26 10:37:46 +00:00
Alexander Barkov	5058ced5df	MDEV-7769 MY_CHARSET_INFO refactoring# On branch 10.2 Part 3 (final): removing MY_CHARSET_HANDLER::well_formed_len().	2016-10-10 14:36:09 +04:00
Alexander Barkov	ee19806b8e	MDEV-9711 NO PAD collations Based on the patch from Daniil Medvedev (a Google Summer of Code task)	2016-09-06 12:50:02 +04:00
Alexander Barkov	e7ff281d2e	MDEV-6353 my_ismbchar() and my_mbcharlen() refactoring	2016-05-17 15:27:10 +04:00
Alexander Barkov	1d73005bf3	MDEV-8360 Clean-up CHARSET_INFO: strnncollsp: diff_if_only_endspace_difference - Removing the "diff_if_only_endspace_difference" argument from MY_COLLATION_HANDLER::strnncollsp(), my_strnncollsp_simple(), as well as in the function template MY_FUNCTION_NAME(strnncollsp) in strcoll.ic - Removing the "diff_if_only_space_different" from ha_compare_text(), hp_rec_key_cmp(). - Adding a new function my_strnncollsp_padspace_bin() and reusing it instead of duplicate code pieces in my_strnncollsp_8bit_bin(), my_strnncollsp_latin1_de(), my_strnncollsp_tis620(), my_strnncollsp_utf8_cs(). - Adding more tests for better coverage of the trailing space handling. - Removing the unused definition of HA_END_SPACE_ARE_EQUAL	2016-03-31 11:04:48 +04:00
Alexander Barkov	e09299511e	MDEV-9665 Remove cs->cset->ismbchar() Using a more powerfull cs->cset->charlen() instead.	2016-03-16 10:55:12 +04:00
Alexander Barkov	3ba2a958be	MDEV-8694 Wrong result for SELECT..WHERE a NOT LIKE 'a ' AND a='a' Note, the patch for MDEV-8661 unintentionally fixed MDEV-8694 as well, as a side effect. Adding a real clear fix: implementing Item_func_like::propagate_equal_fields() with comments.	2015-08-28 17:03:09 +04:00
Alexander Barkov	78b80cb6ba	Adding MY_CHARSET_HANDLER::native_to_mb(). This is a pre-requisite patch for: - MDEV-8433 Make field<'broken-string' use indexes - MDEV-8625 Bad result set with ignorable characters when using a prefix key - MDEV-8626 Bad result set with contractions when using a prefix key	2015-08-14 18:34:41 +04:00
Alexander Barkov	197afb413f	MDEV-6566 Different INSERT behaviour on bad bytes with and without character set conversion	2015-03-13 16:51:36 +04:00
Alexander Barkov	b1b6101af2	A preparatory patch for MDEV-6566. Adding a new virtual function MY_CHARSET_HANDLER::copy_abort(). Moving character set specific code into the correspoding implementations (for simple, multi-byte and mbmaxlen>1 character sets).	2015-03-02 18:24:22 +04:00
Alexander Barkov	807934d083	MDEV-7086 main.ctype_cp932 fails in buildbot on a valgrind build Removing a redundant and wrong condition which could access beyond the pattern string range.	2014-11-18 13:07:37 +04:00
Michael Widenius	c4f5326bb7	MDEV-6255 DUPLICATE KEY Errors on SELECT .. GROUP BY that uses temporary and filesort. The problem was that my_hash_sort didn't properly delete end-space characters properly, so strings that should compare identically was seen as different strings. (Space was handled correctly, but not NBSP) This caused duplicate key errors when a heap table was converted to Aria as part of overflow in group by. Fixed by removing all characters that compares as end space when creating a hash. Other things: - Fixed that --sorted_results also works for errors in mysqltest. - Speed up hash by not comparing strings that has different hash. - Speed up many my_hash_sort functions by using registers to calculate hash instead of pointers. This was previously done for some functions, but not for all. - Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions. client/mysqltest.cc: Fixed that --sorted_results also works for error messages. mysql-test/r/ctype_partitions.result: New test to ensure that partitions on hash works mysql-test/suite/multi_source/gtid.result: Updated result mysql-test/suite/multi_source/gtid.test: Test that --sorted_result works for error messages mysql-test/suite/multi_source/gtid_ignore_duplicates.result: Updated result mysql-test/suite/multi_source/gtid_ignore_duplicates.test: Updated result mysql-test/suite/multi_source/load_data.result: Updated result mysql-test/suite/multi_source/load_data.test: Updated result mysql-test/t/ctype_partitions.test: New test to ensure that partitions on hash works storage/heap/hp_write.c: Speed up hash by not comparing strings that has different hash. storage/maria/ma_check.c: Extra debug strings/ctype-bin.c: Use macro for hash function strings/ctype-latin1.c: Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-mb.c: Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-simple.c: Use macro for hash function Use same variable names as in other my_hash_sort functions. Update my_hash_sort_simple() to properly remove end space (patch by Bar) strings/ctype-uca.c: Ignore duplicated space inside strings and end space in my_hash_sort_uca(). This fixed MDEV-6255 Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-ucs2.c: Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-utf8.c: Use macro for hash function Use registers to calculate hash (speedup) strings/strings_def.h: Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions.	2014-09-11 22:42:35 +03:00
Sergei Golubchik	1c6ad62a26	mysql-5.5.39 merge ~40% bugfixed() applied ~40$ bugfixed reverted (incorrect or we're not buggy) ~20% bugfixed applied, despite us being not buggy () only changes in the server code, e.g. not cmakefiles	2014-08-02 21:26:16 +02:00
Sergei Golubchik	6fb17a0601	5.5.39 merge	2014-08-07 18:06:56 +02:00
Erlend Dahl	13d4101a39	Bug#18850241 WRONG COPYRIGHT HEADER IN SOME STRINGS/CTYPE-* FILES	2014-06-23 12:11:13 +02:00
Sergey Vojtovich	b95c8ce530	MDEV-5675 - Performance: my_hash_sort_bin is called too often Reduced number of my_hash_sort_bin() calls from 4 to 1 per query. Reduced number of memory accesses done by my_hash_sort_bin(). Details: - let MDL subsystem use pre-calculated hash value for hash inserts and deletes - let table cache use pre-calculated MDL hash value - MDL namespace is excluded from hash value calculation, so that hash value can be used by table cache as is - hash value for MDL is calculated as resulting hash value + MDL namespace - extended hash implementation to accept user defined hash function	2014-03-06 16:19:12 +04:00
Sergei Golubchik	8dce8ecfda	minor cleanup	2014-03-01 10:19:42 +01:00
Alexander Barkov	426d246f5b	MDEV-5163 Merge WEIGHT_STRING function from MySQL-5.6	2013-10-23 20:25:52 +04:00
Alexander Barkov	0b6c4bb34f	MDEV-4928 Merge collation customization improvements Merging the following MySQL-5.6 changes: - WL#5624: Collation customization improvements http://dev.mysql.com/worklog/task/?id=5624 - WL#4013: Unicode german2 collation http://dev.mysql.com/worklog/task/?id=4013 - Bug#62429 XML: ExtractValue, UpdateXML max arg length 127 chars http://bugs.mysql.com/bug.php?id=62429 (required by WL#5624)	2013-10-02 15:04:07 +04:00
Sergei Golubchik	b7b5f6f1ab	10.0-monty merge includes: * remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING" * introduce LOCK_share, now LOCK_ha_data is strictly for engines * rea_create_table() always creates .par file (even in "frm-only" mode) * fix a 5.6 bug, temp file leak on dummy ALTER TABLE	2013-07-21 16:39:19 +02:00
Sergei Golubchik	005c7e5421	mysql-5.5.32 merge	2013-07-16 19:09:54 +02:00
Sergei Golubchik	b381cf843c	mysql-5.5.31 merge	2013-05-07 13:05:09 +02:00
Michael Widenius	068c61978e	Temporary commit of 10.0-merge	2013-03-26 00:03:13 +02:00
Murthy Narkedimilli	8afe262ae5	Fix for Bug 16395495 - OLD FSF ADDRESS IN GPL HEADER	2013-03-19 15:53:48 +01:00
Neeraj Bisht	84d798a1d5	BUG#14303860 - EXECUTING A SELECT QUERY WITH TOO MANY WILDCARDS CAUSES A SEGFAULT Back port from 5.6 and trunk	2013-01-14 16:51:52 +05:30
Neeraj Bisht	99645e5be5	BUG#14303860 - EXECUTING A SELECT QUERY WITH TOO MANY WILDCARDS CAUSES A SEGFAULT Back port from 5.6 and trunk	2013-01-14 14:59:48 +05:30
Sergei Golubchik	4f435bddfd	5.3 merge	2012-01-13 15:50:02 +01:00
Michael Widenius	6920457142	Merge with MariaDB 5.1	2011-11-24 18:48:58 +02:00
Michael Widenius	a8d03ab235	Initail merge with MySQL 5.1 (XtraDB still needs to be merged) Fixed up copyright messages.	2011-11-21 19:13:14 +02:00
Sergei Golubchik	76f0b94bb0	merge with 5.3 sql/sql_insert.cc: CREATE ... IF NOT EXISTS may do nothing, but it is still not a failure. don't forget to my_ok it. **** CREATE ... IF NOT EXISTS may do nothing, but it is still not a failure. don't forget to my_ok it. sql/sql_table.cc: small cleanup **** small cleanup	2011-10-19 21:45:18 +02:00
Sergei Golubchik	9809f05199	5.5-merge	2011-07-02 22:08:51 +02:00

1 2 3 4

155 commits