mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-27 01:04:19 +01:00

Author	SHA1	Message	Date
Oleksandr Byelkin	b83c379420	Merge branch '10.5' into 10.6	2023-11-08 15:57:05 +01:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Sergei Petrunia	4941ac9192	MDEV-32113: utf8mb3_key_col=utf8mb4_value cannot be used for ref (Variant#3: Allow cross-charset comparisons, use a special CHARSET_INFO to create lookup keys. Review input addressed.) Equalities that compare utf8mb{3,4}_general_ci strings, like: WHERE ... utf8mb3_key_col=utf8mb4_value (MB3-4-CMP) can now be used to construct ref[const] access and also participate in multiple-equalities. This means that utf8mb3_key_col can be used for key-lookups when compared with an utf8mb4 constant, field or expression using '=' or '<=>' comparison operators. This is controlled by optimizer_switch='cset_narrowing=on', which is OFF by default. IMPLEMENTATION Item value comparison in (MB3-4-CMP) is done using utf8mb4_general_ci. This is valid as any utf8mb3 value is also an utf8mb4 value. When making index lookup value for utf8mb3_key_col, we do "Charset Narrowing": characters that are in the Basic Multilingual Plane (=BMP) are copied as-is, as they can be represented in utf8mb3. Characters that are outside the BMP cannot be represented in utf8mb3 and are replaced with U+FFFD, the "Replacement Character". In utf8mb4_general_ci, the Replacement Character compares as equal to any character that's not in BMP. Because of this, the constructed lookup value will find all index records that would be considered equal by the original condition (MB3-4-CMP). Approved-by: Monty <monty@mariadb.org>	2023-10-19 17:24:30 +03:00
Xiaotong Niu	8f2f8f3173	MDEV-26494 Fix buffer overflow of string lib on Arm64 In the hexlo function, the element type of the array hex_lo_digit is not explicitly declared as signed char, causing elements with a value of -1 to be converted to 255 on Arm64. The problem occurs because "char" is unsigned by default on Arm64 compiler, but signed on x86 compiler. This problem can be seen in https://godbolt.org/z/rT775xshj The above issue causes "use-after-poison" exception in my_mb_wc_filename function. The code snippet where the error occurred is shown below, copied from below link. `5fc19e7137/strings/ctype-utf8.c (L2728)` 2728 if ((byte1= hexlo(byte1)) >= 0 && 2729 (byte2= hexlo(byte2)) >= 0) { 2731 int byte3= hexlo(s[3]); … } At line 2729, when byte2 is 0, which indicates the end of the string s. (1) On x86, hexlo(0) return -1 and line 2731 is skipped, as expected. (2) On Arm64, hexlo(0) return 255 and line 2731 is executed, not as expected, accessing s[3] after the null character of string s, thus raising the "user-after-poison" error. The problem was discovered when executing the main.mysqlcheck test. Signed-off-by: Xiaotong Niu <xiaotong.niu@arm.com>	2023-10-18 20:23:27 +11:00
Marko Mäkelä	5bada1246d	Merge 10.5 into 10.6	2023-04-11 16:15:19 +03:00
Oleksandr Byelkin	ac5a534a4c	Merge remote-tracking branch '10.4' into 10.5	2023-03-31 21:32:41 +02:00
Alexander Barkov	965bdf3e66	MDEV-30746 Regression in ucs2_general_mysql500_ci 1. Adding a separate MY_COLLATION_HANDLER my_collation_ucs2_general_mysql500_ci_handler implementing a proper order for ucs2_general_mysql500_ci The problem happened because ucs2_general_mysql500_ci erroneously used my_collation_ucs2_general_ci_handler. 2. Cosmetic changes: Renaming: - plane00_mysql500 to my_unicase_mysql500_page00 - my_unicase_pages_mysql500 to my_unicase_mysql500_pages to use the same naming style with: - my_unicase_default_page00 - my_unicase_defaul_pages 3. Moving code fragments from - handler::check_collation_compatibility() in handler.cc - upgrade_collation() in table.cc into new methods in class Charset, to reuse the code easier.	2023-03-01 15:38:02 +04:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Oleksandr Byelkin	a576a1cea5	Merge branch '10.3' into 10.4	2022-01-30 09:46:52 +01:00
Oleksandr Byelkin	41a163ac5c	Merge branch '10.2' into 10.3	2022-01-29 15:41:05 +01:00
Alexander Barkov	b915f79e4e	MDEV-25904 New collation functions to compare InnoDB style trimmed NO PAD strings	2022-01-21 12:16:07 +04:00
Vladislav Vaintroub	47e18af906	MDEV-27494 Rename .ic files to .inl	2022-01-17 16:41:51 +01:00
Alexander Barkov	0d68b0a2d6	MDEV-26669 Add MY_COLLATION_HANDLER functions min_str() and max_str()	2021-09-27 17:10:22 +04:00
Monty	a206658b98	Change CHARSET_INFO character set and collaction names to LEX_CSTRING This change removed 68 explict strlen() calls from the code. The following renames was done to ensure we don't use the old names when merging code from earlier releases, as using the new variables for print function could result in crashes: - charset->csname renamed to charset->cs_name - charset->name renamed to charset->coll_name Almost everything where mechanical changes except: - Changed to use the new Protocol::store(LEX_CSTRING..) when possible - Changed to use field->store(LEX_CSTRING, CHARSET_INFO) when possible - Changed to use String->append(LEX_CSTRING&) when possible Other things: - There where compiler issues with ensuring that all character set names points to the same string: gcc doesn't allow one to use integer constants when defining global structures (constant char * pointers works fine). To get around this, I declared defines for each character set name length.	2021-05-19 22:54:07 +02:00
Rucha Deodhar	2fdb556e04	MDEV-8334: Rename utf8 to utf8mb3 This patch changes the main name of 3 byte character set from utf8 to utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default, so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.	2021-05-19 06:48:36 +02:00
Monty	dbcd3384e0	MDEV-7947 strcmp() takes 0.37% in OLTP RO This patch ensures that all identical character sets shares the same cs->csname. This allows us to replace strcmp() in my_charset_same() with comparisons of pointers. This fixes a long standing performance issue that could cause as strcmp() for every item sent trough the protocol class to the end user. One consequence of this patch is that we don't allow one to add a character definition in the Index.xml file that changes the csname of an existing character set. This is by design as changing character set names of existing ones is extremely dangerous, especially as some storage engines just records character set numbers. As we now have a hash over character set's csname, we can in the future use that for faster access to a specific character set. This could be done by changing the hash to non unique and use the hash to find the next character set with same csname.	2020-07-23 10:54:33 +03:00
Marko Mäkelä	3dbc49f075	Merge 10.4 into 10.5	2020-06-14 10:13:53 +03:00
Marko Mäkelä	805340936a	Merge 10.3 into 10.4	2020-06-13 19:01:28 +03:00
Marko Mäkelä	d83a443250	Merge 10.2 into 10.3	2020-06-13 15:11:43 +03:00
Alexander Barkov	9b9a354da9	MDEV-22849 Reuse skip_trailing_space() in my_hash_sort_utf8mbX Replacing the slow loop in my_hash_sort_utf8mbX() to the fast skip_trailing_spaces(), which consumes 8 bytes in one iteration, and is around 8 times faster on long data. Also, renaming: - my_hash_sort_utf8() to my_hash_sort_utf8mb3() - my_hash_sort_utf8_nopad() to my_hash_sort_utf8mb3_nopad() to merge to 10.5 easier (automatically?).	2020-06-10 08:42:31 +04:00
Alexander Barkov	cfe5ee90c8	MDEV-22043 Special character leads to assertion in my_wc_to_printable_generic on 10.5.2 (debug) The code did not take into account that: - U+005C (backslash) can occupy more than mbminlen characters (e.g. in sjis) - Some character sets do not have a code for U+005C (e.g. swe7) Adding a new function my_wc_to_printable into MY_CHARSET_HANDLER to cover all special cases easier.	2020-05-09 16:01:30 +04:00
Alexander Barkov	f1e13fdc8d	MDEV-21581 Helper functions and methods for CHARSET_INFO	2020-01-28 12:29:23 +04:00
Alexander Barkov	3e7e87ddcc	MDEV-19897 Rename source code variable names from utf8 to utf8mb3	2019-06-28 12:37:04 +04:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru	5543b75550	Update FSF Address * Update wrong zip-code	2019-05-11 21:29:06 +03:00
Marko Mäkelä	1f3bcff1f2	Remove unused declarations	2019-04-03 19:46:34 +03:00
Marko Mäkelä	074c684099	Merge 10.3 into 10.4	2018-11-06 16:24:16 +02:00
Marko Mäkelä	df563e0c03	Merge 10.2 into 10.3 main.derived_cond_pushdown: Move all 10.3 tests to the end, trim trailing white space, and add an "End of 10.3 tests" marker. Add --sorted_result to tests where the ordering is not deterministic. main.win_percentile: Add --sorted_result to tests where the ordering is no longer deterministic.	2018-11-06 09:40:39 +02:00
Marko Mäkelä	32062cc61c	Merge 10.1 into 10.2	2018-11-06 08:41:48 +02:00
Sergei Golubchik	44f6f44593	Merge branch '10.0' into 10.1	2018-10-30 15:10:01 +01:00
Alexander Barkov	a8efe7ab1f	MDEV-17502 MDEV-17474 Change Unicode xxx_general_ci and xxx_bin collation implementation to "inline" style	2018-10-19 14:35:01 +04:00
Alexander Barkov	6eae037c4c	MDEV-17474 Change Unicode collation implementation from "handler" to "inline" style	2018-10-17 06:44:40 +04:00
Alexander Barkov	34f8a4071e	MDEV-17064 LIKE function has error behavior on the fields in which the collation is xxx_unicode_xx Synchronizing sources in: - my_wildcmp_uca_impl() handling utf8_unicode_ci - my_wildcmp_unicode_impl() handling utf8_general_ci The latter has already had a fix for a similar MySQL bug in utf8_general_ci: Bug#11754 SET NAMES utf8 followed by SELECT "A\\" LIKE "A\\" returns 0 So fix is now propagated to utf8_unicode_ci.	2018-10-15 13:22:18 +04:00
Marko Mäkelä	05459706f2	Merge 10.2 into 10.3	2018-08-03 15:57:23 +03:00
Marko Mäkelä	ef3070e997	Merge 10.1 into 10.2	2018-08-02 08:19:57 +03:00
Oleksandr Byelkin	cb5952b506	Merge branch '10.0' into bb-10.1-merge-sanja	2018-07-25 22:24:40 +02:00
Alexander Barkov	e2ac4098ed	Simplify caseup() and casedn() in charsets After the MDEV-13118 fix there's no code in the server that wants caseup/casedn to change the argument in place for simple charsets. Let's remove this logic and always return the result in a new string for all charsets, both simple and complex. 1. Removing the optimization that some character sets used in casedn() and caseup(), which allowed (and required) to change the case in-place, overwriting the string passed as the "src" argument. Now all CHARSET_INFO's work in the same way: non of them change the source string in-place, all of them now convert case from the source string to the destination string, leaving the source string untouched. 2. Adding "const" qualifier to the "char src" parameter to caseup() and casedn(). 3. Removing duplicate implementations in ctype-mb.c. Now both caseup() and casedn() implementations for all CJK character sets use internally the same function my_casefold_mb() (the former my_casefold_mb_varlen()). 4. Removing the "unused" attribute from parameters of some my_case{up\|dn}_xxx() implementations, as the affected parameters are now used* in the code. Previously these parameters were used only in DBUG_ASSERT().	2018-07-19 13:02:14 +04:00
luz.paz	3dd01669b4	Misc. typos Found via `codespell -i 3 -w --skip="./debian/po" -I ../mariadb-server-word-whitelist.txt ./cmake/ ./debian/ ./Docs/ ./include/ ./man/ ./plugin/ ./strings/`	2018-04-05 15:26:57 +04:00
Sergei Golubchik	e0a1c745ec	Merge branch '10.1' into 10.2	2017-10-24 14:53:18 +02:00
Sergei Golubchik	9d2e2d7533	Merge branch '10.0' into 10.1	2017-10-22 13:03:41 +02:00
Sergei Golubchik	da4503e956	Merge branch '5.5' into 10.0	2017-10-18 15:14:39 +02:00
Sergei Golubchik	d76f5774fe	MDEV-13459 Warnings, when compiling with gcc-7.x mostly caused by -Wimplicit-fallthrough	2017-10-17 07:37:39 +02:00
Marko Mäkelä	70505dd45b	Merge 10.1 into 10.2	2017-05-22 09:46:51 +03:00
Marko Mäkelä	71cd205956	Silence bogus GCC 7 warnings -Wimplicit-fallthrough Do not silence uncertain cases, or fix any bugs. The only functional change should be that ha_federated::extra() is not calling DBUG_PRINT to report an unhandled case for HA_EXTRA_PREPARE_FOR_DROP.	2017-05-17 08:27:04 +03:00
Marko Mäkelä	7972da8aa1	Silence bogus GCC 7 warnings -Wimplicit-fallthrough Do not silence uncertain cases, or fix any bugs. The only functional change should be that ha_federated::extra() is not calling DBUG_PRINT to report an unhandled case for HA_EXTRA_PREPARE_FOR_DROP.	2017-05-17 08:07:02 +03:00
Vladislav Vaintroub	f2fe5cb282	Fix several compile warnings on Windows	2017-03-10 19:07:07 +00:00

1 2 3 4 5 ...

286 commits