mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 03:52:35 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	7e0afb1c73	Merge 10.5 into 10.6	2024-10-03 09:31:39 +03:00
Alexander Barkov	9ac8172ac3	MDEV-31221 UBSAN runtime error: negation of -9223372036854775808 cannot be represented in type 'long long int' in my_strtoll10_utf32 The code in my_strtoll10_mb2 and my_strtoll10_utf32 could hit undefinite behavior by negation of LONGLONG_MIN. Fixing to avoid this. Also, fixing my_strtoll10() in the same style. The previous reduction produced a redundant warning on CAST(_latin1'-9223372036854775808' AS SIGNED)	2024-09-20 13:04:57 +04:00
Alexander Barkov	841dc07ee1	MDEV-28386 UBSAN: runtime error: negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_strntoull_8bit on SELECT ... OCT The code in my_strntoull_8bit() and my_strntoull_mb2_or_mb4() could hit undefinite behavior by negating of LONGLONG_MIN. Fixing the code to avoid this.	2024-09-20 11:01:31 +04:00
Oleksandr Byelkin	9af2caca33	Merge branch '10.5' into 10.6	2024-07-18 16:25:33 +02:00
Alexander Barkov	b777b749ad	MDEV-28345 ASAN: use-after-poison or unknown-crash in my_strtod_int from charset_info_st::strntod or test_if_number This patch fixes two problems: - The code inside my_strtod_int() in strings/dtoa.c could test the byte behind the end of the string when processing the mantissa. Rewriting the code to avoid this. - The code in test_if_number() in sql/sql_analyse.cc called my_atof() which is unsafe and makes the called my_strtod_int() look behind the end of the string if the input string is not 0-terminated. Fixing test_if_number() to use my_strtod() instead, passing the correct end pointer.	2024-07-17 12:17:27 +04:00
Marko Mäkelä	5ba542e9ee	Merge 10.5 into 10.6	2024-05-30 14:27:07 +03:00
Alexander Barkov	4a158ec167	MDEV-34226 On startup: UBSAN: applying zero offset to null pointer in my_copy_fix_mb from strings/ctype-mb.c and other locations nullptr+0 is an UB (undefined behavior). - Fixing my_string_metadata_get_mb() to handle {nullptr,0} without UB. - Fixing THD::copy_with_error() to disallow {nullptr,0} by DBUG_ASSERT(). - Fixing parse_client_handshake_packet() to call THD::copy_with_error() with an empty string {"",0} instead of NULL string {nullptr,0}.	2024-05-27 13:19:13 +04:00
Alexander Barkov	7925326183	MDEV-30931 UBSAN: negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in get_interval_value on SELECT - Fixing the code in get_interval_value() to use Longlong_hybrid_null. This allows to handle correctly: - Signed and unsigned arguments (the old code assumed the argument to be signed) - Avoid undefined negation behavior the corner case with LONGLONG_MIN This fixes the UBSAN warning: negation of -9223372036854775808 cannot be represented in type 'long long int'; - Fixing the code in get_interval_value() to avoid overflow in the INTERVAL_QUARTER and INTERVAL_WEEK branches. This fixes the UBSAN warning: signed integer overflow: -9223372036854775808 * 7 cannot be represented in type 'long long int' - Fixing the INTERVAL_WEEK branch in date_add_interval() to handle huge numbers correctly. Before the change, huge positive numeber were treated as their negative complements. Note, some other branches still can be affected by this problem and should also be fixed eventually.	2024-05-27 13:19:13 +04:00
Alexander Barkov	7c4c082349	MDEV-28387 UBSAN: runtime error: negation of -9223372036854775808 cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_strtoll10 on SELECT Fixing the condition to raise an overflow in the ulonglong representation of the number is greater or equal to 0x8000000000000000ULL. Before this change the condition did not catch -9223372036854775808 (the smallest possible signed negative longlong number).	2024-05-23 14:18:34 +04:00
Sergei Golubchik	3f6038bc51	Merge branch '10.5' into 10.6	2024-01-31 18:04:03 +01:00
Sergei Golubchik	01f6abd1d4	Merge branch '10.4' into 10.5	2024-01-31 17:32:53 +01:00
Robin Newhouse	615f4a8c9e	MDEV-32587 Allow json exponential notation starting with zero Modify the NS_ZERO state in the JSON number parser to allow exponential notation with a zero coefficient (e.g. 0E-4). The NS_ZERO state transition on 'E' was updated to move to the NS_EX state rather than returning a syntax error. Similar change was made for the NS_ZE1 (negative zero) starter state. This allows accepted number grammar to include cases like: - 0E4 - -0E-10 which were previously disallowed. Numeric parsing remains the same for all other states. Test cases are added to func_json.test to validate parsing for various exponential numbers starting with zero coefficients. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services.	2024-01-17 19:25:43 +05:30
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Alexander Barkov	1710b6454b	MDEV-26743 InnoDB: CHAR+nopad does not work well The patch for "MDEV-25440: Indexed CHAR ... broken with NO_PAD collations" fixed these scenarios from MDEV-26743: - Basic latin letter vs equal accented letter - Two letters vs equal (but space padded) expansion However, this scenario was still broken: - Basic latin letter (but followed by an ignorable character) vs equal accented letter Fix: When processing for a NOPAD collation a string with trailing ignorable characters, like: '<non-ignorable><ignorable><ignorable>' the string gets virtually converted to: '<non-ignorable><ignorable><ignorable><space><space><space>...' After the fix the code works differently in these two cases: 1. <space> fits into the "nchars" limit 2. <space> does not fit into the "nchars" limit Details: 1. If "nchars" is large enough (4+ in this example), return weights as follows: '[weight-for-non-ignorable, 1 char] [weight-for-space-character, 3 chars]' i.e. the weight for the virtual trailing space character now indicates that it corresponds to total 3 characters: - two ignorable characters - one virtual trailing space character 2. If "nchars" is small (3), then the virtual trailing space character does not fit into the "nchar" limit, so return 0x00 as weight, e.g.: '[weight-for-non-ignorable, 1 char] [0x00, 2 chars]' Adding corresponding MTR tests and unit tests.	2023-11-10 06:17:23 +04:00
Oleksandr Byelkin	b83c379420	Merge branch '10.5' into 10.6	2023-11-08 15:57:05 +01:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Sergei Petrunia	4941ac9192	MDEV-32113: utf8mb3_key_col=utf8mb4_value cannot be used for ref (Variant#3: Allow cross-charset comparisons, use a special CHARSET_INFO to create lookup keys. Review input addressed.) Equalities that compare utf8mb{3,4}_general_ci strings, like: WHERE ... utf8mb3_key_col=utf8mb4_value (MB3-4-CMP) can now be used to construct ref[const] access and also participate in multiple-equalities. This means that utf8mb3_key_col can be used for key-lookups when compared with an utf8mb4 constant, field or expression using '=' or '<=>' comparison operators. This is controlled by optimizer_switch='cset_narrowing=on', which is OFF by default. IMPLEMENTATION Item value comparison in (MB3-4-CMP) is done using utf8mb4_general_ci. This is valid as any utf8mb3 value is also an utf8mb4 value. When making index lookup value for utf8mb3_key_col, we do "Charset Narrowing": characters that are in the Basic Multilingual Plane (=BMP) are copied as-is, as they can be represented in utf8mb3. Characters that are outside the BMP cannot be represented in utf8mb3 and are replaced with U+FFFD, the "Replacement Character". In utf8mb4_general_ci, the Replacement Character compares as equal to any character that's not in BMP. Because of this, the constructed lookup value will find all index records that would be considered equal by the original condition (MB3-4-CMP). Approved-by: Monty <monty@mariadb.org>	2023-10-19 17:24:30 +03:00
Xiaotong Niu	8f2f8f3173	MDEV-26494 Fix buffer overflow of string lib on Arm64 In the hexlo function, the element type of the array hex_lo_digit is not explicitly declared as signed char, causing elements with a value of -1 to be converted to 255 on Arm64. The problem occurs because "char" is unsigned by default on Arm64 compiler, but signed on x86 compiler. This problem can be seen in https://godbolt.org/z/rT775xshj The above issue causes "use-after-poison" exception in my_mb_wc_filename function. The code snippet where the error occurred is shown below, copied from below link. `5fc19e7137/strings/ctype-utf8.c (L2728)` 2728 if ((byte1= hexlo(byte1)) >= 0 && 2729 (byte2= hexlo(byte2)) >= 0) { 2731 int byte3= hexlo(s[3]); … } At line 2729, when byte2 is 0, which indicates the end of the string s. (1) On x86, hexlo(0) return -1 and line 2731 is skipped, as expected. (2) On Arm64, hexlo(0) return 255 and line 2731 is executed, not as expected, accessing s[3] after the null character of string s, thus raising the "user-after-poison" error. The problem was discovered when executing the main.mysqlcheck test. Signed-off-by: Xiaotong Niu <xiaotong.niu@arm.com>	2023-10-18 20:23:27 +11:00
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	7564be1352	Merge branch '10.4' into 10.5	2023-07-26 16:02:57 +02:00
Oleksandr Byelkin	f52954ef42	Merge commit '10.4' into 10.5	2023-07-20 11:54:52 +02:00
Alexander Barkov	03c2157dd6	MDEV-28384 UBSAN: null pointer passed as argument 1, which is declared to never be null in my_strnncoll_binary on SELECT ... COUNT or GROUP_CONCAT Also fixes: MDEV-30982 UBSAN: runtime error: null pointer passed as argument 2, which is declared to never be null in my_strnncoll_binary on DELETE Calling memcmp() with a NULL pointer is undefined behaviour according to the C standard, even if the length argument is 0. Adding tests for length==0 before calling memcmp() into: - my_strnncoll_binary() - my_strnncoll_8bit_bin	2023-07-20 11:56:19 +04:00
anson1014	1db4fc543b	Ensure that source files contain only valid UTF8 encodings (#2188 ) Modern software (including text editors, static analysis software, and web-based code review interfaces) often requires source code files to be interpretable via a consistent character encoding, with UTF-8 or ASCII (a strict subset of UTF-8) as the default. Several of the MariaDB source files contain bytes that are not valid in either the UTF-8 or ASCII encodings, but instead represent strings encoded in the ISO-8859-1/Latin-1 or ISO-8859-2/Latin-2 encodings. These inconsistent encodings may prevent software from correctly presenting or processing such files. Converting all source files to valid UTF8 characters will ensure correct handling. Comments written in Czech were replaced with lightly-corrected translations from Google Translate. Additionally, comments describing the proper handling of special characters were changed so that the comments are now purely UTF8. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc. Co-authored-by: Andrew Hutchings <andrew@linuxjedi.co.uk>	2023-05-19 13:21:34 +01:00
Rucha Deodhar	b7b8a9ee43	MDEV-23187: Assorted assertion failures in json_find_path with certain collations Fix by Alexey Botchkov The 'value_len' is calculated wrong for the multibyte charsets. In the read_strn() function we get the length of the string with the final ' " ' character. So have to subtract it's length from the value_len. And the length of '1' isn't correct for the ucs2 charset (must be 2).	2023-05-16 01:52:16 +05:30
Marko Mäkelä	5bada1246d	Merge 10.5 into 10.6	2023-04-11 16:15:19 +03:00
Alexander Barkov	62e137d4d7	Merge remote-tracking branch 'origin/10.4' into 10.5	2023-04-05 16:16:19 +04:00
Alexander Barkov	8020b1bd73	MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations - Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars() and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES. The flag defines if strnncollsp_nchars() should emulate trailing spaces which were possibly trimmed earlier (e.g. in InnoDB CHAR compression). This is important for NOPAD collations. For example, with this input: - str1= 'a ' (Latin letter a followed by one space) - str2= 'a ' (Latin letter a followed by two spaces) - nchars= 3 if the flag is given, strnncollsp_nchars() will virtually restore one trailing space to str1 up to nchars (3) characters and compare two strings as equal: - str1= 'a ' (one extra trailing space emulated) - str2= 'a ' (as is) If the flag is not given, strnncollsp_nchars() does not add trailing virtual spaces, so in case of a NOPAD collation, str1 will be compared as less than str2 because it is shorter. - Field_string::cmp_prefix() now passes the new flag. Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do not pass the new flag. - The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc (which handles the CHAR data type) now also passed the new flag. - Fixing UCA collations to respect the new flag. Other collations are possibly also affected, however I had no success in making an SQL script demonstrating the problem. Other collations will be extended to respect this flags in a separate patch later. - Changing the meaning of the last parameter of Field::cmp_prefix() from "number of bytes" (internal length) to "number of characters" (user visible length). The code calling cmp_prefix() from handler.cc was wrong. After this change, the call in handler.cc became correct. The code calling cmp_prefix() from key_rec_cmp() in key.cc was adjusted according to this change. - Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c now pass the new flag. A few new tests also were added, without the flag.	2023-04-04 12:30:50 +04:00
Oleksandr Byelkin	ac5a534a4c	Merge remote-tracking branch '10.4' into 10.5	2023-03-31 21:32:41 +02:00
Christian Gonzalez	8b0f766c6c	Minimize unsafe C functions usage Replace calls to `sprintf` and `strcpy` by the safer options `snprintf` and `safe_strcpy` in the following directories: - libmysqld - mysys - sql-common - strings All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2023-03-08 10:36:25 +00:00
Alexander Barkov	965bdf3e66	MDEV-30746 Regression in ucs2_general_mysql500_ci 1. Adding a separate MY_COLLATION_HANDLER my_collation_ucs2_general_mysql500_ci_handler implementing a proper order for ucs2_general_mysql500_ci The problem happened because ucs2_general_mysql500_ci erroneously used my_collation_ucs2_general_ci_handler. 2. Cosmetic changes: Renaming: - plane00_mysql500 to my_unicase_mysql500_page00 - my_unicase_pages_mysql500 to my_unicase_mysql500_pages to use the same naming style with: - my_unicase_default_page00 - my_unicase_defaul_pages 3. Moving code fragments from - handler::check_collation_compatibility() in handler.cc - upgrade_collation() in table.cc into new methods in class Charset, to reuse the code easier.	2023-03-01 15:38:02 +04:00
Marko Mäkelä	6aec87544c	Merge 10.5 into 10.6	2023-02-10 13:03:01 +02:00
Marko Mäkelä	c41c79650a	Merge 10.4 into 10.5	2023-02-10 12:02:11 +02:00
Alexander Barkov	0845bce0d9	MDEV-30556 UPPER() returns an empty string for U+0251 in Unicode-5.2.0+ collations for utf8	2023-02-03 18:18:32 +04:00
Oleksandr Byelkin	c3a5cf2b5b	Merge branch '10.5' into 10.6	2023-01-31 09:31:42 +01:00
Oleksandr Byelkin	7fa02f5c0b	Merge branch '10.4' into 10.5	2023-01-27 13:54:14 +01:00
Sergei Golubchik	0c27559994	MDEV-26817 runtime error: index 24320 out of bounds for type 'json_string_char_classes [128] and ASAN: global-buffer-overflow on address ... READ of size 4 on SELECT JSON_VALID protect from out-of-bound array access it was already done in all other places, this one was the only one missed	2023-01-20 19:43:15 +01:00
Marko Mäkelä	a8a5c8a1b8	Merge 10.5 into 10.6	2022-12-13 16:58:58 +02:00
Marko Mäkelä	1dc2f35598	Merge 10.4 into 10.5	2022-12-13 14:39:18 +02:00
Marko Mäkelä	fdf43b5c78	Merge 10.3 into 10.4	2022-12-13 11:37:33 +02:00
Marko Mäkelä	e55397a46d	Merge 10.5 into 10.6	2022-12-05 18:04:23 +02:00
Jan Lindström	4eb8e51c26	Merge 10.4 into 10.5	2022-11-30 13:10:52 +02:00
Alexander Barkov	931549ff66	MDEV-27670 Assertion `(cs->state & 0x20000) == 0' failed in my_strnncollsp_nchars_generic_8bit Also fixes: MDEV-27768 MDEV-25440: Assertion `(cs->state & 0x20000) == 0' failed in my_strnncollsp_nchars_generic_8bit The "strnncollsp_nchars" virtual function pointer for tis620_thai_nopad_ci was incorrectly initialized to a generic function my_strnncollsp_nchars_generic_8bit(), which crashed on assert. Implementing a tis620 specific function version.	2022-11-22 14:03:23 +04:00
Alexander Barkov	6216a2dfa2	MDEV-29473 UBSAN: Signed integer overflow: X * Y cannot be represented in type 'int' in strings/dtoa.c Fixing a few problems relealed by UBSAN in type_float.test - multiplication overflow in dtoa.c - uninitialized Field::geom_type (and Field::srid as well) - Wrong call-back function types used in combination with SHOW_FUNC. Changes in the mysql_show_var_func data type definition were not properly addressed all around the code by the following commits: `b4ff64568c` `18feb62fee` `0ee879ff8a` Adding a helper SHOW_FUNC_ENTRY() function and replacing all mysql_show_var_func declarations using SHOW_FUNC to SHOW_FUNC_ENTRY, to catch mysql_show_var_func in the future at compilation time.	2022-11-17 17:51:01 +04:00
anson1014	966d22b715	Ensure that source files contain only valid UTF8 encodings (#2188 ) Modern software (including text editors, static analysis software, and web-based code review interfaces) often requires source code files to be interpretable via a consistent character encoding, with UTF-8 or ASCII (a strict subset of UTF-8) as the default. Several of the MariaDB source files contain bytes that are not valid in either the UTF-8 or ASCII encodings, but instead represent strings encoded in the ISO-8859-1/Latin-1 or ISO-8859-2/Latin-2 encodings. These inconsistent encodings may prevent software from correctly presenting or processing such files. Converting all source files to valid UTF8 characters will ensure correct handling. Comments written in Czech were replaced with lightly-corrected translations from Google Translate. Additionally, comments describing the proper handling of special characters were changed so that the comments are now purely UTF8. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc. Co-authored-by: Andrew Hutchings <andrew@linuxjedi.co.uk>	2022-08-30 09:21:40 +01:00
Marko Mäkelä	30914389fe	Merge 10.5 into 10.6	2022-07-27 17:52:37 +03:00
Marko Mäkelä	098c0f2634	Merge 10.4 into 10.5	2022-07-27 17:17:24 +03:00
Oleksandr Byelkin	3bb36e9495	Merge branch '10.3' into 10.4	2022-07-27 11:02:57 +02:00
Rucha Deodhar	dbe39f14fe	MDEV-28762: recursive call of some json functions without stack control Analysis: Some recursive json functions dont check for stack control Fix: Add check_stack_overrun(). The last argument is NULL because it is not used	2022-07-20 19:24:48 +05:30
Sergei Golubchik	3bc98a4ec4	Merge branch '10.5' into 10.6	2022-05-10 14:01:23 +02:00

1 2 3 4 5 ...

1670 commits