Commit graph

338 commits

Author SHA1 Message Date
Oleksandr Byelkin
ea75a0b600 Merge branch '11.4' into 11.5 2024-08-05 17:50:18 +02:00
Oleksandr Byelkin
1640c9b06e Merge branch '11.2' into 11.4 2024-08-04 17:27:48 +02:00
Oleksandr Byelkin
80abd847da Merge branch '10.11' into 11.1 2024-08-03 09:32:42 +02:00
Yuchen Pei
f071b7620b
Merge branch '10.5' into 10.6 2024-07-16 15:54:22 +08:00
Alexander Barkov
2f4b0ba328 Moving a part of sql_lex.h into other *.h files
- Lex_ident_cli* into a new file sql/lex_ident_cli.h
- Lex_ident_sys* into a new file sql/lex_ident_sys.h
- Well_formed_prefix into include/m_ctype.h

This change is needed to the optimizer hint parser coming soon.
2024-07-16 09:09:38 +04:00
Oleg Smirnov
405613ebb5 MDEV-34490 get_copy() and build_clone() may return an instance of an ancestor class instead of a copy/clone
The `Item` class methods `get_copy()`, `build_clone()`, and `clone_item()`
face an issue where they may be defined in a descendant class
(e.g., `Item_func`) but not in a further descendant (e.g., `Item_func_child`).
This can lead to scenarios where `build_clone()`, when operating on an
instance of `Item_func_child` with a pointer to the base class (`Item`),
returns an instance of `Item_func` instead of `Item_func_child`.

Since this limitation cannot be resolved at compile time, this commit
introduces runtime type checks for the copy/clone operations.
A debug assertion will now trigger in case of a type mismatch.

`get_copy()`, `build_clone()`, and `clone_item()` are no more virtual,
but virtual `do_get_copy()`, `do_build_clone()`, and `do_clone_item()`
are added to the protected section of the class `Item`.

Additionally, const qualifiers have been added to certain methods
to enhance code reliability.

Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
2024-07-15 18:25:57 +07:00
Oleksandr Byelkin
dd7d9d7fb1 Merge branch '11.4' into 11.5 2024-05-23 17:01:43 +02:00
Oleksandr Byelkin
99b370e023 Merge branch '11.2' into 11.4 2024-05-21 19:38:51 +02:00
Alexander Barkov
310fd6ff69 Backporting bugs fixes fixed by MDEV-31340 from 11.5
The patch for MDEV-31340 fixed the following bugs:

MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

Backporting the fixes from 11.5 to 10.5
2024-05-21 14:58:01 +04:00
Marko Mäkelä
fec2fd6add Merge 10.11 into 11.0 2024-03-28 10:51:36 +02:00
Marko Mäkelä
ccb7a1e9a1 Merge 10.5 into 10.6 2024-03-27 15:00:56 +02:00
Alexander Barkov
0fc123c595 MDEV-33772 Bad SEPARATOR value in GROUP_CONCAT on character set conversion
Item_func_group_concat::print() did not take into account
that Item_func_group_concat::separator can be of a different character set
than the "String *str" (when the printing is being done to).
Therefore, printing did not work correctly for:
- non-ASCII separators when GROUP_CONCAT is done on 8bit data
  or multi-byte data with mbminlen==1.
- all separators (even including simple ones like comma)
  when GROUP_CONCAT is done on ucs2/utf16/utf32 data (mbminlen>1).

Because of this problem, VIEW definitions did not print correctly to
their FRM files. This later led to a wrong SELECT and SHOW CREATE output.

Fix:

- Adding new String methods:

  bool append_for_single_quote_using_mb_wc(const char *str, size_t length,
                                           CHARSET_INFO *cs);

  bool append_for_single_quote_opt_convert(const char *str,
                                           size_t length,
                                           CHARSET_INFO *cs)

  which perform both escaping and character set conversion at the same time.

- Adding a new String method escaped_wc_for_single_quote(),
  to reuse the code between the old and the new methods.

- Fixing Item_func_group_concat::print() to use the new
  method append_for_single_quote_opt_convert().
2024-03-27 15:22:58 +04:00
Alexander Barkov
929c2e06aa MDEV-31531 Remove my_casedn_str() and my_caseup_str()
Under terms of MDEV 27490 we'll add support for non-BMP identifiers
and upgrade casefolding information to Unicode version 14.0.0.
In Unicode-14.0.0 conversion to lower and upper cases can increase octet length
of the string, so conversion won't be possible in-place any more.

This patch removes virtual functions performing in-place casefolding:
  - my_charset_handler_st::casedn_str()
  - my_charset_handler_st::caseup_str()
and fixes the code to use the non-inplace functions instead:
  - my_charset_handler_st::casedn()
  - my_charset_handler_st::caseup()
2024-02-28 22:20:29 +04:00
Oleksandr Byelkin
34272bd6a5 Merge branch '11.2' into 11.3 2023-11-14 18:33:03 +01:00
Oleksandr Byelkin
b83c379420 Merge branch '10.5' into 10.6 2023-11-08 15:57:05 +01:00
Oleksandr Byelkin
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
Alexander Barkov
e2da748c29 MDEV-28835 Assertion `(length % 4) == 0' failed in my_lengthsp_utf32 on INSERT
Problem:

Item_func_date_format::val_str() and make_date_time() did not take into
account that the format string and the result string
(separately or at the same time) can be of a tricky character set
like UCS2, UTF16, UTF32. As a result, DATE_FORMAT() could generate
an ill-formed result which crashed on DBUG_ASSERTs testing well-formedness
in other parts of the code.

Fix:

1. class String changes
   Removing String::append_with_prefill(). It was not compatible with
   tricky character sets. Also it was inconvenient to use and required
   too much duplicate code on the caller side.
   Adding String::append_zerofill() instead. It's compatible with tricky
   character sets and is easier to use.
   Adding helper methods Static_binary_string::q_append_wc() and
   String::append_wc(), to append a single wide character
   (a Unicode code point in my_wc_t).

2. storage/spider changes
   Removing spider_string::append_with_prefill().
   It used String::append_with_prefix() inside, but it was unused itself.

3. Changing tricky charset incompatible code pieces in make_date_time()
   to compatible replacements:

   - Fixing the loop scanning the format string to iterate in terms
     of Unicode code points (using mb_wc()) rather than in terms
     of "char" items.
   - Using append_wc(my_wc_t) instead of append(char) to append
     a single character to the result string.
   - Using append_zerofill() instead of append_with_prefill() to
     append date/time numeric components to the result string.
2023-10-04 08:51:48 +04:00
Alexander Barkov
8951f7d940 MDEV-31992 Automatic conversion from LEX_STRING to LEX_CSTRING
- Adding automatic conversion operator from LEX_STRING to LEX_CSTRING
  Now a LEX_STRING can be passed directly to any function expecting
  a LEX_CSTRING parameter passed by value or by reference.
- Removing a number of duplicate methods accepting LEX_STRING.
  Now the code used the LEX_CSTRING version.
2023-08-23 15:30:06 +04:00
Sergei Golubchik
a4817e1520 cleanup: String::strstr() const 2023-07-04 16:37:29 +02:00
Marko Mäkelä
5bada1246d Merge 10.5 into 10.6 2023-04-11 16:15:19 +03:00
Oleksandr Byelkin
ac5a534a4c Merge remote-tracking branch '10.4' into 10.5 2023-03-31 21:32:41 +02:00
Alexander Barkov
965bdf3e66 MDEV-30746 Regression in ucs2_general_mysql500_ci
1. Adding a separate MY_COLLATION_HANDLER
   my_collation_ucs2_general_mysql500_ci_handler
   implementing a proper order for ucs2_general_mysql500_ci
   The problem happened because ucs2_general_mysql500_ci
   erroneously used my_collation_ucs2_general_ci_handler.

2. Cosmetic changes: Renaming:
   - plane00_mysql500 to my_unicase_mysql500_page00
   - my_unicase_pages_mysql500 to my_unicase_mysql500_pages
   to use the same naming style with:
   - my_unicase_default_page00
   - my_unicase_defaul_pages

3. Moving code fragments from
   - handler::check_collation_compatibility() in handler.cc
   - upgrade_collation() in table.cc
   into new methods in class Charset, to reuse the code easier.
2023-03-01 15:38:02 +04:00
Marko Mäkelä
6aec87544c Merge 10.5 into 10.6 2023-02-10 13:03:01 +02:00
Marko Mäkelä
c41c79650a Merge 10.4 into 10.5 2023-02-10 12:02:11 +02:00
Vicențiu Ciorbaru
08c852026d Apply clang-tidy to remove empty constructors / destructors
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .

Code style changes have been done on top. The result of this change
leads to the following improvements:

1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
  ~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.

2. Compiler can better understand the intent of the code, thus it leads
   to more optimization possibilities. Additionally it enabled detecting
   unused variables that had an empty default constructor but not marked
   so explicitly.

   Particular change required following this patch in sql/opt_range.cc

   result_keys, an unused template class Bitmap now correctly issues
   unused variable warnings.

   Setting Bitmap template class constructor to default allows the compiler
   to identify that there are no side-effects when instantiating the class.
   Previously the compiler could not issue the warning as it assumed Bitmap
   class (being a template) would not be performing a NO-OP for its default
   constructor. This prevented the "unused variable warning".
2023-02-09 16:09:08 +02:00
Vicențiu Ciorbaru
c115559b66 Extend Binary_string::strstr to also take in a const char pointer
One shouldn't have to instantiate a Binary_string every time a strstr
call is needed.
2023-02-03 16:27:16 +02:00
Sergei Golubchik
900d7bf360 Merge branch '10.5' into 10.6 2022-10-02 22:14:21 +02:00
Sergei Golubchik
3a2116241b Merge branch '10.4' into 10.5 2022-10-02 14:38:13 +02:00
Alexander Barkov
1118e979c2 MDEV-29672 Add MTR tests covering key and key segment flags and types 2022-09-30 11:08:49 +04:00
Marko Mäkelä
44fd2c4b24 Merge 10.5 into 10.6 2022-09-20 16:53:20 +03:00
Marko Mäkelä
0792aff161 Merge 10.4 into 10.5 2022-09-20 13:17:02 +03:00
Marko Mäkelä
0c0a569028 Merge 10.3 into 10.4 2022-09-20 12:38:25 +03:00
Alexander Barkov
5dcc56be4d MDEV-29561 SHOW CREATE TABLE produces syntactically incorrect structure 2022-09-20 11:02:36 +04:00
Alexander Barkov
0e8544cd73 MDEV-29355 Backport templatized INET6 implementation from 10.7 to 10.6 2022-08-23 14:36:08 +04:00
Norio Akagi
84d26f98c7
MDEV-28315 Fix ASAN stack-buffer-overflow in String::copy_aligned
Starting since this commit 36cdd5c3cd
there is an ASAN stack-buffer-overflow error because we append a NULL
terminator beyond the length of memory allocated.

Reviewed by: Monty and Nayuta Yanagisawa
2022-08-01 20:27:33 +09:00
Sergei Golubchik
4c1ed54bfc fix Binary_string::c_ptr and c_ptr_safe
if the Ptr="abc", then str_length=3, and for a C ptr it needs Ptr[3]=0;
but it passes str_length+1 (=4) to realloc, and realloc allocates
arg_length+1 bytes (that is 5) and does Ptr[arg_length]= 0; (Ptr[4]=0)
2021-09-05 21:08:18 +02:00
Sergei Golubchik
3648b333c7 cleanup: formatting
also avoid an oxymoron of using `MYSQL_PLUGIN_IMPORT` under
`#ifdef MYSQL_SERVER`, and empty_clex_str is so trivial that a plugin
can define it if needed.
2021-06-11 13:02:55 +02:00
Monty
e45b54b75d Removed Static_binary_string
This did not server any real purpose and also made it too difficult to add
asserts for string memory overrwrites.

Moved all functionallity from Static_binary_string to Binary_string.

Other things:
- Added asserts to q_xxx and qs_xxx functions to check for memory overruns
- Fixed wrong test in String_buffer::set_buffer_if_not_allocated().
  The idea is to reuse allocated buffers (to avoid extra allocs), which
  the code did not do.
2021-05-19 22:54:12 +02:00
Monty
81d9bed3a4 MDEV-20017 Implement TO_CHAR() Oracle compatible function
TO_CHAR(expr, fmt)
- expr: required parameter, data/time/timestamp type expression
- fmt: optional parameter, format string, supports
  YYYY/YYY/YY/RRRR/RR/MM/MON/MONTH/MI/DD/DY/HH/HH12/HH24/SS and special
  characters. The default value is "YYYY-MM-DD HH24:MI:SS"

In Oracle, TO_CHAR() can also be used to convert numbers to strings, but
this is not supported. This will gave an error in this patch.

Other things:
- If format strings is a constant, it's evaluated only once and if there
  is any errors in it, they are given at once and the statement will abort.

Original author: woqutech
Lots of optimizations and cleanups done as part of review
2021-05-19 22:54:12 +02:00
Monty
949d10bea2 Don't reset StringBuffers in loops when not needed
- Moved out creating StringBuffers in loops and instead create them
  outside and just reset the buffer if it was not allocated (to avoid
  a possible malloc/free for every entry)

Other things related to set_buffer_if_not_allocated()
- Changed Valuebuffer to not call set_buffer_if_not_allocated() when
  it is created.
- Fixed geometry functions to reset string length before calling
  String::reserve().  This is because one should not access length()
  of an undefined.
- Added Item_func_conv_charset::save_in_field() as the item is using
  str_value to store cached values, which conflicts with
  Item::save_str_in_field().
- Changed Item_proc_string to not store the string value in sql_string
  as this clashes with Item::save_str_in_field().
- Locally store value of full_name_cstring() in analyse::end_of_records()
  as Item::save_str_in_field() may overwrite it.
- Marked some strings as set_thread_specific()
- Added String::free_buffer() to be used internally in String functions
  to just free the buffer but not reset other String values.
- Fixed uses_buffer_owned_by() to check for allocated length instead of
  strlength, which could be marked MEM_UNDEFINED().
2021-05-19 22:54:11 +02:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Monty
b6ff139aa3 Reduce usage of strlen()
Changes:
- To detect automatic strlen() I removed the methods in String that
  uses 'const char *' without a length:
  - String::append(const char*)
  - Binary_string(const char *str)
  - String(const char *str, CHARSET_INFO *cs)
  - append_for_single_quote(const char *)
  All usage of append(const char*) is changed to either use
  String::append(char), String::append(const char*, size_t length) or
  String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
  String::append()
- Added overflow argument to escape_string_for_mysql() and
  escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
  This was needed as most usage of the above functions never tested the
  result for -1 and would have given wrong results or crashes in case
  of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
  Changed all Item_func::func_name()'s to func_name_cstring()'s.
  The old Item_func_or_sum::func_name() is now an inline function that
  returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
  LEX_CSTRING.
- Changed for some functions the name argument from const char * to
  to const LEX_CSTRING &:
  - Item::Item_func_fix_attributes()
  - Item::check_type_...()
  - Type_std_attributes::agg_item_collations()
  - Type_std_attributes::agg_item_set_converter()
  - Type_std_attributes::agg_arg_charsets...()
  - Type_handler_hybrid_field_type::aggregate_for_result()
  - Type_handler_geometry::check_type_geom_or_binary()
  - Type_handler::Item_func_or_sum_illegal_param()
  - Predicant_to_list_comparator::add_value_skip_null()
  - Predicant_to_list_comparator::add_value()
  - cmp_item_row::prepare_comparators()
  - cmp_item_row::aggregate_row_elements_for_comparison()
  - Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
  could be simplified to not use String_space(), thanks to the fixed
  my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
  - NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
  clarify what the function really does.
- Rename of protocol function:
  bool store(const char *from, CHARSET_INFO *cs) to
  bool store_string_or_null(const char *from, CHARSET_INFO *cs).
  This was done to both clarify the difference between this 'store' function
  and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.

Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
  in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
  append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
  inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
  case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
2021-05-19 22:27:48 +02:00
Monty
36cdd5c3cd Optimize usage of c_ptr(), c_ptr_quick() and String::alloc()
The problem was that when one used String::alloc() to allocate a string,
the String ensures that there is space for an extra NULL byte in the
buffer and if not, reallocates the string. This is a problem with the
String::set_int() that calls alloc(21), which forces extra
malloc/free calls to happen.

- We do not anymore re-allocate String if alloc() is called with the
  Allocated_length. This reduces number of malloc() allocations,
  especially one big re-allocation in Protocol::send_result_Set_metadata()
  for almost every query that produced a result to the connnected client.
- Avoid extra mallocs when using LONGLONG_BUFFER_SIZE
  This can now be done as alloc() doesn't increase buffers if new length is
  not bigger than old one.
- c_ptr() is redesigned to be safer (but a bit longer) than before.
- Remove wrong usage of c_ptr_quick()
  c_ptr_quick() was used in many cases to get the pointer to the used
  buffer, even when it didn't need to be \0 terminated. In this case
  ptr() is a better substitute.
  Another problem with c_ptr_quick() is that it did not guarantee that
  the string would be \0 terminated.
- item_val_str(), an API function not used currently by the server,
  now always returns a null terminated string (before it didn't always
  do that).
- Ensure that all String allocations uses STRING_PSI_MEMORY_KEY. The old
  mixed usage of performance keys caused assert's when String buffers
  where shrunk.
- Binary_string::shrink() is simplifed
- Fixed bug in String(const char *str, size_t len, CHARSET_INFO *cs) that
  used Binary_string((char *) str, len) instead of Binary_string(str,len).
- Changed argument to String() creations and String.set() functions to use
  'const char*' instead of 'char*'. This ensures that Alloced_length is
  not set, which gives safety against someone trying to change the
  original string. This also would allow us to use !Alloced_length in
  c_ptr() if needed.
- Changed string_ptr_cmp() to use memcmp() instead of c_ptr() to avoid
  a possible malloc during string comparision.
2021-05-19 22:27:27 +02:00
Varun Gupta
b4379df5b4 MDEV-21265: IN predicate conversion to IN subquery should be allowed for a broader set of datatype comparison
Allow materialization strategy when collations on the
inner and outer sides of an IN subquery are the same and the
character set of the inner side is a proper subset of the character
set on the outer side.
This allows conversion from utf8mb3 to utf8mb4
as the former is a subset of the later.
This is only allowed when IN predicate is converted to an IN subquery

Backported part of the patch (d6a00d9b18) of MDEV-17905.
2020-11-30 17:16:43 +05:30
Marko Mäkelä
03357ded17 Merge 10.4 into 10.5 2020-10-30 13:53:10 +02:00
Marko Mäkelä
cb253b8687 MDEV-22387: Static_binary_string::q_append() invokes memcpy on NULL
Invoking memcpy() on a NULL pointer is undefined behaviour
(even if the length is 0) and gives the compiler permission to
assume that the pointer is nonnull. Recent versions of GCC
(starting with version 8) are more aggressively optimizing away
checks for NULL pointers. This undefined behaviour would cause
a SIGSEGV in the test main.func_encrypt on an optimized debug build
on GCC 10.2.0.
2020-10-30 13:07:42 +02:00
Vicențiu Ciorbaru
85c686e2d1 cleanup: Static_binary_string need not take non-const double parameter
Convert the parameter to const as the function won't modify the pointer
value.
2020-10-28 11:38:14 +02:00
Marko Mäkelä
c3752cef3c Merge 10.2 into 10.3 2020-09-03 09:26:54 +03:00
Marko Mäkelä
2a93e632b1 Merge 10.1 into 10.2 2020-09-03 09:10:03 +03:00
Marko Mäkelä
94a520ddbe MDEV-22387: Do not pass null pointer to some memcpy()
Passing a null pointer to a nonnull argument is not only undefined
behaviour, but it also grants the compiler the permission to optimize
away further checks whether the pointer is null. GCC -O2 at least
starting with version 8 may do that, potentially causing SIGSEGV.

These problems were caught in a WITH_UBSAN=ON build with the
Bug#7024 test in main.view.
2020-09-03 09:05:56 +03:00