Commit graph

299 commits

Author SHA1 Message Date
Sergei Golubchik
4c1ed54bfc fix Binary_string::c_ptr and c_ptr_safe
if the Ptr="abc", then str_length=3, and for a C ptr it needs Ptr[3]=0;
but it passes str_length+1 (=4) to realloc, and realloc allocates
arg_length+1 bytes (that is 5) and does Ptr[arg_length]= 0; (Ptr[4]=0)
2021-09-05 21:08:18 +02:00
Sergei Golubchik
3648b333c7 cleanup: formatting
also avoid an oxymoron of using `MYSQL_PLUGIN_IMPORT` under
`#ifdef MYSQL_SERVER`, and empty_clex_str is so trivial that a plugin
can define it if needed.
2021-06-11 13:02:55 +02:00
Monty
e45b54b75d Removed Static_binary_string
This did not server any real purpose and also made it too difficult to add
asserts for string memory overrwrites.

Moved all functionallity from Static_binary_string to Binary_string.

Other things:
- Added asserts to q_xxx and qs_xxx functions to check for memory overruns
- Fixed wrong test in String_buffer::set_buffer_if_not_allocated().
  The idea is to reuse allocated buffers (to avoid extra allocs), which
  the code did not do.
2021-05-19 22:54:12 +02:00
Monty
81d9bed3a4 MDEV-20017 Implement TO_CHAR() Oracle compatible function
TO_CHAR(expr, fmt)
- expr: required parameter, data/time/timestamp type expression
- fmt: optional parameter, format string, supports
  YYYY/YYY/YY/RRRR/RR/MM/MON/MONTH/MI/DD/DY/HH/HH12/HH24/SS and special
  characters. The default value is "YYYY-MM-DD HH24:MI:SS"

In Oracle, TO_CHAR() can also be used to convert numbers to strings, but
this is not supported. This will gave an error in this patch.

Other things:
- If format strings is a constant, it's evaluated only once and if there
  is any errors in it, they are given at once and the statement will abort.

Original author: woqutech
Lots of optimizations and cleanups done as part of review
2021-05-19 22:54:12 +02:00
Monty
949d10bea2 Don't reset StringBuffers in loops when not needed
- Moved out creating StringBuffers in loops and instead create them
  outside and just reset the buffer if it was not allocated (to avoid
  a possible malloc/free for every entry)

Other things related to set_buffer_if_not_allocated()
- Changed Valuebuffer to not call set_buffer_if_not_allocated() when
  it is created.
- Fixed geometry functions to reset string length before calling
  String::reserve().  This is because one should not access length()
  of an undefined.
- Added Item_func_conv_charset::save_in_field() as the item is using
  str_value to store cached values, which conflicts with
  Item::save_str_in_field().
- Changed Item_proc_string to not store the string value in sql_string
  as this clashes with Item::save_str_in_field().
- Locally store value of full_name_cstring() in analyse::end_of_records()
  as Item::save_str_in_field() may overwrite it.
- Marked some strings as set_thread_specific()
- Added String::free_buffer() to be used internally in String functions
  to just free the buffer but not reset other String values.
- Fixed uses_buffer_owned_by() to check for allocated length instead of
  strlength, which could be marked MEM_UNDEFINED().
2021-05-19 22:54:11 +02:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Monty
b6ff139aa3 Reduce usage of strlen()
Changes:
- To detect automatic strlen() I removed the methods in String that
  uses 'const char *' without a length:
  - String::append(const char*)
  - Binary_string(const char *str)
  - String(const char *str, CHARSET_INFO *cs)
  - append_for_single_quote(const char *)
  All usage of append(const char*) is changed to either use
  String::append(char), String::append(const char*, size_t length) or
  String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
  String::append()
- Added overflow argument to escape_string_for_mysql() and
  escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
  This was needed as most usage of the above functions never tested the
  result for -1 and would have given wrong results or crashes in case
  of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
  Changed all Item_func::func_name()'s to func_name_cstring()'s.
  The old Item_func_or_sum::func_name() is now an inline function that
  returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
  LEX_CSTRING.
- Changed for some functions the name argument from const char * to
  to const LEX_CSTRING &:
  - Item::Item_func_fix_attributes()
  - Item::check_type_...()
  - Type_std_attributes::agg_item_collations()
  - Type_std_attributes::agg_item_set_converter()
  - Type_std_attributes::agg_arg_charsets...()
  - Type_handler_hybrid_field_type::aggregate_for_result()
  - Type_handler_geometry::check_type_geom_or_binary()
  - Type_handler::Item_func_or_sum_illegal_param()
  - Predicant_to_list_comparator::add_value_skip_null()
  - Predicant_to_list_comparator::add_value()
  - cmp_item_row::prepare_comparators()
  - cmp_item_row::aggregate_row_elements_for_comparison()
  - Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
  could be simplified to not use String_space(), thanks to the fixed
  my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
  - NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
  clarify what the function really does.
- Rename of protocol function:
  bool store(const char *from, CHARSET_INFO *cs) to
  bool store_string_or_null(const char *from, CHARSET_INFO *cs).
  This was done to both clarify the difference between this 'store' function
  and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.

Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
  in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
  append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
  inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
  case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
2021-05-19 22:27:48 +02:00
Monty
36cdd5c3cd Optimize usage of c_ptr(), c_ptr_quick() and String::alloc()
The problem was that when one used String::alloc() to allocate a string,
the String ensures that there is space for an extra NULL byte in the
buffer and if not, reallocates the string. This is a problem with the
String::set_int() that calls alloc(21), which forces extra
malloc/free calls to happen.

- We do not anymore re-allocate String if alloc() is called with the
  Allocated_length. This reduces number of malloc() allocations,
  especially one big re-allocation in Protocol::send_result_Set_metadata()
  for almost every query that produced a result to the connnected client.
- Avoid extra mallocs when using LONGLONG_BUFFER_SIZE
  This can now be done as alloc() doesn't increase buffers if new length is
  not bigger than old one.
- c_ptr() is redesigned to be safer (but a bit longer) than before.
- Remove wrong usage of c_ptr_quick()
  c_ptr_quick() was used in many cases to get the pointer to the used
  buffer, even when it didn't need to be \0 terminated. In this case
  ptr() is a better substitute.
  Another problem with c_ptr_quick() is that it did not guarantee that
  the string would be \0 terminated.
- item_val_str(), an API function not used currently by the server,
  now always returns a null terminated string (before it didn't always
  do that).
- Ensure that all String allocations uses STRING_PSI_MEMORY_KEY. The old
  mixed usage of performance keys caused assert's when String buffers
  where shrunk.
- Binary_string::shrink() is simplifed
- Fixed bug in String(const char *str, size_t len, CHARSET_INFO *cs) that
  used Binary_string((char *) str, len) instead of Binary_string(str,len).
- Changed argument to String() creations and String.set() functions to use
  'const char*' instead of 'char*'. This ensures that Alloced_length is
  not set, which gives safety against someone trying to change the
  original string. This also would allow us to use !Alloced_length in
  c_ptr() if needed.
- Changed string_ptr_cmp() to use memcmp() instead of c_ptr() to avoid
  a possible malloc during string comparision.
2021-05-19 22:27:27 +02:00
Marko Mäkelä
03357ded17 Merge 10.4 into 10.5 2020-10-30 13:53:10 +02:00
Marko Mäkelä
cb253b8687 MDEV-22387: Static_binary_string::q_append() invokes memcpy on NULL
Invoking memcpy() on a NULL pointer is undefined behaviour
(even if the length is 0) and gives the compiler permission to
assume that the pointer is nonnull. Recent versions of GCC
(starting with version 8) are more aggressively optimizing away
checks for NULL pointers. This undefined behaviour would cause
a SIGSEGV in the test main.func_encrypt on an optimized debug build
on GCC 10.2.0.
2020-10-30 13:07:42 +02:00
Vicențiu Ciorbaru
85c686e2d1 cleanup: Static_binary_string need not take non-const double parameter
Convert the parameter to const as the function won't modify the pointer
value.
2020-10-28 11:38:14 +02:00
Marko Mäkelä
1c58748196 Merge 10.4 into 10.5 2020-08-10 21:38:55 +03:00
Marko Mäkelä
101ddc5e27 Merge mariadb-10.4.14 2020-08-10 20:37:52 +03:00
Alexander Barkov
fe555b9c5f MDEV-23415 Server crash or Assertion `dec_length <= str_length' failed in Item_func_format::val_str_ascii
Problem:

The crash happened in FORMAT(double, dec>=31, 'de_DE').

The patch for MDEV-23118 (commit 0041dacc1b)
did not take into account that String::set_real() has a limit of 31
(FLOATING_POINT_DECIMALS) fractional digits. So for the range of 31..38
digits, set_real() switches to use:
- my_fcvt() - decimal point notation, e.g. 1.9999999999
- my_gcvt() - scientific notation,    e.g. 1e22

my_gcvt() returned a shorter string than Item_func_format::val_str_ascii()
expected to get after the my_fcvt() call, so it crashed on assert.

Solution:

We cannot extend set_real() to use the my_fcvt() mode for the range of
31..38 fractional digits, because set_real() is used in a lot of places
and such a change will break everything.

Introducing String::set_fcvt() which always prints using my_fcvt()
for the whole range of decimals 0..38, supported by the FORMAT() function.
2020-08-08 09:44:31 +04:00
Sergei Golubchik
cd2924bacb MDEV-23330 Server crash or ASAN negative-size-param in my_strnncollsp_binary / SORT_FIELD_ATTR::compare_packed_varstrings
and
MDEV-23414 Assertion `res->charset() == item->collation.collation' failed in Type_handler_string_result::make_packed_sort_key_part

pack_sort_string() *must* take a collation from the Item, not from the
String value. Because when casting a string to _binary the original
String is not copied for performance reasons, it's reused but its
collation does not match Item's collation anymore.

Note, that String's collation cannot be simply changed to _binary,
because for an Item_string literal the original String must stay
unchanged for the duration of the query.

this partially reverts 61c15ebe32
2020-08-07 13:39:04 +02:00
Oleksandr Byelkin
57325e4706 Merge branch '10.3' into 10.4 2020-08-03 14:44:06 +02:00
Alexander Barkov
d63631c3fa MDEV-19632 Replication aborts with ER_SLAVE_CONVERSION_FAILED upon CREATE ... SELECT in ORACLE mode
- Adding optional qualifiers to data types:
    CREATE TABLE t1 (a schema.DATE);
  Qualifiers now work only for three pre-defined schemas:

    mariadb_schema
    oracle_schema
    maxdb_schema

  These schemas are virtual (hard-coded) for now, but may turn into real
  databases on disk in the future.

- mariadb_schema.TYPE now always resolves to a true MariaDB data
  type TYPE without sql_mode specific translations.

- oracle_schema.DATE translates to MariaDB DATETIME.

- maxdb_schema.TIMESTAMP translates to MariaDB DATETIME.

- Fixing SHOW CREATE TABLE to use a qualifier for a data type TYPE
  if the current sql_mode translates TYPE to something else.

The above changes fix the reported problem, so this script:

    SET sql_mode=ORACLE;
    CREATE TABLE t2 AS SELECT mariadb_date_column FROM t1;

is now replicated as:

    SET sql_mode=ORACLE;
    CREATE TABLE t2 (mariadb_date_column mariadb_schema.DATE);

and the slave can unambiguously treat DATE as the true MariaDB DATE
without ORACLE specific translation to DATETIME.

Similar,

    SET sql_mode=MAXDB;
    CREATE TABLE t2 AS SELECT mariadb_timestamp_column FROM t1;

is now replicated as:

    SET sql_mode=MAXDB;
    CREATE TABLE t2 (mariadb_timestamp_column mariadb_schema.TIMESTAMP);

so the slave treats TIMESTAMP as the true MariaDB TIMESTAMP
without MAXDB specific translation to DATETIME.
2020-08-01 07:43:50 +04:00
Monty
61c15ebe32 Remove String::lex_string() and String::lex_cstring()
- Better to use 'String *' directly.
- Added String::get_value(LEX_STRING*) for the few cases where we want to
  convert a String to LEX_CSTRING.

Other things:
- Use StringBuffer for some functions to avoid mallocs
2020-07-23 10:54:32 +03:00
Marko Mäkelä
fbe2712705 Merge 10.4 into 10.5
The functional changes of commit 5836191c8f
(MDEV-21168) are omitted due to MDEV-742 having addressed the issue.
2020-04-25 21:57:52 +03:00
Marko Mäkelä
af91266498 Merge 10.3 into 10.4
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
2020-04-16 12:12:26 +03:00
Eugene Kosov
6577a7a8f2 fix tests related to SQL comment length
tests are:
engines/funcs.jp_comment_column
engines/funcs.jp_comment_index
engines/funcs.jp_comment_table
2020-04-15 20:47:44 +03:00
Marko Mäkelä
37c14690fc Merge 10.4 into 10.5 2020-03-30 19:07:25 +03:00
Marko Mäkelä
e2f1f88fa6 Merge 10.3 into 10.4 2020-03-30 14:50:23 +03:00
Marko Mäkelä
1a9b6c4c7f Merge 10.2 into 10.3 2020-03-30 11:12:56 +03:00
Eugene Kosov
0b00c1a22f MDEV-22005 UBSAN: applying non-zero offset 2 to null pointer in my_charpos_mb()
Empty comment has a correct length.
2020-03-26 18:33:47 +03:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Alexander Barkov
f1e13fdc8d MDEV-21581 Helper functions and methods for CHARSET_INFO 2020-01-28 12:29:23 +04:00
Alexander Barkov
d30dbaa20d A cleanup for MDEV-8844: Fixing compilation failure on Windows
Fixing lossy type conversions:
- from int64 to int
- from size_t to uint
2019-12-07 19:12:04 +04:00
Alexander Barkov
3c6065a270 MDEV-8844 Unreadable control characters printed as is in warnings 2019-12-06 18:51:05 +04:00
Aleksey Midenkov
0c05a2ed71 Merge 10.4 into 10.5 2019-11-25 17:24:09 +03:00
Marko Mäkelä
f9ceb0a67f MDEV-20190 Instant operation fails when add column and collation change on non-indexed column
We must relax too strict debug assertions. For latin1_swedish_ci,
mtype=DATA_CHAR or mtype=DATA_VARCHAR will be used instead of
mtype=DATA_MYSQL or mtype=DATA_VARMYSQL. Likewise, some changes of
dtype_get_charset_coll() do not affect the data type encoding,
but only any indexes that are defined on the column.

Charset::same_encoding(): Check whether two charset-collations have
the same character set encoding.

dict_col_t::same_encoding(): Check whether two character columns
have the same character set encoding.

dict_col_t::same_type(): Check whether two columns have a compatible
data type encoding.

dict_col_t::same_format(), dict_table_t::instant_column(): Do not
compare mtype or the charset-collation of prtype directly.
Rely on dict_col_t::same_type() instead.

dtype_get_charset_coll(): Narrow the return type to uint16_t.

This is a refined version of a fix that was developed by
Thirunarayanan Balathandayuthapani.
2019-11-25 15:26:22 +02:00
Alexander Barkov
9a833dc688 MDEV-20856 Bad values in metadata views for partitions on VARBINARY
The old code to print partition values was too complicated:
- it created new Items for character set conversion purposes.
- it mixed string conversion and partition error reporting
  in the same code blocks.

Simplifying the code as follows:

- Adding helper methods String::can_be_safely_convert_to() and
  String::append_introducer_and_hex().

- Adding DBUG_EXECUTE_IF("generate_partition_syntax_for_frm",  push_warning...)
  into generate_partition_syntax_for_frm(), to test the PARTITON
  clause written to FRM. Adding test partition_utf8-debug.test for this.

- Removing functions get_cs_converted_part_value_from_string() and
  get_cs_converted_string_value. Changing get_partition_column_description()
  to use Type_handler::partition_field_append_value() instead.
  This makes SHOW CREATE TABLE and SELECT FROM I_S.PARTITIONS
  use the same code path.

- Changing Type_handler::partition_field_append_value() not to
  call convert_charset_partition_constant(), to avoid creating a new Item
  for string conversion pursposes.
  Rewritting the code to use only String methods.

- Removing error reporting code (ER_PARTITION_FUNCTION_IS_NOT_ALLOWED)
  from Type_handler::partition_field_append_value().
  The error is correctly detected and reported on the caller level.
  So error reporting was redundant here.

Also:

- Moving methods Type_handler::partition_field_*() from sql_partition.cc
  to sql_type.cc. This fixes compilation problem with -DPLUGIN_PARTITION=NO,
  earlier introduced by the patch for MDEV-20831.
2019-10-18 10:15:17 +04:00
Marko Mäkelä
4081b7b27a Merge 10.4 into 10.5 2019-09-06 17:16:40 +03:00
Sergei Golubchik
244f0e6dd8 Merge branch '10.3' into 10.4 2019-09-06 11:53:10 +02:00
Alexander Barkov
7e08ac0b41 Merge 10.2 (up to commit ef00ac4c86) into 10.3 2019-09-04 10:19:58 +04:00
Alexander Barkov
dc719597ee MDEV-18156 Assertion 0' failed or btr_validate_index(index, 0, false)' in row_upd_sec_index_entry or error code 126: Index is corrupted upon DELETE with PAD_CHAR_TO_FULL_LENGTH
This change takes into account a column's GENERATED ALWAYS AS
expression dependcy on sql_mode's PAD_CHAR_TO_FULL_LENGTH and
NO_UNSIGNED_SUBTRACTION flags.

Indexed virtual columns as well as persistent generated columns are
now not allowed to have such dependencies to avoid inconsistent data
or index files on sql_mode changes.
So an error is now returned in cases like this:

  CREATE OR REPLACE TABLE t1
  (
    a CHAR(5),
    v VARCHAR(5) AS (a) PERSISTENT -- CHAR->VARCHAR or CHAR->TEXT = ERROR
  );

Functions RPAD() and RTRIM() can now remove dependency on
PAD_CHAR_TO_FULL_LENGTH. So this can be used instead:

  CREATE OR REPLACE TABLE t1
  (
    a CHAR(5),
    v VARCHAR(5) AS (RTRIM(a)) PERSISTENT
  );

Note, unlike CHAR->VARCHAR and CHAR->TEXT this still works,
not RPAD(a) is needed:

  CREATE OR REPLACE TABLE t1
  (
    a CHAR(5),
    v CHAR(5) AS (a) PERSISTENT -- CHAR->CHAR is OK
  );

More sql_mode flags may affect values of generated columns.
They will be addressed separately.

See comments in sql_mode.h for implementation details.
2019-09-03 05:34:53 +04:00
Alexander Barkov
1517087b54 MDEV-20042 Implement EXTRA2_FIELD_DATA_TYPE_INFO in FRM 2019-07-11 21:51:18 +04:00
Monty
1a41fc77dd Merge remote-tracking branch 'origin/10.4' into 10.5 2019-06-27 01:21:41 +03:00
Alexander Barkov
c62eaa7bdf MDEV-19843 Modify ST_FIELD_INFO to use Type_handler and LEX_CSTRING 2019-06-24 06:25:16 +04:00
Eugene Kosov
854c219a7f MDEV-17301 Change of COLLATE unnecessarily requires ALGORITHM=COPY
Patch is about two cases:
1) On some collate changes it's possible to rebuild only secondary indexes
2) For non-indexed columns collate can be changed INSTANTly

Implemented mostly in Field_{string,varstring,blob}::is_equal().
Make this method return how exactly collationa differs.
This information is later used by fill_alter_inplace_info() to pass
correct info to engine.
2019-06-22 14:09:12 +03:00
Varun Gupta
8b576616b4 MDEV-19776: Assertion `to_len >= 8' failed in convert_to_printable with optimizer trace enabled
Introduced the convert_to_printable_required_length to return the correct length(taking into
consideration of dots in the case of error messages).
2019-06-20 12:03:32 +05:30
Marko Mäkelä
984d7100cd Merge 10.4 into 10.5 2019-06-13 18:36:09 +03:00
Varun
a0cb7551a4 MDEV-18880: Optimizer trace prints date in hexadecimal
Introduced a print_key_value function to makes sure that the trace prints data in readable format
for readable characters and the rest of the characters are printed as hexadecimal.
2019-06-11 15:44:58 +05:30
Alexander Barkov
f42bda6d75 MDEV-19727 Add Type_handler::Key_part_spec_init_ft 2019-06-11 07:54:37 +04:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Alexander Barkov
c59d6395a6 A joint patch for MDEV-19284 and MDEV-19285 (INSTANT ALTER)
This patch fixes:

- MDEV-19284 INSTANT ALTER with ucs2-to-utf16 conversion produces bad data
- MDEV-19285 INSTANT ALTER from ascii_general_ci to latin1_general_ci produces corrupt data

These regressions were introduced in 10.4.3 by:
- MDEV-15564 Avoid table rebuild in ALTER TABLE on collation or charset changes

Changes:

1. Cleanup: Adding a helper method
   Field_longstr::csinfo_change_allows_instant_alter(),
   to remove some duplicate code in field.cc.

2. Cleanup: removing Type_handler::Charsets_are_compatible() and static
   function charsets_are_compatible() and
   introducing new methods in the recently added class Charset instead:
   - encoding_allows_reinterpret_as()
   - encoding_and_order_allow_reinterpret_as()

3. Bug fix: Removing the code that allowed instant conversion for
   ascii-to->8bit and ucs2-to->utf16.
   This actually fixes MDEV-19284 and MDEV-19285.

4. Bug fix: Adding a helper method Charset::collation_specific_name().
   The old corresponding code in Type_handler::Charsets_are_compatible()
   was not safe against (badly named) user-defined collations whose
   character set name can be longer than collation name.
2019-05-16 16:20:25 +04:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00