Commit graph

241 commits

Author SHA1 Message Date
Rucha Deodhar
25219920f5 MDEV-28762: recursive call of some json functions without stack control
Fixup to MDEV-28762. Fixes warnings about unused variable "stack_used_up"
during building with RelWithDebInfo
2022-07-29 22:09:45 +05:30
Rucha Deodhar
bfdc4ff22e MDEV-28762: recursive call of some json functions without stack control
Fixup to MDEV-28762. Fixes warnings about unused variable "stack_used_up"
during building with RelWithDebInfo
2022-07-29 16:34:57 +05:30
Oleksandr Byelkin
cbcc0101ee MDEV-29188 Crash in JSON_EXTRACT
If we have null_value set then decimal/string value/result shoud be 0 pointer.
2022-07-29 09:03:54 +02:00
Marko Mäkelä
f53f64b7b9 Merge 10.8 into 10.9 2022-07-28 10:47:33 +03:00
Marko Mäkelä
f79cebb4d0 Merge 10.7 into 10.8 2022-07-28 10:33:26 +03:00
Marko Mäkelä
742e1c727f Merge 10.6 into 10.7 2022-07-27 18:26:21 +03:00
Marko Mäkelä
30914389fe Merge 10.5 into 10.6 2022-07-27 17:52:37 +03:00
Marko Mäkelä
098c0f2634 Merge 10.4 into 10.5 2022-07-27 17:17:24 +03:00
Oleksandr Byelkin
3bb36e9495 Merge branch '10.3' into 10.4 2022-07-27 11:02:57 +02:00
Marko Mäkelä
3bf10012e0 MDEV-28762: Fixup for clang
Unlike GCC, clang could optimize away alloca() and thus the
ALLOCATE_MEM_ON_STACK() instrumentation. To make it harder, let us
invoke a non-inline function on the entire allocated buffer.
2022-07-26 11:00:11 +03:00
Rucha Deodhar
46ff660883 This commit is a fixup for MDEV-28762 2022-07-25 22:13:40 +05:30
Rucha Deodhar
d8c2eeeb59 MDEV-28762: recursive call of some json functions without stack control
This commit is fixup for MDEV-28762

Analysis: Some recursive json functions dont check for stack control
Fix: Add check_stack_overrun(). The last argument is NULL because it is not
used
2022-07-25 22:12:27 +05:30
Rucha Deodhar
95989e8211 MDEV-28762: recursive call of some json functions without stack control
This commit is a fixup for MDEV-28762

    Analysis: Some recursive json functions dont check for stack control
    Fix: Add check_stack_overrun(). The last argument is NULL because it is not
    used
2022-07-23 23:02:12 +05:30
Rucha Deodhar
5ad14ab272 MDEV-28762: recursive call of some json functions without stack control
Analysis: Some recursive json functions dont check for stack control
Fix: Add check_stack_overrun(). The last argument is NULL because it is not
used
2022-07-20 22:54:30 +05:30
Rucha Deodhar
dbe39f14fe MDEV-28762: recursive call of some json functions without stack control
Analysis: Some recursive json functions dont check for stack control
Fix: Add check_stack_overrun(). The last argument is NULL because it is not
used
2022-07-20 19:24:48 +05:30
qggcs
ba5b2e7b29 json_type should consider the charset mbmaxlen 2022-06-30 12:45:28 +05:30
Rucha Deodhar
4730a6982f MDEV-28350: Test failing on buildbot with UBSAN
Analysis: There were two kinds of failing tests on buildbot with UBSAN.
1) runtime error: signed integer overflow and
2) runtime error: load of value is not valid value for type
Signed integer overflow was occuring because addition of two integers
(size of json array + item number in array) was causing overflow in
json_path_parts_compare. This overflow happens because a->n_item_end
wasn't set.
The second error was occuring because c_path->p.types_used is not
initialized but the value is used later on to check for negative path index.
Fix: For signed integer overflow, use a->n_item_end only in case of range
so that it is set.
2022-04-26 13:59:43 +05:30
Rucha Deodhar
c781cefd8a MDEV-27911: Implement range notation for json path
Range can be thought about in similar manner as wildcard (*) where
more than one elements are processed. To implement range notation, extended
json parser to parse the 'to' keyword and added JSON_PATH_ARRAY_RANGE for
path type. If there is 'to' keyword then use JSON_PATH_ARRAY range for
path type along with existing type.
This new integer to store the end index of range is n_item_end.
When there is 'to' keyword, store the integer in n_item_end else store in
n_item.
2022-04-15 01:02:44 +05:30
Rucha Deodhar
dfcbb30a92 MDEV-22224: Support JSON Path negative index
This patch can be viewed as combination of two parts:
1) Enabling '-' in the path so that the parser does not give out a warning.
2) Setting the negative index to a correct value and returning the
   appropriate value.

1) To enable using the negative index in the path:
To make the parser not return warning when negative index is used in path
'-' needs to be allowed in json path characters. P_NEG is added
to enable this and is made recognizable by setting the 45th index of
json_path_chr_map[] to P_NEG (instead of previous P_ETC)
because 45 corresponds to '-' in unicode.
When the path is being parsed and '-' is encountered, the parser should
recognize it as parsing '-' sign, so a new json state PS_NEG is required.
When the state is PS_NEG, it means that a negative integer is
going to be parsed so set is_negative_index of current step to 1 and
n_item is set accordingly when integer is encountered after '-'.
Next proceed with parsing rest of the path and get the correct path.
Next thing is parsing the json and returning correct value.

2) Setting the negative index to a correct value and returning the value:
While parsing json if we encounter array and the path step for the array
is a negative index (n_item < 0), then we can count the number of elements
in the array and set n_item to correct corresponding value. This is done in
json_skip_array_and_count.
2022-04-13 21:16:32 +05:30
Rucha Deodhar
3eb1e11d8a MDEV-23479: Add a THD* argument to Item_func_or_sum::fix_length_and_dec()
Fix: Added THD *thd argument in Item_func_or_sum::fix_length_and_dec() and in
fix_length_and_dec() for all derived classes of Item_func_or_sum.
2022-03-30 17:00:17 +05:30
Rucha Deodhar
12abe61af4 MDEV-27990: Incorrect behavior of JSON_OVERLAPS() on warning
Analysis: In case of error while processing json document, we goto
error label which eventually return 1 instead of 0.
Fix: Return 0 in case of error instead of 1.
2022-03-30 15:09:01 +05:30
Rucha Deodhar
a653dde279 MDEV-27677: Implement JSON_OVERLAPS()
1) When at least one of the two json documents is of scalar type:
     1.a) If value and json document both are scalar, then return true
          if they have same type and value.
     1.b) If json document is scalar but other is array (or vice versa),
          then return true if array has at least one element of same type
          and value as scalar.
     1.c) If one is scalar and other is object, then return false because
          it can't be compared.

  2) When both arguments are of non-scalar type and below conditons
      are satisfied then return true:
      2.a) When both arguments are arrays:
           Iterate over the value and json document. If there exists at
           least one element in other array of same type and value as
           that of element in value.
      2.b) If both arguments are objects:
           Iterate over value and json document and if there exists at least
           one key-value pair common between two objects.
      2.c) If either of json document or value is array and other is object:
           Iterate over the array, if an element of type object is found,
           then compare it with the object (which is the other arguemnt).
           If the entire object matches i.e all they key value pairs match.
2022-03-30 15:09:01 +05:30
Oleksandr Byelkin
9ed8deb656 Merge branch '10.6' into 10.7 2022-02-04 14:11:46 +01:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Sergei Golubchik
bc10f58a58 MDEV-24909 JSON functions don't respect KILL QUERY / max_statement_time limit
pass the pointer to thd->killed down to the json library,
check it while scanning,
use thd->check_killed() to generate the proper error message
2022-01-30 12:07:31 +01:00
Sergei Golubchik
bae9fb5904 MDEV-24909 JSON functions don't respect KILL QUERY / max_statement_time limit
preallocate the string in json_nice()
to avoid reallocs on every 1-character append()
2022-01-30 12:07:16 +01:00
Sergei Golubchik
3add51b769 cleanup: move the common code into the function 2022-01-30 12:07:16 +01:00
Alexander Barkov
e4b302e436 MDEV-27018 IF and COALESCE lose "json" property
Hybrid functions (IF, COALESCE, etc) did not preserve the JSON property
from their arguments. The same problem was repeatable for single row subselects.

The problem happened because the method Item::is_json_type() was inconsistently
implemented across the Item hierarchy. For example, Item_hybrid_func
and Item_singlerow_subselect did not override is_json_type().

Solution:

- Removing Item::is_json_type()

- Implementing specific JSON type handlers:
  Type_handler_string_json
  Type_handler_varchar_json
  Type_handler_tiny_blob_json
  Type_handler_blob_json
  Type_handler_medium_blob_json
  Type_handler_long_blob_json

- Reusing the existing data type infrastructure to pass JSON
  type handlers across all item types, including classes Item_hybrid_func
  and Item_singlerow_subselect. Note, these two classes themselves do not
  need any changes!

- Extending the data type infrastructure so data types can inherit
  their properties (e.g. aggregation rules) from their base data types.
  E.g. VARCHAR/JSON acts as VARCHAR, LONGTEXT/JSON acts as LONGTEXT
  when mixed to a non-JSON data type. This is done by:
    - adding virtual method Type_handler::type_handler_base()
    - adding a helper class Type_handler_pair
    - refactoring Type_handler_hybrid_field_type methods
      aggregate_for_result(), aggregate_for_min_max(),
      aggregate_for_num_op() to use Type_handler_pair.

This change also fixes:

  MDEV-27361 Hybrid functions with JSON arguments do not send format metadata

Also, adding mtr tests for JSON replication. It was not covered yet.
And the current patch changes the replication code slightly.
2022-01-21 19:28:48 +04:00
Marko Mäkelä
71d4ecf182 Merge 10.6 into 10.7 2021-10-22 14:41:47 +03:00
Marko Mäkelä
73f5cbd0b6 Merge 10.5 into 10.6 2021-10-21 16:06:34 +03:00
Marko Mäkelä
5f8561a6bc Merge 10.4 into 10.5 2021-10-21 15:26:25 +03:00
Marko Mäkelä
489ef007be Merge 10.3 into 10.4 2021-10-21 14:57:00 +03:00
Marko Mäkelä
e4a7c15dd6 Merge 10.2 into 10.3 2021-10-21 13:41:04 +03:00
Alexey Botchkov
1a54cf62f8 MDEV-24585 Assertion `je->s.cs == nice_js->charset()' failed in json_nice.
We should set the charset in
Item_func_json_format::fix_length_and_dec().
2021-10-19 11:00:56 +04:00
Vicențiu Ciorbaru
5982734eac Fix json_normalize asan
String::c_ptr() assumes the string is null terminated, but not all
results of val_str are, so use ptr() and length() instead.
2021-08-06 11:29:56 +03:00
Eric Herman
593885f785 MDEV-23143 Add JSON_EQUALS function
This patch implements JSON_EQUALS SQL function.  The function takes
advantage of the json_normalize functionality and does the following:

norm_a = json_normalize(a)
norm_b = json_normalize(b)
return strcmp(norm_a, norm_b)

Co-authored-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>
2021-07-21 16:32:11 +03:00
Eric Herman
fcde341764 MDEV-16375 Function to normalize a json value
This patch implements JSON_NORMALIZE SQL function.

Co-authored-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>
2021-07-21 16:32:11 +03:00
Eric Herman
7b587fcbe7 fix json typo s/UNINITALIZED/UNINITIALIZED/ 2021-07-21 16:32:11 +03:00
Sergei Golubchik
b62672af72 MDEV-26054 Server crashes in Item_func_json_arrayagg::get_str_from_field
Revert "fix JSON_ARRAYAGG not to over-quote json in joins"
This removes 8711adb786 but keeps the test case.
A different fix is coming up.

Because args can be Item_field's that are later
replaced by Item_direct_view_ref to the actual field.
While Item_field preserved in orig_args will stay unfixed
with item->field==NULL and no metadata
2021-06-30 22:08:53 +02:00
Sergei Golubchik
8711adb786 fix JSON_ARRAYAGG not to over-quote json in joins
use metadata (in particular is_json() property) of the original
argument item, even if the actual argument was later replaced
with an Item_temptable_field
2021-06-30 09:34:27 +02:00
Marko Mäkelä
b11aa0df85 Merge 10.5 into 10.6 2021-06-29 15:18:18 +03:00
Alexey Botchkov
98c7916f0f MDEV-23004 When using GROUP BY with JSON_ARRAYAGG with joint table, the
square brackets are not included.

Item_func_json_arrayagg::copy_or_same() should be implemented.
2021-06-28 11:14:18 +04:00
Monty
b6ff139aa3 Reduce usage of strlen()
Changes:
- To detect automatic strlen() I removed the methods in String that
  uses 'const char *' without a length:
  - String::append(const char*)
  - Binary_string(const char *str)
  - String(const char *str, CHARSET_INFO *cs)
  - append_for_single_quote(const char *)
  All usage of append(const char*) is changed to either use
  String::append(char), String::append(const char*, size_t length) or
  String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
  String::append()
- Added overflow argument to escape_string_for_mysql() and
  escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
  This was needed as most usage of the above functions never tested the
  result for -1 and would have given wrong results or crashes in case
  of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
  Changed all Item_func::func_name()'s to func_name_cstring()'s.
  The old Item_func_or_sum::func_name() is now an inline function that
  returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
  LEX_CSTRING.
- Changed for some functions the name argument from const char * to
  to const LEX_CSTRING &:
  - Item::Item_func_fix_attributes()
  - Item::check_type_...()
  - Type_std_attributes::agg_item_collations()
  - Type_std_attributes::agg_item_set_converter()
  - Type_std_attributes::agg_arg_charsets...()
  - Type_handler_hybrid_field_type::aggregate_for_result()
  - Type_handler_geometry::check_type_geom_or_binary()
  - Type_handler::Item_func_or_sum_illegal_param()
  - Predicant_to_list_comparator::add_value_skip_null()
  - Predicant_to_list_comparator::add_value()
  - cmp_item_row::prepare_comparators()
  - cmp_item_row::aggregate_row_elements_for_comparison()
  - Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
  could be simplified to not use String_space(), thanks to the fixed
  my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
  - NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
  clarify what the function really does.
- Rename of protocol function:
  bool store(const char *from, CHARSET_INFO *cs) to
  bool store_string_or_null(const char *from, CHARSET_INFO *cs).
  This was done to both clarify the difference between this 'store' function
  and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.

Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
  in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
  append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
  inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
  case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
2021-05-19 22:27:48 +02:00
Monty
6079b46d8d Split item->flags into base_flags and with_flags
This was done to simplify copying of with_* flags

Other things:
- Changed Flags to C++ enums, which enables gdb to print
  out bit values for the flags. This also enables compiler
  errors if one tries to manipulate a non existing bit in
  a variable.
- Added set_maybe_null() as a shortcut as setting the
  MAYBE_NULL flags was used in a LOT of places.
- Renamed PARAM flag to SP_VAR to ensure it's not confused with persistent
  statement parameters.
2021-05-19 22:27:28 +02:00
Michael Widenius
3105c9e7a5 Change bitfields in Item to an uint16
The reason for the change is that neither clang or gcc can do efficient
code when several bit fields are change at the same time or when copying
one or more bits between identical bit fields.
Updated bits explicitely with & and | is MUCH more efficient than what
current compilers can do.
2021-05-19 22:27:28 +02:00
Michael Widenius
189d03dac5 Revert MDEV-14517 Cleanup for Item::with_subselect
Added back variable 'with_subquery' to Item class as a bit field.

This made the code shorter, faster (removed some virtual methods,
less code to create an initialized item etc) and made many Item's 7 bytes
smaller.

This is the last set of my patches the decreases the size of Item.

Some examples from gdb:
sizeof(Item):        144 -> 120
sizeof(Item_func)    208 -> 184
sizeof(Item_sum_max) 368 -> 344
2021-05-19 22:27:28 +02:00
Marko Mäkelä
4930f9c94b Merge 10.5 into 10.6 2021-04-21 11:45:00 +03:00
Alexey Botchkov
e9fd327ee3 MDEV-17399 Add support for JSON_TABLE.
The specific table handler for the table functions was introduced,
and used to implement JSON_TABLE.
2021-04-21 10:21:43 +04:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Monty
031f11717d Fix all warnings given by UBSAN
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.

The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
  complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
  memory access of integers.  Fixed by using byte_order_generic.h when
  compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
  disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
  suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
  safe to have overflows (two cases, in item_func.cc).

Things fixed:
- Don't left shift signed values
  (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
  constructors.  This was needed as UBSAN checks that these types has
  correct values when one copies an object.
  (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
  deleted objects.
  (events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
  on Query_arena object.
- Fixed several cast of objects to an incompatible class!
  (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
   sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
  This includes also ++ and -- of integers.
  (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
  value_type is initialized to this instead of to -1, which is not a valid
  enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
  instead of a null string (safer as it ensures we do not do arithmetic
  on null strings).

Other things:

- Changed struct st_position to an OBJECT and added an initialization
  function to it to ensure that we do not copy or use uninitialized
  members. The change to a class was also motived that we used "struct
  st_position" and POSITION randomly trough the code which was
  confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
  the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
  avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr.  (This variable was before
  only in 10.5 and up).  It can now have one of two values:
  ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
  it virtual. This was an effort to get UBSAN to work with loaded storage
  engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
  in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
  server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
  to integer arithmetic.

Changes that should not be needed but had to be done to suppress warnings
from UBSAN:

- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
  compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
  some compile time warnings.

Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
2021-04-20 12:30:09 +03:00