When values of different types are compared they're converted to a type that
allows correct comparison. This conversion is done for each comparison and
takes some time. When a constant is being compared it's possible to cache the
value after conversion to speedup comparison. In some cases (large dataset,
complex WHERE condition with many type conversions) query might be executed
7% faster.
A test case isn't provided because all changes are internal and isn't visible
outside.
The behavior of the Item_cache is changed to cache values on the first request
of cached value rather than at the moment of storing item to be cached.
A flag named value_cached is added to the Item_cache class. It's set to TRUE
when cache holds the value of the last stored item.
Function named cache_value() is added to the Item_cache class and derived classes.
This function actually caches the value of the saved item.
Item_cache_xxx::store functions now only store item to be cached and set
value_cached flag to FALSE.
Item_cache_xxx::val_xxx functions are changed to call cache_value function
prior to returning cached value if value_cached is FALSE.
The Arg_comparator::set_cmp_func function now calls cache_converted_constant
to cache constants if they need a type conversion.
The Item_cache::get_cache function is overloaded to allow setting of the
cache type.
The cache_converted_constant function is added to the Arg_comparator class.
It checks whether a value can and should be cached and if so caches it.
When a query was using a DATE or DATETIME value formatted
using any other separator characters beside hyphen '-', a
query with a greater-or-equal '>=' condition matching only
the greatest value in an indexed column, the result was
empty if index range scan was employed.
The range optimizer got a new feature between 5.1.38 and
5.1.39 that changes a greater-or-equal condition to a
greater-than if the value matching that in the query was not
present in the table. But the value comparison function
compared the dates as strings instead of dates.
The bug was fixed by splitting the function
get_date_from_str in two: One part that parses and does
error checking. This function is now visible outside the
module. The old get_date_from_str now calls the new
function.
Problem was that the partition containing NULL values
was pruned away, since '2001-01-01' < '2001-02-00' but
TO_DAYS('2001-02-00') is NULL.
Added the NULL partition for RANGE/LIST partitioning on TO_DAYS()
function to be scanned too.
Also fixed a bug that added ALLOW_INVALID_DATES to sql_mode
(SELECT * FROM t WHERE date_col < '1999-99-99' on a RANGE/LIST
partitioned table would add it).
There were a problem since pruning uses the field
for comparison (while evaluate_join_record uses longlong),
resulting in pruning failures when comparing DATE to DATETIME.
Fix was to always comparing DATE vs DATETIME as DATETIME,
by adding ' 00:00:00' to the DATE string.
And adding optimization for comparing with 23:59:59, so that
DATETIME_col > '2001-02-03 23:59:59' ->
TO_DAYS(DATETIME_col) > TO_DAYS('2001-02-03 23:59:59') instead
of '>='.
The problem was that creating a DECIMAL column from a decimal
value could lead to a failed assertion as decimal values can
have a higher precision than those attached to a table. The
assert could be triggered by creating a table from a decimal
with a large (> 30) scale. Also, there was a problem in
calculating the number of digits in the integral and fractional
parts if both exceeded the maximum number of digits permitted
by the new decimal type.
The solution is to ensure that truncation procedure is executed
when deducing a DECIMAL column from a decimal value of higher
precision. If the integer part is equal to or bigger than the
maximum precision for the DECIMAL type (65), the integer part
is truncated to fit and the fractional becomes zero. Otherwise,
the fractional part is truncated to fit into the space left
after the integer part is copied.
This patch borrows code and ideas from Martin Hansson's patch.
HAVING
When calculating GROUP BY the server caches some expressions. It does
that by allocating a string slot (Item_copy_string) and assigning the
value of the expression to it. This effectively means that the result
type of the expression can be changed from whatever it was to a string.
As this substitution takes place after the compile-time result type
calculation for IN but before the run-time type calculations,
it causes the type calculations in the IN function done at run time
to get unexpected results different from what was prepared at compile time.
In the CASE ... WHEN ... THEN ... statement there was a similar problem
and it was solved by artificially adding a STRING argument to the set of
types of the IN/CASE arguments at compile time, so if any of the
arguments of the CASE function changes its type to a string it will
still be covered by the information prepared at compile time.
In case of ROW item each compared pair does not
check if argumet collations can be aggregated and
thus appropiriate item conversion does not happen.
The fix is to add the check and convertion for ROW
pairs.
- Remove bothersome warning messages. This change focuses on the warnings
that are covered by the ignore file: support-files/compiler_warnings.supp.
- Strings are guaranteed to be max uint in length
IS NULL was not checking the correct row in a HAVING context.
At the first row of a new group (where the HAVING clause is evaluated)
the column and SELECT list references in the HAVING clause should
refer to the last row of the previous group and not to the current one.
This was not done for IS NULL, because it was using Item::is_null() doesn't
have a Item_is_null_result() counterpart to access the data from the
last row of the previous group. Note that all the Item::val_xxx() functions
(e.g. Item::val_int()) have their _result counterparts (e.g. Item::val_int_result()).
Fixed by implementing a is_null_result() (similarly to int_result()) and
calling this instead of is_null() column and SELECT list references inside
the HAVING clause.
The code to get read the value of a system variable was extracting its value
on PREPARE stage and was substituting the value (as a constant) into the parse tree.
Note that this must be a reversible transformation, i.e. it must be reversed before
each re-execution.
Unfortunately this cannot be reliably done using the current code, because there are
other non-reversible source tree transformations that can interfere with this
reversible transformation.
Fixed by not resolving the value at PREPARE, but at EXECUTE (as the rest of the
functions operate). Added a cache of the value (so that it's constant throughout
the execution of the query). Note that the cache also caches NULL values.
Updated an obsolete related test suite (variables-big) and the code to test the
result type of system variables (as per bug 74).
The optimizer pulls up aggregate functions which should be aggregated in
an outer select. At some point it may substitute such a function for a field
in the temporary table. The setup_copy_fields function doesn't take this
into account and may overrun the copy_field buffer.
Fixed by filtering out the fields referenced through the specialized
reference for aggregates (Item_aggregate_ref).
Added an assertion to make sure bugs that cause similar discrepancy
don't go undetected.
Length value is the length of the field,
Max_length is the length of the field value.
So Max_length can not be more than Length.
The fix: fixed calculation of the Item_empty_string item length
(Patch applied and queued on demand of Trudy/Davi.)
This fix is for 5.0 only : back porting the 6.0 patch manually
The parser code in sql/sql_yacc.yy needs to be more robust to out of
memory conditions, so that when parsing a query fails due to OOM,
the thread gracefully returns an error.
Before this fix, a new/alloc returning NULL could:
- cause a crash, if dereferencing the NULL pointer,
- produce a corrupted parsed tree, containing NULL nodes,
- alter the semantic of a query, by silently dropping token values or nodes
With this fix:
- C++ constructors are *not* executed with a NULL "this" pointer
when operator new fails.
This is achieved by declaring "operator new" with a "throw ()" clause,
so that a failed new gracefully returns NULL on OOM conditions.
- calls to new/alloc are tested for a NULL result,
- The thread diagnostic area is set to an error status when OOM occurs.
This ensures that a request failing in the server properly returns an
ER_OUT_OF_RESOURCES error to the client.
- OOM conditions cause the parser to stop immediately (MYSQL_YYABORT).
This prevents causing further crashes when using a partially built parsed
tree in further rules in the parser.
No test scripts are provided, since automating OOM failures is not
instrumented in the server.
Tested under the debugger, to verify that an error in alloc_root cause the
thread to returns gracefully all the way to the client application, with
an ER_OUT_OF_RESOURCES error.
WL#4165 Prepared statements: validation
WL#4166 Prepared statements: automatic re-prepare
Fixes
Bug#27430 Crash in subquery code when in PS and table DDL changed after PREPARE
Bug#27690 Re-execution of prepared statement after table was replaced with a view crashes
Bug#27420 A combination of PS and view operations cause error + assertion on shutdown
The basic idea of the patch is to keep track of table metadata between
prepared statement prepare and execute. If some table used in the statement
has changed, the prepared statement is re-prepared before execution.
See WL#4165 and WL#4166 contents and comments in the code for details
of the implementation.
Assertion `0' failed
If ROW item is a part of an expression that also has
aggregate function calls (COUNT/SUM/AVG...), a
"splitting" with an Item::split_sum_func2 function
is applied to that ROW item.
Current implementation of Item::split_sum_func2
replaces this Item_row with a newly created
Item_aggregate_ref reference to it.
Then the row cache tries to work with the
Item_aggregate_ref object as with the Item_row object:
row cache calls row-emulation methods such as cols and
element_index. Item_aggregate_ref (like it's parent
Item_ref) inherits dummy implementations of those
methods from the hierarchy root Item, and call to
them leads to failed assertions and wrong data
output.
Row-emulation virtual functions (cols, element_index, addr,
check_cols, null_inside and bring_value) of Item_ref have
been overloaded to forward calls to an underlying item
reference.
The problem is that passing anything other than a integer to a limit
clause in a prepared statement would fail. This limitation was introduced
to avoid replication problems (e.g: replicating the statement with a
string argument would cause a parse failure in the slave).
The solution is to convert arguments to the limit clause to a integer
value and use this converted value when persisting the query to the log.
NAME_CONST('whatever', -1) * MAX(whatever) bombed since -1 was
not seen as constant, but as FUNCTION_UNARY_MINUS(constant)
while we are at the same time pretending it was a basic const
item. This confused the aggregate handlers in exciting ways.
We now make NAME_CONST() behave more consistently.
between 5.0 and 5.1.
The problem was that in the patch for Bug#11986 it was decided
to store original query in UTF8 encoding for the INFORMATION_SCHEMA.
This approach however turned out to be quite difficult to implement
properly. The main problem is to preserve the same IS-output after
dump/restore.
So, the fix is to rollback to the previous functionality, but also
to fix it to support multi-character-set-queries properly. The idea
is to generate INFORMATION_SCHEMA-query from the item-tree after
parsing view declaration. The IS-query should:
- be completely in UTF8;
- not contain character set introducers.
For more information, see WL4052.
The problem is that CREATE VIEW statements inside prepared statements
weren't being expanded during the prepare phase, which leads to objects
not being allocated in the appropriate memory arenas.
The solution is to perform the validation of CREATE VIEW statements
during the prepare phase of a prepared statement. The validation
during the prepare phase assures that transformations of the parsed
tree will use the permanent arena of the prepared statement.
but not collation.
The problem here was that text literals in a view were always
dumped with character set introducer. That lead to loosing
collation information.
The fix is to dump character set introducer only if it was
in the original query. That is now possible because there
is no problem any more of loss of character set of string
literals in views -- after WL#4052 the view is dumped
in the original character set.
Two disjuncts containing equalities of the form key=const1 and key=const2 can
be merged into one if const1 is equal to const2. To check it the common
collation of the constants were used rather than the collation of the field key.
For example when the default collation of the constants was cases insensitive
while the collation of the field was case sensitive, then two or-ed equality
predicates key='b' and key='B' incorrectly were merged into one f='b'. As a
result ref access was used instead of range access and wrong result sets were
returned in many cases.
Fixed the problem by comparing constant in the or-ed predicate with collation of
the key field.
Problem: passing a non-constant name to the NAME_CONST function results in a crash.
Fix: check the NAME_CONST name argument; return fake item type if we got
non-constant argument(s).