Handling of large signed/unsigned values was not consistent, so some string functions could return bogus results.
The current fix is to simply patch up the val_str() methods for those string items.
It would be good clean this code up in general, to make similar problems much harder to make. This is left as an exercise for the reader.
- Removed not used variables and functions
- Added #ifdef around code that is not used
- Renamed variables and functions to avoid conflicts
- Removed some not used arguments
Fixed some class/struct warnings in ndb
Added define IS_LONGDATA() to simplify code in libmysql.c
I did run gcov on the changes and added 'purecov' comments on almost all lines that was not just variable name changes
Fixed compiler warnings (detected by VC++):
- Removed not used variables
- Added casts
- Fixed wrong assignments to bool
- Fixed wrong calls with bool arguments
- Added missing argument to store(longlong), which caused wrong store method to be called.
- Removed not used variables
- Changed some ulong parameters/variables to ulonglong (possible serious bug)
- Added casts to get rid of safe assignment from longlong to long (and similar)
- Added casts to function parameters
- Fixed signed/unsigned compares
- Added some constructores to structures
- Removed some not portable constructs
Better fix for bug Bug #21428 "skipped 9 bytes from file: socket (3)" on "mysqladmin shutdown"
(Added new parameter to net_clear() to define when we want the communication buffer to be emptied)
Before this change, the functions BENCHMARK, ENCODE, DECODE and FORMAT could
only accept a constant for some parameters.
After this change, this restriction has been removed. An implication is that
these functions can also be used in prepared statements.
The change consist of changing the following classes:
- Item_func_benchmark
- Item_func_encode
- Item_func_decode
- Item_func_format
to:
- only accept Item* in the constructor,
- and evaluate arguments during calls to val_xxx()
which fits the general design of all the other functions.
The 'TODO' items identified in item_create.cc during the work done for
Bug 21114 are addressed by this fix, as a natural consequence of aligning
the design.
In the 'func_str' test, a single very long test line involving an explain
extended select with many functions has been rewritten into multiple
separate tests, to improve maintainability.
The result of explain extended select decode(encode(...)) has changed,
since the encode and decode functions now print all their parameters.
on large length
Problem: Most (all) of the numeric inputs were being coerced into
int (32 bit) sized variables. Works OK for sane inputs; any input
larger than 2^32 (or 2^31 for signed vars) exihibited predictable
wrapping behavior (up to about 10^18) and then started having really
strange behaviour past that point (since the conversion to 64 bit int
from the DECIMAL type can do weird things on out of range numbers).
Solution: 1) Add many tests. 2) Convert input from (u)long type to
(u)longlong. 3) Do (sometimes multiple) sanity checks on input,
keeping in mind that sometimes a negative longlong is not a negative
longlong (if the unsigned_flag is set). 4) Emulate existing behavior
w/rt negative and "small" out-of-bounds values.
Item_substr's results are improperly stored in a temporary table due to
wrongly calculated max_length value for multi-byte charsets if two
arguments specified.
- Fix typo in Item_func_export_set::fix_length_and_dec() which caused character set aggregation to fail
- Remove default argument from last arg of agg_arg_charsets() function, to reduce potential errors
Converting BIT to a string (an intermediate step in conversion) does
not yield an ASCII numeric string, so we skip that step for BIT and
get the integer value directly from the item.
This site in sql/item_strfunc.cc may be ripe for refactoring for
other types as well, where converting to a string is a waste of time.
Make the encryption functions MD5(), SHA1() and ENCRYPT() return binary results.
Make MAKE_SET() and EXPORT_SET() use the correct character set for their default separator strings.
for class Item_func_trim.
For 4.1 it caused wrong output for EXPLAIN EXTENDED commands
if expressions with the TRIM function of two arguments were used.
For 5.0 it caused an error message when trying to select
from a view with the TRIM function of two arguments.
This unexpected error message was due to the fact that the
print method for the class Item_func_trim was inherited from
the class Item_func. Yet the TRIM function does not take a list
of its arguments. Rather it takes the arguments in the form:
[{BOTH | LEADING | TRAILING} [remstr] FROM] str) |
[remstr FROM] str
- Add a check that length of field to uncompress is longer than 4 bytes.
This can be dones as the length of uncompressed data is written as
first four bytes of field and thus it can't be valid compressed data.
Adding test case.
item_strfunc.cc:
bug#11728 string function LEFT, strange undocumented behaviour
Fixing LEFT and RIGHT return NULL if the second
argument is NULL.
The implementation of the method Item_func_reverse::val_str
for the REVERSE function modified the argument of the function.
This led to wrong results for expressions that contained
REVERSE(ref) if ref occurred somewhere else in the expressions.
invoker name
The bug was fixed similar to how context switch is handled in
Item_func_sp::execute_impl(): we store pointer to current
Name_resolution_context in Item_func_current_user class, and use
its Security_context in Item_func_current_user::fix_fields().
In some functions dealing with strings and character sets, the wrong
pointers were saved for restoration in THD::rollback_item_tree_changes().
This could potentially cause random corruption or crashes.
Fixed by passing the original Item ** locations, not local stack copies.
Also remove unnecessary use of default arguments.
Bug#19022 "Memory bug when switching db during trigger execution"
Bug#17199 "Problem when view calls function from another database."
Bug#18444 "Fully qualified stored function names don't work correctly in
SELECT statements"
Documentation note: this patch introduces a change in behaviour of prepared
statements.
This patch adds a few new invariants with regard to how THD::db should
be used. These invariants should be preserved in future:
- one should never refer to THD::db by pointer and always make a deep copy
(strmake, strdup)
- one should never compare two databases by pointer, but use strncmp or
my_strncasecmp
- TABLE_LIST object table->db should be always initialized in the parser or
by creator of the object.
For prepared statements it means that if the current database is changed
after a statement is prepared, the database that was current at prepare
remains active. This also means that you can not prepare a statement that
implicitly refers to the current database if the latter is not set.
This is not documented, and therefore needs documentation. This is NOT a
change in behavior for almost all SQL statements except:
- ALTER TABLE t1 RENAME t2
- OPTIMIZE TABLE t1
- ANALYZE TABLE t1
- TRUNCATE TABLE t1 --
until this patch t1 or t2 could be evaluated at the first execution of
prepared statement.
CURRENT_DATABASE() still works OK and is evaluated at every execution
of prepared statement.
Note, that in stored routines this is not an issue as the default
database is the database of the stored procedure and "use" statement
is prohibited in stored routines.
This patch makes obsolete the use of check_db_used (it was never used in the
old code too) and all other places that check for table->db and assign it
from THD::db if it's NULL, except the parser.
How this patch was created: THD::{db,db_length} were replaced with a
LEX_STRING, THD::db. All the places that refer to THD::{db,db_length} were
manually checked and:
- if the place uses thd->db by pointer, it was fixed to make a deep copy
- if a place compared two db pointers, it was fixed to compare them by value
(via strcmp/my_strcasecmp, whatever was approproate)
Then this intermediate patch was used to write a smaller patch that does the
same thing but without a rename.
TODO in 5.1:
- remove check_db_used
- deploy THD::set_db in mysql_change_db
See also comments to individual files.
Fix for bug#16716 for --ps-protocol mode.
item_cmpfunc.cc:
Fix for a memory allocation/freeing problem in agg_cmp_type() after fix
for bug#16377. Few language corrections.
To calculate its max_length the CONCAT() function is simply sums max_lengths
of its arguments but when the collation of an argument differs from the
collation of the CONCAT() max_length will be wrong. This may lead to a data
truncation when a tmp table is used, in UNIONS for example.
The Item_func_concat::fix_length_and_dec() function now recalculates the
max_length of an argument when the mbmaxlen of the argument differs from the
mbmaxlen of the CONCAT().
argument can lead to a wrong result.
md5() and sha() functions treat their arguments as case sensitive strings.
But when they are compared their arguments were compared as a case
insensitive strings which leads to two functions with different arguments
and thus different results to being identical. This can lead to a wrong
decision made in the range optimizer and thus lead to a wrong result set.
Item_func_md5::fix_length_and_dec() and Item_func_sha::fix_length_and_dec()
functions now set binary collation on their arguments.
The Item_func_concat::val_str() function tries to make as less re-allocations
as possible. This results in appending strings returned by 2nd and next
arguments to the string returned by 1st argument if the buffer for the first
argument has enough free space. A constant subselect is evaluated only once
and its result is stored in an Item_cache_str. In the case when the first
argument of the concat() function is such a subselect Item_cache_str returns
the stored value and Item_func_concat::val_str() append values of other
arguments to it. But for the next row the value in the Item_cache_str isn't
restored because the subselect is a constant one and it isn't evaluated second
time. This results in appending string values of 2nd and next arguments to the
result of the previous Item_func_concat::val_str() call.
The Item_func_concat::val_str() function now checks whether the first argument
is a constant one and if so it doesn't append values of 2nd and next arguments
to the string value returned by it.
- detect the need for row-based binlogging not at execution stage but earlier at parsing stage; needed for example for CREATE TABLE SELECT UUID().
- more tests of this mixed mode.
and new binlog format called "mixed" (which is statement-based except if only row-based is correct,
in this cset it means if UDF or UUID is used; more cases could be added in later 5.1 release):
SET GLOBAL|SESSION BINLOG_FORMAT=row|statement|mixed|default;
the global default is statement unless cluster is enabled (then it's row) as in 5.1-alpha.
It's not possible to use SET on this variable if a session is currently in row-based mode and has open temporary tables (because CREATE
TEMPORARY TABLE was not binlogged so temp table is not known on slave), or if NDB is enabled (because
NDB does not support such change on-the-fly, though it will later), of if in a stored function (see below).
The added tests test the possibility or impossibility to SET, their effects, and the mixed mode,
including in prepared statements and in stored procedures and functions.
Caveats:
a) The mixed mode will not work for stored functions: in mixed mode, a stored function will
always be binlogged as one call and in a statement-based way (e.g. INSERT VALUES(myfunc()) or SELECT myfunc()).
b) for the same reason, changing the thread's binlog format inside a stored function is
refused with an error message.
c) the same problems apply to triggers; implementing b) for triggers will be done later (will ask
Dmitri).
Additionally, as the binlog format is now changeable by each user for his session, I remove the implication
which was done at startup, where row-based automatically set log-bin-trust-routine-creators to 1
(not possible anymore as a user can now switch to stmt-based and do nasty things again), and automatically
set --innodb-locks-unsafe-for-binlog to 1 (was anyway theoretically incorrect as it disabled
phantom protection).
Plus fixes for compiler warnings.