Main changes:
- Changing the constructor to accept a CHARSET_INFO pointer, instead of an Item pointer
- Updating the bison grammar accordingly
Additional cleanups:
- Simplifying Item_func_set_collation::eq() by reusing Item_func::eq()
- Removing unused binary_keyword
TODO:
- Make get_thd_memroot() inline
- To do this, we need to reduce dependence of include files, especially
so that sql_class.h is not depending in item.h
It's better to prohibit pushdown of conditions that involve
regexp_substr() and regexp_replace() into materialized derived
tables / views until proper implementations of the get_copy()
virtual method are provided for those functions.
- Implementing stricter data type control for Item_long_func descendants
- Cleanup: renaming Type_handler::can_return_str_ascii() to can_return_text()
(a better name).
1. use Regexp_processor_pcre::set_recursion_limit() to set the
recursion limit depending on the current available stack size
2. make pcre stack frame to be estimated no less than 500 bytes.
sometimes pcre estimates it too low, even though the manual
says 500+16 bytes (it was estimated only 188 for me, actual
frame size was 512).
3. do it for embedded too
The bug happens because of a combination of unfortunate circumstances:
1. Arguments args[0] and args[2] of Item_func_concat point recursively
(through Item_direct_view_ref's) to the same Item_func_conv_charset.
Both args[0]->args[0]->ref[0] and args[2]->args[0]->ref[0] refer to
this Item_func_conv_charset.
2. When Item_func_concat::args[0]->val_str() is called,
Item_func_conv_charset::val_str() writes its result to
Item_func_conc_charset::tmp_value.
3. Then, for optimization purposes (to avoid copying),
Item_func_substr::val_str() initializes Item_func_substr::tmp_value
to point to the buffer fragment owned by Item_func_conv_charset::tmp_value
Item_func_substr::tmp_value is returned as a result of
Item_func_concat::args[0]->val_str().
4. Due to optimization to avoid memory reallocs,
Item_func_concat::val_str() remembers the result of args[0]->val_str()
in "res" and further uses "res" to collect the return value.
5. When Item_func_concat::args[2]->val_str() is called,
Item_func_conv_charset::tmp_value gets overwritten (see #1),
which effectively overwrites args[0]'s Item_func_substr::tmp_value (see #3),
which effectively overwrites "res" (see #4).
This patch does the following:
a. Changes Item_func_conv_charset::val_str(String *str) to use
tmp_value and str the other way around. After this change tmp_value
is used to store a temporary result, while str is used to return the value.
The fixes the second problem (without SUBSTR):
SELECT CONCAT(t2,'-',t2) c2
FROM (SELECT CONVERT(t USING latin1) t2 FROM t1) sub;
As Item_func_concat::val_str() supplies two different buffers when calling
args[0]->val_str() and args[2]->val_str(), in the new reduction the result
created during args[0]->val_str() does not get overwritten by
args[2]->val_str().
b. Fixing the same problem in val_str() for similar classes
Item_func_to_base64
Item_func_from_base64
Item_func_weight_string
Item_func_hex
Item_func_unhex
Item_func_quote
Item_func_compress
Item_func_uncompress
Item_func_des_encrypt
Item_func_des_decrypt
Item_func_conv_charset
Item_func_reverse
Item_func_soundex
Item_func_aes_encrypt
Item_func_aes_decrypt
Item_func_buffer
c. Fixing Item_func::val_str_from_val_str_ascii() the same way.
Now Item_str_ascii_func::ascii_buff is used for temporary value,
while the parameter passed to val_str() is used to return the result.
This fixes the same problem when conversion (from ASCII to e.g. UCS2)
takes place. See the ctype_ucs.test for example queries that returned
wrong results before the fix.
d. Some Item_func descendand classes had temporary String buffers
(tmp_value and tmp_str), but did not really use them.
Removing these temporary buffers from:
Item_func_decode_histogram
Item_func_format
Item_func_binlog_gtid_pos
Item_func_spatial_collection:
e. Removing Item_func_buffer::tmp_value, because it's not used any more.
f. Renaming Item_func_[un]compress::buffer to "tmp_value",
for consistency with other classes.
Note, this patch does not fix the following classes
(although they have a similar problem):
Item_str_conv
Item_func_make_set
Item_char_typecast
They have a complex implementations and simple swapping between "tmp_value"
and "str" won't work. These classes will be fixed separately.
The problem lies in how CURRENT_ROLE is defined. The
Item_func_current_role inherits from Item_func_sysconst, which defines
a safe_charset_converter to be a const_charset_converter.
During view creation, if there is no role previously set, the current_role()
function returns NULL.
This is captured on item instantiation and the
const_charset_converter call subsequently returns an Item_null.
In turn, the function is replaced with Item_null and the view is
then created with an Item_null instead of Item_func_current_role.
Without this patch, the first SHOW CREATE VIEW from the testcase would
have a where clause of WHERE role_name = NULL, while the second SHOW
CREATE VIEW would show a correctly created view.
The same applies for the DATABASE function, as it can change as well.
There is an additional problem with CURRENT_ROLE() when used in a
prepared statement. During prepared statement creation we used to set
the string_value of the function to the current role as well as the
null_value flag. During execution, if CURRENT_ROLE was not null, the
null_value flag was never set to not-null during fix_fields.
Item_func_current_user however can never be NULL so it did not show this
problem in a view before. At the same time, the CURRENT_USER() can not
be changed between prepared statement execution and creation so the
implementation where the value is stored during fix_fields is
sufficient.
Note also that DATABASE() function behaves differently during prepared
statements. See bug 25843 for details or commit
7e0ad09edf