In sql_class.cc, 'row_count', of type 'ha_rows', was used as last argument for
ER_TRUNCATED_WRONG_VALUE_FOR_FIELD which is
"Incorrect %-.32s value: '%-.128s' for column '%.192s' at row %ld".
So 'ha_rows' was used as 'long'.
On SPARC32 Solaris builds, 'long' is 4 bytes and 'ha_rows' is 'longlong' i.e. 8 bytes.
So the printf-like code was reading only the first 4 bytes.
Because the CPU is big-endian, 1LL is 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01
so the first four bytes yield 0. So the warning message had "row 0" instead of
"row 1" in test outfile_loaddata.test:
-Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 1
+Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 0
All error-messaging functions which internally invoke some printf-life function
are potential candidate for such mistakes.
One apparently easy way to catch such mistakes is to use
ATTRIBUTE_FORMAT (from my_attribute.h).
But this works only when call site has both:
a) the format as a string literal
b) the types of arguments.
So:
func(ER(ER_BLAH), 10);
will silently not be checked, because ER(ER_BLAH) is not known at
compile time (it is known at run-time, and depends on the chosen
language).
And
func("%s", a va_list argument);
has the same problem, as the *real* type of arguments is not
known at this site at compile time (it's known in some caller).
Moreover,
func(ER(ER_BLAH));
though possibly correct (if ER(ER_BLAH) has no '%' markers), will not
compile (gcc says "error: format not a string literal and no format
arguments").
Consequences:
1) ATTRIBUTE_FORMAT is here added only to functions which in practice
take "string literal" formats: "my_error_reporter" and "print_admin_msg".
2) it cannot be added to the other functions: my_error(),
push_warning_printf(), Table_check_intact::report_error(),
general_log_print().
To do a one-time check of functions listed in (2), the following
"static code analysis" has been done:
1) replace
my_error(ER_xxx, arguments for substitution in format)
with the equivalent
my_printf_error(ER_xxx,ER(ER_xxx), arguments for substitution in
format),
so that we have ER(ER_xxx) and the arguments *in the same call site*
2) add ATTRIBUTE_FORMAT to push_warning_printf(),
Table_check_intact::report_error(), general_log_print()
3) replace ER(xxx) with the hard-coded English text found in
errmsg.txt (like: ER(ER_UNKNOWN_ERROR) is replaced with
"Unknown error"), so that a call site has the format as string literal
4) this way, ATTRIBUTE_FORMAT can effectively do its job
5) compile, fix errors detected by ATTRIBUTE_FORMAT
6) revert steps 1-2-3.
The present patch has no compiler error when submitted again to the
static code analysis above.
It cannot catch all problems though: see Field::set_warning(), in
which a call to push_warning_printf() has a variable error
(thus, not replacable by a string literal); I checked set_warning() calls
by hand though.
See also WL 5883 for one proposal to avoid such bugs from appearing
again in the future.
The issues fixed in the patch are:
a) mismatch in types (like 'int' passed to '%ld')
b) more arguments passed than specified in the format.
This patch resolves mismatches by changing the type/number of arguments,
not by changing error messages of sql/share/errmsg.txt. The latter would be wrong,
per the following old rule: errmsg.txt must be as stable as possible; no insertions
or deletions of messages, no changes of type or number of printf-like format specifiers,
are allowed, as long as the change impacts a message already released in a GA version.
If this rule is not followed:
- Connectors, which use error message numbers, will be confused (by insertions/deletions
of messages)
- using errmsg.sys of MySQL 5.1.n with mysqld of MySQL 5.1.(n+1)
could produce wrong messages or crash; such usage can easily happen if
installing 5.1.(n+1) while /etc/my.cnf still has --language=/path/to/5.1.n/xxx;
or if copying mysqld from 5.1.(n+1) into a 5.1.n installation.
When fixing b), I have verified that the superfluous arguments were not used in the format
in the first 5.1 GA (5.1.30 'bteam@astra04-20081114162938-z8mctjp6st27uobm').
Had they been used, then passing them today, even if the message doesn't use them
anymore, would have been necessary, as explained above.
Part 2. Function QUOTE() was not multi-byte safe.
@ mysql-test/r/ctype_ucs.result
@ mysql-test/t/ctype_ucs.test
Adding tests
@ sql/item_strfunc.cc
Fixing Item_func_quote::val_str to be multi-byte safe.
@ sql/item_strfunc.h
Multiple size needed for quote characters to mbmaxlen
Problem: wrong character set pointer was passed to my_strtoll10_mb2,
which led to DBUG_ASSERT failure in some cases.
@ mysql-test/r/func_encrypt_ucs2.result
@ mysql-test/t/func_encrypt_ucs2.test
@ mysql-test/r/ctype_ucs.result
@ mysql-test/t/ctype_ucs.test
Adding tests
@ sql/item_func.cc
"cs" initialization was wrong (res does not necessarily point to &str_value)
@ sql/item_strfunc.cc
Item_func_dec_encrypt::val_str() and Item_func_des_descrypt::val_str()
did not set character set for tmp_value (the returned value),
so the old value, which was previously copied from args[1]->val_str(),
was incorrectly returned with tmp_value.
Introduced by the fix for bug#44766.
Problem: it's not correct to use args[0]->str_value as a buffer,
because args[0] may need this buffer for its own purposes.
Fix: adding a new class member tmp_value to use as return value.
@ mysql-test/r/ctype_many.result
@ mysql-test/t/ctype_many.test
Adding tests
@ sql/item_strfunc.cc
Changing code into traditional style:
use "str" as a buffer for the argument and tmp_value for the result value.
@ sql/item_strfunc.h
Adding tmp_value
other crashes
Some string manipulating SQL functions use a shared string object intended to
contain an immutable empty string. This object was used by the SQL function
SUBSTRING_INDEX() to return an empty string when one argument was of the wrong
datatype. If the string object was then modified by the sql function INSERT(),
undefined behavior ensued.
Fixed by instead modifying the string object representing the function's
result value whenever string manipulating SQL functions return an empty
string.
Relevant code has also been documented.
Bug#57820 extractvalue crashes
Problem: ExtractValue and Replace crashed in some cases
due to invalid handling of empty and NULL arguments.
Per file comments:
@mysql-test/r/ctype_ujis.result
@mysql-test/r/xml.result
@mysql-test/t/ctype_ujis.test
@mysql-test/t/xml.test
Adding tests
@sql/item_strfunc.cc
Make sure Item_func_replace::val_str safely handles empty strings.
@sql/item_xmlfunc.cc
set null_value if nodeset_func returned NULL,
which is possible when the second argument is an
unset user variable.
Problem: if multibyte and binary string arguments passed to
RPAD(), LPAD() or INSERT() functions, they might return
wrong results or even lead to a server crash due to missed
character set convertion.
Fix: perform the convertion if necessary.
Iterative patch improvement. Previously committed patch
caused wrong result on Windows. The previous patch also
broke secure_file_priv for symlinks since not all file
paths which must be compared against this variable are
normalized using the same norm.
The server variable opt_secure_file_priv wasn't
normalized properly and caused the operations
LOAD DATA INFILE .. INTO TABLE ..
and
SELECT load_file(..)
to do different interpretations of the
--secure-file-priv option.
The patch moves code to the server initialization
routines so that the path always is normalized
once and only once.
It was also intended that setting the option
to an empty string should be equal to
lifting all previously set restrictions. This
is also fixed by this patch.
Procedure, while DECIMAL works
Selecting of the CONCAT(...<SP variable>...) result into
a user variable may return wrong data.
Item_func_concat::val_str contains a number of memory
allocation-saving tricks. One of them concatenates
strings inplace inserting the value of one string
at the beginning of the other string. However,
this trick didn't care about strings those points
to the same data buffer: this is possible when
a CONCAT() parameter is a stored procedure variable -
Item_sp_variable::val_str() uses the intermediate
Item_sp_variable::str_value field, where it may
store a reference to an external buffer.
The Item_func_concat::val_str function has been
modified to take into account val_str functions
(such as Item_sp_variable::val_str) that return
a pointer to an internal Item member variable
that may reference to a buffer provided.
Rename method as to not hide a base.
Reorder attributes initialization.
Remove unused variable.
Rework code to silence a warning due to assignment used as truth value.
MySQL's hash functions MD5 and SHA relied on the somewhat slow
sprintf function to convert the digests to hex representations.
This patch replaces the sprintf with a specific and inline hex
conversion function.
Patch contributed by Jan Steemann.
Selecting of the CONCAT_WS(...<PS parameter>...) result into
a user variable may return wrong data.
Item_func_concat_ws::val_str contains a number of memory
allocation-saving optimization tricks. After the fix
for bug 46815 the control flow has been changed to a
branch that is commented as "This is quite uncommon!":
one of places where we are trying to concatenate
strings inplace. However, that "uncommon" place
didn't care about PS parameters, that have another
trick in Item_sp_variable::val_str(): they use the
intermediate Item_sp_variable::str_value field,
where they may store a reference to an external
argument's buffer.
The Item_func_concat_ws::val_str function has been
modified to take into account val_str functions
(such as Item_sp_variable::val_str) that return a
pointer to an internal Item member variable that
may reference to a buffer provided.
The problem was that the multiple evaluations of a ENCODE or
DECODE function within a single statement caused the random
generator to be reinitialized at each evaluation, even though
the parameters were constants.
The solution is to initialize the random generator only once
if the password (seed) parameter is constant.
This patch borrows code and ideas from Georgi Kodinov's patch.
Problem: Some system functions that could return different values on
master and slave were not marked unsafe. In particular:
GET_LOCK
IS_FREE_LOCK
IS_USED_LOCK
MASTER_POS_WAIT
RELEASE_LOCK
SLEEP
SYSDATE
VERSION
Fix: Mark these functions unsafe.
The problem is that argument buffer can be used as result buffer
and it leads to argument value change.
The fix is to use 'old buffer' as result buffer only
if first argument is not constant item.
The assertion in String::copy was added in order to avoid
valgrind errors when the destination was the same as the source.
Eased restriction to allow for the case when str == NULL.
old_password() functions
The PASSWORD() and OLD_PASSWORD() functions could lead to
memory reads outside of an internal buffer when used with BLOB
arguments.
String::c_ptr() assumes there is at least one extra byte
in the internally allocated buffer when adding the trailing
'\0'. This, however, may not be the case when a String object
was initialized with externally allocated buffer.
The bug was fixed by adding an additional "length" argument to
make_scrambled_password_323() and make_scrambled_password() in
order to avoid String::c_ptr() calls for
PASSWORD()/OLD_PASSWORD().
However, since the make_scrambled_password[_323] functions are
a part of the client library ABI, the functions with the new
interfaces were implemented with the 'my_' prefix in their
names, with the old functions changed to be wrappers around
the new ones to maintain interface compatibility.
bug#44766: valgrind error when using convert() in a subquery
Problem: input and output buffers may be the same
converting a string to some charset.
That may lead to wrong results/valgrind warnings.
Fix: use different buffers.
warnings after uncompressed_length
UNCOMPRESSED_LENGTH() did not validate its argument. In
particular, if the argument length was less than 4 bytes,
an uninitialized memory value was returned as a result.
Since the result of COMPRESS() is either an empty string or
a 4-byte length prefix followed by compressed data, the bug was
fixed by ensuring that the argument of UNCOMPRESSED_LENGTH() is
either an empty string or contains at least 5 bytes (as done in
UNCOMPRESS()). This is the best we can do to validate input
without decompressing.
Problem: using LOAD_FILE() in some cases we pass a file name string
without a trailing '\0' to fn_format() which relies on that however.
That may lead to valgrind warnings.
Fix: add a trailing '\0' to the file name passed to fn_format().
The warning happens because string argument is not zero ended.
The fix is to add new parameter 'length' to SQL_CRYPT() and
use ptr() instead of c_ptr().
to wrong results
3 problems found with DES_ENCRYPT/DES_DECRYPT :
1. The max length was not calculated properly. Fixed in fix_length_and_dec()
2. DES_ENCRYPT had a side effect of sometimes reallocating and changing
the value of its argument. Fixed by explicitly pre-allocating the necessary
space to pad the argument with trailing '*' (stars) when calculating the
DES digest.
3. in DES_ENCRYPT the string buffer for the result value was not
reallocated to the correct size and only string length was assigned to it.
Fixed by making sure there's enough space to hold the result.
Took the Xfree implementation (based on the same rewrite as the NDB one)
and added it instead of the current implementation.
Added a macro to make the calls to MD5 more streamlined.
- Remove bothersome warning messages. This change focuses on the warnings
that are covered by the ignore file: support-files/compiler_warnings.supp.
- Strings are guaranteed to be max uint in length
- Remove bothersome warning messages. This change focuses on the warnings
that are covered by the ignore file: support-files/compiler_warnings.supp.
- Strings are guaranteed to be max uint in length
When substituting system constant functions with a constant result
the server was not expecting that the function may return NULL.
Fixed by checking for NULL and returning Item_null (in the relevant
collation) if the result of the system constant function was NULL.
offset for time part in UUIDs was 1/1000 of what it
should be. In other words, offset was off.
Also handle the case where we count into the future
when several UUIDs are generated in one "tick", and
then the next call is late enough for us to unwind
some but not all of those borrowed ticks.
Lastly, handle the case where we keep borrowing and
borrowing until the tick-counter overflows by also
changing into a new "numberspace" by creating a new
random suffix.
offset for time part in UUIDs was 1/1000 of what it
should be. In other words, offset was off.
Also handle the case where we count into the future
when several UUIDs are generated in one "tick", and
then the next call is late enough for us to unwind
some but not all of those borrowed ticks.
Lastly, handle the case where we keep borrowing and
borrowing until the tick-counter overflows by also
changing into a new "numberspace" by creating a new
random suffix.