AS A INNODB PARTITTION.
PROBLEM
-------
The correct engine_type was not being set during
rebuild of the partition due to which the handler
was always created with the default engine,
which is innodb for 5.5+ ,therefore even if the
table was myisam, after rebuilding the partitions
ended up as innodb partitions.
FIX
---
Set the correct engine type during rebuild.
[Approved by mattiasj #rb3599]
INCLUDES FIRST PARTITION WHEN PRUNING
PROBLEM
-------
TO_DAYS()/TO_SECONDS() can return NULL for invalid dates which
was stored in the first partition ,therefore the first partition
was always included for the scan when range was specified.
FIX
---
The fix is a small optimization which we have included ,which will
prune the scanning of NULL/first partition if the dates specified
in the range are valid and in the same year and month . TO_SECONDS()
function is not supported in 5.1 so removed it from the fix and test
scripts for mysql-5.1 version.
In sql_class.cc, 'row_count', of type 'ha_rows', was used as last argument for
ER_TRUNCATED_WRONG_VALUE_FOR_FIELD which is
"Incorrect %-.32s value: '%-.128s' for column '%.192s' at row %ld".
So 'ha_rows' was used as 'long'.
On SPARC32 Solaris builds, 'long' is 4 bytes and 'ha_rows' is 'longlong' i.e. 8 bytes.
So the printf-like code was reading only the first 4 bytes.
Because the CPU is big-endian, 1LL is 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01
so the first four bytes yield 0. So the warning message had "row 0" instead of
"row 1" in test outfile_loaddata.test:
-Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 1
+Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 0
All error-messaging functions which internally invoke some printf-life function
are potential candidate for such mistakes.
One apparently easy way to catch such mistakes is to use
ATTRIBUTE_FORMAT (from my_attribute.h).
But this works only when call site has both:
a) the format as a string literal
b) the types of arguments.
So:
func(ER(ER_BLAH), 10);
will silently not be checked, because ER(ER_BLAH) is not known at
compile time (it is known at run-time, and depends on the chosen
language).
And
func("%s", a va_list argument);
has the same problem, as the *real* type of arguments is not
known at this site at compile time (it's known in some caller).
Moreover,
func(ER(ER_BLAH));
though possibly correct (if ER(ER_BLAH) has no '%' markers), will not
compile (gcc says "error: format not a string literal and no format
arguments").
Consequences:
1) ATTRIBUTE_FORMAT is here added only to functions which in practice
take "string literal" formats: "my_error_reporter" and "print_admin_msg".
2) it cannot be added to the other functions: my_error(),
push_warning_printf(), Table_check_intact::report_error(),
general_log_print().
To do a one-time check of functions listed in (2), the following
"static code analysis" has been done:
1) replace
my_error(ER_xxx, arguments for substitution in format)
with the equivalent
my_printf_error(ER_xxx,ER(ER_xxx), arguments for substitution in
format),
so that we have ER(ER_xxx) and the arguments *in the same call site*
2) add ATTRIBUTE_FORMAT to push_warning_printf(),
Table_check_intact::report_error(), general_log_print()
3) replace ER(xxx) with the hard-coded English text found in
errmsg.txt (like: ER(ER_UNKNOWN_ERROR) is replaced with
"Unknown error"), so that a call site has the format as string literal
4) this way, ATTRIBUTE_FORMAT can effectively do its job
5) compile, fix errors detected by ATTRIBUTE_FORMAT
6) revert steps 1-2-3.
The present patch has no compiler error when submitted again to the
static code analysis above.
It cannot catch all problems though: see Field::set_warning(), in
which a call to push_warning_printf() has a variable error
(thus, not replacable by a string literal); I checked set_warning() calls
by hand though.
See also WL 5883 for one proposal to avoid such bugs from appearing
again in the future.
The issues fixed in the patch are:
a) mismatch in types (like 'int' passed to '%ld')
b) more arguments passed than specified in the format.
This patch resolves mismatches by changing the type/number of arguments,
not by changing error messages of sql/share/errmsg.txt. The latter would be wrong,
per the following old rule: errmsg.txt must be as stable as possible; no insertions
or deletions of messages, no changes of type or number of printf-like format specifiers,
are allowed, as long as the change impacts a message already released in a GA version.
If this rule is not followed:
- Connectors, which use error message numbers, will be confused (by insertions/deletions
of messages)
- using errmsg.sys of MySQL 5.1.n with mysqld of MySQL 5.1.(n+1)
could produce wrong messages or crash; such usage can easily happen if
installing 5.1.(n+1) while /etc/my.cnf still has --language=/path/to/5.1.n/xxx;
or if copying mysqld from 5.1.(n+1) into a 5.1.n installation.
When fixing b), I have verified that the superfluous arguments were not used in the format
in the first 5.1 GA (5.1.30 'bteam@astra04-20081114162938-z8mctjp6st27uobm').
Had they been used, then passing them today, even if the message doesn't use them
anymore, would have been necessary, as explained above.
multiple columns in the partition key
ndb crash if duplicate columns in the partitioning key.
Backport from mysql-5.1-telco-7.0, see bug#53354.
Changed from case sensitive field name comparision
to non case sensitive too.
multiline inserts into partition
Bug#57071: EXTRACT(WEEK from date_col) cannot be
allowed as partitioning function
Renamed function according to reviewers comments.
Bug#57071: EXTRACT(WEEK from date_col) cannot be allowed as partitioning function
There were functions allowed as partitioning functions
that implicit allowed cast. That could result in unacceptable
behaviour.
Solution was to check that the arguments of date and time functions
have allowed types (field and date/datetime/time depending on function).
Fix assorted warnings that are generated in optimized builds.
Most of it is silencing variables that are set but unused.
This patch also introduces the MY_ASSERT_UNREACHABLE macro
which helps the compiler to deduce that a certain piece of
code is unreachable.
In case of failure in ALTER ... PARTITION under LOCK TABLE
the server could crash, due to it had modified the locked
table object, which was not reverted in case of failure,
resulting in a bad table definition used after the failed
command.
Solved by always closing the LOCKED TABLE, even in case
of error.
Note: this is a 5.1-only fix, bug#56172 fixed it in 5.5+
/*![:version:] Query Code */, where [:version:] is a sequence of 5
digits representing the mysql server version(e.g /*!50200 ... */),
is a special comment that the query in it can be executed on those
servers whose versions are larger than the version appearing in the
comment. It leads to a security issue when slave's version is larger
than master's. A malicious user can improve his privileges on slaves.
Because slave SQL thread is running with SUPER privileges, so it can
execute queries that he/she does not have privileges on master.
This bug is fixed with the logic below:
- To replace '!' with ' ' in the magic comments which are not applied on
master. So they become common comments and will not be applied on slave.
- Example:
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /*!99999 ,(3)*/
will be binlogged as
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /* 99999 ,(3)*/
Item_hex_string::Item_hex_string
The status of memory allocation in the Lex_input_stream (called
from the Parser_state constructor) was not checked which led to
a parser crash in case of the out-of-memory error.
The solution is to introduce new init() member function in
Parser_state and Lex_input_stream so that status of memory
allocation can be returned to the caller.
concurrent I_S query
There were two problem:
1) MYSQL_LOCK_IGNORE_FLUSH also ignored name locks
2) there was a race between abort_and_upgrade_locks and
alter_close_tables
(i.e. remove_table_from_cache and
close_data_files_and_morph_locks)
Which allowed the table to be opened with MYSQL_LOCK_IGNORE_FLUSH flag
resulting in renaming a partition that was already in use,
which could cause the table to be unusable.
Solution was to not allow IGNORE_FLUSH to skip waiting for
a named locked table.
And to not release the LOCK_open mutex between the
calls to remove_table_from_cache and
close_data_files_and_morph_locks by merging the functions
abort_and_upgrade_locks and alter_close_tables.
(regression)
Problem was that partition pruning did not exclude the
last partition if the range was beyond it
(i.e. not using MAXVALUE)
Fix was to not include the last partition if the
partitioning function value was not within the partition
range.
REORGANIZE PARTITION
There were several problems which lead to this this,
all related to bad error handling.
1) There was several bugs preventing the ddl-log to be used for
cleaning up created files on error.
2) The error handling after the copy partition rows did not close
and unlock the tables, resulting in deletion of partitions
which were in use, which lead InnoDB to put the partition to
drop in a background queue.
Problem was when calculating the range of partitions for
pruning.
Solution was to get the calculation correct. I also simplified
it a bit for easier understanding.
timestamp primary key
Since TIMESTAMP values are adjusted by the current time zone
settings in both numeric and string contexts, using any
expressions involving TIMESTAMP values as a
(sub)partitioning function leads to undeterministic behavior of
partitioned tables. The effect may vary depending on a storage
engine, it can be either incorrect data being retrieved or
stored, or an assertion failure. The root cause of this is the
fact that the calculated partition ID may differ from a
previously calculated ID for the same data due to timezone
adjustments of the partitioning expression value.
Fixed by disabling any expressions involving TIMESTAMP values
to be used in partitioning functions with the follwing two
exceptions:
1. Creating or altering into a partitioned table that violates
the above rule is not allowed, but opening existing such tables
results in a warning rather than an error so that such tables
could be fixed.
2. UNIX_TIMESTAMP() is the only way to get a
timezone-independent value from a TIMESTAMP column, because it
returns the internal representation (a time_t value) of a
TIMESTAMP argument verbatim. So UNIX_TIMESTAMP(timestamp_column)
is allowed and should be used to fix existing tables if one
wants to use TIMESTAMP columns with partitioning.
ONLY_FULL_GROUP_BY
Problem was that during checking and preparation of the
partitioining function as a side effect in fix_fields
the full_group_by_flag was changed.
Solution was to set it back to its original value after
calling fix_fields.
Updated patch, to also exclude allow_sum_func from being
affected of fix_fields, as requested by reviewer.
Implemented the server infrastructure for the fix:
1. Added a function LEX_STRING *thd_query_string(THD) to return
a LEX_STRING structure instead of char *.
This is the function that must be called in innodb instead of
thd_query()
2. Did some encapsulation in THD : aggregated thd_query and
thd_query_length into a LEX_STRING and made accessor and mutator
methods for easy code updating.
3. Updated the server code to use the new methods where applicable.
Problem was that the partition containing NULL values
was pruned away, since '2001-01-01' < '2001-02-00' but
TO_DAYS('2001-02-00') is NULL.
Added the NULL partition for RANGE/LIST partitioning on TO_DAYS()
function to be scanned too.
Also fixed a bug that added ALLOW_INVALID_DATES to sql_mode
(SELECT * FROM t WHERE date_col < '1999-99-99' on a RANGE/LIST
partitioned table would add it).
when partition is reoganized.
Problem was that table->timestamp_field_type was not changed
before copying rows between partitions.
fixed by setting it to TIMESTAMP_NO_AUTO_SET as the first thing
in fast_alter_partition_table, so that all if-branches is covered.
We disallow the partitioning of a log table. You could however
partition a table first, and then point logging to it. This is
not only against the docs, it also crashes the server.
We catch this case now.
contains ONLY_FULL_GROUP_BY
The partitioning code needs to issue a Item::fix_fields()
on the partitioning expression in order to prepare
it for being evaluated.
It does this by creating a special table and a table list
for the scope of the partitioning expression.
But when checking ONLY_FULL_GROUP_BY the
Item_field::fix_fields() was relying that there always be
cached_table set and was trying to use it to get the
select_lex of the SELECT the field's table is in.
But the cached_table was not set by the partitioning code
that creates the artificial TABLE_LIST used to resolve the
partitioning expression and this resulted in a crash.
Fixed by rectifying the following errors :
1. Item_field::fix_fields() : the code that check for
ONLY_FULL_GROUP_BY relies on having tables with
cacheable_table set. This is mostly true, the only
two exceptions being the partitioning context table
and the trigger context table.
Fixed by taking the current parsing context if no pointer
to the TABLE_LIST instance is present in the cached_table.
2. fix_fields_part_func() :
2a. The code that adds the table being created to the
scope for the partitioning expression is mostly a copy
of the add_table_to_list and friends with one exception :
it was not marking the table as cacheable (something that
normal add_table_to_list is doing). This caused the
problem in the check for ONLY_FULL_GROUP_BY in
Item_field::fix_fields() to appear.
Fixed by setting the correct members to make the table
cacheable.
The ideal structural fix for this is to use a unified
interface for adding a table to a table list
(add_table_to_list?) : noted in a TODO comment
2b. The Item::fix_fields() was called with a NULL destination
pointer. This causes uninitalized memory reads in the
overloaded ::fix_fields() function (namely
Item_field::fix_fields()) as it expects a non-zero pointer
there. Fixed by passing the source pointer similarly to how
it's done in JOIN::prepare().
The problem: described in the bug report.
The fix:
--increase buffers where it's necessary
(buffers which are used in stxnmov)
--decrease buffer lengths which are used
Problem was an errornous date that lead to end partition
was before the start, leading to a crash.
Solution was to check greater or equal instead of only
equal between start and end partition.
NOTE: partitioning pruning handles incorrect dates
differently than index lookup, which can give different
results in a partitioned table versus a non partitioned
table for queries having 'bad' dates in the where clause.