Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
ha_partition::init_record_priority_queue()
Cherry-pick rev.6b0ee0c795499cee7f9deb649fb010801e0be4c2 from mysql-5.6.
Bug #18305270 BACKPORT BUG#18694052 FIX
FOR ASSERTION `!M_ORDERED_REC_BUFFER'
FAILED TO 5.6
PROBLEM
-------
Missed to remove record priority queue if
init_index failed for a partition which
was causing the crash.
FIX
---
Remove priority queue if init_index fails
for partition.
PROBLEM
Create time is calculated as last status change time of .frm file.
The first problem was that innodb was passing file name as
"table_name#po#p0.frm" to the stat() call which calculates the create time.
Since there is no frm file with this name create_time will be stored as NULL.
The second problem is ha_partition::info() updates stats for create time
when HA_STATUS_CONST flag was set ,where as innodb calculates this statistic
when HA_STATUS_TIME is set,which causes create_time to be set as NULL.
Fix
Pass proper .frm name to stat() call and calculate create time when
HA_STATUS_CONST flag is set.
Problem is on test it tried to verify that no files were left
on test database.
Fix: There's no need to list other file types, it can only
list *.par files
MySQL Bug#71095: Wrong results with PARTITION BY LIST COLUMNS()
MySQL Bug#72803: Wrong "Impossible where" with LIST partitioning
MDEV-6240: Wrong "Impossible where" with LIST partitioning
- Backprot the fix from MySQL Bug#71095.
(This is attempt at fix#2) (re-commit with fixed typo)
- Moved the testcase from partition_test to partition_innodb.test where it can really work.
- Made ordered index scans over ha_partition tables to satisfy ROR property for
the case where underlying table uses extended keys.
- Backport MySQL's fix: do set ha_partition::m_pkey_is_clustered for ha_partition
objects created with handler->clone() call.
- Also, include a testcase.
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
from MariaDB 10.0.
The bug in mdev-3948 was an instance of the problem fixed by Sergey's patch
in 10.0 - namely that the range optimizer could change table->[read | write]_set,
and not restore it.
revno: 3471
committer: Sergey Petrunya <psergey@askmonty.org>
branch nick: 10.0-serg-fix-imerge
timestamp: Sat 2012-11-03 12:24:36 +0400
message:
# MDEV-3817: Wrong result with index_merge+index_merge_intersection, InnoDB table, join, AND and OR conditions
Reconcile the fixes from:
#
# guilhem.bichot@oracle.com-20110805143029-ywrzuz15uzgontr0
# Fix for BUG#12698916 - "JOIN QUERY GIVES WRONG RESULT AT 2ND EXEC. OR
# AFTER FLUSH TABLES [-INT VS NULL]"
#
# guilhem.bichot@oracle.com-20111209150650-tzx3ldzxe1yfwji6
# Fix for BUG#12912171 - ASSERTION FAILED: QUICK->HEAD->READ_SET == SAVE_READ_SET
# and
#
and related fixes from: BUG#1006164, MDEV-376:
Now, ROR-merged QUICK_RANGE_SELECT objects make no assumptions about the values
of table->read_set and table->write_set.
Each QUICK_ROR_SELECT has (and had before) its own column bitmap, but now, all
QUICK_ROR_SELECT's functions that care: reset(), init_ror_merged_scan(), and
get_next() will set table->read_set when invoked and restore it back to what
it was before the call before they return.
This allows to avoid the mess when somebody else modifies table->read_set for
some reason.
Reconcile the fixes from:
#
# guilhem.bichot@oracle.com-20110805143029-ywrzuz15uzgontr0
# Fix for BUG#12698916 - "JOIN QUERY GIVES WRONG RESULT AT 2ND EXEC. OR
# AFTER FLUSH TABLES [-INT VS NULL]"
#
# guilhem.bichot@oracle.com-20111209150650-tzx3ldzxe1yfwji6
# Fix for BUG#12912171 - ASSERTION FAILED: QUICK->HEAD->READ_SET == SAVE_READ_SET
# and
#
and related fixes from: BUG#1006164, MDEV-376:
Now, ROR-merged QUICK_RANGE_SELECT objects make no assumptions about the values
of table->read_set and table->write_set.
Each QUICK_ROR_SELECT has (and had before) its own column bitmap, but now, all
QUICK_ROR_SELECT's functions that care: reset(), init_ror_merged_scan(), and
get_next() will set table->read_set when invoked and restore it back to what
it was before the call before they return.
This allows to avoid the mess when somebody else modifies table->read_set for
some reason.
PARTITION STATISTICS
Problem was the fix for bug#11756867; It always used the first
partitions, and stopped after it checked 10 [sub]partitions.
(or until it found a partition which would contain a match).
This results in bad statistics for tables where the first 10 partitions
don't represent the majority of the data (like when the first 10
partitions only contained a few rows in total).
The solution was to take statisics from the partitions containing
the most rows instead:
Added an array of partition ids which is sorted by number of records
in descending order.
this array is used in records_in_range to cover as many records as
possible in as few calls as possible.
Also changed the limit of how many partitions to use for the statistics
from a static max of 10 partitions, into a dynamic model:
Maximum number of partitions is now log2(total number of partitions)
taken from the ordered array.
It will continue calling partitions records_in_range until it has
checked:
(total rows in matching partitions) * (maximum number of partitions)
/ (number of used partitions)
Also reverted the changes for ha_partition::scan_time() and
ha_partition::estimate_rows_upper_bound() to before
the fix of bug#11756867. Since they are not as slow as
records_in_range.
PARTITION STATISTICS
Problem was the fix for bug#11756867; It always used the first
partitions, and stopped after it checked 10 [sub]partitions.
(or until it found a partition which would contain a match).
This results in bad statistics for tables where the first 10 partitions
don't represent the majority of the data (like when the first 10
partitions only contained a few rows in total).
The solution was to take statisics from the partitions containing
the most rows instead:
Added an array of partition ids which is sorted by number of records
in descending order.
this array is used in records_in_range to cover as many records as
possible in as few calls as possible.
Also changed the limit of how many partitions to use for the statistics
from a static max of 10 partitions, into a dynamic model:
Maximum number of partitions is now log2(total number of partitions)
taken from the ordered array.
It will continue calling partitions records_in_range until it has
checked:
(total rows in matching partitions) * (maximum number of partitions)
/ (number of used partitions)
Also reverted the changes for ha_partition::scan_time() and
ha_partition::estimate_rows_upper_bound() to before
the fix of bug#11756867. Since they are not as slow as
records_in_range.
leave the table unusable".
Failing ALTER statement on partitioned table could have left
this table in an unusable state. This has happened in cases
when ALTER was executed using "fast" algorithm, which doesn't
involve copying of data between old and new versions of table,
and the resulting new table was incompatible with partitioning
function in some way.
The problem stems from the fact that discrepancies between new
table definition and partitioning function are discovered only
when the table is opened. In case of "fast" algorithm this has
happened too late during ALTER's execution, at the moment when
all changes were already done and couldn't have been reverted.
In the cases when "slow" algorithm, which copies data, is used
such discrepancies are detected at the moment new table
definition is opened implicitly when new version of table is
created in storage engine. As result ALTER is aborted before
any changes to table were done.
This fix tries to address this issue by ensuring that "fast"
algorithm behaves similarly to "slow" algorithm and checks
compatibility between new definition and partitioning function
by trying to open new definition after .FRM file for it has
been created.
Long term we probably should implement some way to check
compatibility between partitioning function and new table
definition which won't involve opening it, as this should
allow much cleaner fix for this problem.
mysql-test/r/partition_innodb.result:
Added test for bug #57985 "ONLINE/FAST ALTER PARTITION can
fail and leave the table unusable".
mysql-test/t/partition_innodb.test:
Added test for bug #57985 "ONLINE/FAST ALTER PARTITION can
fail and leave the table unusable".
sql/sql_table.cc:
Ensure that in cases when .FRM for partitioned table is
created without creating table in storage engine (e.g.
during "fast" ALTER TABLE) we still open table definition.
This allows to check that definition of created table/.FRM
is compatible with its partitioning function.
leave the table unusable".
Failing ALTER statement on partitioned table could have left
this table in an unusable state. This has happened in cases
when ALTER was executed using "fast" algorithm, which doesn't
involve copying of data between old and new versions of table,
and the resulting new table was incompatible with partitioning
function in some way.
The problem stems from the fact that discrepancies between new
table definition and partitioning function are discovered only
when the table is opened. In case of "fast" algorithm this has
happened too late during ALTER's execution, at the moment when
all changes were already done and couldn't have been reverted.
In the cases when "slow" algorithm, which copies data, is used
such discrepancies are detected at the moment new table
definition is opened implicitly when new version of table is
created in storage engine. As result ALTER is aborted before
any changes to table were done.
This fix tries to address this issue by ensuring that "fast"
algorithm behaves similarly to "slow" algorithm and checks
compatibility between new definition and partitioning function
by trying to open new definition after .FRM file for it has
been created.
Long term we probably should implement some way to check
compatibility between partitioning function and new table
definition which won't involve opening it, as this should
allow much cleaner fix for this problem.