SECONDARY INDEX IN INNODB
The patches for Bug#11751388 and Bug#11784056 enabled concurrent
reads while creating secondary indexes in InnoDB. However, they
introduced a regression. This regression occured if ALTER TABLE
failed after the index had been added, for example during the
lock upgrade needed to update .FRM. If this happened, InnoDB
and the server got out of sync with regards to which indexes
actually existed. Therefore the patch for Bug#11815600 again
disabled concurrent reads.
This patch re-enables concurrent reads. The original regression
is fixed by splitting the ADD INDEX operation into two parts.
First the new index is created but not made active. This is
done while concurrent reads are allowed. The second part of
the operation makes the index active (or reverts the change).
This is done after lock upgrade, which prevents the original
regression.
In order to implement this change, the patch changes the storage
API for in-place index creation. handler::add_index() is split
into two functions, handler_add_index() and
handler::final_add_index(). The former for creating indexes without
making them visible and the latter for commiting (i.e. making
visible) new indexes or reverting the changes.
Large parts of this patch were written by Marko Mäkelä.
Test case added to innodb_mysql_lock.test.
Update for previous patch according to reviewers comments.
Updated the constructors for ha_partitions to use the common
init_handler_variables functions
Added use of defines for size and offset to get better readability for the code that reads
and writes the .par file. Also refactored the get_from_handler_file function.
When executing row-ordered-retrieval index merge,
the handler was cloned, but it used the wrong
memory root, so instead of allocating memory
on the thread/query's mem_root, it used the table's
mem_root, resulting in non released memory in the
table object, and was not freed until the table was
closed.
Solution was to ensure that memory used during cloning
of a handler was allocated from the correct memory root.
This was implemented by fixing handler::clone() to also
take a name argument, so it can be used with partitioning.
And in ha_partition only allocate the ha_partition's ref, and
call the original ha_partition partitions clone() and set at cloned
partitions.
Fix of .bzrignore on Windows with VS 2010
but the statement is written to binlog
TRUNCATE PARTITION was written to the binlog
even if it failed before calling any partition's
truncate function.
Solved by adding an argument to truncate_partition,
to flag if it should be written to the binlog or not.
It should be written to the binlog when a call to any
partitions truncate function is done.
Bug#54678: InnoDB, TRUNCATE, ALTER, I_S SELECT, crash or deadlock
- Incompatible change: truncate no longer resorts to a row by
row delete if the storage engine does not support the truncate
method. Consequently, the count of affected rows does not, in
any case, reflect the actual number of rows.
- Incompatible change: it is no longer possible to truncate a
table that participates as a parent in a foreign key constraint,
unless it is a self-referencing constraint (both parent and child
are in the same table). To work around this incompatible change
and still be able to truncate such tables, disable foreign checks
with SET foreign_key_checks=0 before truncate. Alternatively, if
foreign key checks are necessary, please use a DELETE statement
without a WHERE condition.
Problem description:
The problem was that for storage engines that do not support
truncate table via a external drop and recreate, such as InnoDB
which implements truncate via a internal drop and recreate, the
delete_all_rows method could be invoked with a shared metadata
lock, causing problems if the engine needed exclusive access
to some internal metadata. This problem originated with the
fact that there is no truncate specific handler method, which
ended up leading to a abuse of the delete_all_rows method that
is primarily used for delete operations without a condition.
Solution:
The solution is to introduce a truncate handler method that is
invoked when the engine does not support truncation via a table
drop and recreate. This method is invoked under a exclusive
metadata lock, so that there is only a single instance of the
table when the method is invoked.
Also, the method is not invoked and a error is thrown if
the table is a parent in a non-self-referencing foreign key
relationship. This was necessary to avoid inconsistency as
some integrity checks are bypassed. This is inline with the
fact that truncate is primarily a DDL operation that was
designed to quickly remove all data from a table.
LOAD DATA into partitioned MyISAM table
Problem was that both partitioning and myisam
used the same table_share->mutex for different protections
(auto inc and repair).
Solved by adding a specific mutex for the partitioning
auto_increment.
Also adding destroying the ha_data structure in
free_table_share (which is to be propagated
into 5.5).
This is a 5.1 ONLY patch, already fixed in 5.5+.
Problem was that the handler call ::extra(HA_EXTRA_CACHE) was cached
but the ::extra(HA_EXTRA_PREPARE_FOR_UPDATE) was not.
Solution was to also cache the other call and forward it when moving
to a new partition to scan.
In bug-28430 HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
was disabled in the partitioning engine in the first patch,
That bug was later fixed a second time, but that flag
was not removed.
No need to disable this flag, as it leads to bad
choise in row replication.
The handler function for reading one row from a specific index
was not optimized in the partitioning handler since it
used the default implementation.
No test case since it is performance only, verified by hand.
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h
into partitioned MyISAM table
Problem was that the ha_data structure was introduced in 5.1
and only used for partitioning first, but with the intention
of be of use for others engines as well, and when used by other
engines it would clash if it also was partitioned.
Solution is to move the partitioning specific data to a separate
structure, with its own mutex (which is used for auto_increment).
Also did rename PARTITION_INFO to PARTITION_STATS since there
already exist a class named partition_info, also cleaned up
some related variables.
-------------------------------------------------------------------
revno: 2630.6.6
committer: Konstantin Osipov <konstantin@mysql.com>
branch nick: mysql-6.0-3726
timestamp: Tue 2008-05-27 16:15:44 +0400
message:
Implement code review fixes for WL#3726 "DDL locking for all
metadata objects": cleanup the code from share->mutex
acquisitions, which are now obsolete.
Problem was that ha_partition::records_in_range called
records_in_range for all non pruned partitions, even if
an estimate should be given.
Solution is to only use 1/3 of the partitions (up to 10) for
records_in_range and estimate the total from this subset.
(And continue until a non zero return value from the called
partitions records_in_range is given, since 0 means no rows
will match.)
Conflicts
=========
Text conflict in .bzr-mysql/default.conf
Text conflict in libmysqld/CMakeLists.txt
Text conflict in libmysqld/Makefile.am
Text conflict in mysql-test/collections/default.experimental
Text conflict in mysql-test/extra/rpl_tests/rpl_row_sp006.test
Text conflict in mysql-test/suite/binlog/r/binlog_tmp_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata_fatal.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_create_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_sp006_InnoDB.result
Text conflict in mysql-test/suite/rpl/r/rpl_stm_log.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_circular_simplex.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_sp006.result
Text conflict in mysql-test/t/mysqlbinlog.test
Text conflict in sql/CMakeLists.txt
Text conflict in sql/Makefile.am
Text conflict in sql/log_event_old.cc
Text conflict in sql/rpl_rli.cc
Text conflict in sql/slave.cc
Text conflict in sql/sql_binlog.cc
Text conflict in sql/sql_lex.h
21 conflicts encountered.
NOTE
====
mysql-5.1-rpl-merge has been made a mirror of mysql-next-mr:
- "mysql-5.1-rpl-merge$ bzr pull ../mysql-next-mr"
This is the first cset (merge/...) committed after pulling
from mysql-next-mr.
get_table_share, drop_open_table
In the partition handler code, LOCK_open and share->LOCK_ha_data
are acquired in the wrong order in certain cases. When doing a
multi-row INSERT (i.e a INSERT..SELECT) in a table with auto-
increment column(s). the increments must be in a monotonically
continuous increasing sequence (i.e it can't have "holes"). To
achieve this, a lock is held for the duration of the operation.
share->LOCK_ha_data was used for this purpose.
Whenever there was a need to open a view _during_ the operation
(views are not currently pre-opened the way tables are), and
LOCK_open was grabbed, a deadlock could occur. share->LOCK_ha_data
is other places used _while_ holding LOCK_open.
A new mutex was introduced in the HA_DATA_PARTITION structure,
for exclusive use of the autoincrement data fields, so we don't
need to overload the use of LOCK_ha_data here.
A module test case has not been supplied, since the problem occurs
as a result of a race condition, and testing for this condition
is thus not deterministic. Testing for it could be done by
setting up a test case as described in the bug report.
"insert into.. select * from"
When inserting into a partitioned table using 'insert into
<target> select * from <src>', read_buffer_size bytes of memory
are allocated for each partition in the target table.
This resulted in large memory consumption when the number of
partitions are high.
This patch introduces a new method which tries to estimate the
buffer size required for each partition and limits the maximum
buffer size used to maximum of 10 * read_buffer_size,
11 * read_buffer_size in case of monotonic partition functions.
(Backport)
Problem is that when insert (ha_start_bulk_insert) in i partitioned table,
it will call ha_start_bulk_insert for every partition, used or not.
Solution is to delay the call to the partitions ha_start_bulk_insert until
the first row is to be inserted into that partition
Inserting a negative value in the autoincrement column of a
partitioned innodb table was causing the value of the auto
increment counter to wrap around into a very large positive
value. The consequences are the same as if a very large positive
value was inserted into a column, e.g. reduced autoincrement
range, failure to read autoincrement counter.
The current patch ensures that before calculating the next
auto increment value, the current value is within the positive
maximum allowed limit.