If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.
rb:875 approved by Jimmy Yang
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.
rb:875 approved by Jimmy Yang
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.
rb:875 approved by Jimmy Yang
The counter handler_read_key (SSV::ha_read_key_count) is incremented
incorrectly.
The mysql server maintains a per thread system_status_var (SSV)
object. This object contains among other things the counter
SSV::ha_read_key_count. The purpose of this counter is to measure the
number of requests to read a row based on a key (or the number of
index lookups).
This counter was wrongly incremented in the
ha_innobase::innobase_get_index(). The fix removes
this increment statement (for both innodb and innodb_plugin).
The various callers of the innobase_get_index() was checked to
determine if anybody must increment this counter (if they first call
innobase_get_index() and then perform an index lookup). It was found
that no caller of innobase_get_index() needs to worry about the
SSV::ha_read_key_count counter.
The counter handler_read_key (SSV::ha_read_key_count) is incremented
incorrectly.
The mysql server maintains a per thread system_status_var (SSV)
object. This object contains among other things the counter
SSV::ha_read_key_count. The purpose of this counter is to measure the
number of requests to read a row based on a key (or the number of
index lookups).
This counter was wrongly incremented in the
ha_innobase::innobase_get_index(). The fix removes
this increment statement (for both innodb and innodb_plugin).
The various callers of the innobase_get_index() was checked to
determine if anybody must increment this counter (if they first call
innobase_get_index() and then perform an index lookup). It was found
that no caller of innobase_get_index() needs to worry about the
SSV::ha_read_key_count counter.
I manually checked that all the conflicting InnoDB changes are in 5.5 already.
Two things I am not sure about - I commented them with XXX in this patch.
I will further check with the authors of the changesets whether these things
should be present or not.
I manually checked that all the conflicting InnoDB changes are in 5.5 already.
Two things I am not sure about - I commented them with XXX in this patch.
I will further check with the authors of the changesets whether these things
should be present or not.
a.k.a. Bug#7975 deadlock without any locking, simple select and update
Bug#7975 was reintroduced when the storage engine API was made
pluggable in MySQL 5.1. Instead of looking at thd->lex directly, we
rely on handler::extra(). But, we were looking at the wrong extra()
flag, and we were ignoring the TRX_DUP_REPLACE flag in places where we
should obey it.
innodb_replace.test: Add tests for hopefully all affected statement
types, so that bug should never ever resurface. This kind of tests
should have been added when fixing Bug#7975 in MySQL 5.0.3 in the
first place.
rb:806 approved by Sunny Bains
a.k.a. Bug#7975 deadlock without any locking, simple select and update
Bug#7975 was reintroduced when the storage engine API was made
pluggable in MySQL 5.1. Instead of looking at thd->lex directly, we
rely on handler::extra(). But, we were looking at the wrong extra()
flag, and we were ignoring the TRX_DUP_REPLACE flag in places where we
should obey it.
innodb_replace.test: Add tests for hopefully all affected statement
types, so that bug should never ever resurface. This kind of tests
should have been added when fixing Bug#7975 in MySQL 5.0.3 in the
first place.
rb:806 approved by Sunny Bains
In the ON UPDATE CASCADE clause of FOREIGN KEY constraints, the
calculated update vector was not fully initialized. This bug was
introduced in the InnoDB Plugin when implementing support for
ROW_FORMAT=DYNAMIC.
Additionally, the data type information was not initialized, but
apparently it has never been needed in this case. Nevertheless, it is
not good programming practice to pass uninitialized values around.
calc_row_difference(): Declare the update field uninitialized in
Valgrind. Copy the data type information as well, except when the
field is SQL NULL. In the built-in InnoDB, initialize
ufield->extern_storage = FALSE (an initialization bug that had gone
unnoticed this far). The InnoDB Plugin and later have this flag to
dfield_t and have always initialized it properly.
row_ins_cascade_calc_update_vec(): Reduce the scope of some
pointers. Initialize orig_len. (This caused the bug in InnoDB Plugin
and later.)
row_ins_foreign_check_on_constraint(): Simplify a condition. Declare
the update vector uninitialized.
rb:771 approved by Jimmy Yang
In the ON UPDATE CASCADE clause of FOREIGN KEY constraints, the
calculated update vector was not fully initialized. This bug was
introduced in the InnoDB Plugin when implementing support for
ROW_FORMAT=DYNAMIC.
Additionally, the data type information was not initialized, but
apparently it has never been needed in this case. Nevertheless, it is
not good programming practice to pass uninitialized values around.
calc_row_difference(): Declare the update field uninitialized in
Valgrind. Copy the data type information as well, except when the
field is SQL NULL. In the built-in InnoDB, initialize
ufield->extern_storage = FALSE (an initialization bug that had gone
unnoticed this far). The InnoDB Plugin and later have this flag to
dfield_t and have always initialized it properly.
row_ins_cascade_calc_update_vec(): Reduce the scope of some
pointers. Initialize orig_len. (This caused the bug in InnoDB Plugin
and later.)
row_ins_foreign_check_on_constraint(): Simplify a condition. Declare
the update vector uninitialized.
rb:771 approved by Jimmy Yang
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.
In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)
In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.
trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.
trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.
trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.
trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.
trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.
trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.
rb:749 approved by Inaam Rana
This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.
In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)
In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.
trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.
trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.
trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.
trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.
trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.
trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.
rb:749 approved by Inaam Rana
Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
The patch replaces the use of the POSIX I/O interfaces in mysys on Windows with
the Win32 API calls (CreateFile, WriteFile, etc). The Windows HANDLE for the open
file is stored in the my_file_info struct, along with a flag for append mode
(because the Windows API does not support opening files in append mode in all cases)
The default max open files has been increased to 16384 and can be increased further
by setting --max-open-files=<value> during the server start.
Noteworthy benefit of this patch is that it removes limits from the table_cache size -
allowing for more simultaneus users
SECONDARY INDEX IN INNODB
The patches for Bug#11751388 and Bug#11784056 enabled concurrent
reads while creating secondary indexes in InnoDB. However, they
introduced a regression. This regression occured if ALTER TABLE
failed after the index had been added, for example during the
lock upgrade needed to update .FRM. If this happened, InnoDB
and the server got out of sync with regards to which indexes
actually existed. Therefore the patch for Bug#11815600 again
disabled concurrent reads.
This patch re-enables concurrent reads. The original regression
is fixed by splitting the ADD INDEX operation into two parts.
First the new index is created but not made active. This is
done while concurrent reads are allowed. The second part of
the operation makes the index active (or reverts the change).
This is done after lock upgrade, which prevents the original
regression.
In order to implement this change, the patch changes the storage
API for in-place index creation. handler::add_index() is split
into two functions, handler_add_index() and
handler::final_add_index(). The former for creating indexes without
making them visible and the latter for commiting (i.e. making
visible) new indexes or reverting the changes.
Large parts of this patch were written by Marko Mäkelä.
Test case added to innodb_mysql_lock.test.
With this change, the index prefix column length lifted from 767 bytes
to 3072 bytes if "innodb_large_prefix" is set to "true".
rb://603 approved by Marko
A lot of small fixes and new test cases.
client/mysqlbinlog.cc:
Cast removed
client/mysqltest.cc:
Added missing DBUG_RETURN
include/my_pthread.h:
set_timespec_time_nsec() now only takes one argument
mysql-test/t/date_formats.test:
Remove --disable_ps_protocl as now also ps supports microseconds
mysys/my_uuid.c:
Changed to use my_interval_timer() instead of my_getsystime()
mysys/waiting_threads.c:
Changed to use my_hrtime()
sql/field.h:
Added bool special_const_compare() for fields that may convert values before compare (like year)
sql/field_conv.cc:
Added test to get optimal copying of identical temporal values.
sql/item.cc:
Return that item_int is equal if it's positive, even if unsigned flag is different.
Fixed Item_cache_str::save_in_field() to have identical null check as other similar functions
Added proper NULL check to Item_cache_int::save_in_field()
sql/item_cmpfunc.cc:
Don't call convert_constant_item() if there is nothing that is worth converting.
Simplified test when years should be converted
sql/item_sum.cc:
Mark cache values in Item_sum_hybrid as not constants to ensure they are not replaced by other cache values in compare_datetime()
sql/item_timefunc.cc:
Changed sec_to_time() to take a my_decimal argument to ensure we don't loose any sub seconds.
Added Item_temporal_func::get_time() (This simplifies some things)
sql/mysql_priv.h:
Added Lazy_string_decimal()
sql/mysqld.cc:
Added my_decimal constants max_seconds_for_time_type, time_second_part_factor
sql/table.cc:
Changed expr_arena to be of type CONVENTIONAL_EXECUTION to ensure that we don't loose any items that are created by fix_fields()
sql/tztime.cc:
TIME_to_gmt_sec() now sets *in_dst_time_gap in case of errors
This is needed to be able to detect if timestamp is 0
storage/maria/lockman.c:
Changed from my_getsystime() to set_timespec_time_nsec()
storage/maria/ma_loghandler.c:
Changed from my_getsystime() to my_hrtime()
storage/maria/ma_recovery.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/maria/unittest/trnman-t.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/xtradb/handler/ha_innodb.cc:
Added support for new time,datetime and timestamp
unittest/mysys/thr_template.c:
my_getsystime() -> my_interval_timer()
unittest/mysys/waiting_threads-t.c:
my_getsystime() -> my_interval_timer()
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The
same scenario works fine for MyISAM temporary tables.
The assertion failure was caused by discrepancy between lock
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.
Possible approaches to fixing this problem are:
1) Skip locking of temporary tables as locking doesn't make any
sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ...
READ.
Unfortunately both of these approaches have drawbacks which make
them unviable for stable versions of server.
So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.1 version of this patch switch from read lock to write
lock is done inside of InnoDBs handler methods as doing it on
SQL-layer causes compatibility troubles with FLUSH TABLES WITH
READ LOCK.
mysql-test/suite/innodb/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
storage/innobase/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
storage/innodb_plugin/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The
same scenario works fine for MyISAM temporary tables.
The assertion failure was caused by discrepancy between lock
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.
Possible approaches to fixing this problem are:
1) Skip locking of temporary tables as locking doesn't make any
sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ...
READ.
Unfortunately both of these approaches have drawbacks which make
them unviable for stable versions of server.
So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.1 version of this patch switch from read lock to write
lock is done inside of InnoDBs handler methods as doing it on
SQL-layer causes compatibility troubles with FLUSH TABLES WITH
READ LOCK.
- Added a lot of code comments
- Updated get_best_ror_intersec() to prefer index scan on not clustered keys before clustered keys.
- Use HA_CLUSTERED_INDEX to define if one should use HA_MRR_INDEX_ONLY
- For test of using index or filesort to resolve ORDER BY, use HA_CLUSTERED_INDEX flag instead of primary_key_is_clustered()
- Use HA_TABLE_SCAN_ON_INDEX instead of primary_key_is_clustered() to decide if ALTER TABLE ... ORDER BY will have any effect.
sql/ha_partition.h:
Added comment with warning for code unsafe to use with multiple storage engines at the same time
sql/handler.h:
Added HA_CLUSTERED_INDEX.
Documented primary_key_is_clustered()
sql/opt_range.cc:
Added code comments
Updated get_best_ror_intersec() to ignore clustered keys.
Optimized away cpk_scan_used and one instance of current_thd (Simpler code)
Use HA_CLUSTERED_INDEX to define if one should use HA_MRR_INDEX_ONLY
sql/sql_select.cc:
Changed comment to #ifdef
For test of using index or filesort to resolve ORDER BY, use HA_CLUSTERED_INDEX flag instead of primary_key_is_clustered()
(Change is smaller than what it looks beause of indentation change)
sql/sql_table.cc:
Use HA_TABLE_SCAN_ON_INDEX instead of primary_key_is_clustered() to decide if ALTER TABLE ... ORDER BY will have any effect.
storage/innobase/handler/ha_innodb.h:
Added support for HA_CLUSTERED_INDEX
storage/innodb_plugin/handler/ha_innodb.cc:
Added support for HA_CLUSTERED_INDEX
storage/xtradb/handler/ha_innodb.cc:
Added support for HA_CLUSTERED_INDEX
DB_COL_APPEARS_TWICE_IN_INDEX: Remove. This condition is already
checked and reported by MySQL before passing the index definition to
the storage engine.
row_create_index_for_mysql(): Remove the redundant check for
DB_COL_APPEARS_TWICE_IN_INDEX. When enforcing the column prefix index
limit, invoke dict_mem_index_free(index) to plug the memory leak. In
the loop, use index->n_def instead of dict_index_get_n_fields(index),
because the latter would be 0 for indexes that have not been copied to
the data dictionary cache.
innodb-use-sys-malloc.test:
Add test cases for attempting to trigger the error checks in
row_create_index_for_mysql(). Before MySQL 5.5 and WL#5743, the leak
is only reproducible if ha_innobase::max_supported_key_part_length()
returned a higher limit than the one used in
row_create_index_for_mysql().
In MySQL 5.5 and later, the leak is reproducible with
innodb_large_prefix=true.
rb:688 approved by Jimmy Yang
DB_COL_APPEARS_TWICE_IN_INDEX: Remove. This condition is already
checked and reported by MySQL before passing the index definition to
the storage engine.
row_create_index_for_mysql(): Remove the redundant check for
DB_COL_APPEARS_TWICE_IN_INDEX. When enforcing the column prefix index
limit, invoke dict_mem_index_free(index) to plug the memory leak. In
the loop, use index->n_def instead of dict_index_get_n_fields(index),
because the latter would be 0 for indexes that have not been copied to
the data dictionary cache.
innodb-use-sys-malloc.test:
Add test cases for attempting to trigger the error checks in
row_create_index_for_mysql(). Before MySQL 5.5 and WL#5743, the leak
is only reproducible if ha_innobase::max_supported_key_part_length()
returned a higher limit than the one used in
row_create_index_for_mysql().
In MySQL 5.5 and later, the leak is reproducible with
innodb_large_prefix=true.
rb:688 approved by Jimmy Yang
SECONDARY INDEX IN INNODB
The patches for Bug#11751388 and Bug#11784056 enabled concurrent
reads while creating secondary indexes in InnoDB. However, they
introduced a regression. This regression occured if ALTER TABLE
failed after the index had been added, for example during the
lock upgrade needed to update .FRM. If this happened, InnoDB
and the server got out of sync with regards to which indexes
actually existed. Therefore the patch for Bug#11815600 again
disabled concurrent reads.
This patch re-enables concurrent reads. The original regression
is fixed by splitting the ADD INDEX operation into two parts.
First the new index is created but not made active. This is
done while concurrent reads are allowed. The second part of
the operation makes the index active (or reverts the change).
This is done after lock upgrade, which prevents the original
regression.
In order to implement this change, the patch changes the storage
API for in-place index creation. handler::add_index() is split
into two functions, handler_add_index() and
handler::final_add_index(). The former for creating indexes without
making them visible and the latter for commiting (i.e. making
visible) new indexes or reverting the changes.
Large parts of this patch were written by Marko Mäkelä.
Test case added to innodb_mysql_lock.test.
With this change, the index prefix column length lifted from 767 bytes
to 3072 bytes if "innodb_large_prefix" is set to "true".
rb://603 approved by Marko
The innoDB global variable srv_lower_case_table_names is set to the value of lower_case_table_names declared in mysqld.h server in ha_innodb.cc. Since this variable can change at runtime, it is reset for each handler call to ::create, ::open, ::rename_table & ::delete_table.
But it is possible for tables to be implicitly opened before an explicit handler call is made when an engine is first started or restarted. I was able to reproduce that with the testcase in this patch on a version of InnoDB from 2 weeks ago. It seemed like the change buffer entries for the secondary key was getting put into pages after the restart. (But I am not sure, I did not write down the call stack while it was reproducing.) In the current code, the implicit open, which is actually a call to dict_load_foreigns(), does not occur with this testcase.
The change is to replace srv_lower_case_table_names by an interface function in innodb.cc that retrieves the server global variable when it is needed.
The innoDB global variable srv_lower_case_table_names is set to the value of lower_case_table_names declared in mysqld.h server in ha_innodb.cc. Since this variable can change at runtime, it is reset for each handler call to ::create, ::open, ::rename_table & ::delete_table.
But it is possible for tables to be implicitly opened before an explicit handler call is made when an engine is first started or restarted. I was able to reproduce that with the testcase in this patch on a version of InnoDB from 2 weeks ago. It seemed like the change buffer entries for the secondary key was getting put into pages after the restart. (But I am not sure, I did not write down the call stack while it was reproducing.) In the current code, the implicit open, which is actually a call to dict_load_foreigns(), does not occur with this testcase.
The change is to replace srv_lower_case_table_names by an interface function in innodb.cc that retrieves the server global variable when it is needed.
In ha_innobase::create(), we check some things while holding an
exclusive lock on the data dictionary. Defer the locking and the
creation of transactions until after the checks have passed. The
THDVAR could hang due to a mutex wait (see Bug #11750569 - 41163:
deadlock in mysqld: LOCK_global_system_variables and LOCK_open), and
we want to avoid waiting while holding InnoDB mutexes.
innobase_index_name_is_reserved(): Replace the parameter trx_t with
THD, so that the test can be performed before starting an InnoDB
transaction. We only needed trx->mysql_thd.
ha_innobase::create(): Create transaction and lock the data dictionary
only after passing the basic tests.
create_table_def(): Move the IS_MAGIC_TABLE_AND_USER_DENIED_ACCESS
check to ha_innobase::create(). Assign to srv_lower_case_table_names
while holding dict_sys->mutex.
ha_innobase::delete_table(), ha_innobase::rename_table(),
innobase_rename_table(): Assign srv_lower_case_table_names as late as
possible. Here, the variable is not necessarily protected by
dict_sys->mutex.
ha_innobase::add_index(): Invoke innobase_index_name_is_reserved() and
innobase_check_index_keys() before allocating anything.
rb:618 approved by Jimmy Yang
In ha_innobase::create(), we check some things while holding an
exclusive lock on the data dictionary. Defer the locking and the
creation of transactions until after the checks have passed. The
THDVAR could hang due to a mutex wait (see Bug #11750569 - 41163:
deadlock in mysqld: LOCK_global_system_variables and LOCK_open), and
we want to avoid waiting while holding InnoDB mutexes.
innobase_index_name_is_reserved(): Replace the parameter trx_t with
THD, so that the test can be performed before starting an InnoDB
transaction. We only needed trx->mysql_thd.
ha_innobase::create(): Create transaction and lock the data dictionary
only after passing the basic tests.
create_table_def(): Move the IS_MAGIC_TABLE_AND_USER_DENIED_ACCESS
check to ha_innobase::create(). Assign to srv_lower_case_table_names
while holding dict_sys->mutex.
ha_innobase::delete_table(), ha_innobase::rename_table(),
innobase_rename_table(): Assign srv_lower_case_table_names as late as
possible. Here, the variable is not necessarily protected by
dict_sys->mutex.
ha_innobase::add_index(): Invoke innobase_index_name_is_reserved() and
innobase_check_index_keys() before allocating anything.
rb:618 approved by Jimmy Yang
causes future shutdown hang
InnoDB would hang on shutdown if any XA transactions exist in the
system in the PREPARED state. This has been masked by the fact that
MySQL would roll back any PREPARED transaction on shutdown, in the
spirit of Bug #12161 Xa recovery and client disconnection.
[mysql-test-run] do_shutdown_server: Interpret --shutdown_server 0 as
a request to kill the server immediately without initiating a
shutdown procedure.
xid_cache_insert(): Initialize XID_STATE::rm_error in order to avoid a
bogus error message on XA ROLLBACK of a recovered PREPARED transaction.
innobase_commit_by_xid(), innobase_rollback_by_xid(): Free the InnoDB
transaction object after rolling back a PREPARED transaction.
trx_get_trx_by_xid(): Only consider transactions whose
trx->is_prepared flag is set. The MySQL layer seems to prevent
attempts to roll back connected transactions that are in the PREPARED
state from another connection, but it is better to play it safe. The
is_prepared flag was introduced in the InnoDB Plugin.
trx_n_prepared: A new counter, counting the number of InnoDB
transactions in the PREPARED state.
logs_empty_and_mark_files_at_shutdown(): On shutdown, allow
trx_n_prepared transactions to exist in the system.
trx_undo_free_prepared(), trx_free_prepared(): New functions, to free
the memory objects of PREPARED transactions on shutdown. This is not
needed in the built-in InnoDB, because it would collect all allocated
memory on shutdown. The InnoDB Plugin needs this because of
innodb_use_sys_malloc.
trx_sys_close(): Invoke trx_free_prepared() on all remaining
transactions.
causes future shutdown hang
InnoDB would hang on shutdown if any XA transactions exist in the
system in the PREPARED state. This has been masked by the fact that
MySQL would roll back any PREPARED transaction on shutdown, in the
spirit of Bug #12161 Xa recovery and client disconnection.
[mysql-test-run] do_shutdown_server: Interpret --shutdown_server 0 as
a request to kill the server immediately without initiating a
shutdown procedure.
xid_cache_insert(): Initialize XID_STATE::rm_error in order to avoid a
bogus error message on XA ROLLBACK of a recovered PREPARED transaction.
innobase_commit_by_xid(), innobase_rollback_by_xid(): Free the InnoDB
transaction object after rolling back a PREPARED transaction.
trx_get_trx_by_xid(): Only consider transactions whose
trx->is_prepared flag is set. The MySQL layer seems to prevent
attempts to roll back connected transactions that are in the PREPARED
state from another connection, but it is better to play it safe. The
is_prepared flag was introduced in the InnoDB Plugin.
trx_n_prepared: A new counter, counting the number of InnoDB
transactions in the PREPARED state.
logs_empty_and_mark_files_at_shutdown(): On shutdown, allow
trx_n_prepared transactions to exist in the system.
trx_undo_free_prepared(), trx_free_prepared(): New functions, to free
the memory objects of PREPARED transactions on shutdown. This is not
needed in the built-in InnoDB, because it would collect all allocated
memory on shutdown. The InnoDB Plugin needs this because of
innodb_use_sys_malloc.
trx_sys_close(): Invoke trx_free_prepared() on all remaining
transactions.
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
KEY NO 0 FOR TABLE IN ERROR LOG
With the changes made by the patches for Bug#11751388 and
Bug#11784056, concurrent reads are allowed while secondary
indexes are created in InnoDB. This means that the metadata
lock on the affected table is not upgraded to exclusive
until the .FRM is updated at the end of ALTER TABLE processing.
The problem was that if this lock upgrade failed for some
reason (e.g. timeout), the index information in the server
and inside InnoDB would be out of sync. This would happen
since the add index operation already was committed inside
InnoDB but the table metadata inside the server had not been
updated yet.
This patch fixes the problem by (for now) reverting the
effects of the patches for Bug#11751388 and Bug#11784056.
Concurrent reads will now again be blocked during creation
of secondary indexes in InnoDB.
Test case added to innodb_mysql_lock.test.
KEY NO 0 FOR TABLE IN ERROR LOG
With the changes made by the patches for Bug#11751388 and
Bug#11784056, concurrent reads are allowed while secondary
indexes are created in InnoDB. This means that the metadata
lock on the affected table is not upgraded to exclusive
until the .FRM is updated at the end of ALTER TABLE processing.
The problem was that if this lock upgrade failed for some
reason (e.g. timeout), the index information in the server
and inside InnoDB would be out of sync. This would happen
since the add index operation already was committed inside
InnoDB but the table metadata inside the server had not been
updated yet.
This patch fixes the problem by (for now) reverting the
effects of the patches for Bug#11751388 and Bug#11784056.
Concurrent reads will now again be blocked during creation
of secondary indexes in InnoDB.
Test case added to innodb_mysql_lock.test.
NON-PRIMARY UNIQUE INDEX USING INNODB
This patch adds the HA_INPLACE_ADD_UNIQUE_INDEX_NO_WRITE
capability flag to InnoDB, indicating that concurrent reads
can be allowed while non-primary unique indexes are created.
This is an follow-up to Bug #11751388 which enabled concurrent
reads when creating non-primary non-unique indexes.
Test case added to innodb_mysql_sync.test.
NON-PRIMARY UNIQUE INDEX USING INNODB
This patch adds the HA_INPLACE_ADD_UNIQUE_INDEX_NO_WRITE
capability flag to InnoDB, indicating that concurrent reads
can be allowed while non-primary unique indexes are created.
This is an follow-up to Bug #11751388 which enabled concurrent
reads when creating non-primary non-unique indexes.
Test case added to innodb_mysql_sync.test.
Bug #11766501: Multiple RBS break the get rseg with mininum trx_t::no code during purge
Bug# 59291 changes:
Main problem is that truncating the UNDO log at the completion of every
trx_purge() call is expensive as the number of rollback segments is increased.
We truncate after a configurable amount of pages. The innodb_purge_batch_size
parameter is used to control when InnoDB does the actual truncate. The truncate
is done once after 128 (or TRX_SYS_N_RSEGS iterations). In other words we
truncate after purge 128 * innodb_purge_batch_size. The smaller the batch
size the quicker we truncate.
Introduce a new parameter that allows how many rollback segments to use for
storing REDO information. This is really step 1 in allowing complete control
to the user over rollback space management.
New parameters:
i) innodb_rollback_segments = number of rollback_segments to use
(default is now 128) dynamic parameter, can be changed anytime.
Currently there is little benefit in changing it from the default.
Optimisations in the patch.
i. Change the O(n) behaviour of trx_rseg_get_on_id() to O(log n)
Backported from 5.6. Refactor some of the binary heap code.
Create a new include/ut0bh.ic file.
ii. Avoid truncating the rollback segments after every purge.
Related changes that were moved to a separate patch:
i. Purge should not do any flushing, only wait for space to be free so that
it only does purging of records unless it is held up by a long running
transaction that is preventing it from progressing.
ii. Give the purge thread preference over transactions when acquiring the
rseg->mutex during commit. This to avoid purge blocking unnecessarily
when getting the next rollback segment to purge.
Bug #11766501 changes:
Add the rseg to the min binary heap under the cover of the kernel mutex and
the binary heap mutex. This ensures the ordering of the min binary heap.
The two changes have to be committed together because they share the same
that fixes both issues.
rb://567 Approved by: Inaam Rana.
Bug #11766501: Multiple RBS break the get rseg with mininum trx_t::no code during purge
Bug# 59291 changes:
Main problem is that truncating the UNDO log at the completion of every
trx_purge() call is expensive as the number of rollback segments is increased.
We truncate after a configurable amount of pages. The innodb_purge_batch_size
parameter is used to control when InnoDB does the actual truncate. The truncate
is done once after 128 (or TRX_SYS_N_RSEGS iterations). In other words we
truncate after purge 128 * innodb_purge_batch_size. The smaller the batch
size the quicker we truncate.
Introduce a new parameter that allows how many rollback segments to use for
storing REDO information. This is really step 1 in allowing complete control
to the user over rollback space management.
New parameters:
i) innodb_rollback_segments = number of rollback_segments to use
(default is now 128) dynamic parameter, can be changed anytime.
Currently there is little benefit in changing it from the default.
Optimisations in the patch.
i. Change the O(n) behaviour of trx_rseg_get_on_id() to O(log n)
Backported from 5.6. Refactor some of the binary heap code.
Create a new include/ut0bh.ic file.
ii. Avoid truncating the rollback segments after every purge.
Related changes that were moved to a separate patch:
i. Purge should not do any flushing, only wait for space to be free so that
it only does purging of records unless it is held up by a long running
transaction that is preventing it from progressing.
ii. Give the purge thread preference over transactions when acquiring the
rseg->mutex during commit. This to avoid purge blocking unnecessarily
when getting the next rollback segment to purge.
Bug #11766501 changes:
Add the rseg to the min binary heap under the cover of the kernel mutex and
the binary heap mutex. This ensures the ordering of the min binary heap.
The two changes have to be committed together because they share the same
that fixes both issues.
rb://567 Approved by: Inaam Rana.
that implement add_index
The problem was that ALTER TABLE blocked reads on an InnoDB table
while adding a secondary index, even if this was not needed. It is
only needed for the final step where the .frm file is updated.
The reason queries were blocked, was that ALTER TABLE upgraded the
metadata lock from MDL_SHARED_NO_WRITE (which blocks writes) to
MDL_EXCLUSIVE (which blocks all accesses) before index creation.
The way the server handles index creation, is that storage engines
publish their capabilities to the server and the server determines
which of the following three ways this can be handled: 1) build a
new version of the table; 2) change the existing table but with
exclusive metadata lock; 3) change the existing table but without
metadata lock upgrade.
For InnoDB and secondary index creation, option 3) should have been
selected. However this failed for two reasons. First, InnoDB did
not publish this capability properly.
Second, the ALTER TABLE code failed to made proper use of the
information supplied by the storage engine. A variable
need_lock_for_indexes was set accordingly, but was not later used.
This patch fixes this problem by only doing metadata lock upgrade
before index creation/deletion if this variable has been set.
This patch also changes some of the related terminology used
in the code. Specifically the use of "fast" and "online" with
respect to ALTER TABLE. "Fast" was used to indicate that an
ALTER TABLE operation could be done without involving a
temporary table. "Fast" has been renamed "in-place" to more
accurately describe the behavior.
"Online" meant that the operation could be done without taking
a table lock. However, in the current implementation writes
are always prohibited during ALTER TABLE and an exclusive
metadata lock is held while updating the .frm, so ALTER TABLE
is not completely online. This patch replaces "online" with
"in-place", with additional comments indicating if concurrent
reads are allowed during index creation/deletion or not.
An important part of this update of terminology is renaming
of the handler flags used by handlers to indicate if index
creation/deletion can be done in-place and if concurrent reads
are allowed. For example, the HA_ONLINE_ADD_INDEX_NO_WRITES
flag has been renamed to HA_INPLACE_ADD_INDEX_NO_READ_WRITE,
while HA_ONLINE_ADD_INDEX is now HA_INPLACE_ADD_INDEX_NO_WRITE.
Note that this is a rename to clarify current behavior, the
flag values have not changed and no flags have been removed or
added.
Test case added to innodb_mysql_sync.test.
that implement add_index
The problem was that ALTER TABLE blocked reads on an InnoDB table
while adding a secondary index, even if this was not needed. It is
only needed for the final step where the .frm file is updated.
The reason queries were blocked, was that ALTER TABLE upgraded the
metadata lock from MDL_SHARED_NO_WRITE (which blocks writes) to
MDL_EXCLUSIVE (which blocks all accesses) before index creation.
The way the server handles index creation, is that storage engines
publish their capabilities to the server and the server determines
which of the following three ways this can be handled: 1) build a
new version of the table; 2) change the existing table but with
exclusive metadata lock; 3) change the existing table but without
metadata lock upgrade.
For InnoDB and secondary index creation, option 3) should have been
selected. However this failed for two reasons. First, InnoDB did
not publish this capability properly.
Second, the ALTER TABLE code failed to made proper use of the
information supplied by the storage engine. A variable
need_lock_for_indexes was set accordingly, but was not later used.
This patch fixes this problem by only doing metadata lock upgrade
before index creation/deletion if this variable has been set.
This patch also changes some of the related terminology used
in the code. Specifically the use of "fast" and "online" with
respect to ALTER TABLE. "Fast" was used to indicate that an
ALTER TABLE operation could be done without involving a
temporary table. "Fast" has been renamed "in-place" to more
accurately describe the behavior.
"Online" meant that the operation could be done without taking
a table lock. However, in the current implementation writes
are always prohibited during ALTER TABLE and an exclusive
metadata lock is held while updating the .frm, so ALTER TABLE
is not completely online. This patch replaces "online" with
"in-place", with additional comments indicating if concurrent
reads are allowed during index creation/deletion or not.
An important part of this update of terminology is renaming
of the handler flags used by handlers to indicate if index
creation/deletion can be done in-place and if concurrent reads
are allowed. For example, the HA_ONLINE_ADD_INDEX_NO_WRITES
flag has been renamed to HA_INPLACE_ADD_INDEX_NO_READ_WRITE,
while HA_ONLINE_ADD_INDEX is now HA_INPLACE_ADD_INDEX_NO_WRITE.
Note that this is a rename to clarify current behavior, the
flag values have not changed and no flags have been removed or
added.
Test case added to innodb_mysql_sync.test.
"rows examined" estimates". This change implements "innodb_stats_method"
with options of "nulls_equal", "nulls_unequal" and "null_ignored".
rb://553 approved by Marko
"rows examined" estimates". This change implements "innodb_stats_method"
with options of "nulls_equal", "nulls_unequal" and "null_ignored".
rb://553 approved by Marko
This adds 64 new rows to performance_schema.rwlock_instances.
This patch will make perfschema.binlog_mix perfschema.binlog_row tests fail,
but they will be fixed by http://lists.mysql.com/commits/127862
Approved by: Jimmy (rb://554)
This adds 64 new rows to performance_schema.rwlock_instances.
This patch will make perfschema.binlog_mix perfschema.binlog_row tests fail,
but they will be fixed by http://lists.mysql.com/commits/127862
Approved by: Jimmy (rb://554)
InnoDB does not attempt to handle lower_case_table_names == 2 when looking
up foreign table names and referenced table name. It turned that server
variable into a boolean and ignored the possibility of it being '2'.
The setting lower_case_table_names == 2 means that it should be stored and
displayed in mixed case as given, but compared internally in lower case.
Normally the server deals with this since it stores table names. But
InnoDB stores referential constraints for the server, so it needs to keep
track of both lower case and given names.
This solution creates two table name pointers for each foreign and referenced
table name. One to display the name, and one to look it up. Both pointers
point to the same allocated string unless this setting is 2. So the overhead
added is not too much.
Two functions are created in dict0mem.c to populate the ..._lookup versions
of these pointers. Both dict_mem_foreign_table_name_lookup_set() and
dict_mem_referenced_table_name_lookup_set() are called 5 times each.
InnoDB does not attempt to handle lower_case_table_names == 2 when looking
up foreign table names and referenced table name. It turned that server
variable into a boolean and ignored the possibility of it being '2'.
The setting lower_case_table_names == 2 means that it should be stored and
displayed in mixed case as given, but compared internally in lower case.
Normally the server deals with this since it stores table names. But
InnoDB stores referential constraints for the server, so it needs to keep
track of both lower case and given names.
This solution creates two table name pointers for each foreign and referenced
table name. One to display the name, and one to look it up. Both pointers
point to the same allocated string unless this setting is 2. So the overhead
added is not too much.
Two functions are created in dict0mem.c to populate the ..._lookup versions
of these pointers. Both dict_mem_foreign_table_name_lookup_set() and
dict_mem_referenced_table_name_lookup_set() are called 5 times each.
Code cleanup after changes for Bug 56628. The general approach for
InnoDB is to make a reference to each enum value whenever it is used in a
switch statement. In addition, no default case should be used for switch
statements on enum types. This assures that if there is ever any change
in the enum values, the switch will need to change to reflect it since a
compiler warning will occur. In this case, the enum row_type is declared
in handler.h and could be changed for another storage engine. If so, a
warning will occur in the InnoDB build.
Other changes;
* This patch uses 2 macros to help consolidate warning messages that
need to occur twice in the single switch for row_format.
* Using row_format as the variable name to distinguish it from the enum
type.
* Function declaration format correction.
Code cleanup after changes for Bug 56628. The general approach for
InnoDB is to make a reference to each enum value whenever it is used in a
switch statement. In addition, no default case should be used for switch
statements on enum types. This assures that if there is ever any change
in the enum values, the switch will need to change to reflect it since a
compiler warning will occur. In this case, the enum row_type is declared
in handler.h and could be changed for another storage engine. If so, a
warning will occur in the InnoDB build.
Other changes;
* This patch uses 2 macros to help consolidate warning messages that
need to occur twice in the single switch for row_format.
* Using row_format as the variable name to distinguish it from the enum
type.
* Function declaration format correction.
Silence a warning about old table name when InnoDB tests whether the
format has changed using a nonexistent table name.
Reviewed by: bar@mysql.com, marko.makela@oracle.com
Silence a warning about old table name when InnoDB tests whether the
format has changed using a nonexistent table name.
Reviewed by: bar@mysql.com, marko.makela@oracle.com
removed and replaced by the comprehensive innodb-create-options.test.
It uses the rules listed in the comments at the top of that test.
This patch introduces these differences from previous behavior;
1) KEY_BLOCK_SIZE=0 is allowed by Innodb in both strict and non-strict mode
with no errors or warnings. It was previously used by the server to set
KEY_BLOCK_SIZE to undefined. (Bug#56628)
2) An explicit valid non-DEFAULT ROW_FORMAT always takes priority over a
valid KEY_BLOCK_SIZE. (bug#56632)
3) Automatic use of COMPRESSED row format is only done if the ROW_FORMAT
is DEFAULT or unspecified.
4) ROW_FORMAT=FIXED is prevented in strict mode.
This patch also includes various formatting changes for consistency with
InnoDB coding standards.
Related Bugs
Bug#54679: ALTER TABLE causes compressed row_format to revert to compact
Bug#56628: ALTER TABLE .. KEY_BLOCK_SIZE=0 produces untrue warning or unnecessary error
Bug#56632: ALTER TABLE implicitly changes ROW_FORMAT to COMPRESSED
removed and replaced by the comprehensive innodb-create-options.test.
It uses the rules listed in the comments at the top of that test.
This patch introduces these differences from previous behavior;
1) KEY_BLOCK_SIZE=0 is allowed by Innodb in both strict and non-strict mode
with no errors or warnings. It was previously used by the server to set
KEY_BLOCK_SIZE to undefined. (Bug#56628)
2) An explicit valid non-DEFAULT ROW_FORMAT always takes priority over a
valid KEY_BLOCK_SIZE. (bug#56632)
3) Automatic use of COMPRESSED row format is only done if the ROW_FORMAT
is DEFAULT or unspecified.
4) ROW_FORMAT=FIXED is prevented in strict mode.
This patch also includes various formatting changes for consistency with
InnoDB coding standards.
Related Bugs
Bug#54679: ALTER TABLE causes compressed row_format to revert to compact
Bug#56628: ALTER TABLE .. KEY_BLOCK_SIZE=0 produces untrue warning or unnecessary error
Bug#56632: ALTER TABLE implicitly changes ROW_FORMAT to COMPRESSED
CMakeLists.txt: Remove the checks for mysql_storage_engine.cmake
and MYSQL_VERSION_ID.
ha_innodb.cc, ha_innodb.h: Remove the checks for MYSQL_VERSION_ID.
CMakeLists.txt: Remove the checks for mysql_storage_engine.cmake
and MYSQL_VERSION_ID.
ha_innodb.cc, ha_innodb.h: Remove the checks for MYSQL_VERSION_ID.
Remove trx_t::active_trans. Split into two separate fields with distinct
responsibilities. trx_t::is_registered and trx_t::owns_prepare_mutex.
There are wrapper functions for using this fields in ha_innodb.cc. The
wrapper functions check for invariants.
Fix some formatting to conform to InnoDB guidelines.
Remove innobase_register_stmt() and innobase_register_trx_and_stmt().
Add:
trx_is_started()
trx_deregister_from_2pc()
trx_register_for_2pc()
trx_is_registered_for_2pc()
trx_owns_prepare_commit_mutex_set()
trx_has_prepare_commit_mutex()
rb://479, Approved by Jimmy Yang.
Remove trx_t::active_trans. Split into two separate fields with distinct
responsibilities. trx_t::is_registered and trx_t::owns_prepare_mutex.
There are wrapper functions for using this fields in ha_innodb.cc. The
wrapper functions check for invariants.
Fix some formatting to conform to InnoDB guidelines.
Remove innobase_register_stmt() and innobase_register_trx_and_stmt().
Add:
trx_is_started()
trx_deregister_from_2pc()
trx_register_for_2pc()
trx_is_registered_for_2pc()
trx_owns_prepare_commit_mutex_set()
trx_has_prepare_commit_mutex()
rb://479, Approved by Jimmy Yang.
These files were needed when InnoDB Plugin was maintained and distributed
separately from the MySQL 5.1 source tree. They have never been needed in
MySQL 5.5.
storage/innobase/mysql-test:
Patches to the test suite.
storage/innobase/handler/mysql_addons.cc:
Wrappers for private MySQL functions.
These files were needed when InnoDB Plugin was maintained and distributed
separately from the MySQL 5.1 source tree. They have never been needed in
MySQL 5.5.
storage/innobase/mysql-test:
Patches to the test suite.
storage/innobase/handler/mysql_addons.cc:
Wrappers for private MySQL functions.
Additional fixes in 5.5:
ibuf_set_del_mark(): Add diagnostics when setting a buffered delete-mark fails.
ibuf_delete(): Correct a misleading comment about non-found records.
rec_print(): Add a const qualifier to the index parameter.
Bug #56680 wrong InnoDB results from a case-insensitive covering index
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
btr_cur_update_alloc_zip(): Make the function public.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
Additional fixes in 5.5:
ibuf_set_del_mark(): Add diagnostics when setting a buffered delete-mark fails.
ibuf_delete(): Correct a misleading comment about non-found records.
rec_print(): Add a const qualifier to the index parameter.
Bug #56680 wrong InnoDB results from a case-insensitive covering index
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
btr_cur_update_alloc_zip(): Make the function public.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
REC_INFO_DELETED_FLAG: Move the definition from rem0rec.ic to rem0rec.h.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
buf_LRU_free_block(): Refactored from
buf_LRU_search_and_free_block(). This is needed for the
innodb_change_buffering_debug diagnostics.
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
REC_INFO_DELETED_FLAG: Move the definition from rem0rec.ic to rem0rec.h.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
buf_LRU_free_block(): Refactored from
buf_LRU_search_and_free_block(). This is needed for the
innodb_change_buffering_debug diagnostics.
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
In order to fix this bug we need to distinguish whether ha_innobase::info()
has been called from ::analyze() or not. Rename ::info() to ::info_low()
and add a boolean parameter that tells whether the call is from ::analyze()
or not. Create a new simple ::info() that just calls
::info_low(false => not called from analyze). From ::analyze() instead of
::info() call ::info_low(true => called from analyze).
Approved by: Jimmy (rb://487)
In order to fix this bug we need to distinguish whether ha_innobase::info()
has been called from ::analyze() or not. Rename ::info() to ::info_low()
and add a boolean parameter that tells whether the call is from ::analyze()
or not. Create a new simple ::info() that just calls
::info_low(false => not called from analyze). From ::analyze() instead of
::info() call ::info_low(true => called from analyze).
Approved by: Jimmy (rb://487)
Just remove the check whether the file is "too big".
A similar code exists in ha_innobase::update_table_comment() but that
method does not seem to be used.
Just remove the check whether the file is "too big".
A similar code exists in ha_innobase::update_table_comment() but that
method does not seem to be used.
Bug#54678: InnoDB, TRUNCATE, ALTER, I_S SELECT, crash or deadlock
- Incompatible change: truncate no longer resorts to a row by
row delete if the storage engine does not support the truncate
method. Consequently, the count of affected rows does not, in
any case, reflect the actual number of rows.
- Incompatible change: it is no longer possible to truncate a
table that participates as a parent in a foreign key constraint,
unless it is a self-referencing constraint (both parent and child
are in the same table). To work around this incompatible change
and still be able to truncate such tables, disable foreign checks
with SET foreign_key_checks=0 before truncate. Alternatively, if
foreign key checks are necessary, please use a DELETE statement
without a WHERE condition.
Problem description:
The problem was that for storage engines that do not support
truncate table via a external drop and recreate, such as InnoDB
which implements truncate via a internal drop and recreate, the
delete_all_rows method could be invoked with a shared metadata
lock, causing problems if the engine needed exclusive access
to some internal metadata. This problem originated with the
fact that there is no truncate specific handler method, which
ended up leading to a abuse of the delete_all_rows method that
is primarily used for delete operations without a condition.
Solution:
The solution is to introduce a truncate handler method that is
invoked when the engine does not support truncation via a table
drop and recreate. This method is invoked under a exclusive
metadata lock, so that there is only a single instance of the
table when the method is invoked.
Also, the method is not invoked and a error is thrown if
the table is a parent in a non-self-referencing foreign key
relationship. This was necessary to avoid inconsistency as
some integrity checks are bypassed. This is inline with the
fact that truncate is primarily a DDL operation that was
designed to quickly remove all data from a table.
mysql-test/suite/innodb/t/innodb-truncate.test:
Add test cases for truncate and foreign key checks.
Also test that InnoDB resets auto-increment on truncate.
mysql-test/suite/innodb/t/innodb.test:
FK is not necessary, test is related to auto-increment.
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
mysql-test/suite/innodb/t/innodb_mysql.test:
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
Use delete instead of truncate, test is used to check
the interaction of FKs, triggers and delete.
mysql-test/suite/parts/inc/partition_check.inc:
Fix typo.
mysql-test/suite/sys_vars/t/foreign_key_checks_func.test:
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
mysql-test/t/mdl_sync.test:
Modify test case to reflect and ensure that truncate takes
a exclusive metadata lock.
mysql-test/t/trigger-trans.test:
Update error number, truncate is no longer invoked if
table is parent in a FK relationship.
sql/ha_partition.cc:
Reorganize the various truncate methods. delete_all_rows is now
passed directly to the underlying engines, so as truncate. The
code responsible for truncating individual partitions is moved
to ha_partition::truncate_partition, which is invoked when a
ALTER TABLE t1 TRUNCATE PARTITION p statement is executed.
Since the partition truncate no longer can be invoked via
delete, the bitmap operations are not necessary anymore. The
explicit reset of the auto-increment value is also removed
as the underlying engines are now responsible for reseting
the value.
sql/handler.cc:
Wire up the handler truncate method.
sql/handler.h:
Introduce and document the truncate handler method. It assumes
certain use cases of delete_all_rows.
Add method to retrieve the list of foreign keys referencing a
table. Method is used to avoid truncating tables that are
parent in a foreign key relationship.
sql/share/errmsg-utf8.txt:
Add error message for truncate and FK.
sql/sql_lex.h:
Introduce a flag so that the partition engine can detect when
a partition is being truncated. Used to give a special error.
sql/sql_parse.cc:
Function mysql_truncate_table no longer exists.
sql/sql_partition_admin.cc:
Implement the TRUNCATE PARTITION statement.
sql/sql_truncate.cc:
Change the truncate table implementation to use the new truncate
handler method and to not rely on row-by-row delete anymore.
The truncate handler method is always invoked with a exclusive
metadata lock. Also, it is no longer possible to truncate a
table that is parent in some non-self-referencing foreign key.
storage/archive/ha_archive.cc:
Rename method as the description indicates that in the future
this could be a truncate operation.
storage/blackhole/ha_blackhole.cc:
Implement truncate as no operation for the blackhole engine in
order to remain compatible with older releases.
storage/federated/ha_federated.cc:
Introduce truncate method that invokes delete_all_rows.
This is required to support partition truncate as this
form of truncate does not implement the drop and recreate
protocol.
storage/heap/ha_heap.cc:
Introduce truncate method that invokes delete_all_rows.
This is required to support partition truncate as this
form of truncate does not implement the drop and recreate
protocol.
storage/ibmdb2i/ha_ibmdb2i.cc:
Introduce truncate method that invokes delete_all_rows.
This is required to support partition truncate as this
form of truncate does not implement the drop and recreate
protocol.
storage/innobase/handler/ha_innodb.cc:
Rename delete_all_rows to truncate. InnoDB now does truncate
under a exclusive metadata lock.
Introduce and reorganize methods used to retrieve the list
of foreign keys referenced by a or referencing a table.
storage/myisammrg/ha_myisammrg.cc:
Introduce truncate method that invokes delete_all_rows.
This is required in order to remain compatible with earlier
releases where truncate would resort to a row-by-row delete.
Bug#54678: InnoDB, TRUNCATE, ALTER, I_S SELECT, crash or deadlock
- Incompatible change: truncate no longer resorts to a row by
row delete if the storage engine does not support the truncate
method. Consequently, the count of affected rows does not, in
any case, reflect the actual number of rows.
- Incompatible change: it is no longer possible to truncate a
table that participates as a parent in a foreign key constraint,
unless it is a self-referencing constraint (both parent and child
are in the same table). To work around this incompatible change
and still be able to truncate such tables, disable foreign checks
with SET foreign_key_checks=0 before truncate. Alternatively, if
foreign key checks are necessary, please use a DELETE statement
without a WHERE condition.
Problem description:
The problem was that for storage engines that do not support
truncate table via a external drop and recreate, such as InnoDB
which implements truncate via a internal drop and recreate, the
delete_all_rows method could be invoked with a shared metadata
lock, causing problems if the engine needed exclusive access
to some internal metadata. This problem originated with the
fact that there is no truncate specific handler method, which
ended up leading to a abuse of the delete_all_rows method that
is primarily used for delete operations without a condition.
Solution:
The solution is to introduce a truncate handler method that is
invoked when the engine does not support truncation via a table
drop and recreate. This method is invoked under a exclusive
metadata lock, so that there is only a single instance of the
table when the method is invoked.
Also, the method is not invoked and a error is thrown if
the table is a parent in a non-self-referencing foreign key
relationship. This was necessary to avoid inconsistency as
some integrity checks are bypassed. This is inline with the
fact that truncate is primarily a DDL operation that was
designed to quickly remove all data from a table.
parameters don't match
Revert the changes of the default values of innodb_file_per_table
and innobase_file_format in 5.5, until WL#5135 is implemented.
parameters don't match
Revert the changes of the default values of innodb_file_per_table
and innobase_file_format in 5.5, until WL#5135 is implemented.
Fix compiler warning:
handler/ha_innodb.cc: In function 'bool innodb_show_status(handlerton*, THD*, bool (*)(THD*, const char*, uint, const char*, uint, const char*, uint))':
handler/ha_innodb.cc:7539:7: error: variable 'result' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
handler/ha_innodb.cc: In function 'bool innodb_show_status(handlerton*, THD*, bool (*)(THD*, const char*, uint, const char*, uint, const char*, uint))':
handler/ha_innodb.cc:7539:7: error: variable 'result' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
handler/ha_innodb.cc: In function 'void innobase_drop_database(handlerton*, char*)':
handler/ha_innodb.cc:5969:6: error: variable 'error' set but not used [-Werror=unused-but-set-variable]
Fix compiler warning:
handler/ha_innodb.cc: In function 'void innobase_drop_database(handlerton*, char*)':
handler/ha_innodb.cc:5969:6: error: variable 'error' set but not used [-Werror=unused-but-set-variable]
------------------------------------------------------------
revno: 3550
revision-id: marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2
parent: marko.makela@oracle.com-20100823102854-t1clrojqis2ley36
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Tue 2010-08-24 11:10:03 +0300
message:
Bug#55832: selects crash too easily when innodb_force_recovery>3
dict_update_statistics_low(): Create bogus statistics for those
indexes that cannot be accessed because of the innodb_force_recovery
setting.
ha_innobase::info(): Calculate statistics for each index, even if
innodb_force_recovery is set. Fill in bogus data for those indexes
that are not accessed because of the innodb_force_recovery setting.
------------------------------------------------------------
revno: 3550
revision-id: marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2
parent: marko.makela@oracle.com-20100823102854-t1clrojqis2ley36
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Tue 2010-08-24 11:10:03 +0300
message:
Bug#55832: selects crash too easily when innodb_force_recovery>3
dict_update_statistics_low(): Create bogus statistics for those
indexes that cannot be accessed because of the innodb_force_recovery
setting.
ha_innobase::info(): Calculate statistics for each index, even if
innodb_force_recovery is set. Fill in bogus data for those indexes
that are not accessed because of the innodb_force_recovery setting.
dict_update_statistics_low(): Create bogus statistics for those
indexes that cannot be accessed because of the innodb_force_recovery
setting.
ha_innobase::info(): Calculate statistics for each index, even if
innodb_force_recovery is set. Fill in bogus data for those indexes
that are not accessed because of the innodb_force_recovery setting.
dict_update_statistics_low(): Create bogus statistics for those
indexes that cannot be accessed because of the innodb_force_recovery
setting.
ha_innobase::info(): Calculate statistics for each index, even if
innodb_force_recovery is set. Fill in bogus data for those indexes
that are not accessed because of the innodb_force_recovery setting.
This patch doesn't get rid of the need to acquire the dict_sys->mutex but
reduces the need to keep the mutex locked for the duration of the query
to fsp_get_available_space_in_free_extents() from ha_innobase::info().
rb://390.
This patch doesn't get rid of the need to acquire the dict_sys->mutex but
reduces the need to keep the mutex locked for the duration of the query
to fsp_get_available_space_in_free_extents() from ha_innobase::info().
rb://390.
Handle overflow when reading value from SELECT MAX(C) FROM T;
Call ha_innobase::info() after initializing the autoinc value
in ha_innobase::open().
Fix for both the builtin and plugin.
rb://402
Merge from mysql-5.1-security.
Handle overflow when reading value from SELECT MAX(C) FROM T;
Call ha_innobase::info() after initializing the autoinc value
in ha_innobase::open().
Fix for both the builtin and plugin.
rb://402
Merge from mysql-5.1-security.
Handle overflow when reading value from SELECT MAX(C) FROM T;
Call ha_innobase::info() after initializing the autoinc value
in ha_innobase::open().
Fix for both the builtin and plugin.
rb://402
Handle overflow when reading value from SELECT MAX(C) FROM T;
Call ha_innobase::info() after initializing the autoinc value
in ha_innobase::open().
Fix for both the builtin and plugin.
rb://402
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
client/mysqldump.c:
Pass my_free directly as its signature is compatible with the
callback type -- which wasn't the case for free_table_ent.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
Merge up to sunny.bains@oracle.com-20100625081841-ppulnkjk1qlazh82 .
There are 8 more changesets in mysql-5.1-innodb, but PB2 shows a
failure for a test added in one of them. If that is resolved quickly
then those 8 more changesets will be merged too.
Merge up to sunny.bains@oracle.com-20100625081841-ppulnkjk1qlazh82 .
There are 8 more changesets in mysql-5.1-innodb, but PB2 shows a
failure for a test added in one of them. If that is resolved quickly
then those 8 more changesets will be merged too.