sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.
In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)
In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.
trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.
trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.
trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.
trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.
trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.
trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.
rb:749 approved by Inaam Rana
Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
The patch replaces the use of the POSIX I/O interfaces in mysys on Windows with
the Win32 API calls (CreateFile, WriteFile, etc). The Windows HANDLE for the open
file is stored in the my_file_info struct, along with a flag for append mode
(because the Windows API does not support opening files in append mode in all cases)
The default max open files has been increased to 16384 and can be increased further
by setting --max-open-files=<value> during the server start.
Noteworthy benefit of this patch is that it removes limits from the table_cache size -
allowing for more simultaneus users
SECONDARY INDEX IN INNODB
The patches for Bug#11751388 and Bug#11784056 enabled concurrent
reads while creating secondary indexes in InnoDB. However, they
introduced a regression. This regression occured if ALTER TABLE
failed after the index had been added, for example during the
lock upgrade needed to update .FRM. If this happened, InnoDB
and the server got out of sync with regards to which indexes
actually existed. Therefore the patch for Bug#11815600 again
disabled concurrent reads.
This patch re-enables concurrent reads. The original regression
is fixed by splitting the ADD INDEX operation into two parts.
First the new index is created but not made active. This is
done while concurrent reads are allowed. The second part of
the operation makes the index active (or reverts the change).
This is done after lock upgrade, which prevents the original
regression.
In order to implement this change, the patch changes the storage
API for in-place index creation. handler::add_index() is split
into two functions, handler_add_index() and
handler::final_add_index(). The former for creating indexes without
making them visible and the latter for commiting (i.e. making
visible) new indexes or reverting the changes.
Large parts of this patch were written by Marko Mäkelä.
Test case added to innodb_mysql_lock.test.
With this change, the index prefix column length lifted from 767 bytes
to 3072 bytes if "innodb_large_prefix" is set to "true".
rb://603 approved by Marko
A lot of small fixes and new test cases.
client/mysqlbinlog.cc:
Cast removed
client/mysqltest.cc:
Added missing DBUG_RETURN
include/my_pthread.h:
set_timespec_time_nsec() now only takes one argument
mysql-test/t/date_formats.test:
Remove --disable_ps_protocl as now also ps supports microseconds
mysys/my_uuid.c:
Changed to use my_interval_timer() instead of my_getsystime()
mysys/waiting_threads.c:
Changed to use my_hrtime()
sql/field.h:
Added bool special_const_compare() for fields that may convert values before compare (like year)
sql/field_conv.cc:
Added test to get optimal copying of identical temporal values.
sql/item.cc:
Return that item_int is equal if it's positive, even if unsigned flag is different.
Fixed Item_cache_str::save_in_field() to have identical null check as other similar functions
Added proper NULL check to Item_cache_int::save_in_field()
sql/item_cmpfunc.cc:
Don't call convert_constant_item() if there is nothing that is worth converting.
Simplified test when years should be converted
sql/item_sum.cc:
Mark cache values in Item_sum_hybrid as not constants to ensure they are not replaced by other cache values in compare_datetime()
sql/item_timefunc.cc:
Changed sec_to_time() to take a my_decimal argument to ensure we don't loose any sub seconds.
Added Item_temporal_func::get_time() (This simplifies some things)
sql/mysql_priv.h:
Added Lazy_string_decimal()
sql/mysqld.cc:
Added my_decimal constants max_seconds_for_time_type, time_second_part_factor
sql/table.cc:
Changed expr_arena to be of type CONVENTIONAL_EXECUTION to ensure that we don't loose any items that are created by fix_fields()
sql/tztime.cc:
TIME_to_gmt_sec() now sets *in_dst_time_gap in case of errors
This is needed to be able to detect if timestamp is 0
storage/maria/lockman.c:
Changed from my_getsystime() to set_timespec_time_nsec()
storage/maria/ma_loghandler.c:
Changed from my_getsystime() to my_hrtime()
storage/maria/ma_recovery.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/maria/unittest/trnman-t.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/xtradb/handler/ha_innodb.cc:
Added support for new time,datetime and timestamp
unittest/mysys/thr_template.c:
my_getsystime() -> my_interval_timer()
unittest/mysys/waiting_threads-t.c:
my_getsystime() -> my_interval_timer()
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The
same scenario works fine for MyISAM temporary tables.
The assertion failure was caused by discrepancy between lock
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.
Possible approaches to fixing this problem are:
1) Skip locking of temporary tables as locking doesn't make any
sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ...
READ.
Unfortunately both of these approaches have drawbacks which make
them unviable for stable versions of server.
So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.1 version of this patch switch from read lock to write
lock is done inside of InnoDBs handler methods as doing it on
SQL-layer causes compatibility troubles with FLUSH TABLES WITH
READ LOCK.
mysql-test/suite/innodb/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
storage/innobase/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
storage/innodb_plugin/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
- Added a lot of code comments
- Updated get_best_ror_intersec() to prefer index scan on not clustered keys before clustered keys.
- Use HA_CLUSTERED_INDEX to define if one should use HA_MRR_INDEX_ONLY
- For test of using index or filesort to resolve ORDER BY, use HA_CLUSTERED_INDEX flag instead of primary_key_is_clustered()
- Use HA_TABLE_SCAN_ON_INDEX instead of primary_key_is_clustered() to decide if ALTER TABLE ... ORDER BY will have any effect.
sql/ha_partition.h:
Added comment with warning for code unsafe to use with multiple storage engines at the same time
sql/handler.h:
Added HA_CLUSTERED_INDEX.
Documented primary_key_is_clustered()
sql/opt_range.cc:
Added code comments
Updated get_best_ror_intersec() to ignore clustered keys.
Optimized away cpk_scan_used and one instance of current_thd (Simpler code)
Use HA_CLUSTERED_INDEX to define if one should use HA_MRR_INDEX_ONLY
sql/sql_select.cc:
Changed comment to #ifdef
For test of using index or filesort to resolve ORDER BY, use HA_CLUSTERED_INDEX flag instead of primary_key_is_clustered()
(Change is smaller than what it looks beause of indentation change)
sql/sql_table.cc:
Use HA_TABLE_SCAN_ON_INDEX instead of primary_key_is_clustered() to decide if ALTER TABLE ... ORDER BY will have any effect.
storage/innobase/handler/ha_innodb.h:
Added support for HA_CLUSTERED_INDEX
storage/innodb_plugin/handler/ha_innodb.cc:
Added support for HA_CLUSTERED_INDEX
storage/xtradb/handler/ha_innodb.cc:
Added support for HA_CLUSTERED_INDEX
DB_COL_APPEARS_TWICE_IN_INDEX: Remove. This condition is already
checked and reported by MySQL before passing the index definition to
the storage engine.
row_create_index_for_mysql(): Remove the redundant check for
DB_COL_APPEARS_TWICE_IN_INDEX. When enforcing the column prefix index
limit, invoke dict_mem_index_free(index) to plug the memory leak. In
the loop, use index->n_def instead of dict_index_get_n_fields(index),
because the latter would be 0 for indexes that have not been copied to
the data dictionary cache.
innodb-use-sys-malloc.test:
Add test cases for attempting to trigger the error checks in
row_create_index_for_mysql(). Before MySQL 5.5 and WL#5743, the leak
is only reproducible if ha_innobase::max_supported_key_part_length()
returned a higher limit than the one used in
row_create_index_for_mysql().
In MySQL 5.5 and later, the leak is reproducible with
innodb_large_prefix=true.
rb:688 approved by Jimmy Yang
The innoDB global variable srv_lower_case_table_names is set to the value of lower_case_table_names declared in mysqld.h server in ha_innodb.cc. Since this variable can change at runtime, it is reset for each handler call to ::create, ::open, ::rename_table & ::delete_table.
But it is possible for tables to be implicitly opened before an explicit handler call is made when an engine is first started or restarted. I was able to reproduce that with the testcase in this patch on a version of InnoDB from 2 weeks ago. It seemed like the change buffer entries for the secondary key was getting put into pages after the restart. (But I am not sure, I did not write down the call stack while it was reproducing.) In the current code, the implicit open, which is actually a call to dict_load_foreigns(), does not occur with this testcase.
The change is to replace srv_lower_case_table_names by an interface function in innodb.cc that retrieves the server global variable when it is needed.
In ha_innobase::create(), we check some things while holding an
exclusive lock on the data dictionary. Defer the locking and the
creation of transactions until after the checks have passed. The
THDVAR could hang due to a mutex wait (see Bug #11750569 - 41163:
deadlock in mysqld: LOCK_global_system_variables and LOCK_open), and
we want to avoid waiting while holding InnoDB mutexes.
innobase_index_name_is_reserved(): Replace the parameter trx_t with
THD, so that the test can be performed before starting an InnoDB
transaction. We only needed trx->mysql_thd.
ha_innobase::create(): Create transaction and lock the data dictionary
only after passing the basic tests.
create_table_def(): Move the IS_MAGIC_TABLE_AND_USER_DENIED_ACCESS
check to ha_innobase::create(). Assign to srv_lower_case_table_names
while holding dict_sys->mutex.
ha_innobase::delete_table(), ha_innobase::rename_table(),
innobase_rename_table(): Assign srv_lower_case_table_names as late as
possible. Here, the variable is not necessarily protected by
dict_sys->mutex.
ha_innobase::add_index(): Invoke innobase_index_name_is_reserved() and
innobase_check_index_keys() before allocating anything.
rb:618 approved by Jimmy Yang
causes future shutdown hang
InnoDB would hang on shutdown if any XA transactions exist in the
system in the PREPARED state. This has been masked by the fact that
MySQL would roll back any PREPARED transaction on shutdown, in the
spirit of Bug #12161 Xa recovery and client disconnection.
[mysql-test-run] do_shutdown_server: Interpret --shutdown_server 0 as
a request to kill the server immediately without initiating a
shutdown procedure.
xid_cache_insert(): Initialize XID_STATE::rm_error in order to avoid a
bogus error message on XA ROLLBACK of a recovered PREPARED transaction.
innobase_commit_by_xid(), innobase_rollback_by_xid(): Free the InnoDB
transaction object after rolling back a PREPARED transaction.
trx_get_trx_by_xid(): Only consider transactions whose
trx->is_prepared flag is set. The MySQL layer seems to prevent
attempts to roll back connected transactions that are in the PREPARED
state from another connection, but it is better to play it safe. The
is_prepared flag was introduced in the InnoDB Plugin.
trx_n_prepared: A new counter, counting the number of InnoDB
transactions in the PREPARED state.
logs_empty_and_mark_files_at_shutdown(): On shutdown, allow
trx_n_prepared transactions to exist in the system.
trx_undo_free_prepared(), trx_free_prepared(): New functions, to free
the memory objects of PREPARED transactions on shutdown. This is not
needed in the built-in InnoDB, because it would collect all allocated
memory on shutdown. The InnoDB Plugin needs this because of
innodb_use_sys_malloc.
trx_sys_close(): Invoke trx_free_prepared() on all remaining
transactions.
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
KEY NO 0 FOR TABLE IN ERROR LOG
With the changes made by the patches for Bug#11751388 and
Bug#11784056, concurrent reads are allowed while secondary
indexes are created in InnoDB. This means that the metadata
lock on the affected table is not upgraded to exclusive
until the .FRM is updated at the end of ALTER TABLE processing.
The problem was that if this lock upgrade failed for some
reason (e.g. timeout), the index information in the server
and inside InnoDB would be out of sync. This would happen
since the add index operation already was committed inside
InnoDB but the table metadata inside the server had not been
updated yet.
This patch fixes the problem by (for now) reverting the
effects of the patches for Bug#11751388 and Bug#11784056.
Concurrent reads will now again be blocked during creation
of secondary indexes in InnoDB.
Test case added to innodb_mysql_lock.test.
NON-PRIMARY UNIQUE INDEX USING INNODB
This patch adds the HA_INPLACE_ADD_UNIQUE_INDEX_NO_WRITE
capability flag to InnoDB, indicating that concurrent reads
can be allowed while non-primary unique indexes are created.
This is an follow-up to Bug #11751388 which enabled concurrent
reads when creating non-primary non-unique indexes.
Test case added to innodb_mysql_sync.test.
Bug #11766501: Multiple RBS break the get rseg with mininum trx_t::no code during purge
Bug# 59291 changes:
Main problem is that truncating the UNDO log at the completion of every
trx_purge() call is expensive as the number of rollback segments is increased.
We truncate after a configurable amount of pages. The innodb_purge_batch_size
parameter is used to control when InnoDB does the actual truncate. The truncate
is done once after 128 (or TRX_SYS_N_RSEGS iterations). In other words we
truncate after purge 128 * innodb_purge_batch_size. The smaller the batch
size the quicker we truncate.
Introduce a new parameter that allows how many rollback segments to use for
storing REDO information. This is really step 1 in allowing complete control
to the user over rollback space management.
New parameters:
i) innodb_rollback_segments = number of rollback_segments to use
(default is now 128) dynamic parameter, can be changed anytime.
Currently there is little benefit in changing it from the default.
Optimisations in the patch.
i. Change the O(n) behaviour of trx_rseg_get_on_id() to O(log n)
Backported from 5.6. Refactor some of the binary heap code.
Create a new include/ut0bh.ic file.
ii. Avoid truncating the rollback segments after every purge.
Related changes that were moved to a separate patch:
i. Purge should not do any flushing, only wait for space to be free so that
it only does purging of records unless it is held up by a long running
transaction that is preventing it from progressing.
ii. Give the purge thread preference over transactions when acquiring the
rseg->mutex during commit. This to avoid purge blocking unnecessarily
when getting the next rollback segment to purge.
Bug #11766501 changes:
Add the rseg to the min binary heap under the cover of the kernel mutex and
the binary heap mutex. This ensures the ordering of the min binary heap.
The two changes have to be committed together because they share the same
that fixes both issues.
rb://567 Approved by: Inaam Rana.
that implement add_index
The problem was that ALTER TABLE blocked reads on an InnoDB table
while adding a secondary index, even if this was not needed. It is
only needed for the final step where the .frm file is updated.
The reason queries were blocked, was that ALTER TABLE upgraded the
metadata lock from MDL_SHARED_NO_WRITE (which blocks writes) to
MDL_EXCLUSIVE (which blocks all accesses) before index creation.
The way the server handles index creation, is that storage engines
publish their capabilities to the server and the server determines
which of the following three ways this can be handled: 1) build a
new version of the table; 2) change the existing table but with
exclusive metadata lock; 3) change the existing table but without
metadata lock upgrade.
For InnoDB and secondary index creation, option 3) should have been
selected. However this failed for two reasons. First, InnoDB did
not publish this capability properly.
Second, the ALTER TABLE code failed to made proper use of the
information supplied by the storage engine. A variable
need_lock_for_indexes was set accordingly, but was not later used.
This patch fixes this problem by only doing metadata lock upgrade
before index creation/deletion if this variable has been set.
This patch also changes some of the related terminology used
in the code. Specifically the use of "fast" and "online" with
respect to ALTER TABLE. "Fast" was used to indicate that an
ALTER TABLE operation could be done without involving a
temporary table. "Fast" has been renamed "in-place" to more
accurately describe the behavior.
"Online" meant that the operation could be done without taking
a table lock. However, in the current implementation writes
are always prohibited during ALTER TABLE and an exclusive
metadata lock is held while updating the .frm, so ALTER TABLE
is not completely online. This patch replaces "online" with
"in-place", with additional comments indicating if concurrent
reads are allowed during index creation/deletion or not.
An important part of this update of terminology is renaming
of the handler flags used by handlers to indicate if index
creation/deletion can be done in-place and if concurrent reads
are allowed. For example, the HA_ONLINE_ADD_INDEX_NO_WRITES
flag has been renamed to HA_INPLACE_ADD_INDEX_NO_READ_WRITE,
while HA_ONLINE_ADD_INDEX is now HA_INPLACE_ADD_INDEX_NO_WRITE.
Note that this is a rename to clarify current behavior, the
flag values have not changed and no flags have been removed or
added.
Test case added to innodb_mysql_sync.test.