In ha_innobase::create(), we check some things while holding an
exclusive lock on the data dictionary. Defer the locking and the
creation of transactions until after the checks have passed. The
THDVAR could hang due to a mutex wait (see Bug #11750569 - 41163:
deadlock in mysqld: LOCK_global_system_variables and LOCK_open), and
we want to avoid waiting while holding InnoDB mutexes.
innobase_index_name_is_reserved(): Replace the parameter trx_t with
THD, so that the test can be performed before starting an InnoDB
transaction. We only needed trx->mysql_thd.
ha_innobase::create(): Create transaction and lock the data dictionary
only after passing the basic tests.
create_table_def(): Move the IS_MAGIC_TABLE_AND_USER_DENIED_ACCESS
check to ha_innobase::create(). Assign to srv_lower_case_table_names
while holding dict_sys->mutex.
ha_innobase::delete_table(), ha_innobase::rename_table(),
innobase_rename_table(): Assign srv_lower_case_table_names as late as
possible. Here, the variable is not necessarily protected by
dict_sys->mutex.
ha_innobase::add_index(): Invoke innobase_index_name_is_reserved() and
innobase_check_index_keys() before allocating anything.
rb:618 approved by Jimmy Yang
causes future shutdown hang
InnoDB would hang on shutdown if any XA transactions exist in the
system in the PREPARED state. This has been masked by the fact that
MySQL would roll back any PREPARED transaction on shutdown, in the
spirit of Bug #12161 Xa recovery and client disconnection.
[mysql-test-run] do_shutdown_server: Interpret --shutdown_server 0 as
a request to kill the server immediately without initiating a
shutdown procedure.
xid_cache_insert(): Initialize XID_STATE::rm_error in order to avoid a
bogus error message on XA ROLLBACK of a recovered PREPARED transaction.
innobase_commit_by_xid(), innobase_rollback_by_xid(): Free the InnoDB
transaction object after rolling back a PREPARED transaction.
trx_get_trx_by_xid(): Only consider transactions whose
trx->is_prepared flag is set. The MySQL layer seems to prevent
attempts to roll back connected transactions that are in the PREPARED
state from another connection, but it is better to play it safe. The
is_prepared flag was introduced in the InnoDB Plugin.
trx_n_prepared: A new counter, counting the number of InnoDB
transactions in the PREPARED state.
logs_empty_and_mark_files_at_shutdown(): On shutdown, allow
trx_n_prepared transactions to exist in the system.
trx_undo_free_prepared(), trx_free_prepared(): New functions, to free
the memory objects of PREPARED transactions on shutdown. This is not
needed in the built-in InnoDB, because it would collect all allocated
memory on shutdown. The InnoDB Plugin needs this because of
innodb_use_sys_malloc.
trx_sys_close(): Invoke trx_free_prepared() on all remaining
transactions.
On shutdown, do not exit threads in os_event_wait(). This method of
exiting was only used by the I/O handler threads. Exit them on a
higher level.
os_event_wait_low(), os_event_wait_time_low(): Do not exit on shutdown.
os_thread_exit(), ut_dbg_assertion_failed(), ut_print_timestamp(): Add
attribute cold, so that GCC knows that these functions are rarely
invoked and can be optimized for size.
os_aio_linux_collect(): Return on shutdown.
os_aio_linux_handle(), os_aio_simulated_handle(), os_aio_windows_handle():
Set *message1 = *message2 = NULL and return TRUE on shutdown.
fil_aio_wait(): Return on shutdown.
logs_empty_and_mark_files_at_shutdown(): Even in very fast shutdown
(innodb_fast_shutdown=2), allow the background threads to exit, but
skip the flushing and log checkpointing.
innobase_shutdown_for_mysql(): Always wait for all the threads to exit.
rb:633 approved by Sunny Bains
Remove most references to thread id in InnoDB. Three references
remain: the current holder of a mutex, and the current x-lock holder
of a rw-lock, and some references in UNIV_SYNC_DEBUG checks. This
allows MySQL to change the thread associated to a client connection.
Tighten the UNIV_SYNC_DEBUG checks, trying to ensure that no InnoDB
mutex or x-lock is being held when returning control to MySQL. The
only semaphore that may be held is the btr_search_latch in shared mode.
sync_thread_levels_empty_except_dict(): A wrapper for
sync_thread_levels_empty_gen(TRUE).
sync_thread_levels_nonempty_trx(): Check that the current thread is
not holding any InnoDB semaphores, except btr_search_latch if
trx->has_search_latch.
sync_thread_levels_empty(): Unused function; remove.
trx_t: Remove mysql_thread_id and mysql_process_no.
srv_slot_t: Remove id and handle.
row_search_for_mysql(), srv_conc_enter_innodb(),
srv_conc_force_enter_innodb(), srv_conc_force_exit_innodb(),
srv_conc_exit_innodb(), srv_suspend_mysql_thread: Assert
!sync_thread_levels_nonempty_trx().
rb:634 approved by Sunny Bains
sync_array_print_long_waits(): Return the longest waiting thread ID
and the longest waited-for lock. Only if those remain unchanged
between calls in srv_error_monitor_thread(), increment
fatal_cnt. Otherwise, reset fatal_cnt.
Background: There is a built-in watchdog in InnoDB whose purpose is to
kill the server when some thread is stuck waiting for a mutex or
rw-lock. Before this fix, the logic was flawed.
The function sync_array_print_long_waits() returns TRUE if it finds a
lock wait that exceeds 10 minutes (srv_fatal_semaphore_wait_threshold).
The function srv_error_monitor_thread() will kill the server if this
happens 10 times in a row (fatal_cnt reaches 10), checked every 30
seconds. This is wrong, because this situation does not mean that the
server is hung. If the server is very busy for a little over 15
minutes, it will be killed.
Consider this example. Thread T1 is waiting for mutex M. Some time
later, threads T2..Tn start waiting for the same mutex M. If T1 keeps
waiting for 600 seconds, fatal_cnt will be incremented to 1. So far,
so good. Now, if M is granted to T1, the server was obviously not
stuck. But, T2..Tn keeps waiting, and their wait time will be longer
than 600 seconds. If 5 minutes later, some Tn has still been waiting
for more than 10 minutes for the mutex M, the server can be killed,
even though it is not stuck.
rb:622 approved by Jimmy Yang
The LGPL license is used in some legacy code, and to
adhere to current licensing polity, we remove those
files that are no longer used, and reorganize the
remaining LGPL code so it will be GPL licensed from
now on.
Note: This patch only removed LGPL licensed files
in MySQL 5.5 and later, and is the third of a
set of patches to remove LGPL from all trees.
(See Bug# 11840513 for details)
The LGPL license is used in some legacy code, and to
adhere to current licensing polity, we remove those
files that are no longer used, and reorganize the
remaining LGPL code so it will be GPL licensed from
now on.
Note: This patch only removed LGPL licensed files
in MySQL 5.1, and is the second of a set of
patches to remove LGPL from all trees.
(See Bug# 11840513 for details)
This problem was introduced in
marko.makela@oracle.com-20100514130815-ym7j7cfu88ro6km4
and is probably the reason for the following valgrind warning:
from http://bugs.mysql.com/52691 , http://bugs.mysql.com/file.php?id=16880 :
Version: '5.6.3-m5-valgrind-max-debug' socket: '/tmp/mysql.sock' port: 3306 Source distribution
==14947== Thread 18:
==14947== Conditional jump or move depends on uninitialised value(s)
==14947== at 0x4A06318: __GI_strlen (mc_replace_strmem.c:284)
==14947== by 0x9F3D7A: fill_innodb_trx_from_cache(trx_i_s_cache_struct*, THD*, TABLE*) (i_s.cc:591)
==14947== by 0x9F4D7D: trx_i_s_common_fill_table(THD*, TABLE_LIST*, Item*) (i_s.cc:1238)
==14947== by 0x7689F3: get_schema_tables_result(JOIN*, enum_schema_table_state) (sql_show.cc:6745)
==14947== by 0x715A75: JOIN::exec() (sql_select.cc:2861)
==14947== by 0x7185BD: mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:3609)
==14947== by 0x70E823: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:319)
==14947== by 0x6F2305: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:4557)
==14947== by 0x6EAED4: mysql_execute_command(THD*) (sql_parse.cc:2135)
==14947== by 0x6F44C9: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5597)
==14947== by 0x6E864B: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1093)
==14947== by 0x6E785E: do_command(THD*) (sql_parse.cc:815)
==14947== by 0x6C18DD: do_handle_one_connection(THD*) (sql_connect.cc:771)
==14947== by 0x6C146E: handle_one_connection (sql_connect.cc:707)
==14947== by 0x30E1807760: start_thread (pthread_create.c:301)
==14947== by 0x35EA670F: ???
==14947== Uninitialised value was created by a heap allocation
==14947== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==14947== by 0xB4B948: mem_area_alloc (mem0pool.c:385)
==14947== by 0xB4A27C: mem_heap_create_block (mem0mem.c:333)
==14947== by 0xB4A530: mem_heap_add_block (mem0mem.c:446)
==14947== by 0xB0D2A4: mem_heap_alloc (mem0mem.ic:186)
==14947== by 0xB0D9C2: ha_storage_put_memlim (ha0storage.c:118)
==14947== by 0xA479D8: fill_trx_row (trx0i_s.c:521)
==14947== by 0xA490E9: fetch_data_into_cache (trx0i_s.c:1319)
==14947== by 0xA491BA: trx_i_s_possibly_fetch_data_into_cache (trx0i_s.c:1352)
==14947== by 0x9F4CE7: trx_i_s_common_fill_table(THD*, TABLE_LIST*, Item*) (i_s.cc:1221)
==14947== by 0x7689F3: get_schema_tables_result(JOIN*, enum_schema_table_state) (sql_show.cc:6745)
==14947== by 0x715A75: JOIN::exec() (sql_select.cc:2861)
==14947== by 0x7185BD: mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:3609)
==14947== by 0x70E823: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:319)
==14947== by 0x6F2305: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:4557)
==14947== by 0x6EAED4: mysql_execute_command(THD*) (sql_parse.cc:2135)
==14947== by 0x6F44C9: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5597)
==14947== by 0x6E864B: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1093)
==14947== by 0x6E785E: do_command(THD*) (sql_parse.cc:815)
==14947== by 0x6C18DD: do_handle_one_connection(THD*) (sql_connect.cc:771)
==14947== by 0x6C146E: handle_one_connection (sql_connect.cc:707)
==14947== by 0x30E1807760: start_thread (pthread_create.c:301)
==14947== by 0x35EA670F: ???
(gdb) bt
#0 0x0000000004a06318 in _vgrZU_libcZdsoZa___GI_strlen (str=0x3026bfa0 "insert into `blobtest` set `data`='pkefxxpkalpabzgrczlxefkreqljeqbvzrcnhvhsjsfnvxzjsltfuincffigdkmhvvcmnseluzgbtedrfmxvnrdmzesbinjgwvharkpgjplrlnqudfidbqwgbykupycxzyikzqincnsjrxgncqzlgyqwjdbjulztgsffxpjgymsnntdibvklwqylmwhsmdskmllxuwafabdjnwlyofknwuixiyrgnplmerfdewgizkdhznitesfqepsqbbwkdepkmjoseyxjofmmjaqdipwopfrwidmhqbtovdslvayxcnpewzhppeetblccppniamezibuoinvlxkafpcmozawtplfpepxwlwhymsuraezcwvjqzwogsozodlsfzjiyrcaljjhqwdrcjawvelhefzzaexvcbyorlcyupqwgjuamiqpiputtndjwcsuyzdfhuxswuowhrzdvriwrxqmcqthvzzzvivbabbnhdbtcfdtgssvmirrcddnytnctcvqplwytxxzxelldhwahalzxvgynaiwjyezhxqhlsqudngekocfvlbqprxqhyhwbaomgqiwkpfguohuvlnhtrsszgacxhhzeppyqwfwabiqzgyzkperiidyunrykopysvlcxwhrcboetjltawdjergalsfvaxncmzoznryumrjmncvhvxqvqhhbznnifkguuiffmlrbmgwtzvnuwlaguixqadkupfhasbbxnwkrvsfhrqanfmvjtzfqodtutkjlxfcogtsjywrdgmzgszjtsmimaelsveayqrwviqwwefeziuaqsqpauxpnzhaxjtkdfvvodniwezskbxfxszyniyzkzxngcfwgjlyrlskmrzxqnptwlilsxybuguafxxkvryyjrnkhhcmxuusitaflaiuxjhyfnzkahlgmaszujqmfdhyppdnpweqanmvzgjfyzjolbmprhnuuxextcaxzicfvsuochprmlf"...) at mc_replace_strmem.c:284
#1 0x00000000009f3d7b in fill_innodb_trx_from_cache (cache=0x1462440, thd=0x2a495000, table=0x2a422500) at /home/sbester/build/bzr/mysql-trunk/storage/innobase/handler/i_s.cc:591
#2 0x00000000009f4d7e in trx_i_s_common_fill_table (thd=0x2a495000, tables=0x2a4c3ec0) at /home/sbester/build/bzr/mysql-trunk/storage/innobase/handler/i_s.cc:1238
#3 0x00000000007689f4 in get_schema_tables_result (join=0x30f90c40, executed_place=PROCESSED_BY_JOIN_EXEC) at /home/sbester/build/bzr/mysql-trunk/sql/sql_show.cc:6745
#4 0x0000000000715a76 in JOIN::exec (this=0x30f90c40) at /home/sbester/build/bzr/mysql-trunk/sql/sql_select.cc:2861
#5 0x00000000007185be in mysql_select (thd=0x2a495000, rref_pointer_array=0x2a497590, tables=0x2a4c3ec0, wild_num=1, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2684619520, result=0x30319720, unit=0x2a496d28, select_lex=0x2a497378) at /home/sbester/build/bzr/mysql-trunk/sql/sql_select.cc:3609
#6 0x000000000070e824 in handle_select (thd=0x2a495000, lex=0x2a496c78, result=0x30319720, setup_tables_done_option=0) at /home/sbester/build/bzr/mysql-trunk/sql/sql_select.cc:319
#7 0x00000000006f2306 in execute_sqlcom_select (thd=0x2a495000, all_tables=0x2a4c3ec0) at /home/sbester/build/bzr/mysql-trunk/sql/sql_parse.cc:4557
#8 0x00000000006eaed5 in mysql_execute_command (thd=0x2a495000) at /home/sbester/build/bzr/mysql-trunk/sql/sql_parse.cc:2135
#9 0x00000000006f44ca in mysql_parse (thd=0x2a495000, rawbuf=0x30d80060 "select * from innodb_trx", length=24, parser_state=0x35ea5540) at /home/sbester/build/bzr/mysql-trunk/sql/sql_parse.cc:5597
#10 0x00000000006e864c in dispatch_command (command=COM_QUERY, thd=0x2a495000, packet=0x30bb4e31 "select * from innodb_trx", packet_length=24) at /home/sbester/build/bzr/mysql-trunk/sql/sql_parse.cc:1093
#11 0x00000000006e785f in do_command (thd=0x2a495000) at /home/sbester/build/bzr/mysql-trunk/sql/sql_parse.cc:815
#12 0x00000000006c18de in do_handle_one_connection (thd_arg=0x2a495000) at /home/sbester/build/bzr/mysql-trunk/sql/sql_connect.cc:771
#13 0x00000000006c146f in handle_one_connection (arg=0x2a495000) at /home/sbester/build/bzr/mysql-trunk/sql/sql_connect.cc:707
#14 0x00000030e1807761 in start_thread (arg=0x35ea6710) at pthread_create.c:301
#15 0x00000030e14e14ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) frame 1
#1 0x00000000009f3d7b in fill_innodb_trx_from_cache (cache=0x1462440, thd=0x2a495000, table=0x2a422500) at /home/sbester/build/bzr/mysql-trunk/storage/innobase/handler/i_s.cc:591
591 row->trx_query_cs);
(gdb) list
586 if (row->trx_query) {
587 /* store will do appropriate character set
588 conversion check */
589 fields[IDX_TRX_QUERY]->store(
590 row->trx_query, strlen(row->trx_query),
591 row->trx_query_cs);
592 fields[IDX_TRX_QUERY]->set_notnull();
593 } else {
594 fields[IDX_TRX_QUERY]->set_null();
595 }
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).
mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.
ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().
buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.
buf_read_ahead_linear(): Add the parameter inside_ibuf.
ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).
ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).
ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).
rb:585 approved by Sunny Bains
Remove the slot_no member of struct thr_local_struct.
enum srv_thread_type: Remove unused thread types.
srv_get_thread_type(): Unused function, remove.
thr_local_get_slot_no(), thr_local_set_slot_no(): Remove.
srv_thread_type_validate(), srv_slot_get_type(): New functions, for debugging.
srv_table_reserve_slot(): Return the srv_slot_t* directly. Do not create
thread-local storage.
srv_suspend_thread(): Get the srv_slot_t* as parameter. Return void;
the caller knows slot->event already.
srv_thread_has_reserved_slot(), srv_release_threads(): Assert
srv_thread_type_validate(type).
srv_init(): Use mem_zalloc() instead of mem_alloc(). Replace
srv_table_get_nth_slot(), because it now asserts that the kernel_mutex
is being held.
srv_master_thread(), srv_purge_thread(): Remember the slot from
srv_table_reserve_slot().
rb:629 approved by Inaam Rana
According to the zlib documentation, next_in and avail_in
must be initialized before invoking inflateInit or inflateInit2.
Furthermore, the zalloc function must clear the allocated memory.
btr_copy_zblob_prefix(): Replace the d_stream parameter with buf,len
and return the copied length.
page_zip_decompress(): Invoke inflateInit2 a little later.
page_zip_zalloc(): Rename from page_zip_alloc().
Invoke mem_heap_zalloc() instead of mem_heap_alloc().
rb:619 approved by Jimmy Yang
Before this fix, two performance schema unit tests crashed on windows.
The problem was a missing initialization to PFS_atomics,
which caused the crash only for platform not compiled with native atomics.
This fix adds the missing initialization in the unit tests.
No production code was changed, this is a unit test bug only.
MAP 'REPAIR TABLE' TO RECREATE +ANALYZE FOR ENGINES NOT
SUPPORTING NATIVE REPAIR
Executing 'mysqlcheck --check-upgrade --auto-repair ...' will first issue
'CHECK TABLE FOR UPGRADE' for all tables in the database in order to check if the
tables are compatible with the current version of MySQL. Any tables that are
found incompatible are then upgraded using 'REPAIR TABLE'.
The problem was that some engines (e.g. InnoDB) do not support 'REPAIR TABLE'.
This caused any such tables to be left incompatible. As a result such tables were
not properly fixed by the mysql_upgrade tool.
This patch fixes the problem by first changing 'CHECK TABLE FOR UPGRADE' to return
a different error message if the engine does not support REPAIR. Instead of
"Table upgrade required. Please do "REPAIR TABLE ..." it will report
"Table rebuild required. Please do "ALTER TABLE ... FORCE ..."
Second, the patch changes mysqlcheck to do 'ALTER TABLE ... FORCE' instead of
'REPAIR TABLE' in these cases.
This patch also fixes 'ALTER TABLE ... FORCE' to actually rebuild the table.
This change should be reflected in the documentation. Before this patch,
'ALTER TABLE ... FORCE' was unused (See Bug#11746162)
Test case added to mysqlcheck.test
Setting lowercase_table_names to 2 on Windows causing Foreign Key problems
This problem was exposed by the fix for Bug#55222. There was a codepath in dict0load.c,
dict_load_foreigns() that made sure the table name matched case sensitive in order to
load a referenced table into the dictionary as needed. If an engine is rebooted which
accesses a table with foreign keys, and lower_case_table_names=2, then the table with
foreign keys will get an error when it is changed (insert/updated/delete).
Once the referenced tables are loaded into the dictionary cache by a select statement
on those tables, the same change would succeed because the affected code path would
not get followed.
NON-PRIMARY UNIQUE INDEX USING INNODB
This patch adds the HA_INPLACE_ADD_UNIQUE_INDEX_NO_WRITE
capability flag to InnoDB, indicating that concurrent reads
can be allowed while non-primary unique indexes are created.
This is an follow-up to Bug #11751388 which enabled concurrent
reads when creating non-primary non-unique indexes.
Test case added to innodb_mysql_sync.test.
and compressed tables
buf_LRU_drop_page_hash_for_tablespace(): after releasing and
reacquiring the buffer pool mutex, do not dereference any block
descriptor pointer that is not known to be a pointer to an
uncompressed page frame (type buf_block_t; state ==
BUF_BLOCK_FILE_PAGE). Also, defer the acquisition of the block_mutex
until it is needed.
buf_page_get_gen(): Add mode == BUF_GET_IF_IN_POOL_PEEK for
buffer-fixing a block without making it young in the LRU list.
buf_page_get_gen(), buf_page_init(), buf_LRU_block_remove_hashed_page():
Set bpage->state = BUF_BLOCK_ZIP_FREE before buf_buddy_free(bpage),
so that similar race conditions might be detected a little easier.
btr_search_drop_page_hash_when_freed(): Use BUF_GET_IF_IN_POOL_PEEK
when dropping the hash indexes.
rb://528 approved by Jimmy Yang
IN OVERFLOW
Do not assign the result of the difference to a signed variable and
checking whether it is negative afterwards because this limits the max diff
to 2G on 32 bit systems. E.g. "signed = 3.5G - 1G" would be negative and the
code would assume that 3.5G < 1G. Instead compare the two variables directly
and assign to unsigned only if we know that the result of the subtraction
will be positive.
Discussed with: Jimmy and Sunny (via IRC)
Bug #11766501: Multiple RBS break the get rseg with mininum trx_t::no code during purge
Bug# 59291 changes:
Main problem is that truncating the UNDO log at the completion of every
trx_purge() call is expensive as the number of rollback segments is increased.
We truncate after a configurable amount of pages. The innodb_purge_batch_size
parameter is used to control when InnoDB does the actual truncate. The truncate
is done once after 128 (or TRX_SYS_N_RSEGS iterations). In other words we
truncate after purge 128 * innodb_purge_batch_size. The smaller the batch
size the quicker we truncate.
Introduce a new parameter that allows how many rollback segments to use for
storing REDO information. This is really step 1 in allowing complete control
to the user over rollback space management.
New parameters:
i) innodb_rollback_segments = number of rollback_segments to use
(default is now 128) dynamic parameter, can be changed anytime.
Currently there is little benefit in changing it from the default.
Optimisations in the patch.
i. Change the O(n) behaviour of trx_rseg_get_on_id() to O(log n)
Backported from 5.6. Refactor some of the binary heap code.
Create a new include/ut0bh.ic file.
ii. Avoid truncating the rollback segments after every purge.
Related changes that were moved to a separate patch:
i. Purge should not do any flushing, only wait for space to be free so that
it only does purging of records unless it is held up by a long running
transaction that is preventing it from progressing.
ii. Give the purge thread preference over transactions when acquiring the
rseg->mutex during commit. This to avoid purge blocking unnecessarily
when getting the next rollback segment to purge.
Bug #11766501 changes:
Add the rseg to the min binary heap under the cover of the kernel mutex and
the binary heap mutex. This ensures the ordering of the min binary heap.
The two changes have to be committed together because they share the same
that fixes both issues.
rb://567 Approved by: Inaam Rana.