InnoDB maintains an internal persistent sequence of transaction
identifiers. This sequence is used for assigning both transaction
start identifiers (DB_TRX_ID=trx->id) and end identifiers (trx->no)
as well as end identifiers for the mysql.transaction_registry table
that was introduced in MDEV-12894.
TRX_SYS_TRX_ID_WRITE_MARGIN: Remove. After this many updates of
the sequence we used to update the TRX_SYS page. We can avoid accessing
the TRX_SYS page if we modify the InnoDB startup so that resurrecting
the sequence from other pages of the transaction system.
TRX_SYS_TRX_ID_STORE: Deprecate. The field only exists for the purpose
of upgrading from an earlier version of MySQL or MariaDB.
Starting with this fix, MariaDB will rely on the fields
TRX_UNDO_TRX_ID, TRX_UNDO_TRX_NO in the undo log header page of
each non-committed transaction, and on the new field
TRX_RSEG_MAX_TRX_ID in rollback segment header pages.
Because of this change, setting innodb_force_recovery=5 or 6 may cause
the system to recover with trx_sys.get_max_trx_id()==0. We must adjust
checks for invalid DB_TRX_ID and PAGE_MAX_TRX_ID accordingly.
We will change the startup and shutdown messages to display the
trx_sys.get_max_trx_id() in addition to the log sequence number.
trx_sys_t::flush_max_trx_id(): Remove.
trx_undo_mem_create_at_db_start(), trx_undo_lists_init():
Add an output parameter max_trx_id, to be updated from
TRX_UNDO_TRX_ID, TRX_UNDO_TRX_NO.
TRX_RSEG_MAX_TRX_ID: New field, for persisting
trx_sys.get_max_trx_id() at the time of the latest transaction commit.
Startup is not reading the undo log pages of committed transactions.
We want to avoid additional page accesses on startup, as well as
trouble when all undo logs have been emptied.
On startup, we will simply determine the maximum value from all pages
that are being read anyway.
TRX_RSEG_FORMAT: Redefined from TRX_RSEG_MAX_SIZE.
Old versions of InnoDB wrote uninitialized garbage to unused data fields.
Because of this, we cannot simply introduce a new field in the
rollback segment pages and expect it to be always zero, like it would
if the database was created by a recent enough InnoDB version.
Luckily, it looks like the field TRX_RSEG_MAX_SIZE was always written
as 0xfffffffe. We will indicate a new subformat of the page by writing
0 to this field. This has the nice side effect that after a downgrade
to older versions of InnoDB, transactions should fail to allocate any
undo log, that is, writes will be blocked. So, there is no problem of
getting corrupted transaction identifiers after downgrading.
trx_rseg_t::max_size: Remove.
trx_rseg_header_create(): Remove the parameter max_size=ULINT_MAX.
trx_purge_add_undo_to_history(): Update TRX_RSEG_MAX_SIZE
(and TRX_RSEG_FORMAT if needed). This is invoked on transaction commit.
trx_rseg_mem_restore(): If TRX_RSEG_FORMAT contains 0,
read TRX_RSEG_MAX_SIZE.
trx_rseg_array_init(): Invoke trx_sys.init_max_trx_id(max_trx_id + 1)
where max_trx_id was the maximum that was encountered in the rollback
segment pages and the undo log pages of recovered active, XA PREPARE,
or some committed transactions. (See trx_purge_add_undo_to_history()
which invokes trx_rsegf_set_nth_undo(..., FIL_NULL, ...);
not all committed transactions will be immediately detached from the
rollback segment header.)
trx_rseg_mem_restore(): Update the max_trx_id from the undo log pages.
trx_sys_init_at_db_start(): Remove; merge with trx_lists_init_at_db_start().
trx_undo_lists_init(): Move to the only calling module, trx0rseg.cc.
trx_undo_mem_create_at_db_start(): Declare globally. Return the number
of pages.
trx_undo_page_get_prev_rec(), trx_undo_page_get_last_rec(),
trx_undo_page_get_first_rec(), trx_undo_page_get_start():
Move to the only caller, trx0undo.cc.
Add some const qualifiers.
trx_sysf_t: Remove.
trx_sysf_get(): Return the TRX_SYS page, not a pointer within it.
trx_sysf_rseg_get_space(), trx_sysf_rseg_get_page_no():
Remove a parameter, and merge the declaration and definition.
Take the TRX_SYS page as a parameter.
TRX_SYS_N_RSEGS: Correct the comment.
trx_sysf_rseg_find_free(), trx_sys_update_mysql_binlog_offset(),
trx_sys_update_wsrep_checkpoint(): Take the TRX_SYS page as a parameter.
trx_rseg_header_create(): Add a parameter for the TRX_SYS page.
trx_sysf_rseg_set_space(), trx_sysf_rseg_set_page_no(): Remove;
merge to the only caller, trx_rseg_header_create().
srv_init_abort_low(): Call srv_shutdown_bg_undo_sources() so that if
startup aborts while creating InnoDB system tables, the shutdown will
proceed correctly.
- When adding LEX_CSTRING to String, we are now checking that
string is \0 terminated (as normally LEX_CSTRING should be
usable for printf(). In the cases when one wants to avoid the
checking one can use String->append(ptr, length) instead of just
String->append(LEX_CSTRING*)
This preserves const str for constant strings
Other things
- A few variables where changed from LEX_STRING to LEX_CSTRING
- Incident_log_event::Incident_log_event and record_incident where
changed to take LEX_CSTRING* as an argument instead of LEX_STRING
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY
Move a test from innodb.rename_table_debug to innodb.alter_copy.
ha_innobase::extra(HA_EXTRA_BEGIN_ALTER_COPY): Register id-versioned
tables so that mysql.transaction_registry will be updated, even for
empty tables that are subjected to ALTER TABLE…ALGORITHM=COPY.
Whenever one copies an IO_CACHE struct, one must remember to call
setup_io_cache, if not, the IO_CACHE's current_pos and end_pos
self-references will point to the previous struct's memory, which
could go out of scope. Commit 9003869390
fixes this problem in a more general fashion by removing the
self-references altogether, but for 5.5 we'll keep the old behaviour.
If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
a lot of time rolling back writes to the intermediate copy of the table.
To reduce the amount of busy work done, a work-around was introduced in
commit fd069e2bb3 in MySQL 4.1.8 and 5.0.2,
to commit the transaction after every 10,000 inserted rows.
A proper fix would have been to disable the undo logging altogether and
to simply drop the intermediate copy of the table on subsequent server
startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
In MariaDB 10.2, the intermediate copy of the table would be left behind
with a name starting with the string #sql.
This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
contributed by jixianliang <271365745@qq.com>.
Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
InnoDB must for now keep the undo logging enabled, so that the latest
row can be rolled back in case of an error.
In Galera cluster, the LOAD DATA statement will retain the existing
behaviour and commit the transaction after every 10,000 rows if
the parameter wsrep_load_data_splitting=ON is set. The logic to do
so (the wsrep_load_data_split() function and the call
handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
by Ji Xianliang and Marko Mäkelä.
The original fix:
Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date: Wed Dec 2 16:09:15 2015 +0530
Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY
Problem:
During ALTER TABLE, we commit and restart the transaction for every
10,000 rows, so that the rollback after recovery would not take so long.
Fix:
Suppress the undo logging during copy alter operation. If fts_index is
present then insert directly into fts auxiliary table rather
than doing at commit time.
ha_innobase::num_write_row: Remove the variable.
ha_innobase::write_row(): Remove the hack for committing every 10000 rows.
row_lock_table_for_mysql(): Remove the extra 2 parameters.
lock_get_src_table(), lock_is_table_exclusive(): Remove.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
- Galera tests that was not updated with connection change
messages
- Test where out of memory error was changed (We are now using the
standard out of memory error in most places)
- Removed tokudb tests that uses include files that doesn't exist
in MariaDB
- Removed not supported mariadb startup option from option file
- Galera tests that was not updated with connection change
messages
- Disabled some TokuDB tests that always timed out.
These should be enabled again when we have an option to
specicy timeouts per tests.
The thd->lex->part_info should be kept intact during PS
execution. Or the second execution gets that modified part_info.
Let's modify ths->work_part_info instead.
Item_xml_str_func::fix_fields() used a local "String tmp" as a buffer
for args[1]->val_str(). "tmp" was freed at the end of fix_fields(),
while Items created during my_xpath_parse() still pointed to its fragments.
Adding a new member Item_xml_str_func::m_xpath_query and store the result
of args[1]->val_str() into it.
Do not SET DEBUG_DBUG=-d,... in tests. To disable debug instrumentation,
save and restore the original value of the variable DEBUG_DBUG.
Assigning -d,... will enable the output of a lot of unrelated DBUG
messages to the server error log.