branches/innodb+: Merge revisions 4150:4528 from branches/zip:
------------------------------------------------------------------------
r4152 | marko | 2009-02-10 12:52:27 +0200 (Tue, 10 Feb 2009) | 12 lines
branches/zip: When innodb_use_sys_malloc is set, ignore
innodb_additional_mem_pool_size, because nothing will
be allocated from mem_comm_pool.
mem_pool_create(): Remove the assertion about size. The function will
work with any size. However, an assertion would fail in ut_malloc_low()
when size==0.
mem_init(): When srv_use_sys_malloc is set, pass size=1 to mem_pool_create().
mem0mem.c: Add #include "srv0srv.h" that is needed by mem0dbg.c.
------------------------------------------------------------------------
r4153 | vasil | 2009-02-10 22:58:17 +0200 (Tue, 10 Feb 2009) | 14 lines
branches/zip:
(followup to r4145) Non-functional change:
Change the os_atomic_increment() and os_compare_and_swap() functions
to macros to avoid artificial limitations on the types of those
functions' arguments. As a consequence typecasts from the source
code can be removed.
Also remove Google's copyright from os0sync.ic because that file no longer
contains code from Google.
Approved by: Marko (rb://88), also ok from Inaam via IM
------------------------------------------------------------------------
r4163 | marko | 2009-02-12 00:14:19 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip: Make innodb_thread_concurrency=0 the default.
The old default was 8.
------------------------------------------------------------------------
r4169 | calvin | 2009-02-12 10:37:10 +0200 (Thu, 12 Feb 2009) | 3 lines
branches/zip: Adjust the result file of innodb_thread_concurrency_basic
test. The default value of innodb_thread_concurrency is changed to 0
(from 8) via r4163.
------------------------------------------------------------------------
r4174 | vasil | 2009-02-12 17:38:27 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip:
Fix pathname of the file to patch.
------------------------------------------------------------------------
r4176 | vasil | 2009-02-13 10:06:31 +0200 (Fri, 13 Feb 2009) | 7 lines
branches/zip:
Fix the failing mysql-test partition_innodb, which failed only if run after
innodb_trx_weight (or other test that would leave LATEST DEADLOCK ERROR into
the output of SHOW ENGINE INNODB STATUS). Find further explanation for the
failure at the top of the added patch partition_innodb.diff.
------------------------------------------------------------------------
r4198 | vasil | 2009-02-17 09:06:07 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
Add the full text of the GPLv2 license into the root directory of the
plugin. In previous releases this file was copied from an external source
(https://svn.innodb.com/svn/plugin/trunk/support/COPYING) "manually" when
creating the source and binary archives. It is less confusing to have this
present in the root directory of the SVN branch.
------------------------------------------------------------------------
r4199 | vasil | 2009-02-17 09:11:58 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add Google's license into COPYING.Google.
------------------------------------------------------------------------
r4200 | vasil | 2009-02-17 09:56:33 +0200 (Tue, 17 Feb 2009) | 11 lines
branches/zip:
To the files touched by the Google patch from c4144 (excluding
include/os0sync.ic because later we removed Google code from that file):
* Remove the Google license
* Remove old Innobase copyright lines
* Add a reference to the Google license and to the GPLv2 license at the top,
as recommended by the lawyers at Oracle Legal.
------------------------------------------------------------------------
r4201 | vasil | 2009-02-17 10:12:02 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 1/28]
------------------------------------------------------------------------
r4202 | vasil | 2009-02-17 10:15:06 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 2/28]
------------------------------------------------------------------------
r4203 | vasil | 2009-02-17 10:25:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 3/28]
------------------------------------------------------------------------
r4204 | vasil | 2009-02-17 10:55:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 4/28]
------------------------------------------------------------------------
r4205 | vasil | 2009-02-17 10:59:22 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 5/28]
------------------------------------------------------------------------
r4206 | vasil | 2009-02-17 11:02:27 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 6/28]
------------------------------------------------------------------------
r4207 | vasil | 2009-02-17 11:04:28 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 7/28]
------------------------------------------------------------------------
r4208 | vasil | 2009-02-17 11:06:49 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 8/28]
------------------------------------------------------------------------
r4209 | vasil | 2009-02-17 11:10:18 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 9/28]
------------------------------------------------------------------------
r4210 | vasil | 2009-02-17 11:12:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 10/28]
------------------------------------------------------------------------
r4211 | vasil | 2009-02-17 11:14:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 11/28]
------------------------------------------------------------------------
r4212 | vasil | 2009-02-17 11:18:35 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 12/28]
------------------------------------------------------------------------
r4213 | vasil | 2009-02-17 11:24:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4214 | vasil | 2009-02-17 11:27:31 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4215 | vasil | 2009-02-17 11:29:55 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 15/28]
------------------------------------------------------------------------
r4216 | vasil | 2009-02-17 11:33:38 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 16/28]
------------------------------------------------------------------------
r4217 | vasil | 2009-02-17 11:36:44 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 17/28]
------------------------------------------------------------------------
r4218 | vasil | 2009-02-17 11:39:11 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 18/28]
------------------------------------------------------------------------
r4219 | vasil | 2009-02-17 11:41:24 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 19/28]
------------------------------------------------------------------------
r4220 | vasil | 2009-02-17 11:43:50 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 20/28]
------------------------------------------------------------------------
r4221 | vasil | 2009-02-17 11:46:52 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 21/28]
------------------------------------------------------------------------
r4222 | vasil | 2009-02-17 11:50:12 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 22/28]
------------------------------------------------------------------------
r4223 | vasil | 2009-02-17 11:53:58 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 23/28]
------------------------------------------------------------------------
r4224 | vasil | 2009-02-17 12:01:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 24/28]
------------------------------------------------------------------------
r4225 | vasil | 2009-02-17 12:05:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 25/28]
------------------------------------------------------------------------
r4226 | vasil | 2009-02-17 12:09:16 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 26/28]
------------------------------------------------------------------------
r4227 | vasil | 2009-02-17 12:12:56 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 27/28]
------------------------------------------------------------------------
r4228 | vasil | 2009-02-17 12:14:04 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 28/28]
------------------------------------------------------------------------
r4229 | vasil | 2009-02-17 12:30:55 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add the copyright notice to the non C files.
------------------------------------------------------------------------
r4231 | marko | 2009-02-17 14:26:53 +0200 (Tue, 17 Feb 2009) | 12 lines
Minor cleanup of the Google SMP patch.
sync_array_object_signalled(): Add a (void) cast to eliminate a gcc warning
about the return value of os_atomic_increment() being ignored.
rw_lock_create_func(): Properly indent the preprocessor directives.
rw_lock_x_lock_low(), rw_lock_x_lock_func_nowait(): Split lines correctly.
rw_lock_set_writer_id_and_recursion_flag(): Silence a Valgrind warning.
Do not mix statements and variable declarations.
------------------------------------------------------------------------
r4232 | marko | 2009-02-17 14:59:54 +0200 (Tue, 17 Feb 2009) | 3 lines
branches/zip: When assigning lock->recursive = FALSE, also flag
lock->writer_thread invalid, so that Valgrind will catch more errors.
This is related to Issue #175.
------------------------------------------------------------------------
r4242 | marko | 2009-02-18 17:01:09 +0200 (Wed, 18 Feb 2009) | 2 lines
branches/zip: UT_DBG_STOP: Use do{} while(0) to silence a g++-4.3.2 warning
about a while(0); statement. This should fix (part of) Issue #176.
------------------------------------------------------------------------
r4243 | marko | 2009-02-18 17:04:03 +0200 (Wed, 18 Feb 2009) | 3 lines
branches/zip: buf_buddy_get_slot(): Fix a gcc 4.3.2 warning
about an empty body of a "for" statement.
This fixes part of Issue #176.
------------------------------------------------------------------------
r4244 | marko | 2009-02-18 17:25:45 +0200 (Wed, 18 Feb 2009) | 11 lines
branches/zip: Protect ut_total_allocated_memory with ut_list_mutex.
Unprotected updates to ut_total_allocated_memory in
os_mem_alloc_large() and os_mem_free_large(), called during
fast index creation, may corrupt the variable and cause assertion failures.
Also, add UNIV_MEM_ALLOC() and UNIV_MEM_FREE() instrumentation around
os_mem_alloc_large() and os_mem_free_large(), so that Valgrind can
detect more errors.
rb://90 approved by Heikki Tuuri. This addresses Issue #177.
------------------------------------------------------------------------
r4248 | marko | 2009-02-19 11:52:39 +0200 (Thu, 19 Feb 2009) | 2 lines
branches/zip: page_zip_set_size(): Fix a g++ 4.3.2 warning
about an empty body in a "for" statement. This closes Issue #176.
------------------------------------------------------------------------
r4251 | inaam | 2009-02-19 15:46:27 +0200 (Thu, 19 Feb 2009) | 8 lines
branches/zip: Issue #178 rb://91
Change plug.in to have same CXXFLAGS as CFLAGS. This is to ensure that
both .c and .cc files get compiled with same flags. To fix the issue
where UNIV_LINUX was defined only in .c files.
Approved by: Marko
------------------------------------------------------------------------
r4258 | vasil | 2009-02-20 11:52:19 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Cleanup in ChangeLog:
* Wrap lines at 78 characters
* Changed files are listed alphabetically
* White-space cleanup
------------------------------------------------------------------------
r4259 | vasil | 2009-02-20 11:59:42 +0200 (Fri, 20 Feb 2009) | 6 lines
branches/zip:
ChangeLog: Remove include/os0sync.ic from the entry about the google patch,
this file was modified later to not include Google's code.
------------------------------------------------------------------------
r4262 | vasil | 2009-02-20 14:56:59 +0200 (Fri, 20 Feb 2009) | 373 lines
branches/zip:
Merge revisions 4035:4261 from branches/5.1:
------------------------------------------------------------------------
r4065 | sunny | 2009-01-29 16:01:36 +0200 (Thu, 29 Jan 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: In the last round of AUTOINC cleanup we assumed that AUTOINC
is only defined for integer columns. This caused an assertion failure when
we checked for the maximum value of a column type. We now calculate the
max value for floating-point autoinc columns too.
Fix Bug#42400 - InnoDB autoinc code can't handle floating-point columns
rb://84 and Mantis issue://162
------------------------------------------------------------------------
r4111 | sunny | 2009-02-03 22:06:52 +0200 (Tue, 03 Feb 2009) | 2 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add the ULL suffix otherwise there is an overflow.
------------------------------------------------------------------------
r4128 | vasil | 2009-02-08 21:36:45 +0200 (Sun, 08 Feb 2009) | 18 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2709.20.31
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2008-12-19 01:28:51 +0100
message:
Disable part of innodb-autoinc.test, because the MySQL server asserts when
compiled --with-debug, due to bug 39828, "autoinc wraps around when offset and
increment > 1". This change should be reverted when that bug is fixed (and a
a few other minor changes to the test as described in comments).
modified:
mysql-test/r/innodb-autoinc.result
mysql-test/t/innodb-autoinc.test
------------------------------------------------------------------------
r4129 | vasil | 2009-02-08 21:54:25 +0200 (Sun, 08 Feb 2009) | 310 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
[looks like the changes to innodb-autoinc.test were made as part of
the following huge merge, but we are merging only changes to that file]
------------------------------------------------------------
revno: 2546.47.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: 5.1-rpl
timestamp: Fri 2009-01-23 13:22:05 +0100
message:
merge: 5.1 -> 5.1-rpl
conflicts:
Text conflict in client/mysqltest.cc
Text conflict in mysql-test/include/wait_until_connected_again.inc
Text conflict in mysql-test/lib/mtr_report.pm
Text conflict in mysql-test/mysql-test-run.pl
Text conflict in mysql-test/r/events_bugs.result
Text conflict in mysql-test/r/log_state.result
Text conflict in mysql-test/r/myisam_data_pointer_size_func.result
Text conflict in mysql-test/r/mysqlcheck.result
Text conflict in mysql-test/r/query_cache.result
Text conflict in mysql-test/r/status.result
Text conflict in mysql-test/suite/binlog/r/binlog_index.result
Text conflict in mysql-test/suite/binlog/r/binlog_innodb.result
Text conflict in mysql-test/suite/rpl/r/rpl_packet.result
Text conflict in mysql-test/suite/rpl/t/rpl_packet.test
Text conflict in mysql-test/t/disabled.def
Text conflict in mysql-test/t/events_bugs.test
Text conflict in mysql-test/t/log_state.test
Text conflict in mysql-test/t/myisam_data_pointer_size_func.test
Text conflict in mysql-test/t/mysqlcheck.test
Text conflict in mysql-test/t/query_cache.test
Text conflict in mysql-test/t/rpl_init_slave_func.test
Text conflict in mysql-test/t/status.test
removed:
mysql-test/suite/parts/r/partition_bit_ndb.result
mysql-test/suite/parts/t/partition_bit_ndb.test
mysql-test/suite/parts/t/partition_sessions.test
mysql-test/suite/sys_vars/inc/tmp_table_size_basic.inc
mysql-test/suite/sys_vars/r/tmp_table_size_basic_32.result
mysql-test/suite/sys_vars/r/tmp_table_size_basic_64.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic_32.test
mysql-test/suite/sys_vars/t/tmp_table_size_basic_64.test
mysql-test/t/log_bin_trust_function_creators_func-master.opt
mysql-test/t/rpl_init_slave_func-slave.opt
added:
mysql-test/include/check_events_off.inc
mysql-test/include/cleanup_fake_relay_log.inc
mysql-test/include/have_simple_parser.inc
mysql-test/include/no_running_event_scheduler.inc
mysql-test/include/no_running_events.inc
mysql-test/include/running_event_scheduler.inc
mysql-test/include/setup_fake_relay_log.inc
mysql-test/include/wait_condition_sp.inc
mysql-test/r/fulltext_plugin.result
mysql-test/r/have_simple_parser.require
mysql-test/r/innodb_bug38231.result
mysql-test/r/innodb_bug39438.result
mysql-test/r/innodb_mysql_rbk.result
mysql-test/r/partition_innodb_semi_consistent.result
mysql-test/r/query_cache_28249.result
mysql-test/r/status2.result
mysql-test/std_data/bug40482-bin.000001
mysql-test/suite/binlog/r/binlog_innodb_row.result
mysql-test/suite/binlog/t/binlog_innodb_row.test
mysql-test/suite/rpl/r/rpl_binlog_corruption.result
mysql-test/suite/rpl/t/rpl_binlog_corruption-master.opt
mysql-test/suite/rpl/t/rpl_binlog_corruption.test
mysql-test/suite/sys_vars/r/tmp_table_size_basic.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic.test
mysql-test/t/fulltext_plugin-master.opt
mysql-test/t/fulltext_plugin.test
mysql-test/t/innodb_bug38231.test
mysql-test/t/innodb_bug39438-master.opt
mysql-test/t/innodb_bug39438.test
mysql-test/t/innodb_mysql_rbk-master.opt
mysql-test/t/innodb_mysql_rbk.test
mysql-test/t/partition_innodb_semi_consistent-master.opt
mysql-test/t/partition_innodb_semi_consistent.test
mysql-test/t/query_cache_28249.test
mysql-test/t/status2.test
renamed:
mysql-test/suite/funcs_1/r/is_collation_character_set_applicability.result => mysql-test/suite/funcs_1/r/is_coll_char_set_appl.result
mysql-test/suite/funcs_1/t/is_collation_character_set_applicability.test => mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
modified:
.bzr-mysql/default.conf
CMakeLists.txt
client/mysql.cc
client/mysql_upgrade.c
client/mysqlcheck.c
client/mysqltest.cc
configure.in
extra/resolve_stack_dump.c
extra/yassl/include/openssl/ssl.h
include/config-win.h
include/m_ctype.h
include/my_global.h
mysql-test/extra/binlog_tests/database.test
mysql-test/extra/rpl_tests/rpl_auto_increment.test
mysql-test/include/commit.inc
mysql-test/include/have_32bit.inc
mysql-test/include/have_64bit.inc
mysql-test/include/index_merge1.inc
mysql-test/include/linux_sys_vars.inc
mysql-test/include/windows_sys_vars.inc
mysql-test/lib/mtr_report.pm
mysql-test/mysql-test-run.pl
mysql-test/r/alter_table.result
mysql-test/r/commit_1innodb.result
mysql-test/r/create.result
mysql-test/r/csv.result
mysql-test/r/ctype_ucs.result
mysql-test/r/date_formats.result
mysql-test/r/events_bugs.result
mysql-test/r/events_scheduling.result
mysql-test/r/fulltext.result
mysql-test/r/func_if.result
mysql-test/r/func_in.result
mysql-test/r/func_str.result
mysql-test/r/func_time.result
mysql-test/r/grant.result
mysql-test/r/index_merge_myisam.result
mysql-test/r/information_schema.result
mysql-test/r/innodb-autoinc.result
mysql-test/r/innodb.result
mysql-test/r/innodb_mysql.result
mysql-test/r/log_bin_trust_function_creators_func.result
mysql-test/r/log_state.result
mysql-test/r/myisampack.result
mysql-test/r/mysql.result
mysql-test/r/mysqlcheck.result
mysql-test/r/partition_datatype.result
mysql-test/r/partition_mgm.result
mysql-test/r/partition_pruning.result
mysql-test/r/query_cache.result
mysql-test/r/read_buffer_size_basic.result
mysql-test/r/read_rnd_buffer_size_basic.result
mysql-test/r/rpl_init_slave_func.result
mysql-test/r/select.result
mysql-test/r/status.result
mysql-test/r/strict.result
mysql-test/r/temp_table.result
mysql-test/r/type_bit.result
mysql-test/r/type_date.result
mysql-test/r/type_float.result
mysql-test/r/warnings_engine_disabled.result
mysql-test/r/xml.result
mysql-test/suite/binlog/r/binlog_database.result
mysql-test/suite/binlog/r/binlog_index.result
mysql-test/suite/binlog/r/binlog_innodb.result
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result
mysql-test/suite/binlog/t/binlog_innodb.test
mysql-test/suite/funcs_1/r/is_columns_is.result
mysql-test/suite/funcs_1/r/is_engines.result
mysql-test/suite/funcs_1/r/storedproc.result
mysql-test/suite/funcs_1/storedproc/param_check.inc
mysql-test/suite/funcs_2/t/disabled.def
mysql-test/suite/ndb/t/disabled.def
mysql-test/suite/parts/r/partition_bit_innodb.result
mysql-test/suite/parts/r/partition_bit_myisam.result
mysql-test/suite/parts/r/partition_special_innodb.result
mysql-test/suite/parts/t/disabled.def
mysql-test/suite/parts/t/partition_special_innodb.test
mysql-test/suite/parts/t/partition_value_innodb.test
mysql-test/suite/parts/t/partition_value_myisam.test
mysql-test/suite/parts/t/partition_value_ndb.test
mysql-test/suite/rpl/r/rpl_auto_increment.result
mysql-test/suite/rpl/r/rpl_packet.result
mysql-test/suite/rpl/r/rpl_row_create_table.result
mysql-test/suite/rpl/r/rpl_slave_skip.result
mysql-test/suite/rpl/r/rpl_trigger.result
mysql-test/suite/rpl/t/disabled.def
mysql-test/suite/rpl/t/rpl_packet.test
mysql-test/suite/rpl/t/rpl_row_create_table.test
mysql-test/suite/rpl/t/rpl_slave_skip.test
mysql-test/suite/rpl/t/rpl_trigger.test
mysql-test/suite/rpl_ndb/t/disabled.def
mysql-test/suite/sys_vars/inc/key_buffer_size_basic.inc
mysql-test/suite/sys_vars/inc/sort_buffer_size_basic.inc
mysql-test/suite/sys_vars/r/key_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/key_buffer_size_basic_64.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_64.result
mysql-test/t/alter_table.test
mysql-test/t/create.test
mysql-test/t/csv.test
mysql-test/t/ctype_ucs.test
mysql-test/t/date_formats.test
mysql-test/t/disabled.def
mysql-test/t/events_bugs.test
mysql-test/t/events_scheduling.test
mysql-test/t/fulltext.test
mysql-test/t/func_if.test
mysql-test/t/func_in.test
mysql-test/t/func_str.test
mysql-test/t/func_time.test
mysql-test/t/grant.test
mysql-test/t/information_schema.test
mysql-test/t/innodb-autoinc.test
mysql-test/t/innodb.test
mysql-test/t/innodb_mysql.test
mysql-test/t/log_bin_trust_function_creators_func.test
mysql-test/t/log_state.test
mysql-test/t/myisam_data_pointer_size_func.test
mysql-test/t/myisampack.test
mysql-test/t/mysql.test
mysql-test/t/mysqlcheck.test
mysql-test/t/partition_innodb_stmt.test
mysql-test/t/partition_mgm.test
mysql-test/t/partition_pruning.test
mysql-test/t/query_cache.test
mysql-test/t/rpl_init_slave_func.test
mysql-test/t/select.test
mysql-test/t/status.test
mysql-test/t/strict.test
mysql-test/t/temp_table.test
mysql-test/t/type_bit.test
mysql-test/t/type_date.test
mysql-test/t/type_float.test
mysql-test/t/warnings_engine_disabled.test
mysql-test/t/xml.test
mysys/my_getopt.c
mysys/my_init.c
scripts/mysql_install_db.sh
sql-common/my_time.c
sql/field.cc
sql/field.h
sql/filesort.cc
sql/ha_partition.cc
sql/ha_partition.h
sql/item.cc
sql/item_cmpfunc.cc
sql/item_func.h
sql/item_strfunc.cc
sql/item_sum.cc
sql/item_timefunc.cc
sql/item_timefunc.h
sql/log.cc
sql/log.h
sql/log_event.cc
sql/log_event.h
sql/mysql_priv.h
sql/mysqld.cc
sql/opt_range.cc
sql/partition_info.cc
sql/repl_failsafe.cc
sql/rpl_constants.h
sql/set_var.cc
sql/slave.cc
sql/spatial.h
sql/sql_acl.cc
sql/sql_base.cc
sql/sql_binlog.cc
sql/sql_class.h
sql/sql_cursor.cc
sql/sql_delete.cc
sql/sql_lex.cc
sql/sql_lex.h
sql/sql_locale.cc
sql/sql_parse.cc
sql/sql_partition.cc
sql/sql_plugin.cc
sql/sql_plugin.h
sql/sql_profile.cc
sql/sql_repl.cc
sql/sql_select.cc
sql/sql_select.h
sql/sql_show.cc
sql/sql_table.cc
sql/sql_trigger.cc
sql/sql_trigger.h
sql/table.cc
sql/table.h
sql/unireg.cc
storage/csv/ha_tina.cc
storage/federated/ha_federated.cc
storage/heap/ha_heap.cc
storage/innobase/Makefile.am
storage/innobase/btr/btr0sea.c
storage/innobase/buf/buf0lru.c
storage/innobase/dict/dict0dict.c
storage/innobase/dict/dict0mem.c
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innobase/include/btr0sea.h
storage/innobase/include/dict0dict.h
storage/innobase/include/dict0mem.h
storage/innobase/include/ha_prototypes.h
storage/innobase/include/lock0lock.h
storage/innobase/include/row0mysql.h
storage/innobase/include/sync0sync.ic
storage/innobase/include/ut0ut.h
storage/innobase/lock/lock0lock.c
storage/innobase/os/os0file.c
storage/innobase/plug.in
storage/innobase/row/row0mysql.c
storage/innobase/row/row0sel.c
storage/innobase/srv/srv0srv.c
storage/innobase/srv/srv0start.c
storage/innobase/ut/ut0ut.c
storage/myisam/ft_boolean_search.c
strings/ctype.c
strings/xml.c
tests/mysql_client_test.c
win/configure.js
mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
------------------------------------------------------------------------
r4165 | calvin | 2009-02-12 01:34:27 +0200 (Thu, 12 Feb 2009) | 1 line
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: minor non-functional changes.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4263 | vasil | 2009-02-20 15:00:46 +0200 (Fri, 20 Feb 2009) | 4 lines
branches/zip:
Add a ChangeLog entry for a change in r4262.
------------------------------------------------------------------------
r4265 | marko | 2009-02-20 22:31:03 +0200 (Fri, 20 Feb 2009) | 5 lines
branches/zip: Make innodb_use_sys_malloc=ON the default.
Replace srv_use_sys_malloc with UNIV_LIKELY(srv_use_sys_malloc)
to improve branch prediction in the default case.
Approved by Ken over the IM.
------------------------------------------------------------------------
r4266 | vasil | 2009-02-20 23:29:32 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Add a sentence at the top of COPYING.Google to clarify that this license
does not apply to the whole InnoDB.
Suggested by: Ken
------------------------------------------------------------------------
r4268 | marko | 2009-02-23 12:43:51 +0200 (Mon, 23 Feb 2009) | 9 lines
branches/zip: Initialize ut_list_mutex at startup. Without this fix,
ut_list_mutex would be used uninitialized when innodb_use_sys_malloc=1.
This fix addresses Issue #181.
ut_mem_block_list_init(): Rename to ut_mem_init() and make public.
ut_malloc_low(), ut_free_all_mem(): Add ut_a(ut_mem_block_list_inited).
mem_init(): Call ut_mem_init().
------------------------------------------------------------------------
r4269 | marko | 2009-02-23 15:09:49 +0200 (Mon, 23 Feb 2009) | 7 lines
branches/zip: When freeing an uncompressed BLOB page, tolerate garbage in
FIL_PAGE_TYPE. (Bug #43043, Issue #182)
btr_check_blob_fil_page_type(): New function.
btr_free_externally_stored_field(), btr_copy_blob_prefix():
Call btr_check_blob_fil_page_type() to check FIL_PAGE_TYPE.
------------------------------------------------------------------------
r4272 | marko | 2009-02-23 23:10:18 +0200 (Mon, 23 Feb 2009) | 8 lines
branches/zip: Adjust the fix of Issue #182 in r4269 per Inaam's suggestion.
btr_check_blob_fil_page_type(): Replace the parameter
const char* op
with
ibool read. Do not print anything about page type mismatch
when reading a BLOB page in Antelope format.
Print space id before page number.
------------------------------------------------------------------------
r4273 | marko | 2009-02-24 00:11:11 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: ut_mem_init(): Add the assertion !ut_mem_block_list_inited.
------------------------------------------------------------------------
r4274 | marko | 2009-02-24 00:14:38 +0200 (Tue, 24 Feb 2009) | 12 lines
branches/zip: Fix bugs in the fix of Issue #181. Tested inside and
outside Valgrind, with innodb_use_sys_malloc set to 0 and 1.
mem_init(): Invoke ut_mem_init() before mem_pool_create(), because
the latter one will invoke ut_malloc().
srv_general_init(): Do not initialize the memory subsystem (mem_init()).
innobase_init(): Initialize the memory subsystem (mem_init()) before
calling srv_parse_data_file_paths_and_sizes(), which needs ut_malloc().
Call ut_free_all_mem() in error handling to clean up after the mem_init().
------------------------------------------------------------------------
r4280 | marko | 2009-02-24 15:14:59 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove unused function os_mem_alloc_nocache().
------------------------------------------------------------------------
r4281 | marko | 2009-02-24 16:02:48 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove the unused function dict_index_get_type().
------------------------------------------------------------------------
r4283 | marko | 2009-02-24 23:06:56 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: srv0start.c: Remove unnecessary #include "mem0pool.h".
------------------------------------------------------------------------
r4284 | marko | 2009-02-24 23:26:38 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: mem0mem.c: Remove unnecessary #include "mach0data.h".
------------------------------------------------------------------------
r4288 | vasil | 2009-02-25 10:48:07 +0200 (Wed, 25 Feb 2009) | 21 lines
branches/zip: Merge revisions 4261:4287 from branches/5.1:
------------------------------------------------------------------------
r4287 | sunny | 2009-02-25 05:32:01 +0200 (Wed, 25 Feb 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#42714 AUTO_INCREMENT errors in 5.1.31. There are two
changes to the autoinc handling.
1. To fix the immediate problem from the bug report, we must ensure that the
value written to the table is always less than the max value stored in
dict_table_t.
2. The second related change is that according to MySQL documentation when
the offset is greater than the increment, we should ignore the offset.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4289 | vasil | 2009-02-25 10:53:51 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the fix in r4288.
------------------------------------------------------------------------
r4290 | vasil | 2009-02-25 11:05:44 +0200 (Wed, 25 Feb 2009) | 11 lines
branches/zip:
Make ChangeLog entries for bugs in bugs.mysql.com in the form:
Fix Bug#12345 bug title
(for bugs after 1.0.2 was released and the ChangeLog published)
There is no need to bloat the ChangeLog with information that is available
via bugs.mysql.com.
Discussed with: Marko
------------------------------------------------------------------------
r4291 | vasil | 2009-02-25 11:08:32 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Fix Bug synopsis and remove explanation
------------------------------------------------------------------------
r4292 | marko | 2009-02-25 12:09:15 +0200 (Wed, 25 Feb 2009) | 25 lines
branches/zip: Correct the initialization of the memory subsystem once
again, to finally put Issue #181 to rest.
Revert some parts of r4274. It is best not to call ut_malloc() before
srv_general_init().
mem_init(): Do not call ut_mem_init().
srv_general_init(): Initialize the memory subsystem in two phases:
first ut_mem_init(), then mem_init(). This is because os_sync_init()
and sync_init() depend on ut_mem_init() and mem_init() depends on
os_sync_init() or sync_init().
srv_parse_data_file_paths_and_sizes(),
srv_parse_log_group_home_dirs(): Remove the output parameters. Assign
to the global variables directly. Allocate memory with malloc()
instead of ut_malloc(), because these functions will be called before
srv_general_init().
srv_free_paths_and_sizes(): New function, for cleaning up after
srv_parse_data_file_paths_and_sizes() and
srv_parse_log_group_home_dirs().
rb://92 approved by Sunny Bains
------------------------------------------------------------------------
r4297 | vasil | 2009-02-25 17:19:19 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
White-space cleanup in the ChangeLog
------------------------------------------------------------------------
r4301 | vasil | 2009-02-25 21:33:32 +0200 (Wed, 25 Feb 2009) | 5 lines
branches/zip:
Do not output the commands that restore the environment because they depend
on the state of the environment before the test starts executing.
------------------------------------------------------------------------
r4315 | vasil | 2009-02-26 09:21:20 +0200 (Thu, 26 Feb 2009) | 5 lines
branches/zip:
Apply any necessary patches to the mysql tree at the end of setup.sh
This step was previously done manually (and sometimes forgotten).
------------------------------------------------------------------------
r4319 | marko | 2009-02-26 23:27:51 +0200 (Thu, 26 Feb 2009) | 6 lines
branches/zip: btr_check_blob_fil_page_type(): Do not report
FIL_PAGE_TYPE mismatch even when purging a BLOB.
Heavy users may have large data files created with MySQL 5.0 or earlier,
and they don not want to have the error log flooded with such messages.
This fixes Issue #182.
------------------------------------------------------------------------
r4320 | inaam | 2009-02-27 02:13:19 +0200 (Fri, 27 Feb 2009) | 8 lines
branches/zip
This is to revert the changes made to the plug.in (r4251) as a fix for
issue# 178. Changes to plug.in will not propogate to a plugin
installation unless autotools are rerun which is unacceptable.
A fix for issue# 178 will be committed in a separate commit.
------------------------------------------------------------------------
r4321 | inaam | 2009-02-27 02:16:46 +0200 (Fri, 27 Feb 2009) | 6 lines
branches/zip
This is a fix for issue#178. Instead of using UNIV_LINUX which is
defined through CFLAGS we use compiler generated define __linux__
that is effective for both .c and .cc files.
------------------------------------------------------------------------
r4324 | vasil | 2009-02-27 13:27:18 +0200 (Fri, 27 Feb 2009) | 39 lines
branches/zip:
Add FreeBSD to the list of the operating systems that have
sizeof(pthread_t) == sizeof(void*) (i.e. word size).
On FreeBSD pthread_t is defined like:
/usr/include/sys/_pthreadtypes.h:
typedef struct pthread *pthread_t;
I did the following tests (per Inaam's recommendation):
a) appropriate version of GCC is available on that platform (4.1.2 or
higher for atomics to be available)
On FreeBSD 6.x the default compiler is 3.4.6, on FreeBSD 7.x the default
one is 4.2.1. One can always install the version of choice from the ports
collection. If gcc 3.x is used then HAVE_GCC_ATOMIC_BUILTINS will not be
defined and thus the change I am committing will make no difference.
b) find out if sizeof(pthread_t) == sizeof(long)
On 32 bit both are 4 bytes, on 64 bit both are 8 bytes.
c) find out the compiler generated platform define (e.g.: __aix, __sunos__
etc.)
The macro is __FreeBSD__.
d) patch univ.i with the appropriate platform define
e) build the mysql
f) ensure it is using atomic builtins (look at the err.log message at
system startup. It should say we are using atomics for both mutexes and
rw-locks)
g) do sanity testing (keeping in view the smp changes)
I ran the mysql-test suite. All tests pass.
------------------------------------------------------------------------
r4353 | vasil | 2009-03-05 09:27:29 +0200 (Thu, 05 Mar 2009) | 6 lines
branches/zip:
As suggested by Ken, print a message that says that the Google SMP patch
(GCC atomics) is disabled if it is. Also extend the message when the patch
is partially enabled to make it clear that it is partially enabled.
------------------------------------------------------------------------
r4356 | vasil | 2009-03-05 13:49:51 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Fix typo made in r4353.
------------------------------------------------------------------------
r4357 | vasil | 2009-03-05 16:38:59 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip:
Implement a check whether pthread_t objects can be used by GCC atomic
builtin functions. This check is implemented in plug.in and defines the
macro HAVE_ATOMIC_PTHREAD_T. This macro is checked in univ.i and the
relevant part of the code enabled (the one that uses GCC atomics against
pthread_t objects).
In addition to this, the same program that is compiled as part of the
plug.in check is added in ut/ut0auxconf.c. In the InnoDB Plugin source
archives that are shipped to the users, a generated Makefile.in is added.
That Makefile.in will be modified to compile ut/ut0auxconf.c and define
the macro HAVE_ATOMIC_PTHREAD_T if the compilation succeeds. I.e.
Makefile.in will emulate the work that is done by plug.in. This is done in
order to make the check happen and HAVE_ATOMIC_PTHREAD_T eventually
defined without regenerating MySQL's ./configure from
./storage/innobase/plug.in. The point is not to ask users to install the
autotools and regenerate ./configure.
rb://95
Approved by: Marko
------------------------------------------------------------------------
r4360 | vasil | 2009-03-05 22:23:17 +0200 (Thu, 05 Mar 2009) | 21 lines
branches/zip: Merge revisions 4287:4357 from branches/5.1:
------------------------------------------------------------------------
r4325 | sunny | 2009-03-02 02:28:52 +0200 (Mon, 02 Mar 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Bug#43203: Overflow from auto incrementing causes server segv
It was not a SIGSEGV but an assertion failure. The assertion was checking
the invariant that *first_value passed in by MySQL doesn't contain a value
that is greater than the max value for that type. The assertion has been
changed to a check and if the value is greater than the max we report a
generic AUTOINC failure.
rb://93
Approved by Heikki
------------------------------------------------------------------------
------------------------------------------------------------------------
r4361 | vasil | 2009-03-05 22:27:54 +0200 (Thu, 05 Mar 2009) | 30 lines
branches/zip: Merge revision 4358 from branches/5.1 (resolving a conflict):
------------------------------------------------------------------------
r4358 | vasil | 2009-03-05 21:21:10 +0200 (Thu, 05 Mar 2009) | 21 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2728.19.1
committer: Alfranio Correia <alfranio.correia@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-02-03 11:36:46 +0000
message:
BUG#42445 Warning messages in innobase/handler/ha_innodb.cc
There was a type casting problem in the storage/innobase/handler/ha_innodb.cc,
(int ha_innobase::write_row(...)). Innobase uses has an internal error variable
of type 'ulint' while mysql uses an 'int'.
To fix the problem the function manipulates an error variable of
type 'ulint' and only casts it into 'int' when needs to return the value.
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4362 | vasil | 2009-03-05 22:29:07 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip: Merge revision 4359 from branches/5.1:
------------------------------------------------------------------------
r4359 | vasil | 2009-03-05 21:42:01 +0200 (Thu, 05 Mar 2009) | 14 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2747
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2009-01-16 17:49:07 +0100
message:
Add another cast to ignore int/ulong difference in error types, silence warning on Win64
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4363 | vasil | 2009-03-05 22:31:37 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the bugfix in c4360.
------------------------------------------------------------------------
r4378 | calvin | 2009-03-09 10:10:17 +0200 (Mon, 09 Mar 2009) | 7 lines
branches/zip: remove compile flag MYSQL_SERVER for dynamic plugin
The dynamic plugin on Windows used to be built with MYSQL_SERVER
compile flag, while it is not the case for other platforms.
r3797 assumed MYSQL_SERVER was not defined for dynamic plugin,
which introduced the engine crash during dropping a database.
------------------------------------------------------------------------
r4396 | marko | 2009-03-12 09:22:27 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: btr_store_big_rec_extern_fields(): Initialize FIL_PAGE_TYPE
in a separate redo log entry. This will make ibbackup --apply-log
debugging easier.
------------------------------------------------------------------------
r4397 | marko | 2009-03-12 09:26:11 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: trx_sys_create_doublewrite_buf(): As the dummy change,
initialize FIL_PAGE_TYPE. This will make it easier to write the debug
assertions for ibbackup --apply-log.
------------------------------------------------------------------------
r4401 | marko | 2009-03-12 10:26:40 +0200 (Thu, 12 Mar 2009) | 19 lines
branches/zip: Merge revisions 4359:4400 from branches/5.1:
------------------------------------------------------------------------
r4399 | marko | 2009-03-12 09:38:05 +0200 (Thu, 12 Mar 2009) | 2 lines
branches/5.1: row_sel_get_clust_rec_for_mysql(): Store the cursor position
also for unlock_row(). (Bug #39320)
------------------------------------------------------------------------
r4400 | marko | 2009-03-12 10:06:44 +0200 (Thu, 12 Mar 2009) | 5 lines
branches/5.1: Fix a bug in multi-table semi-consistent reads.
Remember the acquired record locks per table handle (row_prebuilt_t)
rather than per transaction (trx_t), so that unlock_row should successfully
unlock all non-matching rows in multi-table operations.
This deficiency was found while investigating Bug #39320.
------------------------------------------------------------------------
These were submitted as rb://94 and rb://96 and approved by Heikki Tuuri.
------------------------------------------------------------------------
r4455 | marko | 2009-03-16 11:43:34 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: UT_LIST_VALIDATE(): Add the parameter ASSERTION and
adjust all callers.
------------------------------------------------------------------------
r4456 | marko | 2009-03-16 12:59:25 +0200 (Mon, 16 Mar 2009) | 6 lines
branches/zip: UT_LIST_VALIDATE(): Assert that the link is non-NULL
before dereferencing it. In this way, ut_list_node_313 will be
pointing to the last non-NULL list item at the time of the assertion
failure. (gcc-4.3.2 -O3 seems to optimize the common subexpressions
and make the variable NULL, though.)
------------------------------------------------------------------------
r4457 | marko | 2009-03-16 14:12:02 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: sync_thread_add_level(): Make the assertions about
level == SYNC_BUF_BLOCK more readable.
------------------------------------------------------------------------
r4461 | vasil | 2009-03-17 09:38:19 +0200 (Tue, 17 Mar 2009) | 6 lines
branches/zip:
Remove mysql-test/patches/bug32625.diff because that bug was fixed in
the mysql repository (1 year and 4 months after sending them the simple
patch!). See http://bugs.mysql.com/32625
------------------------------------------------------------------------
r4465 | marko | 2009-03-17 12:34:19 +0200 (Tue, 17 Mar 2009) | 1 line
branches/zip: buf0buddy.c: Add and adjust some debug assertions.
------------------------------------------------------------------------
r4473 | vasil | 2009-03-17 15:50:30 +0200 (Tue, 17 Mar 2009) | 5 lines
branches/zip:
Increment the InnoDB Plugin version from 1.0.3 to 1.0.4 now that
1.0.3 has been released.
------------------------------------------------------------------------
r4478 | vasil | 2009-03-18 11:53:53 +0200 (Wed, 18 Mar 2009) | 5 lines
branches/zip:
Remove mysql-test/patches/bug41893.diff because that bug has been fixed
in the MySQL repository, see http://bugs.mysql.com/41893.
------------------------------------------------------------------------
r4479 | marko | 2009-03-18 12:43:54 +0200 (Wed, 18 Mar 2009) | 2 lines
branches/zip: buf_LRU_block_remove_hashed_page(): Add some debug assertions.
------------------------------------------------------------------------
r4480 | marko | 2009-03-18 14:32:13 +0200 (Wed, 18 Mar 2009) | 1 line
branches/zip: buf_buddy_free_low(): Correct the function comment.
------------------------------------------------------------------------
r4482 | marko | 2009-03-19 15:23:32 +0200 (Thu, 19 Mar 2009) | 12 lines
branches/zip: Merge revisions 4400:4481 from branches/5.1:
------------------------------------------------------------------------
r4481 | marko | 2009-03-19 15:01:48 +0200 (Thu, 19 Mar 2009) | 6 lines
branches/5.1: row_unlock_for_mysql(): Do not unlock records that were
modified by the current transaction. This bug was introduced or unmasked
in r4400.
rb://97 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r4490 | marko | 2009-03-20 12:33:33 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change for reducing dependencies in InnoDB Hot Backup:
Replace srv_sys->dummy_ind1 and srv_sys->dummy_ind2 with
dict_ind_redundant and dict_ind_compact, initialized in dict_init().
------------------------------------------------------------------------
r4491 | marko | 2009-03-20 12:45:18 +0200 (Fri, 20 Mar 2009) | 2 lines
branches/zip: Add const qualifiers or in/out comments to some function
parameters in log0log.
------------------------------------------------------------------------
r4492 | marko | 2009-03-20 12:52:14 +0200 (Fri, 20 Mar 2009) | 5 lines
branches/zip: page_validate(): Always report the space id and the
name of the index.
In Hot Backup, do not invoke comparison functions, as MySQL collations
will be unavailable.
------------------------------------------------------------------------
r4493 | marko | 2009-03-20 13:24:06 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: Replace fil_get_space_for_id_low() with fil_space_get_by_id().
------------------------------------------------------------------------
r4494 | marko | 2009-03-20 13:51:35 +0200 (Fri, 20 Mar 2009) | 3 lines
branches/zip: fil0fil.c: Refer to fil_system directly, not via local vars.
This eliminates some "unused variable" warnings when building
InnoDB Hot Backup in such a way that all mutex operations are no-ops.
------------------------------------------------------------------------
r4495 | marko | 2009-03-20 14:15:52 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: innobase_get_at_most_n_mbchars(): Declare in ha_prototypes.h.
------------------------------------------------------------------------
r4496 | marko | 2009-03-20 14:48:26 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_recover_page(): Remove compile-time constant parameters.
------------------------------------------------------------------------
r4497 | marko | 2009-03-20 14:56:19 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_sys_init(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4498 | marko | 2009-03-20 15:08:05 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change: Add const qualifiers.
log_block_checksum_is_ok_or_old_format(), recv_sys_add_to_parsing_buf():
The log block is read-only. Make it const.
------------------------------------------------------------------------
r4499 | marko | 2009-03-20 15:10:25 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_scan_log_recs(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4500 | marko | 2009-03-20 15:47:17 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: fil_init(): Add the parameter hash_size.
------------------------------------------------------------------------
r4501 | vasil | 2009-03-20 16:50:41 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip:
Add any entry about the release of 1.0.3 in the ChangeLog.
------------------------------------------------------------------------
r4515 | marko | 2009-03-23 10:49:53 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: hash_table_t: adaptive: Remove from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4516 | marko | 2009-03-23 10:57:16 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use ASSERT_HASH_MUTEX_OWN.
Make it a no-op in UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4517 | marko | 2009-03-23 11:07:20 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use PAGE_ZIP_MATCH.
In UNIV_HOTBACKUP builds, assume fixed allocation.
------------------------------------------------------------------------
r4521 | marko | 2009-03-23 12:05:47 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: buf_page_print(): Clean up the code #ifdef UNIV_HOTBACKUP.
------------------------------------------------------------------------
r4522 | marko | 2009-03-23 12:20:50 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Exclude some operating system interface code
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4523 | marko | 2009-03-23 13:00:43 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove the remaining references to hash_table_t::adapive
from UNIV_HOTBACKUP builds. This should have been done in r4515.
------------------------------------------------------------------------
r4524 | marko | 2009-03-23 14:05:18 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Enclose recv_recovery_from_backup_on and
recv_recovery_from_backup_is_on() in #ifdef UNIV_LOG_ARCHIVE.
------------------------------------------------------------------------
r4525 | marko | 2009-03-23 14:57:45 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: recv_parse_or_apply_log_rec_body(): Add debug assertions
ensuring that FIL_PAGE_TYPE makes sense when applying log records.
------------------------------------------------------------------------
r4526 | marko | 2009-03-23 16:21:34 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove unneeded definitions and dependencies
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4527 | calvin | 2009-03-23 23:15:33 +0200 (Mon, 23 Mar 2009) | 5 lines
branches/zip: adjust build files on Windows
Adjust the patch positions based on the latest MySQL source.
Also add the patches to the .bat files for vs9.
------------------------------------------------------------------------
2009-03-24 08:32:21 +00:00
|
|
|
/*****************************************************************************
|
|
|
|
|
2017-02-14 14:34:44 +05:30
|
|
|
Copyright (c) 1996, 2017, Oracle and/or its affiliates. All Rights Reserved.
|
MDEV-29694 Remove the InnoDB change buffer
The purpose of the change buffer was to reduce random disk access,
which could be useful on rotational storage, but maybe less so on
solid-state storage.
When we wished to
(1) insert a record into a non-unique secondary index,
(2) delete-mark a secondary index record,
(3) delete a secondary index record as part of purge (but not ROLLBACK),
and the B-tree leaf page where the record belongs to is not in the buffer
pool, we inserted a record into the change buffer B-tree, indexed by
the page identifier. When the page was eventually read into the buffer
pool, we looked up the change buffer B-tree for any modifications to the
page, applied these upon the completion of the read operation. This
was called the insert buffer merge.
We remove the change buffer, because it has been the source of
various hard-to-reproduce corruption bugs, including those fixed in
commit 5b9ee8d8193a8c7a8ebdd35eedcadc3ae78e7fc1 and
commit 165564d3c33ae3d677d70644a83afcb744bdbf65 but not limited to them.
A downgrade will fail with a clear message starting with
commit db14eb16f9977453467ec4765f481bb2f71814ba (MDEV-30106).
buf_page_t::state: Merge IBUF_EXIST to UNFIXED and
WRITE_FIX_IBUF to WRITE_FIX.
buf_pool_t::watch[]: Remove.
trx_t: Move isolation_level, check_foreigns, check_unique_secondary,
bulk_insert into the same bit-field. The only purpose of
trx_t::check_unique_secondary is to enable bulk insert into an
empty table. It no longer enables insert buffering for UNIQUE INDEX.
btr_cur_t::thr: Remove. This field was originally needed for change
buffering. Later, its use was extended to cover SPATIAL INDEX.
Much of the time, rtr_info::thr holds this field. When it does not,
we will add parameters to SPATIAL INDEX specific functions.
ibuf_upgrade_needed(): Check if the change buffer needs to be updated.
ibuf_upgrade(): Merge and upgrade the change buffer after all redo log
has been applied. Free any pages consumed by the change buffer, and
zero out the change buffer root page to mark the upgrade completed,
and to prevent a downgrade to an earlier version.
dict_load_tablespaces(): Renamed from
dict_check_tablespaces_and_store_max_id(). This needs to be invoked
before ibuf_upgrade().
btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.
btr_page_alloc(): Renamed from btr_page_alloc_low(). We no longer
allocate any change buffer pages.
btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.
row_search_index_entry(), btr_lift_page_up(): Add a parameter thr
for the SPATIAL INDEX case.
rtr_page_split_and_insert(): Specialized from btr_page_split_and_insert().
rtr_root_raise_and_insert(): Specialized from btr_root_raise_and_insert().
Note: The support for upgrading from the MySQL 3.23 or MySQL 4.0
change buffer format that predates the MySQL 4.1 introduction of
the option innodb_file_per_table was removed in MySQL 5.6.5
as part of mysql/mysql-server@69b6241a79876ae98bb0c9dce7c8d8799d6ad273
and MariaDB 10.0.11 as part of 1d0f70c2f894b27e98773a282871d32802f67964.
In the tests innodb.log_upgrade and innodb.log_corruption, we create
valid (upgraded) change buffer pages.
Tested by: Matthias Leich
2023-01-11 17:59:36 +02:00
|
|
|
Copyright (c) 2016, 2023, MariaDB Corporation.
|
branches/innodb+: Merge revisions 4150:4528 from branches/zip:
------------------------------------------------------------------------
r4152 | marko | 2009-02-10 12:52:27 +0200 (Tue, 10 Feb 2009) | 12 lines
branches/zip: When innodb_use_sys_malloc is set, ignore
innodb_additional_mem_pool_size, because nothing will
be allocated from mem_comm_pool.
mem_pool_create(): Remove the assertion about size. The function will
work with any size. However, an assertion would fail in ut_malloc_low()
when size==0.
mem_init(): When srv_use_sys_malloc is set, pass size=1 to mem_pool_create().
mem0mem.c: Add #include "srv0srv.h" that is needed by mem0dbg.c.
------------------------------------------------------------------------
r4153 | vasil | 2009-02-10 22:58:17 +0200 (Tue, 10 Feb 2009) | 14 lines
branches/zip:
(followup to r4145) Non-functional change:
Change the os_atomic_increment() and os_compare_and_swap() functions
to macros to avoid artificial limitations on the types of those
functions' arguments. As a consequence typecasts from the source
code can be removed.
Also remove Google's copyright from os0sync.ic because that file no longer
contains code from Google.
Approved by: Marko (rb://88), also ok from Inaam via IM
------------------------------------------------------------------------
r4163 | marko | 2009-02-12 00:14:19 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip: Make innodb_thread_concurrency=0 the default.
The old default was 8.
------------------------------------------------------------------------
r4169 | calvin | 2009-02-12 10:37:10 +0200 (Thu, 12 Feb 2009) | 3 lines
branches/zip: Adjust the result file of innodb_thread_concurrency_basic
test. The default value of innodb_thread_concurrency is changed to 0
(from 8) via r4163.
------------------------------------------------------------------------
r4174 | vasil | 2009-02-12 17:38:27 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip:
Fix pathname of the file to patch.
------------------------------------------------------------------------
r4176 | vasil | 2009-02-13 10:06:31 +0200 (Fri, 13 Feb 2009) | 7 lines
branches/zip:
Fix the failing mysql-test partition_innodb, which failed only if run after
innodb_trx_weight (or other test that would leave LATEST DEADLOCK ERROR into
the output of SHOW ENGINE INNODB STATUS). Find further explanation for the
failure at the top of the added patch partition_innodb.diff.
------------------------------------------------------------------------
r4198 | vasil | 2009-02-17 09:06:07 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
Add the full text of the GPLv2 license into the root directory of the
plugin. In previous releases this file was copied from an external source
(https://svn.innodb.com/svn/plugin/trunk/support/COPYING) "manually" when
creating the source and binary archives. It is less confusing to have this
present in the root directory of the SVN branch.
------------------------------------------------------------------------
r4199 | vasil | 2009-02-17 09:11:58 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add Google's license into COPYING.Google.
------------------------------------------------------------------------
r4200 | vasil | 2009-02-17 09:56:33 +0200 (Tue, 17 Feb 2009) | 11 lines
branches/zip:
To the files touched by the Google patch from c4144 (excluding
include/os0sync.ic because later we removed Google code from that file):
* Remove the Google license
* Remove old Innobase copyright lines
* Add a reference to the Google license and to the GPLv2 license at the top,
as recommended by the lawyers at Oracle Legal.
------------------------------------------------------------------------
r4201 | vasil | 2009-02-17 10:12:02 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 1/28]
------------------------------------------------------------------------
r4202 | vasil | 2009-02-17 10:15:06 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 2/28]
------------------------------------------------------------------------
r4203 | vasil | 2009-02-17 10:25:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 3/28]
------------------------------------------------------------------------
r4204 | vasil | 2009-02-17 10:55:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 4/28]
------------------------------------------------------------------------
r4205 | vasil | 2009-02-17 10:59:22 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 5/28]
------------------------------------------------------------------------
r4206 | vasil | 2009-02-17 11:02:27 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 6/28]
------------------------------------------------------------------------
r4207 | vasil | 2009-02-17 11:04:28 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 7/28]
------------------------------------------------------------------------
r4208 | vasil | 2009-02-17 11:06:49 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 8/28]
------------------------------------------------------------------------
r4209 | vasil | 2009-02-17 11:10:18 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 9/28]
------------------------------------------------------------------------
r4210 | vasil | 2009-02-17 11:12:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 10/28]
------------------------------------------------------------------------
r4211 | vasil | 2009-02-17 11:14:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 11/28]
------------------------------------------------------------------------
r4212 | vasil | 2009-02-17 11:18:35 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 12/28]
------------------------------------------------------------------------
r4213 | vasil | 2009-02-17 11:24:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4214 | vasil | 2009-02-17 11:27:31 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4215 | vasil | 2009-02-17 11:29:55 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 15/28]
------------------------------------------------------------------------
r4216 | vasil | 2009-02-17 11:33:38 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 16/28]
------------------------------------------------------------------------
r4217 | vasil | 2009-02-17 11:36:44 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 17/28]
------------------------------------------------------------------------
r4218 | vasil | 2009-02-17 11:39:11 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 18/28]
------------------------------------------------------------------------
r4219 | vasil | 2009-02-17 11:41:24 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 19/28]
------------------------------------------------------------------------
r4220 | vasil | 2009-02-17 11:43:50 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 20/28]
------------------------------------------------------------------------
r4221 | vasil | 2009-02-17 11:46:52 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 21/28]
------------------------------------------------------------------------
r4222 | vasil | 2009-02-17 11:50:12 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 22/28]
------------------------------------------------------------------------
r4223 | vasil | 2009-02-17 11:53:58 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 23/28]
------------------------------------------------------------------------
r4224 | vasil | 2009-02-17 12:01:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 24/28]
------------------------------------------------------------------------
r4225 | vasil | 2009-02-17 12:05:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 25/28]
------------------------------------------------------------------------
r4226 | vasil | 2009-02-17 12:09:16 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 26/28]
------------------------------------------------------------------------
r4227 | vasil | 2009-02-17 12:12:56 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 27/28]
------------------------------------------------------------------------
r4228 | vasil | 2009-02-17 12:14:04 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 28/28]
------------------------------------------------------------------------
r4229 | vasil | 2009-02-17 12:30:55 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add the copyright notice to the non C files.
------------------------------------------------------------------------
r4231 | marko | 2009-02-17 14:26:53 +0200 (Tue, 17 Feb 2009) | 12 lines
Minor cleanup of the Google SMP patch.
sync_array_object_signalled(): Add a (void) cast to eliminate a gcc warning
about the return value of os_atomic_increment() being ignored.
rw_lock_create_func(): Properly indent the preprocessor directives.
rw_lock_x_lock_low(), rw_lock_x_lock_func_nowait(): Split lines correctly.
rw_lock_set_writer_id_and_recursion_flag(): Silence a Valgrind warning.
Do not mix statements and variable declarations.
------------------------------------------------------------------------
r4232 | marko | 2009-02-17 14:59:54 +0200 (Tue, 17 Feb 2009) | 3 lines
branches/zip: When assigning lock->recursive = FALSE, also flag
lock->writer_thread invalid, so that Valgrind will catch more errors.
This is related to Issue #175.
------------------------------------------------------------------------
r4242 | marko | 2009-02-18 17:01:09 +0200 (Wed, 18 Feb 2009) | 2 lines
branches/zip: UT_DBG_STOP: Use do{} while(0) to silence a g++-4.3.2 warning
about a while(0); statement. This should fix (part of) Issue #176.
------------------------------------------------------------------------
r4243 | marko | 2009-02-18 17:04:03 +0200 (Wed, 18 Feb 2009) | 3 lines
branches/zip: buf_buddy_get_slot(): Fix a gcc 4.3.2 warning
about an empty body of a "for" statement.
This fixes part of Issue #176.
------------------------------------------------------------------------
r4244 | marko | 2009-02-18 17:25:45 +0200 (Wed, 18 Feb 2009) | 11 lines
branches/zip: Protect ut_total_allocated_memory with ut_list_mutex.
Unprotected updates to ut_total_allocated_memory in
os_mem_alloc_large() and os_mem_free_large(), called during
fast index creation, may corrupt the variable and cause assertion failures.
Also, add UNIV_MEM_ALLOC() and UNIV_MEM_FREE() instrumentation around
os_mem_alloc_large() and os_mem_free_large(), so that Valgrind can
detect more errors.
rb://90 approved by Heikki Tuuri. This addresses Issue #177.
------------------------------------------------------------------------
r4248 | marko | 2009-02-19 11:52:39 +0200 (Thu, 19 Feb 2009) | 2 lines
branches/zip: page_zip_set_size(): Fix a g++ 4.3.2 warning
about an empty body in a "for" statement. This closes Issue #176.
------------------------------------------------------------------------
r4251 | inaam | 2009-02-19 15:46:27 +0200 (Thu, 19 Feb 2009) | 8 lines
branches/zip: Issue #178 rb://91
Change plug.in to have same CXXFLAGS as CFLAGS. This is to ensure that
both .c and .cc files get compiled with same flags. To fix the issue
where UNIV_LINUX was defined only in .c files.
Approved by: Marko
------------------------------------------------------------------------
r4258 | vasil | 2009-02-20 11:52:19 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Cleanup in ChangeLog:
* Wrap lines at 78 characters
* Changed files are listed alphabetically
* White-space cleanup
------------------------------------------------------------------------
r4259 | vasil | 2009-02-20 11:59:42 +0200 (Fri, 20 Feb 2009) | 6 lines
branches/zip:
ChangeLog: Remove include/os0sync.ic from the entry about the google patch,
this file was modified later to not include Google's code.
------------------------------------------------------------------------
r4262 | vasil | 2009-02-20 14:56:59 +0200 (Fri, 20 Feb 2009) | 373 lines
branches/zip:
Merge revisions 4035:4261 from branches/5.1:
------------------------------------------------------------------------
r4065 | sunny | 2009-01-29 16:01:36 +0200 (Thu, 29 Jan 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: In the last round of AUTOINC cleanup we assumed that AUTOINC
is only defined for integer columns. This caused an assertion failure when
we checked for the maximum value of a column type. We now calculate the
max value for floating-point autoinc columns too.
Fix Bug#42400 - InnoDB autoinc code can't handle floating-point columns
rb://84 and Mantis issue://162
------------------------------------------------------------------------
r4111 | sunny | 2009-02-03 22:06:52 +0200 (Tue, 03 Feb 2009) | 2 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add the ULL suffix otherwise there is an overflow.
------------------------------------------------------------------------
r4128 | vasil | 2009-02-08 21:36:45 +0200 (Sun, 08 Feb 2009) | 18 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2709.20.31
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2008-12-19 01:28:51 +0100
message:
Disable part of innodb-autoinc.test, because the MySQL server asserts when
compiled --with-debug, due to bug 39828, "autoinc wraps around when offset and
increment > 1". This change should be reverted when that bug is fixed (and a
a few other minor changes to the test as described in comments).
modified:
mysql-test/r/innodb-autoinc.result
mysql-test/t/innodb-autoinc.test
------------------------------------------------------------------------
r4129 | vasil | 2009-02-08 21:54:25 +0200 (Sun, 08 Feb 2009) | 310 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
[looks like the changes to innodb-autoinc.test were made as part of
the following huge merge, but we are merging only changes to that file]
------------------------------------------------------------
revno: 2546.47.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: 5.1-rpl
timestamp: Fri 2009-01-23 13:22:05 +0100
message:
merge: 5.1 -> 5.1-rpl
conflicts:
Text conflict in client/mysqltest.cc
Text conflict in mysql-test/include/wait_until_connected_again.inc
Text conflict in mysql-test/lib/mtr_report.pm
Text conflict in mysql-test/mysql-test-run.pl
Text conflict in mysql-test/r/events_bugs.result
Text conflict in mysql-test/r/log_state.result
Text conflict in mysql-test/r/myisam_data_pointer_size_func.result
Text conflict in mysql-test/r/mysqlcheck.result
Text conflict in mysql-test/r/query_cache.result
Text conflict in mysql-test/r/status.result
Text conflict in mysql-test/suite/binlog/r/binlog_index.result
Text conflict in mysql-test/suite/binlog/r/binlog_innodb.result
Text conflict in mysql-test/suite/rpl/r/rpl_packet.result
Text conflict in mysql-test/suite/rpl/t/rpl_packet.test
Text conflict in mysql-test/t/disabled.def
Text conflict in mysql-test/t/events_bugs.test
Text conflict in mysql-test/t/log_state.test
Text conflict in mysql-test/t/myisam_data_pointer_size_func.test
Text conflict in mysql-test/t/mysqlcheck.test
Text conflict in mysql-test/t/query_cache.test
Text conflict in mysql-test/t/rpl_init_slave_func.test
Text conflict in mysql-test/t/status.test
removed:
mysql-test/suite/parts/r/partition_bit_ndb.result
mysql-test/suite/parts/t/partition_bit_ndb.test
mysql-test/suite/parts/t/partition_sessions.test
mysql-test/suite/sys_vars/inc/tmp_table_size_basic.inc
mysql-test/suite/sys_vars/r/tmp_table_size_basic_32.result
mysql-test/suite/sys_vars/r/tmp_table_size_basic_64.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic_32.test
mysql-test/suite/sys_vars/t/tmp_table_size_basic_64.test
mysql-test/t/log_bin_trust_function_creators_func-master.opt
mysql-test/t/rpl_init_slave_func-slave.opt
added:
mysql-test/include/check_events_off.inc
mysql-test/include/cleanup_fake_relay_log.inc
mysql-test/include/have_simple_parser.inc
mysql-test/include/no_running_event_scheduler.inc
mysql-test/include/no_running_events.inc
mysql-test/include/running_event_scheduler.inc
mysql-test/include/setup_fake_relay_log.inc
mysql-test/include/wait_condition_sp.inc
mysql-test/r/fulltext_plugin.result
mysql-test/r/have_simple_parser.require
mysql-test/r/innodb_bug38231.result
mysql-test/r/innodb_bug39438.result
mysql-test/r/innodb_mysql_rbk.result
mysql-test/r/partition_innodb_semi_consistent.result
mysql-test/r/query_cache_28249.result
mysql-test/r/status2.result
mysql-test/std_data/bug40482-bin.000001
mysql-test/suite/binlog/r/binlog_innodb_row.result
mysql-test/suite/binlog/t/binlog_innodb_row.test
mysql-test/suite/rpl/r/rpl_binlog_corruption.result
mysql-test/suite/rpl/t/rpl_binlog_corruption-master.opt
mysql-test/suite/rpl/t/rpl_binlog_corruption.test
mysql-test/suite/sys_vars/r/tmp_table_size_basic.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic.test
mysql-test/t/fulltext_plugin-master.opt
mysql-test/t/fulltext_plugin.test
mysql-test/t/innodb_bug38231.test
mysql-test/t/innodb_bug39438-master.opt
mysql-test/t/innodb_bug39438.test
mysql-test/t/innodb_mysql_rbk-master.opt
mysql-test/t/innodb_mysql_rbk.test
mysql-test/t/partition_innodb_semi_consistent-master.opt
mysql-test/t/partition_innodb_semi_consistent.test
mysql-test/t/query_cache_28249.test
mysql-test/t/status2.test
renamed:
mysql-test/suite/funcs_1/r/is_collation_character_set_applicability.result => mysql-test/suite/funcs_1/r/is_coll_char_set_appl.result
mysql-test/suite/funcs_1/t/is_collation_character_set_applicability.test => mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
modified:
.bzr-mysql/default.conf
CMakeLists.txt
client/mysql.cc
client/mysql_upgrade.c
client/mysqlcheck.c
client/mysqltest.cc
configure.in
extra/resolve_stack_dump.c
extra/yassl/include/openssl/ssl.h
include/config-win.h
include/m_ctype.h
include/my_global.h
mysql-test/extra/binlog_tests/database.test
mysql-test/extra/rpl_tests/rpl_auto_increment.test
mysql-test/include/commit.inc
mysql-test/include/have_32bit.inc
mysql-test/include/have_64bit.inc
mysql-test/include/index_merge1.inc
mysql-test/include/linux_sys_vars.inc
mysql-test/include/windows_sys_vars.inc
mysql-test/lib/mtr_report.pm
mysql-test/mysql-test-run.pl
mysql-test/r/alter_table.result
mysql-test/r/commit_1innodb.result
mysql-test/r/create.result
mysql-test/r/csv.result
mysql-test/r/ctype_ucs.result
mysql-test/r/date_formats.result
mysql-test/r/events_bugs.result
mysql-test/r/events_scheduling.result
mysql-test/r/fulltext.result
mysql-test/r/func_if.result
mysql-test/r/func_in.result
mysql-test/r/func_str.result
mysql-test/r/func_time.result
mysql-test/r/grant.result
mysql-test/r/index_merge_myisam.result
mysql-test/r/information_schema.result
mysql-test/r/innodb-autoinc.result
mysql-test/r/innodb.result
mysql-test/r/innodb_mysql.result
mysql-test/r/log_bin_trust_function_creators_func.result
mysql-test/r/log_state.result
mysql-test/r/myisampack.result
mysql-test/r/mysql.result
mysql-test/r/mysqlcheck.result
mysql-test/r/partition_datatype.result
mysql-test/r/partition_mgm.result
mysql-test/r/partition_pruning.result
mysql-test/r/query_cache.result
mysql-test/r/read_buffer_size_basic.result
mysql-test/r/read_rnd_buffer_size_basic.result
mysql-test/r/rpl_init_slave_func.result
mysql-test/r/select.result
mysql-test/r/status.result
mysql-test/r/strict.result
mysql-test/r/temp_table.result
mysql-test/r/type_bit.result
mysql-test/r/type_date.result
mysql-test/r/type_float.result
mysql-test/r/warnings_engine_disabled.result
mysql-test/r/xml.result
mysql-test/suite/binlog/r/binlog_database.result
mysql-test/suite/binlog/r/binlog_index.result
mysql-test/suite/binlog/r/binlog_innodb.result
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result
mysql-test/suite/binlog/t/binlog_innodb.test
mysql-test/suite/funcs_1/r/is_columns_is.result
mysql-test/suite/funcs_1/r/is_engines.result
mysql-test/suite/funcs_1/r/storedproc.result
mysql-test/suite/funcs_1/storedproc/param_check.inc
mysql-test/suite/funcs_2/t/disabled.def
mysql-test/suite/ndb/t/disabled.def
mysql-test/suite/parts/r/partition_bit_innodb.result
mysql-test/suite/parts/r/partition_bit_myisam.result
mysql-test/suite/parts/r/partition_special_innodb.result
mysql-test/suite/parts/t/disabled.def
mysql-test/suite/parts/t/partition_special_innodb.test
mysql-test/suite/parts/t/partition_value_innodb.test
mysql-test/suite/parts/t/partition_value_myisam.test
mysql-test/suite/parts/t/partition_value_ndb.test
mysql-test/suite/rpl/r/rpl_auto_increment.result
mysql-test/suite/rpl/r/rpl_packet.result
mysql-test/suite/rpl/r/rpl_row_create_table.result
mysql-test/suite/rpl/r/rpl_slave_skip.result
mysql-test/suite/rpl/r/rpl_trigger.result
mysql-test/suite/rpl/t/disabled.def
mysql-test/suite/rpl/t/rpl_packet.test
mysql-test/suite/rpl/t/rpl_row_create_table.test
mysql-test/suite/rpl/t/rpl_slave_skip.test
mysql-test/suite/rpl/t/rpl_trigger.test
mysql-test/suite/rpl_ndb/t/disabled.def
mysql-test/suite/sys_vars/inc/key_buffer_size_basic.inc
mysql-test/suite/sys_vars/inc/sort_buffer_size_basic.inc
mysql-test/suite/sys_vars/r/key_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/key_buffer_size_basic_64.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_64.result
mysql-test/t/alter_table.test
mysql-test/t/create.test
mysql-test/t/csv.test
mysql-test/t/ctype_ucs.test
mysql-test/t/date_formats.test
mysql-test/t/disabled.def
mysql-test/t/events_bugs.test
mysql-test/t/events_scheduling.test
mysql-test/t/fulltext.test
mysql-test/t/func_if.test
mysql-test/t/func_in.test
mysql-test/t/func_str.test
mysql-test/t/func_time.test
mysql-test/t/grant.test
mysql-test/t/information_schema.test
mysql-test/t/innodb-autoinc.test
mysql-test/t/innodb.test
mysql-test/t/innodb_mysql.test
mysql-test/t/log_bin_trust_function_creators_func.test
mysql-test/t/log_state.test
mysql-test/t/myisam_data_pointer_size_func.test
mysql-test/t/myisampack.test
mysql-test/t/mysql.test
mysql-test/t/mysqlcheck.test
mysql-test/t/partition_innodb_stmt.test
mysql-test/t/partition_mgm.test
mysql-test/t/partition_pruning.test
mysql-test/t/query_cache.test
mysql-test/t/rpl_init_slave_func.test
mysql-test/t/select.test
mysql-test/t/status.test
mysql-test/t/strict.test
mysql-test/t/temp_table.test
mysql-test/t/type_bit.test
mysql-test/t/type_date.test
mysql-test/t/type_float.test
mysql-test/t/warnings_engine_disabled.test
mysql-test/t/xml.test
mysys/my_getopt.c
mysys/my_init.c
scripts/mysql_install_db.sh
sql-common/my_time.c
sql/field.cc
sql/field.h
sql/filesort.cc
sql/ha_partition.cc
sql/ha_partition.h
sql/item.cc
sql/item_cmpfunc.cc
sql/item_func.h
sql/item_strfunc.cc
sql/item_sum.cc
sql/item_timefunc.cc
sql/item_timefunc.h
sql/log.cc
sql/log.h
sql/log_event.cc
sql/log_event.h
sql/mysql_priv.h
sql/mysqld.cc
sql/opt_range.cc
sql/partition_info.cc
sql/repl_failsafe.cc
sql/rpl_constants.h
sql/set_var.cc
sql/slave.cc
sql/spatial.h
sql/sql_acl.cc
sql/sql_base.cc
sql/sql_binlog.cc
sql/sql_class.h
sql/sql_cursor.cc
sql/sql_delete.cc
sql/sql_lex.cc
sql/sql_lex.h
sql/sql_locale.cc
sql/sql_parse.cc
sql/sql_partition.cc
sql/sql_plugin.cc
sql/sql_plugin.h
sql/sql_profile.cc
sql/sql_repl.cc
sql/sql_select.cc
sql/sql_select.h
sql/sql_show.cc
sql/sql_table.cc
sql/sql_trigger.cc
sql/sql_trigger.h
sql/table.cc
sql/table.h
sql/unireg.cc
storage/csv/ha_tina.cc
storage/federated/ha_federated.cc
storage/heap/ha_heap.cc
storage/innobase/Makefile.am
storage/innobase/btr/btr0sea.c
storage/innobase/buf/buf0lru.c
storage/innobase/dict/dict0dict.c
storage/innobase/dict/dict0mem.c
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innobase/include/btr0sea.h
storage/innobase/include/dict0dict.h
storage/innobase/include/dict0mem.h
storage/innobase/include/ha_prototypes.h
storage/innobase/include/lock0lock.h
storage/innobase/include/row0mysql.h
storage/innobase/include/sync0sync.ic
storage/innobase/include/ut0ut.h
storage/innobase/lock/lock0lock.c
storage/innobase/os/os0file.c
storage/innobase/plug.in
storage/innobase/row/row0mysql.c
storage/innobase/row/row0sel.c
storage/innobase/srv/srv0srv.c
storage/innobase/srv/srv0start.c
storage/innobase/ut/ut0ut.c
storage/myisam/ft_boolean_search.c
strings/ctype.c
strings/xml.c
tests/mysql_client_test.c
win/configure.js
mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
------------------------------------------------------------------------
r4165 | calvin | 2009-02-12 01:34:27 +0200 (Thu, 12 Feb 2009) | 1 line
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: minor non-functional changes.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4263 | vasil | 2009-02-20 15:00:46 +0200 (Fri, 20 Feb 2009) | 4 lines
branches/zip:
Add a ChangeLog entry for a change in r4262.
------------------------------------------------------------------------
r4265 | marko | 2009-02-20 22:31:03 +0200 (Fri, 20 Feb 2009) | 5 lines
branches/zip: Make innodb_use_sys_malloc=ON the default.
Replace srv_use_sys_malloc with UNIV_LIKELY(srv_use_sys_malloc)
to improve branch prediction in the default case.
Approved by Ken over the IM.
------------------------------------------------------------------------
r4266 | vasil | 2009-02-20 23:29:32 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Add a sentence at the top of COPYING.Google to clarify that this license
does not apply to the whole InnoDB.
Suggested by: Ken
------------------------------------------------------------------------
r4268 | marko | 2009-02-23 12:43:51 +0200 (Mon, 23 Feb 2009) | 9 lines
branches/zip: Initialize ut_list_mutex at startup. Without this fix,
ut_list_mutex would be used uninitialized when innodb_use_sys_malloc=1.
This fix addresses Issue #181.
ut_mem_block_list_init(): Rename to ut_mem_init() and make public.
ut_malloc_low(), ut_free_all_mem(): Add ut_a(ut_mem_block_list_inited).
mem_init(): Call ut_mem_init().
------------------------------------------------------------------------
r4269 | marko | 2009-02-23 15:09:49 +0200 (Mon, 23 Feb 2009) | 7 lines
branches/zip: When freeing an uncompressed BLOB page, tolerate garbage in
FIL_PAGE_TYPE. (Bug #43043, Issue #182)
btr_check_blob_fil_page_type(): New function.
btr_free_externally_stored_field(), btr_copy_blob_prefix():
Call btr_check_blob_fil_page_type() to check FIL_PAGE_TYPE.
------------------------------------------------------------------------
r4272 | marko | 2009-02-23 23:10:18 +0200 (Mon, 23 Feb 2009) | 8 lines
branches/zip: Adjust the fix of Issue #182 in r4269 per Inaam's suggestion.
btr_check_blob_fil_page_type(): Replace the parameter
const char* op
with
ibool read. Do not print anything about page type mismatch
when reading a BLOB page in Antelope format.
Print space id before page number.
------------------------------------------------------------------------
r4273 | marko | 2009-02-24 00:11:11 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: ut_mem_init(): Add the assertion !ut_mem_block_list_inited.
------------------------------------------------------------------------
r4274 | marko | 2009-02-24 00:14:38 +0200 (Tue, 24 Feb 2009) | 12 lines
branches/zip: Fix bugs in the fix of Issue #181. Tested inside and
outside Valgrind, with innodb_use_sys_malloc set to 0 and 1.
mem_init(): Invoke ut_mem_init() before mem_pool_create(), because
the latter one will invoke ut_malloc().
srv_general_init(): Do not initialize the memory subsystem (mem_init()).
innobase_init(): Initialize the memory subsystem (mem_init()) before
calling srv_parse_data_file_paths_and_sizes(), which needs ut_malloc().
Call ut_free_all_mem() in error handling to clean up after the mem_init().
------------------------------------------------------------------------
r4280 | marko | 2009-02-24 15:14:59 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove unused function os_mem_alloc_nocache().
------------------------------------------------------------------------
r4281 | marko | 2009-02-24 16:02:48 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove the unused function dict_index_get_type().
------------------------------------------------------------------------
r4283 | marko | 2009-02-24 23:06:56 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: srv0start.c: Remove unnecessary #include "mem0pool.h".
------------------------------------------------------------------------
r4284 | marko | 2009-02-24 23:26:38 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: mem0mem.c: Remove unnecessary #include "mach0data.h".
------------------------------------------------------------------------
r4288 | vasil | 2009-02-25 10:48:07 +0200 (Wed, 25 Feb 2009) | 21 lines
branches/zip: Merge revisions 4261:4287 from branches/5.1:
------------------------------------------------------------------------
r4287 | sunny | 2009-02-25 05:32:01 +0200 (Wed, 25 Feb 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#42714 AUTO_INCREMENT errors in 5.1.31. There are two
changes to the autoinc handling.
1. To fix the immediate problem from the bug report, we must ensure that the
value written to the table is always less than the max value stored in
dict_table_t.
2. The second related change is that according to MySQL documentation when
the offset is greater than the increment, we should ignore the offset.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4289 | vasil | 2009-02-25 10:53:51 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the fix in r4288.
------------------------------------------------------------------------
r4290 | vasil | 2009-02-25 11:05:44 +0200 (Wed, 25 Feb 2009) | 11 lines
branches/zip:
Make ChangeLog entries for bugs in bugs.mysql.com in the form:
Fix Bug#12345 bug title
(for bugs after 1.0.2 was released and the ChangeLog published)
There is no need to bloat the ChangeLog with information that is available
via bugs.mysql.com.
Discussed with: Marko
------------------------------------------------------------------------
r4291 | vasil | 2009-02-25 11:08:32 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Fix Bug synopsis and remove explanation
------------------------------------------------------------------------
r4292 | marko | 2009-02-25 12:09:15 +0200 (Wed, 25 Feb 2009) | 25 lines
branches/zip: Correct the initialization of the memory subsystem once
again, to finally put Issue #181 to rest.
Revert some parts of r4274. It is best not to call ut_malloc() before
srv_general_init().
mem_init(): Do not call ut_mem_init().
srv_general_init(): Initialize the memory subsystem in two phases:
first ut_mem_init(), then mem_init(). This is because os_sync_init()
and sync_init() depend on ut_mem_init() and mem_init() depends on
os_sync_init() or sync_init().
srv_parse_data_file_paths_and_sizes(),
srv_parse_log_group_home_dirs(): Remove the output parameters. Assign
to the global variables directly. Allocate memory with malloc()
instead of ut_malloc(), because these functions will be called before
srv_general_init().
srv_free_paths_and_sizes(): New function, for cleaning up after
srv_parse_data_file_paths_and_sizes() and
srv_parse_log_group_home_dirs().
rb://92 approved by Sunny Bains
------------------------------------------------------------------------
r4297 | vasil | 2009-02-25 17:19:19 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
White-space cleanup in the ChangeLog
------------------------------------------------------------------------
r4301 | vasil | 2009-02-25 21:33:32 +0200 (Wed, 25 Feb 2009) | 5 lines
branches/zip:
Do not output the commands that restore the environment because they depend
on the state of the environment before the test starts executing.
------------------------------------------------------------------------
r4315 | vasil | 2009-02-26 09:21:20 +0200 (Thu, 26 Feb 2009) | 5 lines
branches/zip:
Apply any necessary patches to the mysql tree at the end of setup.sh
This step was previously done manually (and sometimes forgotten).
------------------------------------------------------------------------
r4319 | marko | 2009-02-26 23:27:51 +0200 (Thu, 26 Feb 2009) | 6 lines
branches/zip: btr_check_blob_fil_page_type(): Do not report
FIL_PAGE_TYPE mismatch even when purging a BLOB.
Heavy users may have large data files created with MySQL 5.0 or earlier,
and they don not want to have the error log flooded with such messages.
This fixes Issue #182.
------------------------------------------------------------------------
r4320 | inaam | 2009-02-27 02:13:19 +0200 (Fri, 27 Feb 2009) | 8 lines
branches/zip
This is to revert the changes made to the plug.in (r4251) as a fix for
issue# 178. Changes to plug.in will not propogate to a plugin
installation unless autotools are rerun which is unacceptable.
A fix for issue# 178 will be committed in a separate commit.
------------------------------------------------------------------------
r4321 | inaam | 2009-02-27 02:16:46 +0200 (Fri, 27 Feb 2009) | 6 lines
branches/zip
This is a fix for issue#178. Instead of using UNIV_LINUX which is
defined through CFLAGS we use compiler generated define __linux__
that is effective for both .c and .cc files.
------------------------------------------------------------------------
r4324 | vasil | 2009-02-27 13:27:18 +0200 (Fri, 27 Feb 2009) | 39 lines
branches/zip:
Add FreeBSD to the list of the operating systems that have
sizeof(pthread_t) == sizeof(void*) (i.e. word size).
On FreeBSD pthread_t is defined like:
/usr/include/sys/_pthreadtypes.h:
typedef struct pthread *pthread_t;
I did the following tests (per Inaam's recommendation):
a) appropriate version of GCC is available on that platform (4.1.2 or
higher for atomics to be available)
On FreeBSD 6.x the default compiler is 3.4.6, on FreeBSD 7.x the default
one is 4.2.1. One can always install the version of choice from the ports
collection. If gcc 3.x is used then HAVE_GCC_ATOMIC_BUILTINS will not be
defined and thus the change I am committing will make no difference.
b) find out if sizeof(pthread_t) == sizeof(long)
On 32 bit both are 4 bytes, on 64 bit both are 8 bytes.
c) find out the compiler generated platform define (e.g.: __aix, __sunos__
etc.)
The macro is __FreeBSD__.
d) patch univ.i with the appropriate platform define
e) build the mysql
f) ensure it is using atomic builtins (look at the err.log message at
system startup. It should say we are using atomics for both mutexes and
rw-locks)
g) do sanity testing (keeping in view the smp changes)
I ran the mysql-test suite. All tests pass.
------------------------------------------------------------------------
r4353 | vasil | 2009-03-05 09:27:29 +0200 (Thu, 05 Mar 2009) | 6 lines
branches/zip:
As suggested by Ken, print a message that says that the Google SMP patch
(GCC atomics) is disabled if it is. Also extend the message when the patch
is partially enabled to make it clear that it is partially enabled.
------------------------------------------------------------------------
r4356 | vasil | 2009-03-05 13:49:51 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Fix typo made in r4353.
------------------------------------------------------------------------
r4357 | vasil | 2009-03-05 16:38:59 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip:
Implement a check whether pthread_t objects can be used by GCC atomic
builtin functions. This check is implemented in plug.in and defines the
macro HAVE_ATOMIC_PTHREAD_T. This macro is checked in univ.i and the
relevant part of the code enabled (the one that uses GCC atomics against
pthread_t objects).
In addition to this, the same program that is compiled as part of the
plug.in check is added in ut/ut0auxconf.c. In the InnoDB Plugin source
archives that are shipped to the users, a generated Makefile.in is added.
That Makefile.in will be modified to compile ut/ut0auxconf.c and define
the macro HAVE_ATOMIC_PTHREAD_T if the compilation succeeds. I.e.
Makefile.in will emulate the work that is done by plug.in. This is done in
order to make the check happen and HAVE_ATOMIC_PTHREAD_T eventually
defined without regenerating MySQL's ./configure from
./storage/innobase/plug.in. The point is not to ask users to install the
autotools and regenerate ./configure.
rb://95
Approved by: Marko
------------------------------------------------------------------------
r4360 | vasil | 2009-03-05 22:23:17 +0200 (Thu, 05 Mar 2009) | 21 lines
branches/zip: Merge revisions 4287:4357 from branches/5.1:
------------------------------------------------------------------------
r4325 | sunny | 2009-03-02 02:28:52 +0200 (Mon, 02 Mar 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Bug#43203: Overflow from auto incrementing causes server segv
It was not a SIGSEGV but an assertion failure. The assertion was checking
the invariant that *first_value passed in by MySQL doesn't contain a value
that is greater than the max value for that type. The assertion has been
changed to a check and if the value is greater than the max we report a
generic AUTOINC failure.
rb://93
Approved by Heikki
------------------------------------------------------------------------
------------------------------------------------------------------------
r4361 | vasil | 2009-03-05 22:27:54 +0200 (Thu, 05 Mar 2009) | 30 lines
branches/zip: Merge revision 4358 from branches/5.1 (resolving a conflict):
------------------------------------------------------------------------
r4358 | vasil | 2009-03-05 21:21:10 +0200 (Thu, 05 Mar 2009) | 21 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2728.19.1
committer: Alfranio Correia <alfranio.correia@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-02-03 11:36:46 +0000
message:
BUG#42445 Warning messages in innobase/handler/ha_innodb.cc
There was a type casting problem in the storage/innobase/handler/ha_innodb.cc,
(int ha_innobase::write_row(...)). Innobase uses has an internal error variable
of type 'ulint' while mysql uses an 'int'.
To fix the problem the function manipulates an error variable of
type 'ulint' and only casts it into 'int' when needs to return the value.
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4362 | vasil | 2009-03-05 22:29:07 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip: Merge revision 4359 from branches/5.1:
------------------------------------------------------------------------
r4359 | vasil | 2009-03-05 21:42:01 +0200 (Thu, 05 Mar 2009) | 14 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2747
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2009-01-16 17:49:07 +0100
message:
Add another cast to ignore int/ulong difference in error types, silence warning on Win64
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4363 | vasil | 2009-03-05 22:31:37 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the bugfix in c4360.
------------------------------------------------------------------------
r4378 | calvin | 2009-03-09 10:10:17 +0200 (Mon, 09 Mar 2009) | 7 lines
branches/zip: remove compile flag MYSQL_SERVER for dynamic plugin
The dynamic plugin on Windows used to be built with MYSQL_SERVER
compile flag, while it is not the case for other platforms.
r3797 assumed MYSQL_SERVER was not defined for dynamic plugin,
which introduced the engine crash during dropping a database.
------------------------------------------------------------------------
r4396 | marko | 2009-03-12 09:22:27 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: btr_store_big_rec_extern_fields(): Initialize FIL_PAGE_TYPE
in a separate redo log entry. This will make ibbackup --apply-log
debugging easier.
------------------------------------------------------------------------
r4397 | marko | 2009-03-12 09:26:11 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: trx_sys_create_doublewrite_buf(): As the dummy change,
initialize FIL_PAGE_TYPE. This will make it easier to write the debug
assertions for ibbackup --apply-log.
------------------------------------------------------------------------
r4401 | marko | 2009-03-12 10:26:40 +0200 (Thu, 12 Mar 2009) | 19 lines
branches/zip: Merge revisions 4359:4400 from branches/5.1:
------------------------------------------------------------------------
r4399 | marko | 2009-03-12 09:38:05 +0200 (Thu, 12 Mar 2009) | 2 lines
branches/5.1: row_sel_get_clust_rec_for_mysql(): Store the cursor position
also for unlock_row(). (Bug #39320)
------------------------------------------------------------------------
r4400 | marko | 2009-03-12 10:06:44 +0200 (Thu, 12 Mar 2009) | 5 lines
branches/5.1: Fix a bug in multi-table semi-consistent reads.
Remember the acquired record locks per table handle (row_prebuilt_t)
rather than per transaction (trx_t), so that unlock_row should successfully
unlock all non-matching rows in multi-table operations.
This deficiency was found while investigating Bug #39320.
------------------------------------------------------------------------
These were submitted as rb://94 and rb://96 and approved by Heikki Tuuri.
------------------------------------------------------------------------
r4455 | marko | 2009-03-16 11:43:34 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: UT_LIST_VALIDATE(): Add the parameter ASSERTION and
adjust all callers.
------------------------------------------------------------------------
r4456 | marko | 2009-03-16 12:59:25 +0200 (Mon, 16 Mar 2009) | 6 lines
branches/zip: UT_LIST_VALIDATE(): Assert that the link is non-NULL
before dereferencing it. In this way, ut_list_node_313 will be
pointing to the last non-NULL list item at the time of the assertion
failure. (gcc-4.3.2 -O3 seems to optimize the common subexpressions
and make the variable NULL, though.)
------------------------------------------------------------------------
r4457 | marko | 2009-03-16 14:12:02 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: sync_thread_add_level(): Make the assertions about
level == SYNC_BUF_BLOCK more readable.
------------------------------------------------------------------------
r4461 | vasil | 2009-03-17 09:38:19 +0200 (Tue, 17 Mar 2009) | 6 lines
branches/zip:
Remove mysql-test/patches/bug32625.diff because that bug was fixed in
the mysql repository (1 year and 4 months after sending them the simple
patch!). See http://bugs.mysql.com/32625
------------------------------------------------------------------------
r4465 | marko | 2009-03-17 12:34:19 +0200 (Tue, 17 Mar 2009) | 1 line
branches/zip: buf0buddy.c: Add and adjust some debug assertions.
------------------------------------------------------------------------
r4473 | vasil | 2009-03-17 15:50:30 +0200 (Tue, 17 Mar 2009) | 5 lines
branches/zip:
Increment the InnoDB Plugin version from 1.0.3 to 1.0.4 now that
1.0.3 has been released.
------------------------------------------------------------------------
r4478 | vasil | 2009-03-18 11:53:53 +0200 (Wed, 18 Mar 2009) | 5 lines
branches/zip:
Remove mysql-test/patches/bug41893.diff because that bug has been fixed
in the MySQL repository, see http://bugs.mysql.com/41893.
------------------------------------------------------------------------
r4479 | marko | 2009-03-18 12:43:54 +0200 (Wed, 18 Mar 2009) | 2 lines
branches/zip: buf_LRU_block_remove_hashed_page(): Add some debug assertions.
------------------------------------------------------------------------
r4480 | marko | 2009-03-18 14:32:13 +0200 (Wed, 18 Mar 2009) | 1 line
branches/zip: buf_buddy_free_low(): Correct the function comment.
------------------------------------------------------------------------
r4482 | marko | 2009-03-19 15:23:32 +0200 (Thu, 19 Mar 2009) | 12 lines
branches/zip: Merge revisions 4400:4481 from branches/5.1:
------------------------------------------------------------------------
r4481 | marko | 2009-03-19 15:01:48 +0200 (Thu, 19 Mar 2009) | 6 lines
branches/5.1: row_unlock_for_mysql(): Do not unlock records that were
modified by the current transaction. This bug was introduced or unmasked
in r4400.
rb://97 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r4490 | marko | 2009-03-20 12:33:33 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change for reducing dependencies in InnoDB Hot Backup:
Replace srv_sys->dummy_ind1 and srv_sys->dummy_ind2 with
dict_ind_redundant and dict_ind_compact, initialized in dict_init().
------------------------------------------------------------------------
r4491 | marko | 2009-03-20 12:45:18 +0200 (Fri, 20 Mar 2009) | 2 lines
branches/zip: Add const qualifiers or in/out comments to some function
parameters in log0log.
------------------------------------------------------------------------
r4492 | marko | 2009-03-20 12:52:14 +0200 (Fri, 20 Mar 2009) | 5 lines
branches/zip: page_validate(): Always report the space id and the
name of the index.
In Hot Backup, do not invoke comparison functions, as MySQL collations
will be unavailable.
------------------------------------------------------------------------
r4493 | marko | 2009-03-20 13:24:06 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: Replace fil_get_space_for_id_low() with fil_space_get_by_id().
------------------------------------------------------------------------
r4494 | marko | 2009-03-20 13:51:35 +0200 (Fri, 20 Mar 2009) | 3 lines
branches/zip: fil0fil.c: Refer to fil_system directly, not via local vars.
This eliminates some "unused variable" warnings when building
InnoDB Hot Backup in such a way that all mutex operations are no-ops.
------------------------------------------------------------------------
r4495 | marko | 2009-03-20 14:15:52 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: innobase_get_at_most_n_mbchars(): Declare in ha_prototypes.h.
------------------------------------------------------------------------
r4496 | marko | 2009-03-20 14:48:26 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_recover_page(): Remove compile-time constant parameters.
------------------------------------------------------------------------
r4497 | marko | 2009-03-20 14:56:19 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_sys_init(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4498 | marko | 2009-03-20 15:08:05 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change: Add const qualifiers.
log_block_checksum_is_ok_or_old_format(), recv_sys_add_to_parsing_buf():
The log block is read-only. Make it const.
------------------------------------------------------------------------
r4499 | marko | 2009-03-20 15:10:25 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_scan_log_recs(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4500 | marko | 2009-03-20 15:47:17 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: fil_init(): Add the parameter hash_size.
------------------------------------------------------------------------
r4501 | vasil | 2009-03-20 16:50:41 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip:
Add any entry about the release of 1.0.3 in the ChangeLog.
------------------------------------------------------------------------
r4515 | marko | 2009-03-23 10:49:53 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: hash_table_t: adaptive: Remove from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4516 | marko | 2009-03-23 10:57:16 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use ASSERT_HASH_MUTEX_OWN.
Make it a no-op in UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4517 | marko | 2009-03-23 11:07:20 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use PAGE_ZIP_MATCH.
In UNIV_HOTBACKUP builds, assume fixed allocation.
------------------------------------------------------------------------
r4521 | marko | 2009-03-23 12:05:47 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: buf_page_print(): Clean up the code #ifdef UNIV_HOTBACKUP.
------------------------------------------------------------------------
r4522 | marko | 2009-03-23 12:20:50 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Exclude some operating system interface code
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4523 | marko | 2009-03-23 13:00:43 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove the remaining references to hash_table_t::adapive
from UNIV_HOTBACKUP builds. This should have been done in r4515.
------------------------------------------------------------------------
r4524 | marko | 2009-03-23 14:05:18 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Enclose recv_recovery_from_backup_on and
recv_recovery_from_backup_is_on() in #ifdef UNIV_LOG_ARCHIVE.
------------------------------------------------------------------------
r4525 | marko | 2009-03-23 14:57:45 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: recv_parse_or_apply_log_rec_body(): Add debug assertions
ensuring that FIL_PAGE_TYPE makes sense when applying log records.
------------------------------------------------------------------------
r4526 | marko | 2009-03-23 16:21:34 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove unneeded definitions and dependencies
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4527 | calvin | 2009-03-23 23:15:33 +0200 (Mon, 23 Mar 2009) | 5 lines
branches/zip: adjust build files on Windows
Adjust the patch positions based on the latest MySQL source.
Also add the patches to the .bat files for vs9.
------------------------------------------------------------------------
2009-03-24 08:32:21 +00:00
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify it under
|
|
|
|
the terms of the GNU General Public License as published by the Free Software
|
|
|
|
Foundation; version 2 of the License.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful, but WITHOUT
|
|
|
|
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
|
|
|
|
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License along with
|
2012-08-01 17:27:34 +03:00
|
|
|
this program; if not, write to the Free Software Foundation, Inc.,
|
2019-05-11 19:25:02 +03:00
|
|
|
51 Franklin Street, Fifth Floor, Boston, MA 02110-1335 USA
|
branches/innodb+: Merge revisions 4150:4528 from branches/zip:
------------------------------------------------------------------------
r4152 | marko | 2009-02-10 12:52:27 +0200 (Tue, 10 Feb 2009) | 12 lines
branches/zip: When innodb_use_sys_malloc is set, ignore
innodb_additional_mem_pool_size, because nothing will
be allocated from mem_comm_pool.
mem_pool_create(): Remove the assertion about size. The function will
work with any size. However, an assertion would fail in ut_malloc_low()
when size==0.
mem_init(): When srv_use_sys_malloc is set, pass size=1 to mem_pool_create().
mem0mem.c: Add #include "srv0srv.h" that is needed by mem0dbg.c.
------------------------------------------------------------------------
r4153 | vasil | 2009-02-10 22:58:17 +0200 (Tue, 10 Feb 2009) | 14 lines
branches/zip:
(followup to r4145) Non-functional change:
Change the os_atomic_increment() and os_compare_and_swap() functions
to macros to avoid artificial limitations on the types of those
functions' arguments. As a consequence typecasts from the source
code can be removed.
Also remove Google's copyright from os0sync.ic because that file no longer
contains code from Google.
Approved by: Marko (rb://88), also ok from Inaam via IM
------------------------------------------------------------------------
r4163 | marko | 2009-02-12 00:14:19 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip: Make innodb_thread_concurrency=0 the default.
The old default was 8.
------------------------------------------------------------------------
r4169 | calvin | 2009-02-12 10:37:10 +0200 (Thu, 12 Feb 2009) | 3 lines
branches/zip: Adjust the result file of innodb_thread_concurrency_basic
test. The default value of innodb_thread_concurrency is changed to 0
(from 8) via r4163.
------------------------------------------------------------------------
r4174 | vasil | 2009-02-12 17:38:27 +0200 (Thu, 12 Feb 2009) | 4 lines
branches/zip:
Fix pathname of the file to patch.
------------------------------------------------------------------------
r4176 | vasil | 2009-02-13 10:06:31 +0200 (Fri, 13 Feb 2009) | 7 lines
branches/zip:
Fix the failing mysql-test partition_innodb, which failed only if run after
innodb_trx_weight (or other test that would leave LATEST DEADLOCK ERROR into
the output of SHOW ENGINE INNODB STATUS). Find further explanation for the
failure at the top of the added patch partition_innodb.diff.
------------------------------------------------------------------------
r4198 | vasil | 2009-02-17 09:06:07 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
Add the full text of the GPLv2 license into the root directory of the
plugin. In previous releases this file was copied from an external source
(https://svn.innodb.com/svn/plugin/trunk/support/COPYING) "manually" when
creating the source and binary archives. It is less confusing to have this
present in the root directory of the SVN branch.
------------------------------------------------------------------------
r4199 | vasil | 2009-02-17 09:11:58 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add Google's license into COPYING.Google.
------------------------------------------------------------------------
r4200 | vasil | 2009-02-17 09:56:33 +0200 (Tue, 17 Feb 2009) | 11 lines
branches/zip:
To the files touched by the Google patch from c4144 (excluding
include/os0sync.ic because later we removed Google code from that file):
* Remove the Google license
* Remove old Innobase copyright lines
* Add a reference to the Google license and to the GPLv2 license at the top,
as recommended by the lawyers at Oracle Legal.
------------------------------------------------------------------------
r4201 | vasil | 2009-02-17 10:12:02 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 1/28]
------------------------------------------------------------------------
r4202 | vasil | 2009-02-17 10:15:06 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 2/28]
------------------------------------------------------------------------
r4203 | vasil | 2009-02-17 10:25:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 3/28]
------------------------------------------------------------------------
r4204 | vasil | 2009-02-17 10:55:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 4/28]
------------------------------------------------------------------------
r4205 | vasil | 2009-02-17 10:59:22 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 5/28]
------------------------------------------------------------------------
r4206 | vasil | 2009-02-17 11:02:27 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 6/28]
------------------------------------------------------------------------
r4207 | vasil | 2009-02-17 11:04:28 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 7/28]
------------------------------------------------------------------------
r4208 | vasil | 2009-02-17 11:06:49 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 8/28]
------------------------------------------------------------------------
r4209 | vasil | 2009-02-17 11:10:18 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 9/28]
------------------------------------------------------------------------
r4210 | vasil | 2009-02-17 11:12:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 10/28]
------------------------------------------------------------------------
r4211 | vasil | 2009-02-17 11:14:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 11/28]
------------------------------------------------------------------------
r4212 | vasil | 2009-02-17 11:18:35 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 12/28]
------------------------------------------------------------------------
r4213 | vasil | 2009-02-17 11:24:40 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4214 | vasil | 2009-02-17 11:27:31 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 13/28]
------------------------------------------------------------------------
r4215 | vasil | 2009-02-17 11:29:55 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 15/28]
------------------------------------------------------------------------
r4216 | vasil | 2009-02-17 11:33:38 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 16/28]
------------------------------------------------------------------------
r4217 | vasil | 2009-02-17 11:36:44 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 17/28]
------------------------------------------------------------------------
r4218 | vasil | 2009-02-17 11:39:11 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 18/28]
------------------------------------------------------------------------
r4219 | vasil | 2009-02-17 11:41:24 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 19/28]
------------------------------------------------------------------------
r4220 | vasil | 2009-02-17 11:43:50 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 20/28]
------------------------------------------------------------------------
r4221 | vasil | 2009-02-17 11:46:52 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 21/28]
------------------------------------------------------------------------
r4222 | vasil | 2009-02-17 11:50:12 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 22/28]
------------------------------------------------------------------------
r4223 | vasil | 2009-02-17 11:53:58 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 23/28]
------------------------------------------------------------------------
r4224 | vasil | 2009-02-17 12:01:41 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 24/28]
------------------------------------------------------------------------
r4225 | vasil | 2009-02-17 12:05:45 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 25/28]
------------------------------------------------------------------------
r4226 | vasil | 2009-02-17 12:09:16 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 26/28]
------------------------------------------------------------------------
r4227 | vasil | 2009-02-17 12:12:56 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 27/28]
------------------------------------------------------------------------
r4228 | vasil | 2009-02-17 12:14:04 +0200 (Tue, 17 Feb 2009) | 8 lines
branches/zip:
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 28/28]
------------------------------------------------------------------------
r4229 | vasil | 2009-02-17 12:30:55 +0200 (Tue, 17 Feb 2009) | 4 lines
branches/zip:
Add the copyright notice to the non C files.
------------------------------------------------------------------------
r4231 | marko | 2009-02-17 14:26:53 +0200 (Tue, 17 Feb 2009) | 12 lines
Minor cleanup of the Google SMP patch.
sync_array_object_signalled(): Add a (void) cast to eliminate a gcc warning
about the return value of os_atomic_increment() being ignored.
rw_lock_create_func(): Properly indent the preprocessor directives.
rw_lock_x_lock_low(), rw_lock_x_lock_func_nowait(): Split lines correctly.
rw_lock_set_writer_id_and_recursion_flag(): Silence a Valgrind warning.
Do not mix statements and variable declarations.
------------------------------------------------------------------------
r4232 | marko | 2009-02-17 14:59:54 +0200 (Tue, 17 Feb 2009) | 3 lines
branches/zip: When assigning lock->recursive = FALSE, also flag
lock->writer_thread invalid, so that Valgrind will catch more errors.
This is related to Issue #175.
------------------------------------------------------------------------
r4242 | marko | 2009-02-18 17:01:09 +0200 (Wed, 18 Feb 2009) | 2 lines
branches/zip: UT_DBG_STOP: Use do{} while(0) to silence a g++-4.3.2 warning
about a while(0); statement. This should fix (part of) Issue #176.
------------------------------------------------------------------------
r4243 | marko | 2009-02-18 17:04:03 +0200 (Wed, 18 Feb 2009) | 3 lines
branches/zip: buf_buddy_get_slot(): Fix a gcc 4.3.2 warning
about an empty body of a "for" statement.
This fixes part of Issue #176.
------------------------------------------------------------------------
r4244 | marko | 2009-02-18 17:25:45 +0200 (Wed, 18 Feb 2009) | 11 lines
branches/zip: Protect ut_total_allocated_memory with ut_list_mutex.
Unprotected updates to ut_total_allocated_memory in
os_mem_alloc_large() and os_mem_free_large(), called during
fast index creation, may corrupt the variable and cause assertion failures.
Also, add UNIV_MEM_ALLOC() and UNIV_MEM_FREE() instrumentation around
os_mem_alloc_large() and os_mem_free_large(), so that Valgrind can
detect more errors.
rb://90 approved by Heikki Tuuri. This addresses Issue #177.
------------------------------------------------------------------------
r4248 | marko | 2009-02-19 11:52:39 +0200 (Thu, 19 Feb 2009) | 2 lines
branches/zip: page_zip_set_size(): Fix a g++ 4.3.2 warning
about an empty body in a "for" statement. This closes Issue #176.
------------------------------------------------------------------------
r4251 | inaam | 2009-02-19 15:46:27 +0200 (Thu, 19 Feb 2009) | 8 lines
branches/zip: Issue #178 rb://91
Change plug.in to have same CXXFLAGS as CFLAGS. This is to ensure that
both .c and .cc files get compiled with same flags. To fix the issue
where UNIV_LINUX was defined only in .c files.
Approved by: Marko
------------------------------------------------------------------------
r4258 | vasil | 2009-02-20 11:52:19 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Cleanup in ChangeLog:
* Wrap lines at 78 characters
* Changed files are listed alphabetically
* White-space cleanup
------------------------------------------------------------------------
r4259 | vasil | 2009-02-20 11:59:42 +0200 (Fri, 20 Feb 2009) | 6 lines
branches/zip:
ChangeLog: Remove include/os0sync.ic from the entry about the google patch,
this file was modified later to not include Google's code.
------------------------------------------------------------------------
r4262 | vasil | 2009-02-20 14:56:59 +0200 (Fri, 20 Feb 2009) | 373 lines
branches/zip:
Merge revisions 4035:4261 from branches/5.1:
------------------------------------------------------------------------
r4065 | sunny | 2009-01-29 16:01:36 +0200 (Thu, 29 Jan 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: In the last round of AUTOINC cleanup we assumed that AUTOINC
is only defined for integer columns. This caused an assertion failure when
we checked for the maximum value of a column type. We now calculate the
max value for floating-point autoinc columns too.
Fix Bug#42400 - InnoDB autoinc code can't handle floating-point columns
rb://84 and Mantis issue://162
------------------------------------------------------------------------
r4111 | sunny | 2009-02-03 22:06:52 +0200 (Tue, 03 Feb 2009) | 2 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add the ULL suffix otherwise there is an overflow.
------------------------------------------------------------------------
r4128 | vasil | 2009-02-08 21:36:45 +0200 (Sun, 08 Feb 2009) | 18 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2709.20.31
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2008-12-19 01:28:51 +0100
message:
Disable part of innodb-autoinc.test, because the MySQL server asserts when
compiled --with-debug, due to bug 39828, "autoinc wraps around when offset and
increment > 1". This change should be reverted when that bug is fixed (and a
a few other minor changes to the test as described in comments).
modified:
mysql-test/r/innodb-autoinc.result
mysql-test/t/innodb-autoinc.test
------------------------------------------------------------------------
r4129 | vasil | 2009-02-08 21:54:25 +0200 (Sun, 08 Feb 2009) | 310 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Merge a change from MySQL:
[looks like the changes to innodb-autoinc.test were made as part of
the following huge merge, but we are merging only changes to that file]
------------------------------------------------------------
revno: 2546.47.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: 5.1-rpl
timestamp: Fri 2009-01-23 13:22:05 +0100
message:
merge: 5.1 -> 5.1-rpl
conflicts:
Text conflict in client/mysqltest.cc
Text conflict in mysql-test/include/wait_until_connected_again.inc
Text conflict in mysql-test/lib/mtr_report.pm
Text conflict in mysql-test/mysql-test-run.pl
Text conflict in mysql-test/r/events_bugs.result
Text conflict in mysql-test/r/log_state.result
Text conflict in mysql-test/r/myisam_data_pointer_size_func.result
Text conflict in mysql-test/r/mysqlcheck.result
Text conflict in mysql-test/r/query_cache.result
Text conflict in mysql-test/r/status.result
Text conflict in mysql-test/suite/binlog/r/binlog_index.result
Text conflict in mysql-test/suite/binlog/r/binlog_innodb.result
Text conflict in mysql-test/suite/rpl/r/rpl_packet.result
Text conflict in mysql-test/suite/rpl/t/rpl_packet.test
Text conflict in mysql-test/t/disabled.def
Text conflict in mysql-test/t/events_bugs.test
Text conflict in mysql-test/t/log_state.test
Text conflict in mysql-test/t/myisam_data_pointer_size_func.test
Text conflict in mysql-test/t/mysqlcheck.test
Text conflict in mysql-test/t/query_cache.test
Text conflict in mysql-test/t/rpl_init_slave_func.test
Text conflict in mysql-test/t/status.test
removed:
mysql-test/suite/parts/r/partition_bit_ndb.result
mysql-test/suite/parts/t/partition_bit_ndb.test
mysql-test/suite/parts/t/partition_sessions.test
mysql-test/suite/sys_vars/inc/tmp_table_size_basic.inc
mysql-test/suite/sys_vars/r/tmp_table_size_basic_32.result
mysql-test/suite/sys_vars/r/tmp_table_size_basic_64.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic_32.test
mysql-test/suite/sys_vars/t/tmp_table_size_basic_64.test
mysql-test/t/log_bin_trust_function_creators_func-master.opt
mysql-test/t/rpl_init_slave_func-slave.opt
added:
mysql-test/include/check_events_off.inc
mysql-test/include/cleanup_fake_relay_log.inc
mysql-test/include/have_simple_parser.inc
mysql-test/include/no_running_event_scheduler.inc
mysql-test/include/no_running_events.inc
mysql-test/include/running_event_scheduler.inc
mysql-test/include/setup_fake_relay_log.inc
mysql-test/include/wait_condition_sp.inc
mysql-test/r/fulltext_plugin.result
mysql-test/r/have_simple_parser.require
mysql-test/r/innodb_bug38231.result
mysql-test/r/innodb_bug39438.result
mysql-test/r/innodb_mysql_rbk.result
mysql-test/r/partition_innodb_semi_consistent.result
mysql-test/r/query_cache_28249.result
mysql-test/r/status2.result
mysql-test/std_data/bug40482-bin.000001
mysql-test/suite/binlog/r/binlog_innodb_row.result
mysql-test/suite/binlog/t/binlog_innodb_row.test
mysql-test/suite/rpl/r/rpl_binlog_corruption.result
mysql-test/suite/rpl/t/rpl_binlog_corruption-master.opt
mysql-test/suite/rpl/t/rpl_binlog_corruption.test
mysql-test/suite/sys_vars/r/tmp_table_size_basic.result
mysql-test/suite/sys_vars/t/tmp_table_size_basic.test
mysql-test/t/fulltext_plugin-master.opt
mysql-test/t/fulltext_plugin.test
mysql-test/t/innodb_bug38231.test
mysql-test/t/innodb_bug39438-master.opt
mysql-test/t/innodb_bug39438.test
mysql-test/t/innodb_mysql_rbk-master.opt
mysql-test/t/innodb_mysql_rbk.test
mysql-test/t/partition_innodb_semi_consistent-master.opt
mysql-test/t/partition_innodb_semi_consistent.test
mysql-test/t/query_cache_28249.test
mysql-test/t/status2.test
renamed:
mysql-test/suite/funcs_1/r/is_collation_character_set_applicability.result => mysql-test/suite/funcs_1/r/is_coll_char_set_appl.result
mysql-test/suite/funcs_1/t/is_collation_character_set_applicability.test => mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
modified:
.bzr-mysql/default.conf
CMakeLists.txt
client/mysql.cc
client/mysql_upgrade.c
client/mysqlcheck.c
client/mysqltest.cc
configure.in
extra/resolve_stack_dump.c
extra/yassl/include/openssl/ssl.h
include/config-win.h
include/m_ctype.h
include/my_global.h
mysql-test/extra/binlog_tests/database.test
mysql-test/extra/rpl_tests/rpl_auto_increment.test
mysql-test/include/commit.inc
mysql-test/include/have_32bit.inc
mysql-test/include/have_64bit.inc
mysql-test/include/index_merge1.inc
mysql-test/include/linux_sys_vars.inc
mysql-test/include/windows_sys_vars.inc
mysql-test/lib/mtr_report.pm
mysql-test/mysql-test-run.pl
mysql-test/r/alter_table.result
mysql-test/r/commit_1innodb.result
mysql-test/r/create.result
mysql-test/r/csv.result
mysql-test/r/ctype_ucs.result
mysql-test/r/date_formats.result
mysql-test/r/events_bugs.result
mysql-test/r/events_scheduling.result
mysql-test/r/fulltext.result
mysql-test/r/func_if.result
mysql-test/r/func_in.result
mysql-test/r/func_str.result
mysql-test/r/func_time.result
mysql-test/r/grant.result
mysql-test/r/index_merge_myisam.result
mysql-test/r/information_schema.result
mysql-test/r/innodb-autoinc.result
mysql-test/r/innodb.result
mysql-test/r/innodb_mysql.result
mysql-test/r/log_bin_trust_function_creators_func.result
mysql-test/r/log_state.result
mysql-test/r/myisampack.result
mysql-test/r/mysql.result
mysql-test/r/mysqlcheck.result
mysql-test/r/partition_datatype.result
mysql-test/r/partition_mgm.result
mysql-test/r/partition_pruning.result
mysql-test/r/query_cache.result
mysql-test/r/read_buffer_size_basic.result
mysql-test/r/read_rnd_buffer_size_basic.result
mysql-test/r/rpl_init_slave_func.result
mysql-test/r/select.result
mysql-test/r/status.result
mysql-test/r/strict.result
mysql-test/r/temp_table.result
mysql-test/r/type_bit.result
mysql-test/r/type_date.result
mysql-test/r/type_float.result
mysql-test/r/warnings_engine_disabled.result
mysql-test/r/xml.result
mysql-test/suite/binlog/r/binlog_database.result
mysql-test/suite/binlog/r/binlog_index.result
mysql-test/suite/binlog/r/binlog_innodb.result
mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result
mysql-test/suite/binlog/t/binlog_innodb.test
mysql-test/suite/funcs_1/r/is_columns_is.result
mysql-test/suite/funcs_1/r/is_engines.result
mysql-test/suite/funcs_1/r/storedproc.result
mysql-test/suite/funcs_1/storedproc/param_check.inc
mysql-test/suite/funcs_2/t/disabled.def
mysql-test/suite/ndb/t/disabled.def
mysql-test/suite/parts/r/partition_bit_innodb.result
mysql-test/suite/parts/r/partition_bit_myisam.result
mysql-test/suite/parts/r/partition_special_innodb.result
mysql-test/suite/parts/t/disabled.def
mysql-test/suite/parts/t/partition_special_innodb.test
mysql-test/suite/parts/t/partition_value_innodb.test
mysql-test/suite/parts/t/partition_value_myisam.test
mysql-test/suite/parts/t/partition_value_ndb.test
mysql-test/suite/rpl/r/rpl_auto_increment.result
mysql-test/suite/rpl/r/rpl_packet.result
mysql-test/suite/rpl/r/rpl_row_create_table.result
mysql-test/suite/rpl/r/rpl_slave_skip.result
mysql-test/suite/rpl/r/rpl_trigger.result
mysql-test/suite/rpl/t/disabled.def
mysql-test/suite/rpl/t/rpl_packet.test
mysql-test/suite/rpl/t/rpl_row_create_table.test
mysql-test/suite/rpl/t/rpl_slave_skip.test
mysql-test/suite/rpl/t/rpl_trigger.test
mysql-test/suite/rpl_ndb/t/disabled.def
mysql-test/suite/sys_vars/inc/key_buffer_size_basic.inc
mysql-test/suite/sys_vars/inc/sort_buffer_size_basic.inc
mysql-test/suite/sys_vars/r/key_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/key_buffer_size_basic_64.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_32.result
mysql-test/suite/sys_vars/r/sort_buffer_size_basic_64.result
mysql-test/t/alter_table.test
mysql-test/t/create.test
mysql-test/t/csv.test
mysql-test/t/ctype_ucs.test
mysql-test/t/date_formats.test
mysql-test/t/disabled.def
mysql-test/t/events_bugs.test
mysql-test/t/events_scheduling.test
mysql-test/t/fulltext.test
mysql-test/t/func_if.test
mysql-test/t/func_in.test
mysql-test/t/func_str.test
mysql-test/t/func_time.test
mysql-test/t/grant.test
mysql-test/t/information_schema.test
mysql-test/t/innodb-autoinc.test
mysql-test/t/innodb.test
mysql-test/t/innodb_mysql.test
mysql-test/t/log_bin_trust_function_creators_func.test
mysql-test/t/log_state.test
mysql-test/t/myisam_data_pointer_size_func.test
mysql-test/t/myisampack.test
mysql-test/t/mysql.test
mysql-test/t/mysqlcheck.test
mysql-test/t/partition_innodb_stmt.test
mysql-test/t/partition_mgm.test
mysql-test/t/partition_pruning.test
mysql-test/t/query_cache.test
mysql-test/t/rpl_init_slave_func.test
mysql-test/t/select.test
mysql-test/t/status.test
mysql-test/t/strict.test
mysql-test/t/temp_table.test
mysql-test/t/type_bit.test
mysql-test/t/type_date.test
mysql-test/t/type_float.test
mysql-test/t/warnings_engine_disabled.test
mysql-test/t/xml.test
mysys/my_getopt.c
mysys/my_init.c
scripts/mysql_install_db.sh
sql-common/my_time.c
sql/field.cc
sql/field.h
sql/filesort.cc
sql/ha_partition.cc
sql/ha_partition.h
sql/item.cc
sql/item_cmpfunc.cc
sql/item_func.h
sql/item_strfunc.cc
sql/item_sum.cc
sql/item_timefunc.cc
sql/item_timefunc.h
sql/log.cc
sql/log.h
sql/log_event.cc
sql/log_event.h
sql/mysql_priv.h
sql/mysqld.cc
sql/opt_range.cc
sql/partition_info.cc
sql/repl_failsafe.cc
sql/rpl_constants.h
sql/set_var.cc
sql/slave.cc
sql/spatial.h
sql/sql_acl.cc
sql/sql_base.cc
sql/sql_binlog.cc
sql/sql_class.h
sql/sql_cursor.cc
sql/sql_delete.cc
sql/sql_lex.cc
sql/sql_lex.h
sql/sql_locale.cc
sql/sql_parse.cc
sql/sql_partition.cc
sql/sql_plugin.cc
sql/sql_plugin.h
sql/sql_profile.cc
sql/sql_repl.cc
sql/sql_select.cc
sql/sql_select.h
sql/sql_show.cc
sql/sql_table.cc
sql/sql_trigger.cc
sql/sql_trigger.h
sql/table.cc
sql/table.h
sql/unireg.cc
storage/csv/ha_tina.cc
storage/federated/ha_federated.cc
storage/heap/ha_heap.cc
storage/innobase/Makefile.am
storage/innobase/btr/btr0sea.c
storage/innobase/buf/buf0lru.c
storage/innobase/dict/dict0dict.c
storage/innobase/dict/dict0mem.c
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innobase/include/btr0sea.h
storage/innobase/include/dict0dict.h
storage/innobase/include/dict0mem.h
storage/innobase/include/ha_prototypes.h
storage/innobase/include/lock0lock.h
storage/innobase/include/row0mysql.h
storage/innobase/include/sync0sync.ic
storage/innobase/include/ut0ut.h
storage/innobase/lock/lock0lock.c
storage/innobase/os/os0file.c
storage/innobase/plug.in
storage/innobase/row/row0mysql.c
storage/innobase/row/row0sel.c
storage/innobase/srv/srv0srv.c
storage/innobase/srv/srv0start.c
storage/innobase/ut/ut0ut.c
storage/myisam/ft_boolean_search.c
strings/ctype.c
strings/xml.c
tests/mysql_client_test.c
win/configure.js
mysql-test/suite/funcs_1/t/is_coll_char_set_appl.test
------------------------------------------------------------------------
r4165 | calvin | 2009-02-12 01:34:27 +0200 (Thu, 12 Feb 2009) | 1 line
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: minor non-functional changes.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4263 | vasil | 2009-02-20 15:00:46 +0200 (Fri, 20 Feb 2009) | 4 lines
branches/zip:
Add a ChangeLog entry for a change in r4262.
------------------------------------------------------------------------
r4265 | marko | 2009-02-20 22:31:03 +0200 (Fri, 20 Feb 2009) | 5 lines
branches/zip: Make innodb_use_sys_malloc=ON the default.
Replace srv_use_sys_malloc with UNIV_LIKELY(srv_use_sys_malloc)
to improve branch prediction in the default case.
Approved by Ken over the IM.
------------------------------------------------------------------------
r4266 | vasil | 2009-02-20 23:29:32 +0200 (Fri, 20 Feb 2009) | 7 lines
branches/zip:
Add a sentence at the top of COPYING.Google to clarify that this license
does not apply to the whole InnoDB.
Suggested by: Ken
------------------------------------------------------------------------
r4268 | marko | 2009-02-23 12:43:51 +0200 (Mon, 23 Feb 2009) | 9 lines
branches/zip: Initialize ut_list_mutex at startup. Without this fix,
ut_list_mutex would be used uninitialized when innodb_use_sys_malloc=1.
This fix addresses Issue #181.
ut_mem_block_list_init(): Rename to ut_mem_init() and make public.
ut_malloc_low(), ut_free_all_mem(): Add ut_a(ut_mem_block_list_inited).
mem_init(): Call ut_mem_init().
------------------------------------------------------------------------
r4269 | marko | 2009-02-23 15:09:49 +0200 (Mon, 23 Feb 2009) | 7 lines
branches/zip: When freeing an uncompressed BLOB page, tolerate garbage in
FIL_PAGE_TYPE. (Bug #43043, Issue #182)
btr_check_blob_fil_page_type(): New function.
btr_free_externally_stored_field(), btr_copy_blob_prefix():
Call btr_check_blob_fil_page_type() to check FIL_PAGE_TYPE.
------------------------------------------------------------------------
r4272 | marko | 2009-02-23 23:10:18 +0200 (Mon, 23 Feb 2009) | 8 lines
branches/zip: Adjust the fix of Issue #182 in r4269 per Inaam's suggestion.
btr_check_blob_fil_page_type(): Replace the parameter
const char* op
with
ibool read. Do not print anything about page type mismatch
when reading a BLOB page in Antelope format.
Print space id before page number.
------------------------------------------------------------------------
r4273 | marko | 2009-02-24 00:11:11 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: ut_mem_init(): Add the assertion !ut_mem_block_list_inited.
------------------------------------------------------------------------
r4274 | marko | 2009-02-24 00:14:38 +0200 (Tue, 24 Feb 2009) | 12 lines
branches/zip: Fix bugs in the fix of Issue #181. Tested inside and
outside Valgrind, with innodb_use_sys_malloc set to 0 and 1.
mem_init(): Invoke ut_mem_init() before mem_pool_create(), because
the latter one will invoke ut_malloc().
srv_general_init(): Do not initialize the memory subsystem (mem_init()).
innobase_init(): Initialize the memory subsystem (mem_init()) before
calling srv_parse_data_file_paths_and_sizes(), which needs ut_malloc().
Call ut_free_all_mem() in error handling to clean up after the mem_init().
------------------------------------------------------------------------
r4280 | marko | 2009-02-24 15:14:59 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove unused function os_mem_alloc_nocache().
------------------------------------------------------------------------
r4281 | marko | 2009-02-24 16:02:48 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: Remove the unused function dict_index_get_type().
------------------------------------------------------------------------
r4283 | marko | 2009-02-24 23:06:56 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: srv0start.c: Remove unnecessary #include "mem0pool.h".
------------------------------------------------------------------------
r4284 | marko | 2009-02-24 23:26:38 +0200 (Tue, 24 Feb 2009) | 1 line
branches/zip: mem0mem.c: Remove unnecessary #include "mach0data.h".
------------------------------------------------------------------------
r4288 | vasil | 2009-02-25 10:48:07 +0200 (Wed, 25 Feb 2009) | 21 lines
branches/zip: Merge revisions 4261:4287 from branches/5.1:
------------------------------------------------------------------------
r4287 | sunny | 2009-02-25 05:32:01 +0200 (Wed, 25 Feb 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#42714 AUTO_INCREMENT errors in 5.1.31. There are two
changes to the autoinc handling.
1. To fix the immediate problem from the bug report, we must ensure that the
value written to the table is always less than the max value stored in
dict_table_t.
2. The second related change is that according to MySQL documentation when
the offset is greater than the increment, we should ignore the offset.
------------------------------------------------------------------------
------------------------------------------------------------------------
r4289 | vasil | 2009-02-25 10:53:51 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the fix in r4288.
------------------------------------------------------------------------
r4290 | vasil | 2009-02-25 11:05:44 +0200 (Wed, 25 Feb 2009) | 11 lines
branches/zip:
Make ChangeLog entries for bugs in bugs.mysql.com in the form:
Fix Bug#12345 bug title
(for bugs after 1.0.2 was released and the ChangeLog published)
There is no need to bloat the ChangeLog with information that is available
via bugs.mysql.com.
Discussed with: Marko
------------------------------------------------------------------------
r4291 | vasil | 2009-02-25 11:08:32 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
Fix Bug synopsis and remove explanation
------------------------------------------------------------------------
r4292 | marko | 2009-02-25 12:09:15 +0200 (Wed, 25 Feb 2009) | 25 lines
branches/zip: Correct the initialization of the memory subsystem once
again, to finally put Issue #181 to rest.
Revert some parts of r4274. It is best not to call ut_malloc() before
srv_general_init().
mem_init(): Do not call ut_mem_init().
srv_general_init(): Initialize the memory subsystem in two phases:
first ut_mem_init(), then mem_init(). This is because os_sync_init()
and sync_init() depend on ut_mem_init() and mem_init() depends on
os_sync_init() or sync_init().
srv_parse_data_file_paths_and_sizes(),
srv_parse_log_group_home_dirs(): Remove the output parameters. Assign
to the global variables directly. Allocate memory with malloc()
instead of ut_malloc(), because these functions will be called before
srv_general_init().
srv_free_paths_and_sizes(): New function, for cleaning up after
srv_parse_data_file_paths_and_sizes() and
srv_parse_log_group_home_dirs().
rb://92 approved by Sunny Bains
------------------------------------------------------------------------
r4297 | vasil | 2009-02-25 17:19:19 +0200 (Wed, 25 Feb 2009) | 4 lines
branches/zip:
White-space cleanup in the ChangeLog
------------------------------------------------------------------------
r4301 | vasil | 2009-02-25 21:33:32 +0200 (Wed, 25 Feb 2009) | 5 lines
branches/zip:
Do not output the commands that restore the environment because they depend
on the state of the environment before the test starts executing.
------------------------------------------------------------------------
r4315 | vasil | 2009-02-26 09:21:20 +0200 (Thu, 26 Feb 2009) | 5 lines
branches/zip:
Apply any necessary patches to the mysql tree at the end of setup.sh
This step was previously done manually (and sometimes forgotten).
------------------------------------------------------------------------
r4319 | marko | 2009-02-26 23:27:51 +0200 (Thu, 26 Feb 2009) | 6 lines
branches/zip: btr_check_blob_fil_page_type(): Do not report
FIL_PAGE_TYPE mismatch even when purging a BLOB.
Heavy users may have large data files created with MySQL 5.0 or earlier,
and they don not want to have the error log flooded with such messages.
This fixes Issue #182.
------------------------------------------------------------------------
r4320 | inaam | 2009-02-27 02:13:19 +0200 (Fri, 27 Feb 2009) | 8 lines
branches/zip
This is to revert the changes made to the plug.in (r4251) as a fix for
issue# 178. Changes to plug.in will not propogate to a plugin
installation unless autotools are rerun which is unacceptable.
A fix for issue# 178 will be committed in a separate commit.
------------------------------------------------------------------------
r4321 | inaam | 2009-02-27 02:16:46 +0200 (Fri, 27 Feb 2009) | 6 lines
branches/zip
This is a fix for issue#178. Instead of using UNIV_LINUX which is
defined through CFLAGS we use compiler generated define __linux__
that is effective for both .c and .cc files.
------------------------------------------------------------------------
r4324 | vasil | 2009-02-27 13:27:18 +0200 (Fri, 27 Feb 2009) | 39 lines
branches/zip:
Add FreeBSD to the list of the operating systems that have
sizeof(pthread_t) == sizeof(void*) (i.e. word size).
On FreeBSD pthread_t is defined like:
/usr/include/sys/_pthreadtypes.h:
typedef struct pthread *pthread_t;
I did the following tests (per Inaam's recommendation):
a) appropriate version of GCC is available on that platform (4.1.2 or
higher for atomics to be available)
On FreeBSD 6.x the default compiler is 3.4.6, on FreeBSD 7.x the default
one is 4.2.1. One can always install the version of choice from the ports
collection. If gcc 3.x is used then HAVE_GCC_ATOMIC_BUILTINS will not be
defined and thus the change I am committing will make no difference.
b) find out if sizeof(pthread_t) == sizeof(long)
On 32 bit both are 4 bytes, on 64 bit both are 8 bytes.
c) find out the compiler generated platform define (e.g.: __aix, __sunos__
etc.)
The macro is __FreeBSD__.
d) patch univ.i with the appropriate platform define
e) build the mysql
f) ensure it is using atomic builtins (look at the err.log message at
system startup. It should say we are using atomics for both mutexes and
rw-locks)
g) do sanity testing (keeping in view the smp changes)
I ran the mysql-test suite. All tests pass.
------------------------------------------------------------------------
r4353 | vasil | 2009-03-05 09:27:29 +0200 (Thu, 05 Mar 2009) | 6 lines
branches/zip:
As suggested by Ken, print a message that says that the Google SMP patch
(GCC atomics) is disabled if it is. Also extend the message when the patch
is partially enabled to make it clear that it is partially enabled.
------------------------------------------------------------------------
r4356 | vasil | 2009-03-05 13:49:51 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Fix typo made in r4353.
------------------------------------------------------------------------
r4357 | vasil | 2009-03-05 16:38:59 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip:
Implement a check whether pthread_t objects can be used by GCC atomic
builtin functions. This check is implemented in plug.in and defines the
macro HAVE_ATOMIC_PTHREAD_T. This macro is checked in univ.i and the
relevant part of the code enabled (the one that uses GCC atomics against
pthread_t objects).
In addition to this, the same program that is compiled as part of the
plug.in check is added in ut/ut0auxconf.c. In the InnoDB Plugin source
archives that are shipped to the users, a generated Makefile.in is added.
That Makefile.in will be modified to compile ut/ut0auxconf.c and define
the macro HAVE_ATOMIC_PTHREAD_T if the compilation succeeds. I.e.
Makefile.in will emulate the work that is done by plug.in. This is done in
order to make the check happen and HAVE_ATOMIC_PTHREAD_T eventually
defined without regenerating MySQL's ./configure from
./storage/innobase/plug.in. The point is not to ask users to install the
autotools and regenerate ./configure.
rb://95
Approved by: Marko
------------------------------------------------------------------------
r4360 | vasil | 2009-03-05 22:23:17 +0200 (Thu, 05 Mar 2009) | 21 lines
branches/zip: Merge revisions 4287:4357 from branches/5.1:
------------------------------------------------------------------------
r4325 | sunny | 2009-03-02 02:28:52 +0200 (Mon, 02 Mar 2009) | 10 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Bug#43203: Overflow from auto incrementing causes server segv
It was not a SIGSEGV but an assertion failure. The assertion was checking
the invariant that *first_value passed in by MySQL doesn't contain a value
that is greater than the max value for that type. The assertion has been
changed to a check and if the value is greater than the max we report a
generic AUTOINC failure.
rb://93
Approved by Heikki
------------------------------------------------------------------------
------------------------------------------------------------------------
r4361 | vasil | 2009-03-05 22:27:54 +0200 (Thu, 05 Mar 2009) | 30 lines
branches/zip: Merge revision 4358 from branches/5.1 (resolving a conflict):
------------------------------------------------------------------------
r4358 | vasil | 2009-03-05 21:21:10 +0200 (Thu, 05 Mar 2009) | 21 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2728.19.1
committer: Alfranio Correia <alfranio.correia@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-02-03 11:36:46 +0000
message:
BUG#42445 Warning messages in innobase/handler/ha_innodb.cc
There was a type casting problem in the storage/innobase/handler/ha_innodb.cc,
(int ha_innobase::write_row(...)). Innobase uses has an internal error variable
of type 'ulint' while mysql uses an 'int'.
To fix the problem the function manipulates an error variable of
type 'ulint' and only casts it into 'int' when needs to return the value.
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4362 | vasil | 2009-03-05 22:29:07 +0200 (Thu, 05 Mar 2009) | 23 lines
branches/zip: Merge revision 4359 from branches/5.1:
------------------------------------------------------------------------
r4359 | vasil | 2009-03-05 21:42:01 +0200 (Thu, 05 Mar 2009) | 14 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a change from MySQL:
------------------------------------------------------------
revno: 2747
committer: Timothy Smith <timothy.smith@sun.com>
branch nick: 51
timestamp: Fri 2009-01-16 17:49:07 +0100
message:
Add another cast to ignore int/ulong difference in error types, silence warning on Win64
modified:
storage/innobase/handler/ha_innodb.cc
------------------------------------------------------------------------
------------------------------------------------------------------------
r4363 | vasil | 2009-03-05 22:31:37 +0200 (Thu, 05 Mar 2009) | 4 lines
branches/zip:
Add ChangeLog entry for the bugfix in c4360.
------------------------------------------------------------------------
r4378 | calvin | 2009-03-09 10:10:17 +0200 (Mon, 09 Mar 2009) | 7 lines
branches/zip: remove compile flag MYSQL_SERVER for dynamic plugin
The dynamic plugin on Windows used to be built with MYSQL_SERVER
compile flag, while it is not the case for other platforms.
r3797 assumed MYSQL_SERVER was not defined for dynamic plugin,
which introduced the engine crash during dropping a database.
------------------------------------------------------------------------
r4396 | marko | 2009-03-12 09:22:27 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: btr_store_big_rec_extern_fields(): Initialize FIL_PAGE_TYPE
in a separate redo log entry. This will make ibbackup --apply-log
debugging easier.
------------------------------------------------------------------------
r4397 | marko | 2009-03-12 09:26:11 +0200 (Thu, 12 Mar 2009) | 3 lines
branches/zip: trx_sys_create_doublewrite_buf(): As the dummy change,
initialize FIL_PAGE_TYPE. This will make it easier to write the debug
assertions for ibbackup --apply-log.
------------------------------------------------------------------------
r4401 | marko | 2009-03-12 10:26:40 +0200 (Thu, 12 Mar 2009) | 19 lines
branches/zip: Merge revisions 4359:4400 from branches/5.1:
------------------------------------------------------------------------
r4399 | marko | 2009-03-12 09:38:05 +0200 (Thu, 12 Mar 2009) | 2 lines
branches/5.1: row_sel_get_clust_rec_for_mysql(): Store the cursor position
also for unlock_row(). (Bug #39320)
------------------------------------------------------------------------
r4400 | marko | 2009-03-12 10:06:44 +0200 (Thu, 12 Mar 2009) | 5 lines
branches/5.1: Fix a bug in multi-table semi-consistent reads.
Remember the acquired record locks per table handle (row_prebuilt_t)
rather than per transaction (trx_t), so that unlock_row should successfully
unlock all non-matching rows in multi-table operations.
This deficiency was found while investigating Bug #39320.
------------------------------------------------------------------------
These were submitted as rb://94 and rb://96 and approved by Heikki Tuuri.
------------------------------------------------------------------------
r4455 | marko | 2009-03-16 11:43:34 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: UT_LIST_VALIDATE(): Add the parameter ASSERTION and
adjust all callers.
------------------------------------------------------------------------
r4456 | marko | 2009-03-16 12:59:25 +0200 (Mon, 16 Mar 2009) | 6 lines
branches/zip: UT_LIST_VALIDATE(): Assert that the link is non-NULL
before dereferencing it. In this way, ut_list_node_313 will be
pointing to the last non-NULL list item at the time of the assertion
failure. (gcc-4.3.2 -O3 seems to optimize the common subexpressions
and make the variable NULL, though.)
------------------------------------------------------------------------
r4457 | marko | 2009-03-16 14:12:02 +0200 (Mon, 16 Mar 2009) | 2 lines
branches/zip: sync_thread_add_level(): Make the assertions about
level == SYNC_BUF_BLOCK more readable.
------------------------------------------------------------------------
r4461 | vasil | 2009-03-17 09:38:19 +0200 (Tue, 17 Mar 2009) | 6 lines
branches/zip:
Remove mysql-test/patches/bug32625.diff because that bug was fixed in
the mysql repository (1 year and 4 months after sending them the simple
patch!). See http://bugs.mysql.com/32625
------------------------------------------------------------------------
r4465 | marko | 2009-03-17 12:34:19 +0200 (Tue, 17 Mar 2009) | 1 line
branches/zip: buf0buddy.c: Add and adjust some debug assertions.
------------------------------------------------------------------------
r4473 | vasil | 2009-03-17 15:50:30 +0200 (Tue, 17 Mar 2009) | 5 lines
branches/zip:
Increment the InnoDB Plugin version from 1.0.3 to 1.0.4 now that
1.0.3 has been released.
------------------------------------------------------------------------
r4478 | vasil | 2009-03-18 11:53:53 +0200 (Wed, 18 Mar 2009) | 5 lines
branches/zip:
Remove mysql-test/patches/bug41893.diff because that bug has been fixed
in the MySQL repository, see http://bugs.mysql.com/41893.
------------------------------------------------------------------------
r4479 | marko | 2009-03-18 12:43:54 +0200 (Wed, 18 Mar 2009) | 2 lines
branches/zip: buf_LRU_block_remove_hashed_page(): Add some debug assertions.
------------------------------------------------------------------------
r4480 | marko | 2009-03-18 14:32:13 +0200 (Wed, 18 Mar 2009) | 1 line
branches/zip: buf_buddy_free_low(): Correct the function comment.
------------------------------------------------------------------------
r4482 | marko | 2009-03-19 15:23:32 +0200 (Thu, 19 Mar 2009) | 12 lines
branches/zip: Merge revisions 4400:4481 from branches/5.1:
------------------------------------------------------------------------
r4481 | marko | 2009-03-19 15:01:48 +0200 (Thu, 19 Mar 2009) | 6 lines
branches/5.1: row_unlock_for_mysql(): Do not unlock records that were
modified by the current transaction. This bug was introduced or unmasked
in r4400.
rb://97 approved by Heikki Tuuri
------------------------------------------------------------------------
------------------------------------------------------------------------
r4490 | marko | 2009-03-20 12:33:33 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change for reducing dependencies in InnoDB Hot Backup:
Replace srv_sys->dummy_ind1 and srv_sys->dummy_ind2 with
dict_ind_redundant and dict_ind_compact, initialized in dict_init().
------------------------------------------------------------------------
r4491 | marko | 2009-03-20 12:45:18 +0200 (Fri, 20 Mar 2009) | 2 lines
branches/zip: Add const qualifiers or in/out comments to some function
parameters in log0log.
------------------------------------------------------------------------
r4492 | marko | 2009-03-20 12:52:14 +0200 (Fri, 20 Mar 2009) | 5 lines
branches/zip: page_validate(): Always report the space id and the
name of the index.
In Hot Backup, do not invoke comparison functions, as MySQL collations
will be unavailable.
------------------------------------------------------------------------
r4493 | marko | 2009-03-20 13:24:06 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: Replace fil_get_space_for_id_low() with fil_space_get_by_id().
------------------------------------------------------------------------
r4494 | marko | 2009-03-20 13:51:35 +0200 (Fri, 20 Mar 2009) | 3 lines
branches/zip: fil0fil.c: Refer to fil_system directly, not via local vars.
This eliminates some "unused variable" warnings when building
InnoDB Hot Backup in such a way that all mutex operations are no-ops.
------------------------------------------------------------------------
r4495 | marko | 2009-03-20 14:15:52 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: innobase_get_at_most_n_mbchars(): Declare in ha_prototypes.h.
------------------------------------------------------------------------
r4496 | marko | 2009-03-20 14:48:26 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_recover_page(): Remove compile-time constant parameters.
------------------------------------------------------------------------
r4497 | marko | 2009-03-20 14:56:19 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_sys_init(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4498 | marko | 2009-03-20 15:08:05 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip: Non-functional change: Add const qualifiers.
log_block_checksum_is_ok_or_old_format(), recv_sys_add_to_parsing_buf():
The log block is read-only. Make it const.
------------------------------------------------------------------------
r4499 | marko | 2009-03-20 15:10:25 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: recv_scan_log_recs(): Remove a compile-time constant parameter.
------------------------------------------------------------------------
r4500 | marko | 2009-03-20 15:47:17 +0200 (Fri, 20 Mar 2009) | 1 line
branches/zip: fil_init(): Add the parameter hash_size.
------------------------------------------------------------------------
r4501 | vasil | 2009-03-20 16:50:41 +0200 (Fri, 20 Mar 2009) | 4 lines
branches/zip:
Add any entry about the release of 1.0.3 in the ChangeLog.
------------------------------------------------------------------------
r4515 | marko | 2009-03-23 10:49:53 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: hash_table_t: adaptive: Remove from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4516 | marko | 2009-03-23 10:57:16 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use ASSERT_HASH_MUTEX_OWN.
Make it a no-op in UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4517 | marko | 2009-03-23 11:07:20 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Define and use PAGE_ZIP_MATCH.
In UNIV_HOTBACKUP builds, assume fixed allocation.
------------------------------------------------------------------------
r4521 | marko | 2009-03-23 12:05:47 +0200 (Mon, 23 Mar 2009) | 1 line
branches/zip: buf_page_print(): Clean up the code #ifdef UNIV_HOTBACKUP.
------------------------------------------------------------------------
r4522 | marko | 2009-03-23 12:20:50 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Exclude some operating system interface code
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4523 | marko | 2009-03-23 13:00:43 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove the remaining references to hash_table_t::adapive
from UNIV_HOTBACKUP builds. This should have been done in r4515.
------------------------------------------------------------------------
r4524 | marko | 2009-03-23 14:05:18 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Enclose recv_recovery_from_backup_on and
recv_recovery_from_backup_is_on() in #ifdef UNIV_LOG_ARCHIVE.
------------------------------------------------------------------------
r4525 | marko | 2009-03-23 14:57:45 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: recv_parse_or_apply_log_rec_body(): Add debug assertions
ensuring that FIL_PAGE_TYPE makes sense when applying log records.
------------------------------------------------------------------------
r4526 | marko | 2009-03-23 16:21:34 +0200 (Mon, 23 Mar 2009) | 2 lines
branches/zip: Remove unneeded definitions and dependencies
from UNIV_HOTBACKUP builds.
------------------------------------------------------------------------
r4527 | calvin | 2009-03-23 23:15:33 +0200 (Mon, 23 Mar 2009) | 5 lines
branches/zip: adjust build files on Windows
Adjust the patch positions based on the latest MySQL source.
Also add the patches to the .bat files for vs9.
------------------------------------------------------------------------
2009-03-24 08:32:21 +00:00
|
|
|
|
|
|
|
*****************************************************************************/
|
|
|
|
|
branches/innodb+: Merge revisions 5091:5143 from branches/zip:
------------------------------------------------------------------------
r5092 | marko | 2009-05-25 09:54:17 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Adjust some function comments after r5091.
------------------------------------------------------------------------
r5100 | marko | 2009-05-25 12:09:45 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Split some long lines that were introduced in r5091.
------------------------------------------------------------------------
r5101 | marko | 2009-05-25 12:42:47 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Introduce the macro TEMP_INDEX_PREFIX_STR.
This is to avoid triggering an error in Doxygen.
------------------------------------------------------------------------
r5102 | marko | 2009-05-25 13:47:14 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Add missing file comments.
------------------------------------------------------------------------
r5103 | marko | 2009-05-25 13:52:29 +0300 (Mon, 25 May 2009) | 10 lines
branches/zip: Add @file comments, and convert decorative
/*********************************
comments to Doxygen /** style like this:
/*****************************//**
This conversion was performed by the following command:
perl -i -e 'while(<ARGV>){if (m|^/\*{30}\**$|) {
s|\*{4}$|//**| if ++$com>1; $_ .= "\@file $ARGV\n" if $com==2}
print; if(eof){$.=0;undef $com}}' */*[ch] include/univ.i
------------------------------------------------------------------------
r5104 | marko | 2009-05-25 14:39:07 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Revert ut0auxconf_* to r5102,
that is, make Doxygen ignore these test programs.
------------------------------------------------------------------------
r5105 | marko | 2009-05-25 14:52:20 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Enclose some #error checks inside #ifndef DOXYGEN
to prevent bogus Doxygen errors.
------------------------------------------------------------------------
r5106 | marko | 2009-05-25 16:09:24 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Add some Doxygen comments, mainly to structs, typedefs,
macros and global variables. Many more to go.
------------------------------------------------------------------------
r5108 | marko | 2009-05-26 00:32:35 +0300 (Tue, 26 May 2009) | 2 lines
branches/zip: lexyy.c: Remove the inadvertently added @file directive.
There is nothing for Doxygen to see in this file, move along.
------------------------------------------------------------------------
r5125 | marko | 2009-05-26 16:28:49 +0300 (Tue, 26 May 2009) | 3 lines
branches/zip: Add some Doxygen comments for many structs, typedefs,
#defines and global variables. Many are still missing.
------------------------------------------------------------------------
r5134 | marko | 2009-05-27 09:08:43 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add some Doxygen @return comments.
------------------------------------------------------------------------
r5139 | marko | 2009-05-27 10:01:40 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add Doxyfile.
------------------------------------------------------------------------
r5143 | marko | 2009-05-27 10:57:25 +0300 (Wed, 27 May 2009) | 3 lines
branches/zip: buf0buf.h, Doxyfile: Fix the Doxygen translation.
@defgroup is for source code modules, not for field groups.
Tell Doxygen to expand the UT_LIST declarations.
------------------------------------------------------------------------
2009-05-27 09:52:16 +00:00
|
|
|
/**************************************************//**
|
2012-08-01 17:27:34 +03:00
|
|
|
@file dict/dict0boot.cc
|
2005-10-27 07:29:40 +00:00
|
|
|
Data dictionary creation and booting
|
|
|
|
|
|
|
|
Created 4/18/1996 Heikki Tuuri
|
|
|
|
*******************************************************/
|
|
|
|
|
|
|
|
#include "dict0boot.h"
|
|
|
|
#include "dict0crea.h"
|
|
|
|
#include "btr0btr.h"
|
|
|
|
#include "dict0load.h"
|
|
|
|
#include "trx0trx.h"
|
|
|
|
#include "srv0srv.h"
|
|
|
|
#include "buf0flu.h"
|
|
|
|
#include "log0recv.h"
|
|
|
|
#include "os0file.h"
|
|
|
|
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
/** The DICT_HDR page identifier */
|
|
|
|
static constexpr page_id_t hdr_page_id{DICT_HDR_SPACE, DICT_HDR_PAGE_NO};
|
|
|
|
|
|
|
|
/** @return the DICT_HDR block, x-latched */
|
|
|
|
static buf_block_t *dict_hdr_get(mtr_t *mtr)
|
|
|
|
{
|
|
|
|
/* We assume that the DICT_HDR page is always readable and available. */
|
MDEV-32068 Some calls to buf_read_ahead_linear() seem to be useless
The linear read-ahead (enabled by nonzero innodb_read_ahead_threshold)
works best if index leaf pages or undo log pages have been allocated
on adjacent page numbers. The read-ahead is assumed not to be helpful
in other types of page accesses, such as non-leaf index pages.
buf_page_get_low(): Do not invoke buf_page_t::set_accessed(),
buf_page_make_young_if_needed(), or buf_read_ahead_linear().
We will invoke them in those callers of buf_page_get_gen() or
buf_page_get() where it makes sense: the access is not
one-time-on-startup and the page and not going to be freed soon.
btr_copy_blob_prefix(), btr_pcur_move_to_next_page(),
trx_undo_get_prev_rec_from_prev_page(),
trx_undo_get_first_rec(), btr_cur_t::search_leaf(),
btr_cur_t::open_leaf(): Invoke buf_read_ahead_linear().
We will not invoke linear read-ahead in functions that would
essentially allocate or free pages, because pages that are
freshly allocated are expected to be initialized by buf_page_create()
and not read from the data file. Likewise, freeing pages should
not involve accessing any sibling pages, except for freeing
singly-linked lists of BLOB pages.
We will not invoke read-ahead in btr_cur_t::pessimistic_search_leaf()
or in a pessimistic operation of btr_cur_t::open_leaf(), because
it is assumed that pessimistic operations should be preceded by
optimistic operations, which should already have invoked read-ahead.
buf_page_make_young_if_needed(): Invoke also buf_page_t::set_accessed()
and return the result.
btr_cur_nonleaf_make_young(): Like buf_page_make_young_if_needed(),
but do not invoke buf_page_t::set_accessed().
Reviewed by: Vladislav Lesin
Tested by: Matthias Leich
2023-12-05 12:31:29 +02:00
|
|
|
buf_block_t *b=
|
|
|
|
buf_page_get_gen(hdr_page_id, 0, RW_X_LATCH, nullptr, BUF_GET, mtr);
|
|
|
|
buf_page_make_young_if_needed(&b->page);
|
|
|
|
return b;
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
}
|
|
|
|
|
branches/innodb+: Merge revisions 5091:5143 from branches/zip:
------------------------------------------------------------------------
r5092 | marko | 2009-05-25 09:54:17 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Adjust some function comments after r5091.
------------------------------------------------------------------------
r5100 | marko | 2009-05-25 12:09:45 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Split some long lines that were introduced in r5091.
------------------------------------------------------------------------
r5101 | marko | 2009-05-25 12:42:47 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Introduce the macro TEMP_INDEX_PREFIX_STR.
This is to avoid triggering an error in Doxygen.
------------------------------------------------------------------------
r5102 | marko | 2009-05-25 13:47:14 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Add missing file comments.
------------------------------------------------------------------------
r5103 | marko | 2009-05-25 13:52:29 +0300 (Mon, 25 May 2009) | 10 lines
branches/zip: Add @file comments, and convert decorative
/*********************************
comments to Doxygen /** style like this:
/*****************************//**
This conversion was performed by the following command:
perl -i -e 'while(<ARGV>){if (m|^/\*{30}\**$|) {
s|\*{4}$|//**| if ++$com>1; $_ .= "\@file $ARGV\n" if $com==2}
print; if(eof){$.=0;undef $com}}' */*[ch] include/univ.i
------------------------------------------------------------------------
r5104 | marko | 2009-05-25 14:39:07 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Revert ut0auxconf_* to r5102,
that is, make Doxygen ignore these test programs.
------------------------------------------------------------------------
r5105 | marko | 2009-05-25 14:52:20 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Enclose some #error checks inside #ifndef DOXYGEN
to prevent bogus Doxygen errors.
------------------------------------------------------------------------
r5106 | marko | 2009-05-25 16:09:24 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Add some Doxygen comments, mainly to structs, typedefs,
macros and global variables. Many more to go.
------------------------------------------------------------------------
r5108 | marko | 2009-05-26 00:32:35 +0300 (Tue, 26 May 2009) | 2 lines
branches/zip: lexyy.c: Remove the inadvertently added @file directive.
There is nothing for Doxygen to see in this file, move along.
------------------------------------------------------------------------
r5125 | marko | 2009-05-26 16:28:49 +0300 (Tue, 26 May 2009) | 3 lines
branches/zip: Add some Doxygen comments for many structs, typedefs,
#defines and global variables. Many are still missing.
------------------------------------------------------------------------
r5134 | marko | 2009-05-27 09:08:43 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add some Doxygen @return comments.
------------------------------------------------------------------------
r5139 | marko | 2009-05-27 10:01:40 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add Doxyfile.
------------------------------------------------------------------------
r5143 | marko | 2009-05-27 10:57:25 +0300 (Wed, 27 May 2009) | 3 lines
branches/zip: buf0buf.h, Doxyfile: Fix the Doxygen translation.
@defgroup is for source code modules, not for field groups.
Tell Doxygen to expand the UT_LIST declarations.
------------------------------------------------------------------------
2009-05-27 09:52:16 +00:00
|
|
|
/**********************************************************************//**
|
2010-05-24 14:45:24 +03:00
|
|
|
Returns a new table, index, or space id. */
|
|
|
|
void
|
2005-10-27 07:29:40 +00:00
|
|
|
dict_hdr_get_new_id(
|
|
|
|
/*================*/
|
2016-08-12 11:17:45 +03:00
|
|
|
table_id_t* table_id, /*!< out: table id
|
|
|
|
(not assigned if NULL) */
|
|
|
|
index_id_t* index_id, /*!< out: index id
|
|
|
|
(not assigned if NULL) */
|
2021-07-22 11:22:47 +03:00
|
|
|
uint32_t* space_id) /*!< out: space id
|
2016-08-12 11:17:45 +03:00
|
|
|
(not assigned if NULL) */
|
2005-10-27 07:29:40 +00:00
|
|
|
{
|
2010-06-23 14:06:59 +03:00
|
|
|
ib_id_t id;
|
2005-10-27 07:29:40 +00:00
|
|
|
mtr_t mtr;
|
|
|
|
|
2019-12-03 10:19:45 +02:00
|
|
|
mtr.start();
|
|
|
|
buf_block_t* dict_hdr = dict_hdr_get(&mtr);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2010-05-24 14:45:24 +03:00
|
|
|
if (table_id) {
|
2019-12-03 10:19:45 +02:00
|
|
|
id = mach_read_from_8(DICT_HDR + DICT_HDR_TABLE_ID
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
+ dict_hdr->page.frame);
|
2010-06-23 14:06:59 +03:00
|
|
|
id++;
|
2019-12-03 10:19:45 +02:00
|
|
|
mtr.write<8>(*dict_hdr, DICT_HDR + DICT_HDR_TABLE_ID
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
+ dict_hdr->page.frame, id);
|
2010-05-24 14:45:24 +03:00
|
|
|
*table_id = id;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (index_id) {
|
2019-12-03 10:19:45 +02:00
|
|
|
id = mach_read_from_8(DICT_HDR + DICT_HDR_INDEX_ID
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
+ dict_hdr->page.frame);
|
2010-06-23 14:06:59 +03:00
|
|
|
id++;
|
2019-12-03 10:19:45 +02:00
|
|
|
mtr.write<8>(*dict_hdr, DICT_HDR + DICT_HDR_INDEX_ID
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
+ dict_hdr->page.frame, id);
|
2010-05-24 14:45:24 +03:00
|
|
|
*index_id = id;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (space_id) {
|
2019-12-03 10:19:45 +02:00
|
|
|
*space_id = mach_read_from_4(DICT_HDR + DICT_HDR_MAX_SPACE_ID
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
+ dict_hdr->page.frame);
|
2010-05-24 14:45:24 +03:00
|
|
|
if (fil_assign_new_space_id(space_id)) {
|
2019-12-03 10:19:45 +02:00
|
|
|
mtr.write<4>(*dict_hdr,
|
|
|
|
DICT_HDR + DICT_HDR_MAX_SPACE_ID
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
+ dict_hdr->page.frame, *space_id);
|
2010-05-24 14:45:24 +03:00
|
|
|
}
|
|
|
|
}
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2019-12-03 10:19:45 +02:00
|
|
|
mtr.commit();
|
2006-02-23 19:25:29 +00:00
|
|
|
}
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2021-05-20 14:58:25 +03:00
|
|
|
/** Create the DICT_HDR page on database initialization.
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
@return error code */
|
|
|
|
dberr_t dict_create()
|
2005-10-27 07:29:40 +00:00
|
|
|
{
|
|
|
|
ulint root_page_no;
|
2006-02-23 19:25:29 +00:00
|
|
|
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
dberr_t err;
|
2021-05-20 14:58:25 +03:00
|
|
|
mtr_t mtr;
|
|
|
|
mtr.start();
|
MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.
dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.
There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.
There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.
FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.
mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.
fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.
fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.
dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.
dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.
dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.
fil_ibd_open(): Return the tablespace.
fil_space_t::set_imported(): Replaces fil_space_set_imported().
truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.
row_truncate_prepare(): Merge to its only caller.
row_drop_table_from_cache(): Assert that the table is persistent.
dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.
row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-27 16:31:10 +03:00
|
|
|
compile_time_assert(DICT_HDR_SPACE == 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
/* Create the dictionary header file block in a new, allocated file
|
|
|
|
segment in the system tablespace */
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
buf_block_t* d = fseg_create(fil_system.sys_space,
|
|
|
|
DICT_HDR + DICT_HDR_FSEG_HEADER, &mtr,
|
|
|
|
&err);
|
|
|
|
if (!d) {
|
|
|
|
goto func_exit;
|
|
|
|
}
|
|
|
|
ut_a(d->page.id() == hdr_page_id);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2022-12-14 14:44:28 +02:00
|
|
|
/* Start counting table, index, and tree ids from
|
2005-10-27 07:29:40 +00:00
|
|
|
DICT_HDR_FIRST_ID */
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<8>(*d, DICT_HDR + DICT_HDR_TABLE_ID + d->page.frame,
|
2021-05-20 14:58:25 +03:00
|
|
|
DICT_HDR_FIRST_ID);
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<8>(*d, DICT_HDR + DICT_HDR_INDEX_ID + d->page.frame,
|
2021-05-20 14:58:25 +03:00
|
|
|
DICT_HDR_FIRST_ID);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
ut_ad(!mach_read_from_4(DICT_HDR + DICT_HDR_MAX_SPACE_ID
|
|
|
|
+ d->page.frame));
|
2010-05-24 14:45:24 +03:00
|
|
|
|
|
|
|
/* Obsolete, but we must initialize it anyway. */
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<4>(*d, DICT_HDR + DICT_HDR_MIX_ID_LOW + d->page.frame,
|
2021-05-20 14:58:25 +03:00
|
|
|
DICT_HDR_FIRST_ID);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
/* Create the B-tree roots for the clustered indexes of the basic
|
|
|
|
system tables */
|
|
|
|
|
2006-02-23 19:25:29 +00:00
|
|
|
/*--------------------------*/
|
MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.
dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.
There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.
There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.
FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.
mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.
fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.
fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.
dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.
dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.
dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.
fil_ibd_open(): Return the tablespace.
fil_space_t::set_imported(): Replaces fil_space_set_imported().
truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.
row_truncate_prepare(): Merge to its only caller.
row_drop_table_from_cache(): Assert that the table is persistent.
dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.
row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-27 16:31:10 +03:00
|
|
|
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
|
|
|
|
fil_system.sys_space, DICT_TABLES_ID,
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
nullptr, &mtr, &err);
|
2005-10-27 07:29:40 +00:00
|
|
|
if (root_page_no == FIL_NULL) {
|
2021-05-20 14:58:25 +03:00
|
|
|
goto func_exit;
|
2005-10-27 07:29:40 +00:00
|
|
|
}
|
|
|
|
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<4>(*d, DICT_HDR + DICT_HDR_TABLES + d->page.frame,
|
|
|
|
root_page_no);
|
2006-02-23 19:25:29 +00:00
|
|
|
/*--------------------------*/
|
MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.
dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.
There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.
There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.
FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.
mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.
fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.
fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.
dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.
dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.
dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.
fil_ibd_open(): Return the tablespace.
fil_space_t::set_imported(): Replaces fil_space_set_imported().
truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.
row_truncate_prepare(): Merge to its only caller.
row_drop_table_from_cache(): Assert that the table is persistent.
dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.
row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-27 16:31:10 +03:00
|
|
|
root_page_no = btr_create(DICT_UNIQUE,
|
|
|
|
fil_system.sys_space, DICT_TABLE_IDS_ID,
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
nullptr, &mtr, &err);
|
2005-10-27 07:29:40 +00:00
|
|
|
if (root_page_no == FIL_NULL) {
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
goto func_exit;
|
2005-10-27 07:29:40 +00:00
|
|
|
}
|
|
|
|
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<4>(*d, DICT_HDR + DICT_HDR_TABLE_IDS + d->page.frame,
|
2021-05-20 14:58:25 +03:00
|
|
|
root_page_no);
|
2006-02-23 19:25:29 +00:00
|
|
|
/*--------------------------*/
|
MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.
dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.
There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.
There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.
FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.
mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.
fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.
fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.
dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.
dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.
dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.
fil_ibd_open(): Return the tablespace.
fil_space_t::set_imported(): Replaces fil_space_set_imported().
truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.
row_truncate_prepare(): Merge to its only caller.
row_drop_table_from_cache(): Assert that the table is persistent.
dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.
row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-27 16:31:10 +03:00
|
|
|
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
|
|
|
|
fil_system.sys_space, DICT_COLUMNS_ID,
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
nullptr, &mtr, &err);
|
2005-10-27 07:29:40 +00:00
|
|
|
if (root_page_no == FIL_NULL) {
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
goto func_exit;
|
2005-10-27 07:29:40 +00:00
|
|
|
}
|
|
|
|
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<4>(*d, DICT_HDR + DICT_HDR_COLUMNS + d->page.frame,
|
2021-05-20 14:58:25 +03:00
|
|
|
root_page_no);
|
2006-02-23 19:25:29 +00:00
|
|
|
/*--------------------------*/
|
MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.
dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.
There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.
There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.
FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.
mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.
fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.
fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.
dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.
dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.
dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.
fil_ibd_open(): Return the tablespace.
fil_space_t::set_imported(): Replaces fil_space_set_imported().
truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.
row_truncate_prepare(): Merge to its only caller.
row_drop_table_from_cache(): Assert that the table is persistent.
dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.
row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-27 16:31:10 +03:00
|
|
|
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
|
|
|
|
fil_system.sys_space, DICT_INDEXES_ID,
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
nullptr, &mtr, &err);
|
2005-10-27 07:29:40 +00:00
|
|
|
if (root_page_no == FIL_NULL) {
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
goto func_exit;
|
2005-10-27 07:29:40 +00:00
|
|
|
}
|
|
|
|
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<4>(*d, DICT_HDR + DICT_HDR_INDEXES + d->page.frame,
|
|
|
|
root_page_no);
|
2006-02-23 19:25:29 +00:00
|
|
|
/*--------------------------*/
|
MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.
dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.
There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.
There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.
FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.
mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.
fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.
fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.
dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.
dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.
dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.
fil_ibd_open(): Return the tablespace.
fil_space_t::set_imported(): Replaces fil_space_set_imported().
truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.
row_truncate_prepare(): Merge to its only caller.
row_drop_table_from_cache(): Assert that the table is persistent.
dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.
row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-27 16:31:10 +03:00
|
|
|
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
|
|
|
|
fil_system.sys_space, DICT_FIELDS_ID,
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
nullptr, &mtr, &err);
|
2005-10-27 07:29:40 +00:00
|
|
|
if (root_page_no == FIL_NULL) {
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
goto func_exit;
|
2005-10-27 07:29:40 +00:00
|
|
|
}
|
|
|
|
|
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.
buf_page_t::lock: Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.
page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:
buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)
buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.
buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.
buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)
Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.
buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.
buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.
buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.
buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.
buf_page_optimistic_get(): Acquire latch before buffer-fixing.
buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.
recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-16 19:55:06 +02:00
|
|
|
mtr.write<4>(*d, DICT_HDR + DICT_HDR_FIELDS + d->page.frame,
|
|
|
|
root_page_no);
|
2021-05-20 14:58:25 +03:00
|
|
|
func_exit:
|
|
|
|
mtr.commit();
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
return err ? err : dict_boot();
|
2005-10-27 07:29:40 +00:00
|
|
|
}
|
|
|
|
|
branches/innodb+: Merge revisions 5091:5143 from branches/zip:
------------------------------------------------------------------------
r5092 | marko | 2009-05-25 09:54:17 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Adjust some function comments after r5091.
------------------------------------------------------------------------
r5100 | marko | 2009-05-25 12:09:45 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Split some long lines that were introduced in r5091.
------------------------------------------------------------------------
r5101 | marko | 2009-05-25 12:42:47 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Introduce the macro TEMP_INDEX_PREFIX_STR.
This is to avoid triggering an error in Doxygen.
------------------------------------------------------------------------
r5102 | marko | 2009-05-25 13:47:14 +0300 (Mon, 25 May 2009) | 1 line
branches/zip: Add missing file comments.
------------------------------------------------------------------------
r5103 | marko | 2009-05-25 13:52:29 +0300 (Mon, 25 May 2009) | 10 lines
branches/zip: Add @file comments, and convert decorative
/*********************************
comments to Doxygen /** style like this:
/*****************************//**
This conversion was performed by the following command:
perl -i -e 'while(<ARGV>){if (m|^/\*{30}\**$|) {
s|\*{4}$|//**| if ++$com>1; $_ .= "\@file $ARGV\n" if $com==2}
print; if(eof){$.=0;undef $com}}' */*[ch] include/univ.i
------------------------------------------------------------------------
r5104 | marko | 2009-05-25 14:39:07 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Revert ut0auxconf_* to r5102,
that is, make Doxygen ignore these test programs.
------------------------------------------------------------------------
r5105 | marko | 2009-05-25 14:52:20 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Enclose some #error checks inside #ifndef DOXYGEN
to prevent bogus Doxygen errors.
------------------------------------------------------------------------
r5106 | marko | 2009-05-25 16:09:24 +0300 (Mon, 25 May 2009) | 2 lines
branches/zip: Add some Doxygen comments, mainly to structs, typedefs,
macros and global variables. Many more to go.
------------------------------------------------------------------------
r5108 | marko | 2009-05-26 00:32:35 +0300 (Tue, 26 May 2009) | 2 lines
branches/zip: lexyy.c: Remove the inadvertently added @file directive.
There is nothing for Doxygen to see in this file, move along.
------------------------------------------------------------------------
r5125 | marko | 2009-05-26 16:28:49 +0300 (Tue, 26 May 2009) | 3 lines
branches/zip: Add some Doxygen comments for many structs, typedefs,
#defines and global variables. Many are still missing.
------------------------------------------------------------------------
r5134 | marko | 2009-05-27 09:08:43 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add some Doxygen @return comments.
------------------------------------------------------------------------
r5139 | marko | 2009-05-27 10:01:40 +0300 (Wed, 27 May 2009) | 1 line
branches/zip: Add Doxyfile.
------------------------------------------------------------------------
r5143 | marko | 2009-05-27 10:57:25 +0300 (Wed, 27 May 2009) | 3 lines
branches/zip: buf0buf.h, Doxyfile: Fix the Doxygen translation.
@defgroup is for source code modules, not for field groups.
Tell Doxygen to expand the UT_LIST declarations.
------------------------------------------------------------------------
2009-05-27 09:52:16 +00:00
|
|
|
/*****************************************************************//**
|
2005-10-27 07:29:40 +00:00
|
|
|
Initializes the data dictionary memory structures when the database is
|
2013-03-26 00:03:13 +02:00
|
|
|
started. This function is also called when the data dictionary is created.
|
|
|
|
@return DB_SUCCESS or error code. */
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
dberr_t dict_boot()
|
2005-10-27 07:29:40 +00:00
|
|
|
{
|
|
|
|
dict_table_t* table;
|
|
|
|
dict_index_t* index;
|
2007-01-30 09:24:18 +00:00
|
|
|
mem_heap_t* heap;
|
2005-10-27 07:29:40 +00:00
|
|
|
mtr_t mtr;
|
|
|
|
|
2021-05-20 14:58:25 +03:00
|
|
|
static_assert(DICT_NUM_COLS__SYS_TABLES == 8, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_TABLES == 10, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_TABLE_IDS == 2, "compatibility");
|
|
|
|
static_assert(DICT_NUM_COLS__SYS_COLUMNS == 7, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_COLUMNS == 9, "compatibility");
|
|
|
|
static_assert(DICT_NUM_COLS__SYS_INDEXES == 8, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_INDEXES == 10, "compatibility");
|
|
|
|
static_assert(DICT_NUM_COLS__SYS_FIELDS == 3, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_FIELDS == 5, "compatibility");
|
|
|
|
static_assert(DICT_NUM_COLS__SYS_FOREIGN == 4, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_FOREIGN == 6, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_FOREIGN_FOR_NAME == 2,
|
|
|
|
"compatibility");
|
|
|
|
static_assert(DICT_NUM_COLS__SYS_FOREIGN_COLS == 4, "compatibility");
|
|
|
|
static_assert(DICT_NUM_FIELDS__SYS_FOREIGN_COLS == 6, "compatibility");
|
2012-08-01 17:27:34 +03:00
|
|
|
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
mtr.start();
|
2005-10-27 07:29:40 +00:00
|
|
|
/* Create the hash tables etc. */
|
2019-05-17 14:32:53 +03:00
|
|
|
dict_sys.create();
|
2005-10-27 07:29:40 +00:00
|
|
|
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
dberr_t err;
|
2023-12-04 09:45:53 +02:00
|
|
|
const buf_block_t *d = recv_sys.recover(hdr_page_id, &mtr ,&err);
|
2024-07-05 12:45:07 +02:00
|
|
|
DBUG_EXECUTE_IF("hdr_page_corrupt",
|
|
|
|
err= DB_CORRUPTION;
|
|
|
|
d= nullptr;);
|
MDEV-29694 Remove the InnoDB change buffer
The purpose of the change buffer was to reduce random disk access,
which could be useful on rotational storage, but maybe less so on
solid-state storage.
When we wished to
(1) insert a record into a non-unique secondary index,
(2) delete-mark a secondary index record,
(3) delete a secondary index record as part of purge (but not ROLLBACK),
and the B-tree leaf page where the record belongs to is not in the buffer
pool, we inserted a record into the change buffer B-tree, indexed by
the page identifier. When the page was eventually read into the buffer
pool, we looked up the change buffer B-tree for any modifications to the
page, applied these upon the completion of the read operation. This
was called the insert buffer merge.
We remove the change buffer, because it has been the source of
various hard-to-reproduce corruption bugs, including those fixed in
commit 5b9ee8d8193a8c7a8ebdd35eedcadc3ae78e7fc1 and
commit 165564d3c33ae3d677d70644a83afcb744bdbf65 but not limited to them.
A downgrade will fail with a clear message starting with
commit db14eb16f9977453467ec4765f481bb2f71814ba (MDEV-30106).
buf_page_t::state: Merge IBUF_EXIST to UNFIXED and
WRITE_FIX_IBUF to WRITE_FIX.
buf_pool_t::watch[]: Remove.
trx_t: Move isolation_level, check_foreigns, check_unique_secondary,
bulk_insert into the same bit-field. The only purpose of
trx_t::check_unique_secondary is to enable bulk insert into an
empty table. It no longer enables insert buffering for UNIQUE INDEX.
btr_cur_t::thr: Remove. This field was originally needed for change
buffering. Later, its use was extended to cover SPATIAL INDEX.
Much of the time, rtr_info::thr holds this field. When it does not,
we will add parameters to SPATIAL INDEX specific functions.
ibuf_upgrade_needed(): Check if the change buffer needs to be updated.
ibuf_upgrade(): Merge and upgrade the change buffer after all redo log
has been applied. Free any pages consumed by the change buffer, and
zero out the change buffer root page to mark the upgrade completed,
and to prevent a downgrade to an earlier version.
dict_load_tablespaces(): Renamed from
dict_check_tablespaces_and_store_max_id(). This needs to be invoked
before ibuf_upgrade().
btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.
btr_page_alloc(): Renamed from btr_page_alloc_low(). We no longer
allocate any change buffer pages.
btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.
row_search_index_entry(), btr_lift_page_up(): Add a parameter thr
for the SPATIAL INDEX case.
rtr_page_split_and_insert(): Specialized from btr_page_split_and_insert().
rtr_root_raise_and_insert(): Specialized from btr_root_raise_and_insert().
Note: The support for upgrading from the MySQL 3.23 or MySQL 4.0
change buffer format that predates the MySQL 4.1 introduction of
the option innodb_file_per_table was removed in MySQL 5.6.5
as part of mysql/mysql-server@69b6241a79876ae98bb0c9dce7c8d8799d6ad273
and MariaDB 10.0.11 as part of 1d0f70c2f894b27e98773a282871d32802f67964.
In the tests innodb.log_upgrade and innodb.log_corruption, we create
valid (upgraded) change buffer pages.
Tested by: Matthias Leich
2023-01-11 17:59:36 +02:00
|
|
|
if (!d) {
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
mtr.commit();
|
|
|
|
return err;
|
MDEV-29694 Remove the InnoDB change buffer
The purpose of the change buffer was to reduce random disk access,
which could be useful on rotational storage, but maybe less so on
solid-state storage.
When we wished to
(1) insert a record into a non-unique secondary index,
(2) delete-mark a secondary index record,
(3) delete a secondary index record as part of purge (but not ROLLBACK),
and the B-tree leaf page where the record belongs to is not in the buffer
pool, we inserted a record into the change buffer B-tree, indexed by
the page identifier. When the page was eventually read into the buffer
pool, we looked up the change buffer B-tree for any modifications to the
page, applied these upon the completion of the read operation. This
was called the insert buffer merge.
We remove the change buffer, because it has been the source of
various hard-to-reproduce corruption bugs, including those fixed in
commit 5b9ee8d8193a8c7a8ebdd35eedcadc3ae78e7fc1 and
commit 165564d3c33ae3d677d70644a83afcb744bdbf65 but not limited to them.
A downgrade will fail with a clear message starting with
commit db14eb16f9977453467ec4765f481bb2f71814ba (MDEV-30106).
buf_page_t::state: Merge IBUF_EXIST to UNFIXED and
WRITE_FIX_IBUF to WRITE_FIX.
buf_pool_t::watch[]: Remove.
trx_t: Move isolation_level, check_foreigns, check_unique_secondary,
bulk_insert into the same bit-field. The only purpose of
trx_t::check_unique_secondary is to enable bulk insert into an
empty table. It no longer enables insert buffering for UNIQUE INDEX.
btr_cur_t::thr: Remove. This field was originally needed for change
buffering. Later, its use was extended to cover SPATIAL INDEX.
Much of the time, rtr_info::thr holds this field. When it does not,
we will add parameters to SPATIAL INDEX specific functions.
ibuf_upgrade_needed(): Check if the change buffer needs to be updated.
ibuf_upgrade(): Merge and upgrade the change buffer after all redo log
has been applied. Free any pages consumed by the change buffer, and
zero out the change buffer root page to mark the upgrade completed,
and to prevent a downgrade to an earlier version.
dict_load_tablespaces(): Renamed from
dict_check_tablespaces_and_store_max_id(). This needs to be invoked
before ibuf_upgrade().
btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.
btr_page_alloc(): Renamed from btr_page_alloc_low(). We no longer
allocate any change buffer pages.
btr_cur_open_at_rnd_pos(): Specialize for use in persistent statistics.
The change buffer merge does not need this function anymore.
row_search_index_entry(), btr_lift_page_up(): Add a parameter thr
for the SPATIAL INDEX case.
rtr_page_split_and_insert(): Specialized from btr_page_split_and_insert().
rtr_root_raise_and_insert(): Specialized from btr_root_raise_and_insert().
Note: The support for upgrading from the MySQL 3.23 or MySQL 4.0
change buffer format that predates the MySQL 4.1 introduction of
the option innodb_file_per_table was removed in MySQL 5.6.5
as part of mysql/mysql-server@69b6241a79876ae98bb0c9dce7c8d8799d6ad273
and MariaDB 10.0.11 as part of 1d0f70c2f894b27e98773a282871d32802f67964.
In the tests innodb.log_upgrade and innodb.log_corruption, we create
valid (upgraded) change buffer pages.
Tested by: Matthias Leich
2023-01-11 17:59:36 +02:00
|
|
|
}
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
|
2007-01-30 09:24:18 +00:00
|
|
|
heap = mem_heap_create(450);
|
|
|
|
|
MDEV-24258 Merge dict_sys.mutex into dict_sys.latch
In the parent commit, dict_sys.latch could theoretically have been
replaced with a mutex. But, we can do better and merge dict_sys.mutex
into dict_sys.latch. Generally, every occurrence of dict_sys.mutex_lock()
will be replaced with dict_sys.lock().
The PERFORMANCE_SCHEMA instrumentation for dict_sys_mutex
will be removed along with dict_sys.mutex. The dict_sys.latch
will remain instrumented as dict_operation_lock.
Some use of dict_sys.lock() will be replaced with dict_sys.freeze(),
which we will reintroduce for the new shared mode. Most notably,
concurrent table lookups are possible as long as the tables are present
in the dict_sys cache. In particular, this will allow more concurrency
among InnoDB purge workers.
Because dict_sys.mutex will no longer 'throttle' the threads that purge
InnoDB transaction history, a performance degradation may be observed
unless innodb_purge_threads=1.
The table cache eviction policy will become FIFO-like,
similar to what happened to fil_system.LRU
in commit 45ed9dd957eebc7fc84feb2509f4aa6baa908a95.
The name of the list dict_sys.table_LRU will become somewhat misleading;
that list contains tables that may be evicted, even though the
eviction policy no longer is least-recently-used but first-in-first-out.
(Note: Tables can never be evicted as long as locks exist on them or
the tables are in use by some thread.)
As demonstrated by the test perfschema.sxlock_func, there
will be less contention on dict_sys.latch, because some previous
use of exclusive latches will be replaced with shared latches.
fts_parse_sql_no_dict_lock(): Replaced with pars_sql().
fts_get_table_name_prefix(): Merged to fts_optimize_create().
dict_stats_update_transient_for_index(): Deduplicated some code.
ha_innobase::info_low(), dict_stats_stop_bg(): Use a combination
of dict_sys.latch and table->stats_mutex_lock() to cover the
changes of BG_STAT_SHOULD_QUIT, because the flag is being read
in dict_stats_update_persistent() while not holding dict_sys.latch.
row_discard_tablespace_for_mysql(): Protect stats_bg_flag by
exclusive dict_sys.latch, like most other code does.
row_quiesce_table_has_fts_index(): Remove unnecessary mutex
acquisition. FLUSH TABLES...FOR EXPORT is protected by MDL.
row_import::set_root_by_heuristic(): Remove unnecessary mutex
acquisition. ALTER TABLE...IMPORT TABLESPACE is protected by MDL.
row_ins_sec_index_entry_low(): Replace a call
to dict_set_corrupted_index_cache_only(). Reads of index->type
were not really protected by dict_sys.mutex, and writes
(flagging an index corrupted) should be extremely rare.
dict_stats_process_entry_from_defrag_pool(): Only freeze the dictionary,
do not lock it exclusively.
dict_stats_wait_bg_to_stop_using_table(), DICT_BG_YIELD: Remove trx.
We can simply invoke dict_sys.unlock() and dict_sys.lock() directly.
dict_acquire_mdl_shared()<trylock=false>: Assert that dict_sys.latch is
only held in shared more, not exclusive mode. Only acquire it in
exclusive mode if the table needs to be loaded to the cache.
dict_sys_t::acquire(): Remove. Relocating elements in dict_sys.table_LRU
would require holding an exclusive latch, which we want to avoid
for performance reasons.
dict_sys_t::allow_eviction(): Add the table first to dict_sys.table_LRU,
to compensate for the removal of dict_sys_t::acquire(). This function
is only invoked by INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS.
dict_table_open_on_id(), dict_table_open_on_name(): If dict_locked=false,
try to acquire dict_sys.latch in shared mode. Only acquire the latch in
exclusive mode if the table is not found in the cache.
Reviewed by: Thirunarayanan Balathandayuthapani
2021-08-31 13:51:35 +03:00
|
|
|
dict_sys.lock(SRW_LOCK_CALL);
|
2006-02-23 19:25:29 +00:00
|
|
|
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
const byte* dict_hdr = &d->page.frame[DICT_HDR];
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2021-07-22 11:22:47 +03:00
|
|
|
if (uint32_t max_space_id
|
|
|
|
= mach_read_from_4(dict_hdr + DICT_HDR_MAX_SPACE_ID)) {
|
MDEV-23855: Improve InnoDB log checkpoint performance
After MDEV-15053, MDEV-22871, MDEV-23399 shifted the scalability
bottleneck, log checkpoints became a new bottleneck.
If innodb_io_capacity is set low or innodb_max_dirty_pct_lwm is
set high and the workload fits in the buffer pool, the page cleaner
thread will perform very little flushing. When we reach the capacity
of the circular redo log file ib_logfile0 and must initiate a checkpoint,
some 'furious flushing' will be necessary. (If innodb_flush_sync=OFF,
then flushing would continue at the innodb_io_capacity rate, and
writers would be throttled.)
We have the best chance of advancing the checkpoint LSN immediately
after a page flush batch has been completed. Hence, it is best to
perform checkpoints after every batch in the page cleaner thread,
attempting to run once per second.
By initiating high-priority flushing in the page cleaner as early
as possible, we aim to make the throughput more stable.
The function buf_flush_wait_flushed() used to sleep for 10ms, hoping
that the page cleaner thread would do something during that time.
The observed end result was that a large number of threads that call
log_free_check() would end up sleeping while nothing useful is happening.
We will revise the design so that in the default innodb_flush_sync=ON
mode, buf_flush_wait_flushed() will wake up the page cleaner thread
to perform the necessary flushing, and it will wait for a signal from
the page cleaner thread.
If innodb_io_capacity is set to a low value (causing the page cleaner to
throttle its work), a write workload would initially perform well, until
the capacity of the circular ib_logfile0 is reached and log_free_check()
will trigger checkpoints. At that point, the extra waiting in
buf_flush_wait_flushed() will start reducing throughput.
The page cleaner thread will also initiate log checkpoints after each
buf_flush_lists() call, because that is the best point of time for
the checkpoint LSN to advance by the maximum amount.
Even in 'furious flushing' mode we invoke buf_flush_lists() with
innodb_io_capacity_max pages at a time, and at the start of each
batch (in the log_flush() callback function that runs in a separate
task) we will invoke os_aio_wait_until_no_pending_writes(). This
tweak allows the checkpoint to advance in smaller steps and
significantly reduces the maximum latency. On an Intel Optane 960
NVMe SSD on Linux, it reduced from 4.6 seconds to 74 milliseconds.
On Microsoft Windows with a slower SSD, it reduced from more than
180 seconds to 0.6 seconds.
We will make innodb_adaptive_flushing=OFF simply flush innodb_io_capacity
per second whenever the dirty proportion of buffer pool pages exceeds
innodb_max_dirty_pages_pct_lwm. For innodb_adaptive_flushing=ON we try
to make page_cleaner_flush_pages_recommendation() more consistent and
predictable: if we are below innodb_adaptive_flushing_lwm, let us flush
pages according to the return value of af_get_pct_for_dirty().
innodb_max_dirty_pages_pct_lwm: Revert the change of the default value
that was made in MDEV-23399. The value innodb_max_dirty_pages_pct_lwm=0
guarantees that a shutdown of an idle server will be fast. Users might
be surprised if normal shutdown suddenly became slower when upgrading
within a GA release series.
innodb_checkpoint_usec: Remove. The master task will no longer perform
periodic log checkpoints. It is the duty of the page cleaner thread.
log_sys.max_modified_age: Remove. The current span of the
buf_pool.flush_list expressed in LSN only matters for adaptive
flushing (outside the 'furious flushing' condition).
For the correctness of checkpoints, the only thing that matters is
the checkpoint age (log_sys.lsn - log_sys.last_checkpoint_lsn).
This run-time constant was also reported as log_max_modified_age_sync.
log_sys.max_checkpoint_age_async: Remove. This does not serve any
purpose, because the checkpoints will now be triggered by the page
cleaner thread. We will retain the log_sys.max_checkpoint_age limit
for engaging 'furious flushing'.
page_cleaner.slot: Remove. It turns out that
page_cleaner_slot.flush_list_time was duplicating
page_cleaner.slot.flush_time and page_cleaner.slot.flush_list_pass
was duplicating page_cleaner.flush_pass.
Likewise, there were some redundant monitor counters, because the
page cleaner thread no longer performs any buf_pool.LRU flushing, and
because there only is one buf_flush_page_cleaner thread.
buf_flush_sync_lsn: Protect writes by buf_pool.flush_list_mutex.
buf_pool_t::get_oldest_modification(): Add a parameter to specify the
return value when no persistent data pages are dirty. Require the
caller to hold buf_pool.flush_list_mutex.
log_buf_pool_get_oldest_modification(): Take the fall-back LSN
as a parameter. All callers will also invoke log_sys.get_lsn().
log_preflush_pool_modified_pages(): Replaced with buf_flush_wait_flushed().
buf_flush_wait_flushed(): Implement two limits. If not enough buffer pool
has been flushed, signal the page cleaner (unless innodb_flush_sync=OFF)
and wait for the page cleaner to complete. If the page cleaner
thread is not running (which can be the case durign shutdown),
initiate the flush and wait for it directly.
buf_flush_ahead(): If innodb_flush_sync=ON (the default),
submit a new buf_flush_sync_lsn target for the page cleaner
but do not wait for the flushing to finish.
log_get_capacity(), log_get_max_modified_age_async(): Remove, to make
it easier to see that af_get_pct_for_lsn() is not acquiring any mutexes.
page_cleaner_flush_pages_recommendation(): Protect all access to
buf_pool.flush_list with buf_pool.flush_list_mutex. Previously there
were some race conditions in the calculation.
buf_flush_sync_for_checkpoint(): New function to process
buf_flush_sync_lsn in the page cleaner thread. At the end of
each batch, we try to wake up any blocked buf_flush_wait_flushed().
If everything up to buf_flush_sync_lsn has been flushed, we will
reset buf_flush_sync_lsn=0. The page cleaner thread will keep
'furious flushing' until the limit is reached. Any threads that
are waiting in buf_flush_wait_flushed() will be able to resume
as soon as their own limit has been satisfied.
buf_flush_page_cleaner: Prioritize buf_flush_sync_lsn and do not
sleep as long as it is set. Do not update any page_cleaner statistics
for this special mode of operation. In the normal mode
(buf_flush_sync_lsn is not set for innodb_flush_sync=ON),
try to wake up once per second. No longer check whether
srv_inc_activity_count() has been called. After each batch,
try to perform a log checkpoint, because the best chances for
the checkpoint LSN to advance by the maximum amount are upon
completing a flushing batch.
log_t: Move buf_free, max_buf_free possibly to the same cache line
with log_sys.mutex.
log_margin_checkpoint_age(): Simplify the logic, and replace
a 0.1-second sleep with a call to buf_flush_wait_flushed() to
initiate flushing. Moved to the same compilation unit
with the only caller.
log_close(): Clean up the calculations. (Should be no functional
change.) Return whether flush-ahead is needed. Moved to the same
compilation unit with the only caller.
mtr_t::finish_write(): Return whether flush-ahead is needed.
mtr_t::commit(): Invoke buf_flush_ahead() when needed. Let us avoid
external calls in mtr_t::commit() and make the logic easier to follow
by having related code in a single compilation unit. Also, we will
invoke srv_stats.log_write_requests.inc() only once per
mini-transaction commit, while not holding mutexes.
log_checkpoint_margin(): Only care about log_sys.max_checkpoint_age.
Upon reaching log_sys.max_checkpoint_age where we must wait to prevent
the log from getting corrupted, let us wait for at most 1MiB of LSN
at a time, before rechecking the condition. This should allow writers
to proceed even if the redo log capacity has been reached and
'furious flushing' is in progress. We no longer care about
log_sys.max_modified_age_sync or log_sys.max_modified_age_async.
The log_sys.max_modified_age_sync could be a relic from the time when
there was a srv_master_thread that wrote dirty pages to data files.
Also, we no longer have any log_sys.max_checkpoint_age_async limit,
because log checkpoints will now be triggered by the page cleaner
thread upon completing buf_flush_lists().
log_set_capacity(): Simplify the calculations of the limit
(no functional change).
log_checkpoint_low(): Split from log_checkpoint(). Moved to the
same compilation unit with the caller.
log_make_checkpoint(): Only wait for everything to be flushed until
the current LSN.
create_log_file(): After checkpoint, invoke log_write_up_to()
to ensure that the FILE_CHECKPOINT record has been written.
This avoids ut_ad(!srv_log_file_created) in create_log_file_rename().
srv_start(): Do not call recv_recovery_from_checkpoint_start()
if the log has just been created. Set fil_system.space_id_reuse_warned
before dict_boot() has been executed, and clear it after recovery
has finished.
dict_boot(): Initialize fil_system.max_assigned_id.
srv_check_activity(): Remove. The activity count is counting transaction
commits and therefore mostly interesting for the purge of history.
BtrBulk::insert(): Do not explicitly wake up the page cleaner,
but do invoke srv_inc_activity_count(), because that counter is
still being used in buf_load_throttle_if_needed() for some
heuristics. (It might be cleaner to execute buf_load() in the
page cleaner thread!)
Reviewed by: Vladislav Vaintroub
2020-10-26 16:35:47 +02:00
|
|
|
max_space_id--;
|
|
|
|
fil_assign_new_space_id(&max_space_id);
|
|
|
|
}
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
/* Insert into the dictionary cache the descriptions of the basic
|
|
|
|
system tables */
|
|
|
|
/*-------------------------*/
|
2021-05-20 14:58:25 +03:00
|
|
|
table = dict_table_t::create(dict_sys.SYS_TABLE[dict_sys.SYS_TABLES],
|
|
|
|
fil_system.sys_space,
|
|
|
|
DICT_NUM_COLS__SYS_TABLES, 0, 0, 0);
|
2016-08-12 11:17:45 +03:00
|
|
|
dict_mem_table_add_col(table, heap, "NAME", DATA_BINARY, 0,
|
|
|
|
MAX_FULL_NAME_LEN);
|
|
|
|
dict_mem_table_add_col(table, heap, "ID", DATA_BINARY, 0, 8);
|
2008-05-14 15:43:19 +00:00
|
|
|
/* ROW_FORMAT = (N_COLS >> 31) ? COMPACT : REDUNDANT */
|
2007-01-30 09:24:18 +00:00
|
|
|
dict_mem_table_add_col(table, heap, "N_COLS", DATA_INT, 0, 4);
|
2017-06-01 13:03:55 +03:00
|
|
|
/* The low order bit of TYPE is always set to 1. If ROW_FORMAT
|
|
|
|
is not REDUNDANT or COMPACT, this field matches table->flags. */
|
2007-01-30 09:24:18 +00:00
|
|
|
dict_mem_table_add_col(table, heap, "TYPE", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "MIX_ID", DATA_BINARY, 0, 0);
|
branches/innodb+: Merge revisions 6130:6364 from branches/zip:
------------------------------------------------------------------------
r6130 | marko | 2009-11-02 11:42:56 +0200 (Mon, 02 Nov 2009) | 9 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0sea.c
M /branches/zip/buf/buf0buf.c
M /branches/zip/dict/dict0dict.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/ibuf/ibuf0ibuf.c
M /branches/zip/include/btr0sea.h
M /branches/zip/include/dict0dict.h
M /branches/zip/include/fil0fil.h
M /branches/zip/include/ibuf0ibuf.h
M /branches/zip/include/lock0lock.h
M /branches/zip/include/log0log.h
M /branches/zip/include/log0recv.h
M /branches/zip/include/mem0mem.h
M /branches/zip/include/mem0pool.h
M /branches/zip/include/os0file.h
M /branches/zip/include/pars0pars.h
M /branches/zip/include/srv0srv.h
M /branches/zip/include/thr0loc.h
M /branches/zip/include/trx0i_s.h
M /branches/zip/include/trx0purge.h
M /branches/zip/include/trx0rseg.h
M /branches/zip/include/trx0sys.h
M /branches/zip/include/trx0undo.h
M /branches/zip/include/usr0sess.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/log/log0recv.c
M /branches/zip/mem/mem0dbg.c
M /branches/zip/mem/mem0pool.c
M /branches/zip/os/os0file.c
M /branches/zip/os/os0sync.c
M /branches/zip/os/os0thread.c
M /branches/zip/pars/lexyy.c
M /branches/zip/pars/pars0lex.l
M /branches/zip/que/que0que.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
M /branches/zip/sync/sync0arr.c
M /branches/zip/sync/sync0sync.c
M /branches/zip/thr/thr0loc.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0purge.c
M /branches/zip/trx/trx0rseg.c
M /branches/zip/trx/trx0sys.c
M /branches/zip/trx/trx0undo.c
M /branches/zip/usr/usr0sess.c
M /branches/zip/ut/ut0mem.c
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 09:57:29 +0200 (Wed, 04 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
r6137 | marko | 2009-11-04 15:24:28 +0200 (Wed, 04 Nov 2009) | 1 line
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_index_too_big_for_undo(): Correct a typo.
------------------------------------------------------------------------
r6153 | vasil | 2009-11-10 15:33:22 +0200 (Tue, 10 Nov 2009) | 145 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6125:6152 from branches/5.1:
(everything except the last white-space change was skipped as it is already
in branches/zip)
------------------------------------------------------------------------
r6127 | vasil | 2009-10-30 11:18:25 +0200 (Fri, 30 Oct 2009) | 18 lines
Changed paths:
M /branches/5.1/Makefile.am
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1:
Backport c6121 from branches/zip:
------------------------------------------------------------------------
r6121 | sunny | 2009-10-30 01:42:11 +0200 (Fri, 30 Oct 2009) | 7 lines
Changed paths:
M /branches/zip/mysql-test/innodb-autoinc.result
branches/zip: This test has been problematic for sometime now. The underlying
bug is that the data dictionaries get out of sync. In the AUTOINC code we
try and apply salve to the symptoms. In the past MySQL made some unrelated
change and the dictionaries stopped getting out of sync and this test started
to fail. Now, it seems they have reverted that changed and the test is
passing again. I suspect this is not he last time that this test will change.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6129 | vasil | 2009-10-30 17:14:22 +0200 (Fri, 30 Oct 2009) | 4 lines
Changed paths:
M /branches/5.1/Makefile.am
branches/5.1:
Revert a change to Makefile.am that sneaked unnoticed in c6127.
------------------------------------------------------------------------
r6136 | marko | 2009-11-04 12:28:10 +0200 (Wed, 04 Nov 2009) | 15 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/ha_prototypes.h
M /branches/5.1/ut/ut0ut.c
branches/5.1: Port r6134 from branches/zip:
------------------------------------------------------------------------
r6134 | marko | 2009-11-04 07:57:29 +0000 (Wed, 04 Nov 2009) | 5 lines
branches/zip: innobase_convert_identifier(): Convert table names with
explain_filename() to address Bug #32430: 'show innodb status'
causes errors Invalid (old?) table or database name in logs.
rb://134 approved by Sunny Bains
------------------------------------------------------------------------
innobase_print_identifier(): Replace with innobase_convert_name().
innobase_convert_identifier(): New function, called by innobase_convert_name().
------------------------------------------------------------------------
r6149 | vasil | 2009-11-09 11:15:01 +0200 (Mon, 09 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/CMakeLists.txt
branches/5.1:
Followup to r5700: Adjust the changes so they are the same as in the BZR
repository.
------------------------------------------------------------------------
r6150 | vasil | 2009-11-09 11:43:31 +0200 (Mon, 09 Nov 2009) | 58 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
Merge a part of r2911.5.5 from MySQL:
(the other part of this was merged in c5700)
------------------------------------------------------------
revno: 2911.5.5
committer: Vladislav Vaintroub <vvaintroub@mysql.com>
branch nick: 5.1-innodb_plugin
timestamp: Wed 2009-06-10 10:59:49 +0200
message:
Backport WL#3653 to 5.1 to enable bundled innodb plugin.
Remove custom DLL loader code from innodb plugin code, use
symbols exported from mysqld.
removed:
storage/innodb_plugin/handler/handler0vars.h
storage/innodb_plugin/handler/win_delay_loader.cc
added:
storage/mysql_storage_engine.cmake
win/create_def_file.js
modified:
CMakeLists.txt
include/m_ctype.h
include/my_global.h
include/my_sys.h
include/mysql/plugin.h
libmysqld/CMakeLists.txt
mysql-test/mysql-test-run.pl
mysql-test/t/plugin.test
mysql-test/t/plugin_load-master.opt
mysys/charset.c
sql/CMakeLists.txt
sql/handler.h
sql/mysql_priv.h
sql/mysqld.cc
sql/sql_class.cc
sql/sql_class.h
sql/sql_list.h
sql/sql_profile.h
storage/Makefile.am
storage/archive/CMakeLists.txt
storage/blackhole/CMakeLists.txt
storage/csv/CMakeLists.txt
storage/example/CMakeLists.txt
storage/federated/CMakeLists.txt
storage/heap/CMakeLists.txt
storage/innobase/CMakeLists.txt
storage/innobase/handler/ha_innodb.cc
storage/innodb_plugin/CMakeLists.txt
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/handler0alter.cc
storage/innodb_plugin/handler/i_s.cc
storage/innodb_plugin/plug.in
storage/myisam/CMakeLists.txt
storage/myisammrg/CMakeLists.txt
win/Makefile.am
win/configure.js
------------------------------------------------------------------------
r6152 | vasil | 2009-11-10 15:30:20 +0200 (Tue, 10 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1:
White space fixup.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6157 | jyang | 2009-11-11 14:27:09 +0200 (Wed, 11 Nov 2009) | 10 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
A /branches/zip/mysql-test/innodb_bug47167.result
A /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_file_format.result
branches/zip: Fix an issue that a local variable defined
in innodb_file_format_check_validate() is being referenced
across function in innodb_file_format_check_update().
In addition, fix "set global innodb_file_format_check =
DEFAULT" call.
Bug #47167: "set global innodb_file_format_check" cannot
set value by User-Defined Variable."
rb://169 approved by Sunny Bains and Marko.
------------------------------------------------------------------------
r6159 | vasil | 2009-11-11 15:13:01 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
r6160 | vasil | 2009-11-11 15:33:49 +0200 (Wed, 11 Nov 2009) | 72 lines
Changed paths:
M /branches/zip/include/os0file.h
M /branches/zip/os/os0file.c
branches/zip: Merge r6152:6159 from branches/5.1:
(r6158 was skipped as an equivallent change has already been merged from MySQL)
------------------------------------------------------------------------
r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines
Changed paths:
M /branches/5.1/include/os0file.h
M /branches/5.1/os/os0file.c
branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995'
after several selects on a large DB
During stress environment, Windows AIO may fail with error code
ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather
crashes. The cause of the error is unknown, but likely due to
faulty hardware or driver.
This patch introduces a new error code OS_FILE_OPERATION_ABORTED,
which maps to Windows ERROR_OPERATION_ABORTED (995). When the error
is detected during AIO, the InnoDB will issue a synchronous retry
(read/write).
This patch has been extensively tested by MySQL support.
Approved by: Marko
rb://196
------------------------------------------------------------------------
r6158 | vasil | 2009-11-11 14:52:14 +0200 (Wed, 11 Nov 2009) | 37 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/handler/ha_innodb.h
branches/5.1:
Merge a change from MySQL:
(this has been reviewed by Calvin and Marko, and Calvin says Luis has
incorporated Marko's suggestions)
------------------------------------------------------------
revno: 3092.5.1
committer: Luis Soares <luis.soares@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Thu 2009-09-24 15:52:52 +0100
message:
BUG#42829: binlogging enabled for all schemas regardless of
binlog-db-db / binlog-ignore-db
InnoDB will return an error if statement based replication is used
along with transaction isolation level READ-COMMITTED (or weaker),
even if the statement in question is filtered out according to the
binlog-do-db rules set. In this case, an error should not be printed.
This patch addresses this issue by extending the existing check in
external_lock to take into account the filter rules before deciding to
print an error. Furthermore, it also changes decide_logging_format to
take into consideration whether the statement is filtered out from
binlog before decision is made.
added:
mysql-test/suite/binlog/r/binlog_stm_do_db.result
mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt
mysql-test/suite/binlog/t/binlog_stm_do_db.test
modified:
sql/sql_base.cc
sql/sql_class.cc
storage/innobase/handler/ha_innodb.cc
storage/innobase/handler/ha_innodb.h
storage/innodb_plugin/handler/ha_innodb.cc
storage/innodb_plugin/handler/ha_innodb.h
------------------------------------------------------------------------
------------------------------------------------------------------------
r6161 | vasil | 2009-11-11 15:36:16 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add changelog entry for r6160.
------------------------------------------------------------------------
r6162 | vasil | 2009-11-11 16:00:12 +0200 (Wed, 11 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog for r6157.
------------------------------------------------------------------------
r6163 | calvin | 2009-11-11 17:53:20 +0200 (Wed, 11 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/handler/ha_innodb.h
branches/zip: Exclude thd_binlog_filter_ok() when building
with older version of MySQL.
thd_binlog_filter_ok() is introduced in MySQL 5.1.41. But the
plugin can be built with MySQL prior to 5.1.41.
Approved by Heikki (on IM).
------------------------------------------------------------------------
r6169 | calvin | 2009-11-12 14:40:43 +0200 (Thu, 12 Nov 2009) | 6 lines
Changed paths:
A /branches/zip/mysql-test/innodb_bug46676.result
A /branches/zip/mysql-test/innodb_bug46676.test
branches/zip: add test case for bug#46676
This crash is reproducible with InnoDB plugin 1.0.4 + MySQL 5.1.37.
But no longer reproducible after MySQL 5.1.38 (with plugin 1.0.5).
Add test case to catch future regression.
------------------------------------------------------------------------
r6170 | marko | 2009-11-12 15:49:08 +0200 (Thu, 12 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/db0err.h
M /branches/zip/row/row0merge.c
M /branches/zip/row/row0mysql.c
branches/zip: Allow CREATE INDEX to be interrupted. (Issue #354)
rb://183 approved by Heikki Tuuri
------------------------------------------------------------------------
r6175 | vasil | 2009-11-16 20:07:39 +0200 (Mon, 16 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Wrap line at 78th char in the ChangeLog
------------------------------------------------------------------------
r6177 | calvin | 2009-11-16 20:20:38 +0200 (Mon, 16 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip: add an entry to ChangeLog for r6065
------------------------------------------------------------------------
r6179 | marko | 2009-11-17 10:19:34 +0200 (Tue, 17 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: ha_innobase::change_active_index(): When the history is
missing, report it to the client, not to the error log.
------------------------------------------------------------------------
r6181 | vasil | 2009-11-17 12:21:41 +0200 (Tue, 17 Nov 2009) | 33 lines
Changed paths:
M /branches/zip/mysql-test/innodb-index.test
branches/zip:
At the end of innodb-index.test: restore the environment as it was before
the test was started to silence this warning:
MTR's internal check of the test case 'main.innodb-index' failed.
This means that the test case does not preserve the state that existed
before the test case was executed. Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:13000 (socket /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result 2009-11-17 13:10:40.000000000 +0300
+++ /tmp/autotest.sh-20091117_033000-zip.btyZwu/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.reject 2009-11-17 13:10:54.000000000 +0300
@@ -84,7 +84,7 @@
INNODB_DOUBLEWRITE ON
INNODB_FAST_SHUTDOWN 1
INNODB_FILE_FORMAT Antelope
-INNODB_FILE_FORMAT_CHECK Antelope
+INNODB_FILE_FORMAT_CHECK Barracuda
INNODB_FILE_PER_TABLE OFF
INNODB_FLUSH_LOG_AT_TRX_COMMIT 1
INNODB_FLUSH_METHOD
mysqltest: Result content mismatch
not ok
------------------------------------------------------------------------
r6182 | marko | 2009-11-17 13:49:15 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-consistent.result
M /branches/zip/mysql-test/innodb-consistent.test
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc.result
M /branches/zip/mysql-test/innodb-use-sys-malloc.test
M /branches/zip/mysql-test/innodb_bug21704.result
M /branches/zip/mysql-test/innodb_bug21704.test
M /branches/zip/mysql-test/innodb_bug40360.test
M /branches/zip/mysql-test/innodb_bug40565.result
M /branches/zip/mysql-test/innodb_bug40565.test
M /branches/zip/mysql-test/innodb_bug41904.result
M /branches/zip/mysql-test/innodb_bug41904.test
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero.result
M /branches/zip/mysql-test/innodb_bug42101-nonzero.test
M /branches/zip/mysql-test/innodb_bug42101.result
M /branches/zip/mysql-test/innodb_bug42101.test
M /branches/zip/mysql-test/innodb_bug44032.result
M /branches/zip/mysql-test/innodb_bug44032.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
M /branches/zip/mysql-test/innodb_bug44571.result
M /branches/zip/mysql-test/innodb_bug44571.test
M /branches/zip/mysql-test/innodb_bug45357.test
M /branches/zip/mysql-test/innodb_bug46000.result
M /branches/zip/mysql-test/innodb_bug46000.test
M /branches/zip/mysql-test/innodb_bug46676.result
M /branches/zip/mysql-test/innodb_bug46676.test
M /branches/zip/mysql-test/innodb_bug47167.result
M /branches/zip/mysql-test/innodb_bug47167.test
M /branches/zip/mysql-test/innodb_bug47777.result
M /branches/zip/mysql-test/innodb_bug47777.test
M /branches/zip/mysql-test/innodb_file_format.result
M /branches/zip/mysql-test/innodb_file_format.test
branches/zip: Set svn:eol-style on mysql-test files.
------------------------------------------------------------------------
r6183 | marko | 2009-11-17 13:51:16 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-consistent-master.opt
M /branches/zip/mysql-test/innodb-master.opt
M /branches/zip/mysql-test/innodb-semi-consistent-master.opt
M /branches/zip/mysql-test/innodb-use-sys-malloc-master.opt
M /branches/zip/mysql-test/innodb_bug42101-nonzero-master.opt
branches/zip: Prepend loose_ to plugin-only mysql-test options.
------------------------------------------------------------------------
r6184 | marko | 2009-11-17 13:52:01 +0200 (Tue, 17 Nov 2009) | 1 line
Changed paths:
M /branches/zip/mysql-test/innodb-index.result
M /branches/zip/mysql-test/innodb-index.test
branches/zip: innodb-index.test: Restore innodb_file_format_check.
------------------------------------------------------------------------
r6185 | marko | 2009-11-17 16:44:20 +0200 (Tue, 17 Nov 2009) | 16 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/mysql-test/innodb.result
M /branches/zip/mysql-test/innodb.test
M /branches/zip/mysql-test/innodb_bug44369.result
M /branches/zip/mysql-test/innodb_bug44369.test
D /branches/zip/mysql-test/patches/innodb-index.diff
M /branches/zip/row/row0mysql.c
branches/zip: Report duplicate table names
to the client connection, not to the error log. This change will allow
innodb-index.test to be re-enabled. It was previously disabled, because
mysql-test-run does not like output in the error log.
row_create_table_for_mysql(): Do not output anything to the error log
when reporting DB_DUPLICATE_KEY. Let the caller report the error.
Add a TODO comment that the dict_table_t object is apparently not freed
when an error occurs.
create_table_def(): Convert InnoDB table names to the character set
of the client connection for reporting. Use my_error(ER_WRONG_COLUMN_NAME)
for reporting reserved column names. Report my_error(ER_TABLE_EXISTS_ERROR)
when row_create_table_for_mysql() returns DB_DUPLICATE_KEY.
rb://206
------------------------------------------------------------------------
r6186 | vasil | 2009-11-17 16:48:14 +0200 (Tue, 17 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6185.
------------------------------------------------------------------------
r6189 | marko | 2009-11-18 11:36:18 +0200 (Wed, 18 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): When creating the primary key
and the table is being locked by another transaction,
do not attempt to drop the table. (Bug #48782)
Approved by Sunny Bains over IM
------------------------------------------------------------------------
r6194 | vasil | 2009-11-19 09:24:45 +0200 (Thu, 19 Nov 2009) | 5 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip:
Increment version number from 1.0.5 to 1.0.6 since 1.0.5 was just released
by MySQL and we will soon release 1.0.6.
------------------------------------------------------------------------
r6197 | calvin | 2009-11-19 09:32:55 +0200 (Thu, 19 Nov 2009) | 6 lines
Changed paths:
M /branches/zip/CMakeLists.txt
branches/zip: merge the fix of bug#48317 (CMake file)
Due to MySQL changes to the CMake, it is no longer able
to build InnoDB plugin as a static library on Windows.
The fix is proposed by Vlad of MySQL.
------------------------------------------------------------------------
r6198 | vasil | 2009-11-19 09:44:31 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for r6197.
------------------------------------------------------------------------
r6199 | vasil | 2009-11-19 12:10:12 +0200 (Thu, 19 Nov 2009) | 31 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/btr/btr0btr.c
M /branches/zip/data/data0type.c
branches/zip: Merge r6159:6198 from branches/5.1:
------------------------------------------------------------------------
r6187 | jyang | 2009-11-18 05:27:30 +0200 (Wed, 18 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Fix bug #48469 "when innodb tablespace is
configured too small, crash and corruption!". Function
btr_create() did not check the return status of fseg_create(),
and continue the index creation even there is no sufficient
space.
rb://205 Approved by Marko
------------------------------------------------------------------------
r6188 | jyang | 2009-11-18 07:14:23 +0200 (Wed, 18 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/data/data0type.c
branches/5.1: Fix bug #48526 "Data type for float and
double is incorrectly reported in InnoDB table monitor".
Certain datatypes are not printed correctly in
dtype_print().
rb://204 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6201 | marko | 2009-11-19 14:09:11 +0200 (Thu, 19 Nov 2009) | 2 lines
Changed paths:
M /branches/zip/handler/handler0alter.cc
branches/zip: ha_innobase::add_index(): Clarify the comment
on orphaned tables when creating a primary key.
------------------------------------------------------------------------
r6202 | jyang | 2009-11-19 15:01:00 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/zip/btr/btr0btr.c
branches/zip: Function fseg_free() is no longer defined
in branches/zip. To port fix for bug #48469 to zip,
we can use btr_free_root() which frees the page,
and also does not require mini-transaction.
Approved by Marko.
------------------------------------------------------------------------
r6207 | vasil | 2009-11-20 10:19:14 +0200 (Fri, 20 Nov 2009) | 54 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Merge r6198:6206 from branches/5.1:
(r6203 was skipped as it is already in branches/zip)
------------------------------------------------------------------------
r6200 | vasil | 2009-11-19 12:14:23 +0200 (Thu, 19 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1:
White space fixup - indent under the opening (
------------------------------------------------------------------------
r6203 | jyang | 2009-11-19 15:12:22 +0200 (Thu, 19 Nov 2009) | 8 lines
Changed paths:
M /branches/5.1/btr/btr0btr.c
branches/5.1: Use btr_free_root() instead of fseg_free() for
the fix of bug #48469, because fseg_free() is not defined
in the zip branch. And we could save one mini-trasaction started
by fseg_free().
Approved by Marko.
------------------------------------------------------------------------
r6205 | jyang | 2009-11-20 07:55:48 +0200 (Fri, 20 Nov 2009) | 11 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Add a special case to handle the Duplicated Key error
and return DB_ERROR instead. This is to avoid a possible SIGSEGV
by mysql error handling re-entering the storage layer for dup key
info without proper table handle.
This is to prevent a server crash when error situation in bug
#45961 "DDL on partitioned innodb tables leaves data dictionary
in an inconsistent state" happens.
rb://157 approved by Sunny Bains.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Fix a minor code formating issue for
the parenthesis iplacement of the if condition in
rename_table().
------------------------------------------------------------------------
------------------------------------------------------------------------
r6208 | vasil | 2009-11-20 10:49:24 +0200 (Fri, 20 Nov 2009) | 4 lines
Changed paths:
M /branches/zip/ChangeLog
branches/zip:
Add ChangeLog entry for c6207.
------------------------------------------------------------------------
r6210 | vasil | 2009-11-20 23:39:48 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/zip/trx/trx0i_s.c
branches/zip:
Whitespace fixup.
------------------------------------------------------------------------
r6248 | marko | 2009-11-30 12:19:50 +0200 (Mon, 30 Nov 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: ChangeLog: Document r4922 that was forgotten.
------------------------------------------------------------------------
r6252 | marko | 2009-11-30 12:50:11 +0200 (Mon, 30 Nov 2009) | 23 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/dict/dict0boot.c
M /branches/zip/dict/dict0crea.c
M /branches/zip/dict/dict0load.c
M /branches/zip/dict/dict0mem.c
M /branches/zip/fil/fil0fil.c
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/dict0mem.h
M /branches/zip/row/row0mysql.c
branches/zip: Suppress errors about non-found temporary tables.
Write the is_temp flag to SYS_TABLES.MIX_LEN.
dict_table_t::flags: Add a flag for is_temporary, DICT_TF2_TEMPORARY.
Unlike other flags, this will not be written to the tablespace flags
or SYS_TABLES.TYPE, but only to SYS_TABLES.MIX_LEN.
dict_build_table_def_step(): Only pass DICT_TF_BITS to tablespaces.
dict_check_tablespaces_and_store_max_id(), dict_load_table():
Suppress errors about temporary tables not being found.
dict_create_sys_tables_tuple(): Write the DICT_TF2_TEMPORARY flag
to SYS_TABLES.MIX_LEN.
fil_space_create(), fil_create_new_single_table_tablespace(): Add assertions
about space->flags.
row_drop_table_for_mysql(): Do not complain about non-found temporary tables.
rb://160 approved by Heikki Tuuri. This addresses the second part of
Bug #41609 Crash recovery does not work for InnoDB temporary tables.
------------------------------------------------------------------------
r6263 | vasil | 2009-12-01 14:49:05 +0200 (Tue, 01 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/include/univ.i
branches/zip: Increment version number from 1.0.6 to 1.0.7
1.0.6 has been released
------------------------------------------------------------------------
r6264 | vasil | 2009-12-01 16:19:44 +0200 (Tue, 01 Dec 2009) | 1 line
Changed paths:
M /branches/zip/ChangeLog
branches/zip: Add ChangeLog entry for the release of 1.0.6.
------------------------------------------------------------------------
r6269 | marko | 2009-12-02 11:35:22 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): UNIV_IBUF_DEBUG
should not break crash recovery, but UNIV_IBUF_COUNT_DEBUG will.
------------------------------------------------------------------------
r6270 | marko | 2009-12-02 11:36:47 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/srv/srv0start.c
branches/zip: innobase_start_or_create_for_mysql(): Log the zlib version.
------------------------------------------------------------------------
r6271 | marko | 2009-12-02 11:43:49 +0200 (Wed, 02 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/ChangeLog
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: ChangeLog: Document that since r6270, the zlib version number
will be displayed at start-up.
------------------------------------------------------------------------
r6272 | marko | 2009-12-02 11:46:05 +0200 (Wed, 02 Dec 2009) | 1 line
Changed paths:
M /branches/zip/Makefile.am
M /branches/zip/include/univ.i
M /branches/zip/plug.in
branches/zip: Revert changes that were accidentally committed in r6271.
------------------------------------------------------------------------
r6274 | marko | 2009-12-03 14:47:12 +0200 (Thu, 03 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/dict/dict0dict.c
branches/zip: dict_table_check_for_dup_indexes(): Assert that the
data dictionary mutex is being held while table->indexes is accessed.
This is already the case.
Currently, only dict_table_get_next_index() and dict_table_get_first_index()
are being invoked without holding dict_sys->mutex.
------------------------------------------------------------------------
r6275 | pekka | 2009-12-03 18:32:47 +0200 (Thu, 03 Dec 2009) | 10 lines
Changed paths:
M /branches/zip/include/log0recv.h
M /branches/zip/include/trx0sys.h
M /branches/zip/log/log0recv.c
M /branches/zip/trx/trx0sys.c
branches/zip: Minor changes which allow build with UNIV_HOTBACKUP
defined to succeed:
include/trx0sys.h: Allow Hot Backup build to see some
TRX_SYS_DOUBLEWRITE_... macros.
trx/trx0sys.c: Exclude trx_sys_close() function from Hot Backup build.
log/log0recv.[ch]: Exclude recv_sys_var_init() function from Hot Backup build.
This change should not affect !UNIV_HOTBACKUP build.
------------------------------------------------------------------------
r6277 | marko | 2009-12-08 11:13:36 +0200 (Tue, 08 Dec 2009) | 1 line
Changed paths:
M /branches/zip/fsp/fsp0fsp.c
branches/zip: fsp0fsp.c: Add some missing in/out and const qualifiers.
------------------------------------------------------------------------
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
Changed paths:
M /branches/zip/row/row0sel.c
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
------------------------------------------------------------------------
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
Changed paths:
M /branches/zip/row/row0upd.c
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.
------------------------------------------------------------------------
r6305 | marko | 2009-12-14 13:03:57 +0200 (Mon, 14 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/row/row0umod.c
branches/zip: row_undo_mod_del_unmark_sec_and_undo_update(): Add a missing
const qualifier.
------------------------------------------------------------------------
r6309 | marko | 2009-12-15 14:05:50 +0200 (Tue, 15 Dec 2009) | 3 lines
Changed paths:
M /branches/zip/lock/lock0lock.c
branches/zip: lock_rec_insert_check_and_lock(): Avoid casting away constness.
Use page_rec_get_next_const() instead. This silences a gcc 4.2.4 warning.
Reported by Sunny Bains.
------------------------------------------------------------------------
r6312 | marko | 2009-12-16 10:10:36 +0200 (Wed, 16 Dec 2009) | 6 lines
Changed paths:
M /branches/zip/fil/fil0fil.c
branches/zip: fil_close(): Add #ifndef UNIV_HOTBACKUP around a debug
assertion on mutex.magic_n. InnoDB Hot Backup is a single-threaded
program and does not contain mutexes. This change allows InnoDB Hot
Backup to be compiled with UNIV_DEBUG.
Suggested by Michael Izioumtchenko.
------------------------------------------------------------------------
r6321 | marko | 2009-12-16 16:16:33 +0200 (Wed, 16 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0merge.c
branches/zip: row_merge_drop_temp_indexes(): Revert a hack to
transaction isolation level that was made unnecessary by r5826 (Issue #337).
When this function is called, any active data dictionary transaction
should have been rolled back.
------------------------------------------------------------------------
r6345 | marko | 2009-12-21 10:46:14 +0200 (Mon, 21 Dec 2009) | 7 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_scan_log_recs(): Non-functional change: Replace a
debug assertion ut_ad(len > 0) with ut_ad(len >= OS_FILE_LOG_BLOCK_SIZE).
This change is only for readability, for Issue #428. Another
assertion on len being an integer multiple of OS_FILE_LOG_BLOCK_SIZE
already ensured together with the old ut_ad(len > 0) that actually len
must be at least OS_FILE_LOG_BLOCK_SIZE.
------------------------------------------------------------------------
r6346 | marko | 2009-12-21 12:03:25 +0200 (Mon, 21 Dec 2009) | 2 lines
Changed paths:
M /branches/zip/log/log0recv.c
branches/zip: recv_recovery_from_checkpoint_finish():
Revert a change that was accidentally committed in r6345.
------------------------------------------------------------------------
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/ha_prototypes.h
M /branches/zip/include/trx0trx.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/trx/trx0i_s.c
M /branches/zip/trx/trx0trx.c
branches/zip: Merge a change from MySQL:
------------------------------------------------------------
revno: 3236
committer: Satya B <satya.bn@sun.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 17:48:57 +0530
message:
merge to mysql-5.1-bugteam
------------------------------------------------------------
revno: 3234.1.1
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.1-bugteam
timestamp: Tue 2009-12-01 14:38:40 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
manual merge 5.0-->5.1, updating InnoDB plugin.
------------------------------------------------------------
revno: 1810.3968.13
committer: Gleb Shchepa <gshchepa@mysql.com>
branch nick: mysql-5.0-bugteam
timestamp: Tue 2009-12-01 14:24:44 +0400
message:
Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
The bug 38816 changed the lock that protects THD::query from
LOCK_thread_count to LOCK_thd_data, but didn't update the associated
InnoDB functions.
1. The innobase_mysql_prepare_print_arbitrary_thd and the
innobase_mysql_end_print_arbitrary_thd InnoDB functions have been
removed, since now we have a per-thread mutex: now we don't need to wrap
several inter-thread access tries to THD::query with a single global
LOCK_thread_count lock, so we can simplify the code.
2. The innobase_mysql_print_thd function has been modified to lock
LOCK_thd_data in direct way.
------------------------------------------------------------------------
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
Changed paths:
M /branches/zip/handler/ha_innodb.cc
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
------------------------------------------------------------------------
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
Changed paths:
M /branches/zip/handler/ha_innodb.cc
M /branches/zip/include/lock0lock.h
M /branches/zip/include/srv0srv.h
M /branches/zip/lock/lock0lock.c
M /branches/zip/log/log0log.c
M /branches/zip/mysql-test/innodb-autoinc.result
M /branches/zip/mysql-test/innodb-autoinc.test
M /branches/zip/row/row0sel.c
M /branches/zip/srv/srv0srv.c
M /branches/zip/srv/srv0start.c
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
Changed paths:
M /branches/zip/ibuf/ibuf0ibuf.c
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
------------------------------------------------------------------------
2009-12-26 19:17:43 +00:00
|
|
|
/* MIX_LEN may contain additional table flags when
|
2017-07-06 02:14:33 +03:00
|
|
|
ROW_FORMAT!=REDUNDANT. */
|
2007-01-30 09:24:18 +00:00
|
|
|
dict_mem_table_add_col(table, heap, "MIX_LEN", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "CLUSTER_NAME", DATA_BINARY, 0, 0);
|
|
|
|
dict_mem_table_add_col(table, heap, "SPACE", DATA_INT, 0, 4);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
table->id = DICT_TABLES_ID;
|
2006-02-23 19:25:29 +00:00
|
|
|
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
dict_table_add_system_columns(table, heap);
|
|
|
|
table->add_to_cache();
|
2019-05-17 14:32:53 +03:00
|
|
|
dict_sys.sys_tables = table;
|
2007-01-30 09:24:18 +00:00
|
|
|
mem_heap_empty(heap);
|
2006-02-23 19:25:29 +00:00
|
|
|
|
2018-03-23 17:25:56 +02:00
|
|
|
index = dict_mem_index_create(table, "CLUST_IND",
|
2006-08-29 09:30:31 +00:00
|
|
|
DICT_UNIQUE | DICT_CLUSTERED, 1);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2006-02-23 19:25:29 +00:00
|
|
|
dict_mem_index_add_field(index, "NAME", 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
index->id = DICT_TABLES_ID;
|
2022-11-28 12:20:17 +02:00
|
|
|
err = dict_index_add_to_cache(
|
2018-03-23 17:25:56 +02:00
|
|
|
index, mach_read_from_4(dict_hdr + DICT_HDR_TABLES));
|
2022-11-28 11:34:22 +02:00
|
|
|
ut_a(err == DB_SUCCESS);
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
ut_ad(!table->is_instant());
|
MDEV-21907: InnoDB: Enable -Wconversion on clang and GCC
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.
Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.
GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int. Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.
In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.
The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.
This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
2020-03-12 19:46:41 +02:00
|
|
|
table->indexes.start->n_core_null_bytes = static_cast<uint8_t>(
|
|
|
|
UT_BITS_IN_BYTES(unsigned(table->indexes.start->n_nullable)));
|
2006-09-19 10:14:07 +00:00
|
|
|
|
2005-10-27 07:29:40 +00:00
|
|
|
/*-------------------------*/
|
2018-03-23 17:25:56 +02:00
|
|
|
index = dict_mem_index_create(table, "ID_IND", DICT_UNIQUE, 1);
|
2006-02-23 19:25:29 +00:00
|
|
|
dict_mem_index_add_field(index, "ID", 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
index->id = DICT_TABLE_IDS_ID;
|
2022-11-28 11:34:22 +02:00
|
|
|
err = dict_index_add_to_cache(
|
2018-03-23 17:25:56 +02:00
|
|
|
index, mach_read_from_4(dict_hdr + DICT_HDR_TABLE_IDS));
|
2022-11-28 11:34:22 +02:00
|
|
|
ut_a(err == DB_SUCCESS);
|
2006-09-19 10:14:07 +00:00
|
|
|
|
2005-10-27 07:29:40 +00:00
|
|
|
/*-------------------------*/
|
2021-05-20 14:58:25 +03:00
|
|
|
table = dict_table_t::create(dict_sys.SYS_TABLE[dict_sys.SYS_COLUMNS],
|
|
|
|
fil_system.sys_space,
|
|
|
|
DICT_NUM_COLS__SYS_COLUMNS, 0, 0, 0);
|
2016-08-12 11:17:45 +03:00
|
|
|
dict_mem_table_add_col(table, heap, "TABLE_ID", DATA_BINARY, 0, 8);
|
2007-01-30 09:24:18 +00:00
|
|
|
dict_mem_table_add_col(table, heap, "POS", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "NAME", DATA_BINARY, 0, 0);
|
|
|
|
dict_mem_table_add_col(table, heap, "MTYPE", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "PRTYPE", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "LEN", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "PREC", DATA_INT, 0, 4);
|
2006-02-23 19:25:29 +00:00
|
|
|
|
2005-10-27 07:29:40 +00:00
|
|
|
table->id = DICT_COLUMNS_ID;
|
|
|
|
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
dict_table_add_system_columns(table, heap);
|
|
|
|
table->add_to_cache();
|
2019-05-17 14:32:53 +03:00
|
|
|
dict_sys.sys_columns = table;
|
2007-01-30 09:24:18 +00:00
|
|
|
mem_heap_empty(heap);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2018-03-23 17:25:56 +02:00
|
|
|
index = dict_mem_index_create(table, "CLUST_IND",
|
2006-08-29 09:30:31 +00:00
|
|
|
DICT_UNIQUE | DICT_CLUSTERED, 2);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2006-02-23 19:25:29 +00:00
|
|
|
dict_mem_index_add_field(index, "TABLE_ID", 0);
|
|
|
|
dict_mem_index_add_field(index, "POS", 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
index->id = DICT_COLUMNS_ID;
|
2022-11-28 11:34:22 +02:00
|
|
|
err = dict_index_add_to_cache(
|
2018-03-23 17:25:56 +02:00
|
|
|
index, mach_read_from_4(dict_hdr + DICT_HDR_COLUMNS));
|
2022-11-28 11:34:22 +02:00
|
|
|
ut_a(err == DB_SUCCESS);
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
ut_ad(!table->is_instant());
|
MDEV-21907: InnoDB: Enable -Wconversion on clang and GCC
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.
Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.
GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int. Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.
In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.
The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.
This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
2020-03-12 19:46:41 +02:00
|
|
|
table->indexes.start->n_core_null_bytes = static_cast<uint8_t>(
|
|
|
|
UT_BITS_IN_BYTES(unsigned(table->indexes.start->n_nullable)));
|
2006-09-19 10:14:07 +00:00
|
|
|
|
2005-10-27 07:29:40 +00:00
|
|
|
/*-------------------------*/
|
2021-05-20 14:58:25 +03:00
|
|
|
table = dict_table_t::create(dict_sys.SYS_TABLE[dict_sys.SYS_INDEXES],
|
|
|
|
fil_system.sys_space,
|
|
|
|
DICT_NUM_COLS__SYS_INDEXES, 0, 0, 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2016-08-12 11:17:45 +03:00
|
|
|
dict_mem_table_add_col(table, heap, "TABLE_ID", DATA_BINARY, 0, 8);
|
|
|
|
dict_mem_table_add_col(table, heap, "ID", DATA_BINARY, 0, 8);
|
2007-01-30 09:24:18 +00:00
|
|
|
dict_mem_table_add_col(table, heap, "NAME", DATA_BINARY, 0, 0);
|
|
|
|
dict_mem_table_add_col(table, heap, "N_FIELDS", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "TYPE", DATA_INT, 0, 4);
|
2021-05-03 18:53:01 +03:00
|
|
|
/* SYS_INDEXES.SPACE is only read by in dict_drop_index_tree() */
|
2007-01-30 09:24:18 +00:00
|
|
|
dict_mem_table_add_col(table, heap, "SPACE", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "PAGE_NO", DATA_INT, 0, 4);
|
2016-08-12 11:17:45 +03:00
|
|
|
dict_mem_table_add_col(table, heap, "MERGE_THRESHOLD", DATA_INT, 0, 4);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
table->id = DICT_INDEXES_ID;
|
2012-08-01 17:27:34 +03:00
|
|
|
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
dict_table_add_system_columns(table, heap);
|
2017-10-13 21:59:15 +03:00
|
|
|
/* The column SYS_INDEXES.MERGE_THRESHOLD was "instantly"
|
|
|
|
added in MySQL 5.7 and MariaDB 10.2.2. Assign it DEFAULT NULL.
|
|
|
|
Because of file format compatibility, we must treat SYS_INDEXES
|
|
|
|
as a special case, relaxing some debug assertions
|
|
|
|
for DICT_INDEXES_ID. */
|
|
|
|
dict_table_get_nth_col(table, DICT_COL__SYS_INDEXES__MERGE_THRESHOLD)
|
|
|
|
->def_val.len = UNIV_SQL_NULL;
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
table->add_to_cache();
|
2019-05-17 14:32:53 +03:00
|
|
|
dict_sys.sys_indexes = table;
|
2007-01-30 09:24:18 +00:00
|
|
|
mem_heap_empty(heap);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2018-03-23 17:25:56 +02:00
|
|
|
index = dict_mem_index_create(table, "CLUST_IND",
|
2006-08-29 09:30:31 +00:00
|
|
|
DICT_UNIQUE | DICT_CLUSTERED, 2);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2006-02-23 19:25:29 +00:00
|
|
|
dict_mem_index_add_field(index, "TABLE_ID", 0);
|
|
|
|
dict_mem_index_add_field(index, "ID", 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
index->id = DICT_INDEXES_ID;
|
2022-11-28 11:34:22 +02:00
|
|
|
err = dict_index_add_to_cache(
|
2018-03-23 17:25:56 +02:00
|
|
|
index, mach_read_from_4(dict_hdr + DICT_HDR_INDEXES));
|
2022-11-28 11:34:22 +02:00
|
|
|
ut_a(err == DB_SUCCESS);
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
ut_ad(!table->is_instant());
|
MDEV-21907: InnoDB: Enable -Wconversion on clang and GCC
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.
Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.
GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int. Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.
In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.
The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.
This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
2020-03-12 19:46:41 +02:00
|
|
|
table->indexes.start->n_core_null_bytes = static_cast<uint8_t>(
|
|
|
|
UT_BITS_IN_BYTES(unsigned(table->indexes.start->n_nullable)));
|
2006-09-19 10:14:07 +00:00
|
|
|
|
2005-10-27 07:29:40 +00:00
|
|
|
/*-------------------------*/
|
2021-05-20 14:58:25 +03:00
|
|
|
table = dict_table_t::create(dict_sys.SYS_TABLE[dict_sys.SYS_FIELDS],
|
|
|
|
fil_system.sys_space,
|
|
|
|
DICT_NUM_COLS__SYS_FIELDS, 0, 0, 0);
|
2016-08-12 11:17:45 +03:00
|
|
|
dict_mem_table_add_col(table, heap, "INDEX_ID", DATA_BINARY, 0, 8);
|
2007-01-30 09:24:18 +00:00
|
|
|
dict_mem_table_add_col(table, heap, "POS", DATA_INT, 0, 4);
|
|
|
|
dict_mem_table_add_col(table, heap, "COL_NAME", DATA_BINARY, 0, 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
table->id = DICT_FIELDS_ID;
|
2012-08-01 17:27:34 +03:00
|
|
|
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
dict_table_add_system_columns(table, heap);
|
|
|
|
table->add_to_cache();
|
2019-05-17 14:32:53 +03:00
|
|
|
dict_sys.sys_fields = table;
|
2007-01-30 09:24:18 +00:00
|
|
|
mem_heap_free(heap);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2018-03-23 17:25:56 +02:00
|
|
|
index = dict_mem_index_create(table, "CLUST_IND",
|
2006-08-29 09:30:31 +00:00
|
|
|
DICT_UNIQUE | DICT_CLUSTERED, 2);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
2006-02-23 19:25:29 +00:00
|
|
|
dict_mem_index_add_field(index, "INDEX_ID", 0);
|
|
|
|
dict_mem_index_add_field(index, "POS", 0);
|
2005-10-27 07:29:40 +00:00
|
|
|
|
|
|
|
index->id = DICT_FIELDS_ID;
|
2022-11-28 11:34:22 +02:00
|
|
|
err = dict_index_add_to_cache(
|
2018-03-23 17:25:56 +02:00
|
|
|
index, mach_read_from_4(dict_hdr + DICT_HDR_FIELDS));
|
2022-11-28 11:34:22 +02:00
|
|
|
ut_a(err == DB_SUCCESS);
|
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.
This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.
ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
Some usability limitations will be addressed in subsequent work:
MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
The format of the clustered index (PRIMARY KEY) is changed as follows:
(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.
(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.
(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.
(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)
We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.
This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.
The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.
When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.
UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.
len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
dict_col_t::def_val: The 'default row' value of the column. If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().
dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.
dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().
dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.
dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.
dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().
dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.
dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.
dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().
dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.
dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.
prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.
dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.
dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).
dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().
pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().
row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().
create_index_dict(): Replaces row_merge_create_index_graph().
innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.
btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.
dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.
dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.
dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().
row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.
commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)
PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.
rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.
rec_offs_make_valid(): Add the parameter 'leaf'.
rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.
dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().
rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.
cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.
trx_t::in_rollback: Make the field available in non-debug builds.
trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.
btr_index_rec_validate(): Account for instant ADD COLUMN
row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.
row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.
dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.
row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().
rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.
rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.
btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.
btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.
row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.
rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.
REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.
rec_comp_status_t: An enum of the status bit values.
rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 07:00:05 +03:00
|
|
|
ut_ad(!table->is_instant());
|
MDEV-21907: InnoDB: Enable -Wconversion on clang and GCC
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.
Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.
GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int. Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.
In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.
The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.
This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
2020-03-12 19:46:41 +02:00
|
|
|
table->indexes.start->n_core_null_bytes = static_cast<uint8_t>(
|
|
|
|
UT_BITS_IN_BYTES(unsigned(table->indexes.start->n_nullable)));
|
2005-10-27 07:29:40 +00:00
|
|
|
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
mtr.commit();
|
2023-12-04 09:45:53 +02:00
|
|
|
dict_sys.unlock();
|
MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c125b841c0650d27d735e3b87509dc286
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.
We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.
This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.
We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.
Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.
buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.
btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.
btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().
btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().
FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.
dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.
trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().
trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.
trx_sys_create_sys_pages(): Merged with trx_sysf_create().
dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.
dir_pathname(): Replaces os_file_make_new_pathname().
row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.
btr_decryption_failed(): Report decryption failures
dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.
dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.
dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.
dict_table_is_corrupted(): Remove.
dict_index_t::is_btree(): Determine if the index is a valid B-tree.
BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.
UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.
buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().
fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.
btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.
opt_search_plan_for_table(): Simplify the code.
row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.
row_undo_mod_upd_exist_sec(): Simplify the code.
row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().
fut_get_ptr(): Replace with buf_page_get_gen() calls.
buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.
buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.
fseg_page_is_allocated(): Replaces fseg_page_is_free().
fts_drop_common_tables(): Return an error if the transaction
was rolled back.
fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.
fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.
Clean up mtr_t::page_lock()
buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.
buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.
mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().
recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.
recv_recover_page(): Return whether the operation succeeded.
recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.
Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
|
|
|
return err;
|
2005-10-27 07:29:40 +00:00
|
|
|
}
|