Commit graph

152 commits

Author SHA1 Message Date
Sergei Golubchik
2db62f686e Merge branch '10.0' into 10.1 2015-03-07 13:21:02 +01:00
Sergei Golubchik
6b05688f6d innodb 5.6.23 2015-02-18 17:59:21 +01:00
Sergei Golubchik
853077ad7e Merge branch '10.0' into bb-10.1-merge
Conflicts:
	.bzrignore
	VERSION
	cmake/plugin.cmake
	debian/dist/Debian/control
	debian/dist/Ubuntu/control
	mysql-test/r/join_outer.result
	mysql-test/r/join_outer_jcl6.result
	mysql-test/r/null.result
	mysql-test/r/old-mode.result
	mysql-test/r/union.result
	mysql-test/t/join_outer.test
	mysql-test/t/null.test
	mysql-test/t/old-mode.test
	mysql-test/t/union.test
	packaging/rpm-oel/mysql.spec.in
	scripts/mysql_config.sh
	sql/ha_ndbcluster.cc
	sql/ha_ndbcluster_binlog.cc
	sql/ha_ndbcluster_cond.cc
	sql/item_cmpfunc.h
	sql/lock.cc
	sql/sql_select.cc
	sql/sql_show.cc
	sql/sql_update.cc
	sql/sql_yacc.yy
	storage/innobase/buf/buf0flu.cc
	storage/innobase/fil/fil0fil.cc
	storage/innobase/include/srv0srv.h
	storage/innobase/lock/lock0lock.cc
	storage/tokudb/CMakeLists.txt
	storage/xtradb/buf/buf0flu.cc
	storage/xtradb/fil/fil0fil.cc
	storage/xtradb/include/srv0srv.h
	storage/xtradb/lock/lock0lock.cc
	support-files/mysql.spec.sh
2014-12-02 22:25:16 +01:00
Sergei Golubchik
a9a6bd5256 InnoDB 5.6.21 2014-11-20 16:59:22 +01:00
Jan Lindström
cb37c55768 MDEV-6929: Port Facebook Prefix Index Queries Optimization
Merge Facebook commit 154c579b828a60722a7d9477fc61868c07453d08
and e8f0052f9b112dc786bf9b957ed5b16a5749f7fd authored
by Steaphan Greene from https://github.com/facebook/mysql-5.6

Optimize prefix index queries to skip cluster index lookup when possible.

Currently InnoDB will always fetch the clustered index (primary key
index) for all prefix columns in an index, even when the value of a
particular record is smaller than the prefix length. This change
optimizes that case to use the record from the secondary index and avoid
the extra lookup.

Also adds two status vars that track how effective this is:

innodb_secondary_index_triggered_cluster_reads:
Times secondary index lookup triggered cluster lookup.

innodb_secondary_index_triggered_cluster_reads_avoided:
Times prefix optimization avoided triggering cluster lookup.
2014-11-03 11:18:52 +02:00
Jan Lindström
85c21dbc47 MDEV-6925: Remove bad "" operators.
Merged Facebook commit ec1aac68c74f3c1e558d057c4c9fcfe6edbbea93
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6

In C++11, "" is not parsed as before.  So "A""B" is not the same as "AB".
Instead, whitespace is required, like: "A" "B"
2014-10-26 07:29:37 +02:00
Jan Lindström
3486135bb5 MDEV-6928: Add trx pointer to struct mtr_t
Merge Facebook commit 25295d003cb0c17aa8fb756523923c77250b3294
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6

This adds a pointer to the trx to each mtr.
This allows the trx to be accessed in parts of the code
where it was otherwise not available. This is needed later.
2014-10-24 22:26:31 +03:00
Sergei Golubchik
f62c12b405 Merge 10.0.14 into 10.1 2014-10-15 12:59:13 +02:00
Sergei Golubchik
7aabc2ded2 fixing embedded: WaaS. Wsrep as a Service. 2014-10-01 23:48:34 +02:00
Sergei Golubchik
75796d9ecb InnoDB 5.6.20 2014-09-11 16:42:54 +02:00
Jan Lindström
df4dd593f2 MDEV-6247: Merge 10.0-galera to 10.1.
Merged lp:maria/maria-10.0-galera up to revision 3879.

Added a new functions to handler API to forcefully abort_transaction,
producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These
were added for future possiblity to add more storage engines that
could use galera replication.
2014-08-26 15:43:46 +03:00
Sergei Golubchik
2023fac281 innodb-5.6.19 2014-08-06 20:05:10 +02:00
Kristian Nielsen
45f6262f54 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes. Fix InnoDB coding style issues.
2014-07-09 13:02:52 +02:00
Kristian Nielsen
98fc5b3af8 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes.

For this patch in 10.0, we do not introduce a new public storage engine API,
we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
API that can be used for all storage engines (MDEV-6429).

Eliminate the background thread that did deadlock kills asynchroneously.
Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
inside the deadlock detection code (when thd_report_wait_for() needs to kill a
later thread to resolve a deadlock).

(We preserve the part of the original patch that introduces dedicated mutex
and condition for the slave init thread, to remove the abuse of
LOCK_thread_count for start/stop synchronisation of the slave init thread).
2014-07-08 12:54:47 +02:00
unknown
bd4153a8c2 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.

The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.

With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.

Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.

Also some misc. fixes found during development and testing:

 - Remove thd_rpl_is_parallel(), it is not used or needed.

 - Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
   worker thread is killed to resolve a deadlock with fixed commit
   ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
   can be ignored if the query otherwise completed successfully, and this
   could cause the deadlock kill to be lost, so that the deadlock was not
   correctly resolved.

 - Fix random test failure due to missing wait_for_binlog_checkpoint.inc.

 - Make sure that deadlock or other temporary errors during parallel
   replication are not printed to the the error log; there were some places
   around the replication code with extra error logging. These conditions can
   occur occasionally and are handled automatically without breaking
   replication, so they should not pollute the error log.

 - Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
   the end of a transaction, to be able to detect and resolve deadlocks due to
   commit ordering. But this value was also used as a flag to mark whether
   record_gtid() had been called, by being set to zero, losing the value. Now,
   introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
   valid for the entire duration of the transaction.

 - Fix one place where the code to handle ignored errors called reset_killed()
   unconditionally, even if no error was caught that should be ignored. This
   could cause loss of a deadlock kill signal, breaking deadlock detection and
   resolution.

 - Fix a couple of missing mysql_reset_thd_for_next_command(). This could
   cause a prior error condition to remain for the next event executed,
   causing assertions about errors already being set and possibly giving
   incorrect error handling for following event executions.

 - Fix code that cleared thd->rgi_slave in the parallel replication worker
   threads after each event execution; this caused the deadlock detection and
   handling code to not be able to correctly process the associated
   transactions as belonging to replication worker threads.

 - Remove useless error code in slave_background_kill_request().

 - Fix bug where wfc->wakeup_error was not cleared at
   wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
   error condition to wrongly propagate to a later wait_for_prior_commit(),
   causing spurious ER_PRIOR_COMMIT_FAILED errors.

 - Do not put the binlog background thread into the processlist. It causes
   too many result differences in mtr, but also it probably is not useful
   for users to pollute the process list with a system thread that does not
   really perform any user-visible tasks...
2014-06-10 10:13:15 +02:00
Sergei Golubchik
8ee9d19607 innodb 5.6.17 2014-05-07 17:32:23 +02:00
Sergei Golubchik
e2e5d07b28 MDEV-6184 10.0.11 merge
InnoDB 5.6.16
2014-05-06 09:57:39 +02:00
Sergei Golubchik
15ee97214d InnoDB 5.6.15 merge.
update test results
2014-02-26 19:36:33 +01:00
Sergei Golubchik
27d45e4696 MDEV-5574 Set AUTO_INCREMENT below max value of column.
Update InnoDB to 5.6.14
Apply MySQL-5.6 hack for MySQL Bug#16434374
Move Aria-only HA_RTREE_INDEX from my_base.h to maria_def.h (breaks an assert in InnoDB)
Fix InnoDB memory leak
2014-02-01 09:33:26 +01:00
Jan Lindström
9538bbfce9 MDEV-4808: Assertion: trx->start_file != 0 fails in trx0trx.cc on killing CREATE TABLE query.
Analysis: There is debug assertion ut_ad(trx->start_file != 0); and ut_ad(trx->start_line != 0); on trx_start_low funcition at trx0trx.cc. These fields are initialized on include/trx0trx.h at function trx_start_if_not_started_xa. Thus at trx_prepare_for_mysql function should call trx_start_if_not_started_xa(trx); not trx_start_if_not_started_xa_low(trx) directly;
2013-10-01 13:24:52 +03:00
Michael Widenius
068c61978e Temporary commit of 10.0-merge 2013-03-26 00:03:13 +02:00
Sergei Golubchik
e1f681c99b 10.0-base -> 10.0-monty 2012-10-19 20:38:59 +02:00
unknown
288eeb3a31 MDEV-232: Remove one fsync() from commit phase.
Introduce a new storage engine API method commit_checkpoint_request().
This is used to replace the fsync() at the end of every storage engine
commit with a single fsync() when a binlog is rotated.

Binlog rotation is now done during group commit instead of being
delayed until unlog(), removing some server stall and avoiding an
expensive lock/unlock of LOCK_log inside unlog().
2012-09-13 14:31:29 +02:00
Michael Widenius
1d0f70c2f8 Temporary commit of merge of MariaDB 10.0-base and MySQL 5.6 2012-08-01 17:27:34 +03:00
Sergei Golubchik
d11829654c merge with MySQL 5.5.27
manually checked every change, reverted incorrect or stupid changes.
2012-08-09 17:22:00 +02:00
Narayanan Venkateswaran
164080e4a8 WL#6161 Integrating with InnoDB codebase in MySQL 5.5
Changes in the InnoDB codebase required to compile and
integrate the MEB codebase with MySQL 5.5.

@ storage/innobase/btr/btr0btr.c
  Excluded buffer pool usage from MEB build.
 
  buf_pool_from_bpage calls are in buf0buf.ic, and
  the buffer pool functions from that file are
  disabled in MEB.
@ storage/innobase/buf/buf0buf.c
  Disabling more buffer pool functions unused in MEB.
@ storage/innobase/dict/dict0dict.c
  Disabling dict_ind_free that is unused in MEB.
@ storage/innobase/dict/dict0mem.c
  The include

  #include "ha_prototypes.h"

  Was causing conflicts with definitions in my_global.h

  Linking C executable mysqlbackup
  libinnodb.a(dict0mem.c.o): In function `dict_mem_foreign_table_name_lookup_set':
  dict0mem.c:(.text+0x91c): undefined reference to `innobase_get_lower_case_table_names'
  libinnodb.a(dict0mem.c.o): In function `dict_mem_referenced_table_name_lookup_set':
  dict0mem.c:(.text+0x9fc): undefined reference to `innobase_get_lower_case_table_names'
  libinnodb.a(dict0mem.c.o): In function `dict_mem_foreign_table_name_lookup_set':
  dict0mem.c:(.text+0x96e): undefined reference to `innobase_casedn_str'
  libinnodb.a(dict0mem.c.o): In function `dict_mem_referenced_table_name_lookup_set':
  dict0mem.c:(.text+0xa4e): undefined reference to `innobase_casedn_str'
  collect2: ld returned 1 exit status
  make[2]: *** [mysqlbackup] Error 1

  innobase_get_lower_case_table_names
  innobase_casedn_str
  are functions that are part of ha_innodb.cc that is not part of the build
        
  dict_mem_foreign_table_name_lookup_set
  function is not there in the current codebase, meaning we do not use it in MEB.
@ storage/innobase/fil/fil0fil.c
  The srv_fast_shutdown variable is declared in
  srv0srv.c that is not compiled in the
  mysqlbackup codebase.

  This throws an undeclared error.

  From the Manual
  ---------------

  innodb_fast_shutdown
  --------------------

  The InnoDB shutdown mode. The default value is 1
  as of MySQL 3.23.50, which causes a “fast� shutdown
  (the normal type of shutdown). If the value is 0,
  InnoDB does a full purge and an insert buffer merge
  before a shutdown. These operations can take minutes,
  or even hours in extreme cases. If the value is 1,
  InnoDB skips these operations at shutdown.

  This ideally does not matter from mysqlbackup
  @ storage/innobase/ha/ha0ha.c
  In file included from /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/ha/ha0ha.c:34:0:
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/include/btr0sea.h:286:17: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token
  make[2]: *** [CMakeFiles/innodb.dir/home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/ha/ha0ha.c.o] Error 1
  make[1]: *** [CMakeFiles/innodb.dir/all] Error 2
  make: *** [all] Error 2

  # include "sync0rw.h" is excluded from hotbackup compilation in dict0dict.h

  This causes extern rw_lock_t*	btr_search_latch_temp; to throw a failure because
  the definition of rw_lock_t is not found.
@ storage/innobase/include/buf0buf.h
  Excluding buffer pool functions that are unused from the
  MEB codebase.
@ storage/innobase/include/buf0buf.ic
  replicated the exclusion of

  #include "buf0flu.h"
  #include "buf0lru.h"
  #include "buf0rea.h"

  by looking at the current codebase in <meb-trunk>/src/innodb
  @ storage/innobase/include/dict0dict.h
  dict_table_x_lock_indexes, dict_table_x_unlock_indexes, dict_table_is_corrupted,
  dict_index_is_corrupted, buf_block_buf_fix_inc_func are unused in MEB and was
  leading to compilation errors and hence excluded.
@ storage/innobase/include/dict0dict.ic
  dict_table_x_lock_indexes, dict_table_x_unlock_indexes, dict_table_is_corrupted,
  dict_index_is_corrupted, buf_block_buf_fix_inc_func are unused in MEB and was
  leading to compilation errors and hence excluded.
@ storage/innobase/include/log0log.h
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/include/log0log.h: At top level:
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/include/log0log.h:767:2: error: expected specifier-qualifier-list before Ã¢â  ‚¬Ëœmutex_t’

  mutex_t definitions were excluded as seen from ambient code
  hence excluding definition for log_flush_order_mutex also.
@ storage/innobase/include/os0file.h
  Bug in InnoDB code, create_mode should have been create.
@ storage/innobase/include/srv0srv.h
  In file included from /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/buf/buf0buf.c:50:0:
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/include/srv0srv.h: At top level:
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/include/srv0srv.h:120:16: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘srv_use_native_aio’

  srv_use_native_aio - we do not use native aio of the OS anyway from MEB. MEB does not compile
  InnoDB with this option. Hence disabling it.
@ storage/innobase/include/trx0sys.h
  [ 56%] Building C object CMakeFiles/innodb.dir/home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/trx/trx0sys.c.o
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/trx/trx0sys.c: In function ‘trx_sys_read_file_format_id’:
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/trx/trx0sys.c:1499:20: error: ‘TRX_SYS_FILE_FORMAT_TAG_MAGIC_N’   undeclared (first use in this function)
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/trx/trx0sys.c:1499:20: note: each undeclared identifier is reported only once for  each function it appears in
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/trx/trx0sys.c: At top level:
  /home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/include/buf0buf.h:607:1: warning: ‘buf_block_buf_fix_inc_func’ declared ‘static’ but never defined
  make[2]: *** [CMakeFiles/innodb.dir/home/narayanan/mysql-server/mysql-5.5-meb-rel3.8-innodb-integration-1/storage/innobase/trx/trx0sys.c.o] Error 1

  unused calls excluded to enable compilation
@ storage/innobase/mem/mem0dbg.c
    excluding #include "ha_prototypes.h" that lead to definitions in ha_innodb.cc
@ storage/innobase/os/os0file.c
    InnoDB not compiled with aio support from MEB anyway. Hence excluding this from
    the compilation.
@ storage/innobase/page/page0zip.c
  page0zip.c:(.text+0x4e9e): undefined reference to `buf_pool_from_block'
  collect2: ld returned 1 exit status

  buf_pool_from_block defined in buf0buf.ic, most of the file is excluded for compilation of MEB
@ storage/innobase/ut/ut0dbg.c
  excluding #include "ha_prototypes.h" since it leads to definitions in ha_innodb.cc
  innobase_basename(file) is defined in ha_innodb.cc. Hence excluding that also.
@ storage/innobase/ut/ut0ut.c
  cal_tm unused from MEB, was leading to earnings, hence disabling for MEB.
2012-06-07 19:14:26 +05:30
Sergei Golubchik
16c5c53fc2 mysql 5.5.23 merge 2012-04-10 08:28:13 +02:00
Sunny Bains
1e10b60629 Merge from mysql-5.1. 2012-03-28 13:59:06 +11:00
Sunny Bains
83a5a20d81 Merge from mysql-5.0 2012-03-28 13:35:44 +11:00
Sergei Golubchik
20e706689d mysql-5.5.22 merge
mysql-test/suite/innodb/t/group_commit_crash.test:
  remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
  remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
  a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
  my_vsnprintf() is ok here, in 5.5
2012-03-28 01:04:46 +02:00
Sergei Golubchik
4933d21e5d merge with mysql-5.5.21 2012-03-09 08:06:59 +01:00
Marko Mäkelä
4ea57c80b2 Merge mysql-5.1 to mysql-5.5. 2012-02-17 11:52:51 +02:00
Sunny Bains
ec47cb6034 Merge from mysql-5.1-innodb. 2012-02-10 13:04:10 +11:00
Marko Mäkelä
39100cd984 Bug #13651627 Move ut_ad(0) from the beginning to the end of buf_page_print(),
print page dump

buf_page_print(): Remove the ut_ad(0) from the beginning. Add two flags
(enum buf_page_print_flags) that can be bitwise-ORed together:

BUF_PAGE_PRINT_NO_CRASH:
  Do not crash debug builds at the end of buf_page_print().
BUF_PAGE_PRINT_NO_FULL:
  Do not print the full page dump. This can be useful when adding
  diagnostic printout to flushing or to the doublewrite buffer.

trx_sys_doublewrite_init_or_restore_page(): Replace exit(1) with ut_error,
so that we can get a core dump if this extraordinary condition happens.

rb:924 approved by Sunny Bains
2012-02-02 12:31:57 +02:00
Marko Mäkelä
d84c95579b Bug #13413535 61104: INNODB: FAILING ASSERTION: PAGE_GET_N_RECS(PAGE) > 1
This fix does not remove the underlying cause of the assertion
failure. It just works around the problem, allowing a corrupted
secondary index to be fixed by DROP INDEX and CREATE INDEX (or in the
worst case, by re-creating the table).

ibuf_delete(): If the record to be purged is the last one in the page
or it is not delete-marked, refuse to purge it. Instead, write an
error message to the error log and let a debug assertion fail.

ibuf_set_del_mark(): If the record to be delete-marked is not found,
display some more information in the error log and let a debug
assertion fail.

row_undo_mod_del_unmark_sec_and_undo_update(),
row_upd_sec_index_entry(): Let a debug assertion fail when the record
to be delete-marked is not found.

buf_page_print(): Add ut_ad(0) so that corruption will be more
prominent in stress testing with debug binaries. Add ut_ad(0) here and
there where corruption is noticed.

btr_corruption_report(): Display some data on page_is_comp() mismatch.

btr_assert_not_corrupted(): A wrapper around btr_corruption_report().
Assert that page_is_comp() agrees with the table flags.

rb:911 approved by Inaam Rana
2012-01-26 13:24:00 +02:00
Yasufumi Kinoshita
ad6a4986eb Bug#12400341 INNODB CAN LEAVE ORPHAN IBD FILES AROUND
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.

rb:875 approved by Jimmy Yang
2012-01-10 14:23:20 +09:00
Yasufumi Kinoshita
115f5e8551 Bug#12400341 INNODB CAN LEAVE ORPHAN IBD FILES AROUND
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.

rb:875 approved by Jimmy Yang
2012-01-10 14:18:58 +09:00
Sergei Golubchik
0e007344ea mysql-5.5.18 merge 2011-11-03 19:17:05 +01:00
Marko Mäkelä
91b5e9352a Revert most of revno 3560.9.1 (Bug#12704861)
This was an attempt to address problems with the Bug#12612184 fix.
Even with this follow-up fix, crash recovery can be broken.
Let us fix the bug later.
2011-10-26 11:44:28 +03:00
Sergei Golubchik
76f0b94bb0 merge with 5.3
sql/sql_insert.cc:
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
  ******
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
  small cleanup
  ******
  small cleanup
2011-10-19 21:45:18 +02:00
Inaam Rana
c95fdd936e Revert original fix for Bug 12612184 and the follow up fix for
Bug 12704861.

Bug 12704861 fix was revno: 3504.1.1 (rb://693)
Bug 12612184 fix was revno: 3445.1.10 (rb://678)
2011-09-30 07:02:19 -04:00
Marko Mäkelä
9aeb3309af Merge mysql-5.1 to mysql-5.5. 2011-09-06 10:14:45 +03:00
Marko Mäkelä
247ada63af Bug#12547647 UPDATE LOGGING COULD EXCEED LOG PAGE SIZE
This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.

In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)

In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.

trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.

trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.

trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.

trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.

trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.

trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.

rb:749 approved by Inaam Rana
2011-09-01 21:48:04 +03:00
Marko Mäkelä
7c20cd1b5d Merge mysql-5.1 to mysql-5.5. 2011-08-29 11:22:43 +03:00
Marko Mäkelä
41f229cd9e Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.

Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().

btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.

btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.

btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).

btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).

btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.

btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.

btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.

fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.

fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.

fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".

mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.

row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.

row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.

There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.

rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
Marko Mäkelä
d5d2c4fa74 Merge mysql-5.1 to mysql-5.5. 2011-06-15 10:30:19 +03:00
Marko Mäkelä
a862937699 Introduce UNIV_BLOB_NULL_DEBUG for temporarily hiding Bug#12650861.
Some ut_a(!rec_offs_any_null_extern()) assertion failures are indicating
genuine BLOB bugs, others are bogus failures when rolling back incomplete
transactions at crash recovery. This needs more work, and until I get a
chance to work on it, other testing must not be disrupted by this.
2011-06-15 10:16:59 +03:00
Marko Mäkelä
34baff7743 Merge mysql-5.1 to mysql-5.5. 2011-06-09 13:59:02 +03:00
Marko Mäkelä
6348b7375a BLOB instrumentation for Bug#12612184 Race condition in row_upd_clust_rec()
If UNIV_DEBUG or UNIV_BLOB_LIGHT_DEBUG is enabled, add
!rec_offs_any_null_extern() assertions, ensuring that records do not
contain null pointers to externally stored columns in inappropriate
places.

btr_cur_optimistic_update(): Assert !rec_offs_any_null_extern().
Incomplete records must never be updated or deleted. This assertion
will cover also the pessimistic route.

row_build(): Assert !rec_offs_any_null_extern(). Search tuples must
never be built from incomplete index entries.

row_rec_to_index_entry(): Assert !rec_offs_any_null_extern() unless
ROW_COPY_DATA is requested. ROW_COPY_DATA is used for
multi-versioning, and therefore it might be valid to copy the most
recent (uncommitted) version while it contains a null pointer to
off-page columns.

row_vers_build_for_consistent_read(),
row_vers_build_for_semi_consistent_read(): Assert !rec_offs_any_null_extern()
on all versions except the most recent one.

trx_undo_prev_version_build(): Assert !rec_offs_any_null_extern() on
the previous version.

rb:682 approved by Sunny Bains
2011-06-09 13:31:15 +03:00
Marko Mäkelä
417a267927 Re-enable the debug assertions for Bug#12650861.
Replace UNIV_BLOB_NULL_DEBUG with UNIV_DEBUG||UNIV_BLOB_LIGHT_DEBUG. 
Fix known bogus failures.

btr_cur_optimistic_update(): If rec_offs_any_null_extern(), assert that
the current transaction is an incomplete transaction that is being
rolled back in crash recovery.

row_build(): If rec_offs_any_null_extern(), assert that the transaction
that last updated the record was recovered during crash recovery
(and will soon be rolled back).
2011-06-16 11:51:04 +03:00