Commit graph

413 commits

Author SHA1 Message Date
Sergei Golubchik
0e007344ea mysql-5.5.18 merge 2011-11-03 19:17:05 +01:00
Sergei Golubchik
76f0b94bb0 merge with 5.3
sql/sql_insert.cc:
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
  ******
  CREATE ... IF NOT EXISTS may do nothing, but
  it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
  small cleanup
  ******
  small cleanup
2011-10-19 21:45:18 +02:00
Daniel Fischer
7450044eb7 merge from 5.5.16 2011-09-21 12:40:41 +02:00
Marko Mäkelä
247ada63af Bug#12547647 UPDATE LOGGING COULD EXCEED LOG PAGE SIZE
This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.

In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)

In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.

trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.

trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.

trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.

trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.

trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.

trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.

rb:749 approved by Inaam Rana
2011-09-01 21:48:04 +03:00
Jimmy Yang
d9c3d35437 In innobase_format_name() we should call innobase_convert_name() with
"!is_index_name" instead of "is_index_name", so the table name in the
error message would not be formated as index name.
2011-08-17 02:39:55 -07:00
Jimmy Yang
95fa7fab3b Fix bug #11830883, SUPPORT "CORRUPTED" BIT FOR INNODB TABLES AND INDEXES.
Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
2011-08-16 18:07:59 -07:00
Mats Kindahl
cf5e5f837a Merging into mysql-5.5.16-release. 2011-08-15 20:12:11 +02:00
Marko Mäkelä
c8163a16fa Merge mysql-5.1-security to mysql-5.5-security. 2011-08-10 15:03:33 +03:00
Marko Mäkelä
f2f4b19678 Bug#12626794 61240: UNUSED FUNCTIONS ... 2011-08-10 14:56:14 +03:00
Marko Mäkelä
5962cadcfa Merge mysql-5.1 to mysql-5.5. 2011-08-08 12:16:15 +03:00
Inaam Rana
fce189fadb Merge from 5.1 the fix for Bug 12356373 2011-07-19 10:54:59 -04:00
Vladislav Vaintroub
f9cb1467b8 merge Windows performance patches into 5.3 2011-07-05 21:46:53 +02:00
Sergei Golubchik
9809f05199 5.5-merge 2011-07-02 22:08:51 +02:00
Vladislav Vaintroub
4171483b53 Backport Fix for Bug#24509 - 2048 file descriptor limit on windows needs increasing.
The patch replaces the use of the POSIX I/O interfaces in mysys on Windows with 
the Win32 API calls (CreateFile, WriteFile, etc). The Windows HANDLE for the open
 file is stored in the my_file_info struct, along with a flag for append mode 
(because the Windows API does not support opening files in append mode in all cases)
The default max open files has been increased to 16384 and can be increased further
by setting --max-open-files=<value> during the server start.

Noteworthy benefit of this patch is that it removes limits from the table_cache size - 
allowing for more simultaneus users
2011-06-12 15:52:07 +02:00
Sergei Golubchik
9b98cae4cc merge with 5.1-micro 2011-06-07 18:13:02 +02:00
Jon Olav Hauglid
9b076952ec Bug#11853126 RE-ENABLE CONCURRENT READS WHILE CREATING
SECONDARY INDEX IN INNODB

The patches for Bug#11751388 and Bug#11784056 enabled concurrent
reads while creating secondary indexes in InnoDB. However, they
introduced a regression. This regression occured if ALTER TABLE
failed after the index had been added, for example during the
lock upgrade needed to update .FRM. If this happened, InnoDB
and the server got out of sync with regards to which indexes
actually existed. Therefore the patch for Bug#11815600 again
disabled concurrent reads.

This patch re-enables concurrent reads. The original regression
is fixed by splitting the ADD INDEX operation into two parts.
First the new index is created but not made active. This is
done while concurrent reads are allowed. The second part of
the operation makes the index active (or reverts the change).
This is done after lock upgrade, which prevents the original
regression.

In order to implement this change, the patch changes the storage
API for in-place index creation. handler::add_index() is split
into two functions, handler_add_index() and
handler::final_add_index(). The former for creating indexes without
making them visible and the latter for commiting (i.e. making
visible) new indexes or reverting the changes.

Large parts of this patch were written by Marko Mäkelä.

Test case added to innodb_mysql_lock.test.
2011-06-01 10:06:55 +02:00
Jimmy Yang
9e2b7fa7d5 Implement worklog #5743 InnoDB: Lift the limit of index key prefixes.
With this change, the index prefix column length lifted from 767 bytes
to 3072 bytes if "innodb_large_prefix" is set to "true".

rb://603 approved by Marko
2011-05-31 02:12:32 -07:00
Michael Widenius
f197991f41 Merge with 5.1-microseconds
A lot of small fixes and new test cases.

client/mysqlbinlog.cc:
  Cast removed
client/mysqltest.cc:
  Added missing DBUG_RETURN
include/my_pthread.h:
  set_timespec_time_nsec() now only takes one argument
mysql-test/t/date_formats.test:
  Remove --disable_ps_protocl as now also ps supports microseconds
mysys/my_uuid.c:
  Changed to use my_interval_timer() instead of my_getsystime()
mysys/waiting_threads.c:
  Changed to use my_hrtime()
sql/field.h:
  Added bool special_const_compare() for fields that may convert values before compare (like year)
sql/field_conv.cc:
  Added test to get optimal copying of identical temporal values.
sql/item.cc:
  Return that item_int is equal if it's positive, even if unsigned flag is different.
  Fixed Item_cache_str::save_in_field() to have identical null check as other similar functions
  Added proper NULL check to Item_cache_int::save_in_field()
sql/item_cmpfunc.cc:
  Don't call convert_constant_item() if there is nothing that is worth converting.
  Simplified test when years should be converted
sql/item_sum.cc:
  Mark cache values in Item_sum_hybrid as not constants to ensure they are not replaced by other cache values in compare_datetime()
sql/item_timefunc.cc:
  Changed sec_to_time() to take a my_decimal argument to ensure we don't loose any sub seconds.
  Added Item_temporal_func::get_time() (This simplifies some things)
sql/mysql_priv.h:
  Added Lazy_string_decimal()
sql/mysqld.cc:
  Added my_decimal constants max_seconds_for_time_type, time_second_part_factor
sql/table.cc:
  Changed expr_arena to be of type CONVENTIONAL_EXECUTION to ensure that we don't loose any items that are created by fix_fields()
sql/tztime.cc:
  TIME_to_gmt_sec() now sets *in_dst_time_gap in case of errors
  This is needed to be able to detect if timestamp is 0
storage/maria/lockman.c:
  Changed from my_getsystime() to set_timespec_time_nsec()
storage/maria/ma_loghandler.c:
  Changed from my_getsystime() to my_hrtime()
storage/maria/ma_recovery.c:
  Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/maria/unittest/trnman-t.c:
  Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/xtradb/handler/ha_innodb.cc:
  Added support for new time,datetime and timestamp
unittest/mysys/thr_template.c:
  my_getsystime() -> my_interval_timer()
unittest/mysys/waiting_threads-t.c:
  my_getsystime() -> my_interval_timer()
2011-05-28 05:11:32 +03:00
Sergei Golubchik
c1a92f9cae innodb compatibility fix 2011-05-26 19:16:10 +02:00
Dmitry Lenev
861291f1ab Fix for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".

Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The 
same scenario works fine for MyISAM temporary tables.

The assertion failure was caused by discrepancy between lock 
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a 
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will 
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.

Possible approaches to fixing this problem are:

1) Skip locking of temporary tables as locking doesn't make any
   sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ... 
   READ.

Unfortunately both of these approaches have drawbacks which make 
them unviable for stable versions of server.

So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.1 version of this patch switch from read lock to write
lock is done inside of InnoDBs handler methods as doing it on 
SQL-layer causes compatibility troubles with FLUSH TABLES WITH
READ LOCK.

mysql-test/suite/innodb/r/innodb_mysql.result:
  Added test for bug #11762012 - "54553: INNODB ASSERTS IN 
  HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb/t/innodb_mysql.test:
  Added test for bug #11762012 - "54553: INNODB ASSERTS IN 
  HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
  Added test for bug #11762012 - "54553: INNODB ASSERTS IN 
  HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
  Added test for bug #11762012 - "54553: INNODB ASSERTS IN 
  HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
storage/innobase/handler/ha_innodb.cc:
  Assume that a temporary table locked by LOCK TABLES can be updated
  even if it was only locked for read and therefore an X-lock should 
  be always requested for such tables.
storage/innodb_plugin/handler/ha_innodb.cc:
  Assume that a temporary table locked by LOCK TABLES can be updated
  even if it was only locked for read and therefore an X-lock should 
  be always requested for such tables.
2011-05-26 17:14:47 +04:00
Michael Widenius
3631146442 Original idea from Zardosht Kasheff to add HA_CLUSTERED_INDEX
- Added a lot of code comments
- Updated get_best_ror_intersec() to prefer index scan on not clustered keys before clustered keys.
- Use HA_CLUSTERED_INDEX to define if one should use HA_MRR_INDEX_ONLY
- For test of using index or filesort to resolve ORDER BY, use HA_CLUSTERED_INDEX flag instead of primary_key_is_clustered()
- Use HA_TABLE_SCAN_ON_INDEX instead of primary_key_is_clustered() to decide if ALTER TABLE ... ORDER BY will have any effect.

sql/ha_partition.h:
  Added comment with warning for code unsafe to use with multiple storage engines at the same time
sql/handler.h:
  Added HA_CLUSTERED_INDEX.
  Documented primary_key_is_clustered()
sql/opt_range.cc:
  Added code comments
  Updated get_best_ror_intersec() to ignore clustered keys.
  Optimized away cpk_scan_used and one instance of current_thd (Simpler code)
  Use HA_CLUSTERED_INDEX to define if one should use HA_MRR_INDEX_ONLY
sql/sql_select.cc:
  Changed comment to #ifdef
  For test of using index or filesort to resolve ORDER BY, use HA_CLUSTERED_INDEX flag instead of primary_key_is_clustered()
  (Change is smaller than what it looks beause of indentation change)
sql/sql_table.cc:
  Use HA_TABLE_SCAN_ON_INDEX instead of primary_key_is_clustered() to decide if ALTER TABLE ... ORDER BY will have any effect.
storage/innobase/handler/ha_innodb.h:
  Added support for HA_CLUSTERED_INDEX
storage/innodb_plugin/handler/ha_innodb.cc:
  Added support for HA_CLUSTERED_INDEX
storage/xtradb/handler/ha_innodb.cc:
  Added support for HA_CLUSTERED_INDEX
2011-05-18 19:26:30 +03:00
Marko Mäkelä
8c988dc61b Bug#12699505 Memory leak in row_create_index_for_mysql()
DB_COL_APPEARS_TWICE_IN_INDEX: Remove. This condition is already
checked and reported by MySQL before passing the index definition to
the storage engine.

row_create_index_for_mysql(): Remove the redundant check for
DB_COL_APPEARS_TWICE_IN_INDEX. When enforcing the column prefix index
limit, invoke dict_mem_index_free(index) to plug the memory leak. In
the loop, use index->n_def instead of dict_index_get_n_fields(index),
because the latter would be 0 for indexes that have not been copied to
the data dictionary cache.

innodb-use-sys-malloc.test:

Add test cases for attempting to trigger the error checks in
row_create_index_for_mysql(). Before MySQL 5.5 and WL#5743, the leak
is only reproducible if ha_innobase::max_supported_key_part_length()
returned a higher limit than the one used in
row_create_index_for_mysql().

In MySQL 5.5 and later, the leak is reproducible with
innodb_large_prefix=true.

rb:688 approved by Jimmy Yang
2011-06-28 15:28:21 +03:00
Michael Widenius
f34be18938 Merge with MariaDB 5.2 2011-05-10 18:17:43 +03:00
unknown
6ed10ee5dd Bug#60309 - Bug#12356829: MYSQL 5.5.9 FOR MAC OSX HAS BUG WITH FOREIGN KEY CONSTRAINTS
The innoDB global variable srv_lower_case_table_names is set to the value of lower_case_table_names declared in mysqld.h server in ha_innodb.cc.  Since this variable can change at runtime, it is reset for each handler call to ::create, ::open, ::rename_table & ::delete_table.

But it is possible for tables to be implicitly opened before an explicit handler call is made when an engine is first started or restarted.  I was able to reproduce that with the testcase in this patch on a version of InnoDB from 2 weeks ago.  It seemed like the change buffer entries for the secondary key was getting put into pages after the restart.  (But I am not sure, I did not write down the call stack while it was reproducing.)  In the current code, the implicit open, which is actually a call to dict_load_foreigns(), does not occur with this testcase.

The change is to replace srv_lower_case_table_names by an interface function in innodb.cc that retrieves the server global variable when it is needed.
2011-04-26 12:55:52 -05:00
Sergei Golubchik
0accbd0364 lots of post-merge changes 2011-04-25 17:22:25 +02:00
Vasil Dimov
99ed0123bd Merge mysql-5.5-innodb -> mysql-5.5 2011-04-21 08:34:21 +03:00
Marko Mäkelä
460a7197bb Merge mysql-5.1-innodb to mysql-5.5-innodb. 2011-04-11 17:03:32 +03:00
Marko Mäkelä
12fbe05c6a Bug #11760042 - 52409: Assertion failure: long semaphore wait
In ha_innobase::create(), we check some things while holding an
exclusive lock on the data dictionary. Defer the locking and the
creation of transactions until after the checks have passed. The
THDVAR could hang due to a mutex wait (see Bug #11750569 - 41163:
deadlock in mysqld: LOCK_global_system_variables and LOCK_open), and
we want to avoid waiting while holding InnoDB mutexes.

innobase_index_name_is_reserved(): Replace the parameter trx_t with
THD, so that the test can be performed before starting an InnoDB
transaction. We only needed trx->mysql_thd.

ha_innobase::create(): Create transaction and lock the data dictionary
only after passing the basic tests.

create_table_def(): Move the IS_MAGIC_TABLE_AND_USER_DENIED_ACCESS
check to ha_innobase::create(). Assign to srv_lower_case_table_names
while holding dict_sys->mutex.

ha_innobase::delete_table(), ha_innobase::rename_table(),
innobase_rename_table(): Assign srv_lower_case_table_names as late as
possible. Here, the variable is not necessarily protected by
dict_sys->mutex.

ha_innobase::add_index(): Invoke innobase_index_name_is_reserved() and
innobase_check_index_keys() before allocating anything.

rb:618 approved by Jimmy Yang
2011-04-11 16:40:28 +03:00
Marko Mäkelä
0ff2a182b6 Bug #11766513 - 59641: Prepared XA transaction in system after hard crash
causes future shutdown hang

InnoDB would hang on shutdown if any XA transactions exist in the
system in the PREPARED state. This has been masked by the fact that
MySQL would roll back any PREPARED transaction on shutdown, in the
spirit of Bug #12161 Xa recovery and client disconnection.

[mysql-test-run] do_shutdown_server: Interpret --shutdown_server 0 as
a request to kill the server immediately without initiating a
shutdown procedure.

xid_cache_insert(): Initialize XID_STATE::rm_error in order to avoid a
bogus error message on XA ROLLBACK of a recovered PREPARED transaction.

innobase_commit_by_xid(), innobase_rollback_by_xid(): Free the InnoDB
transaction object after rolling back a PREPARED transaction.

trx_get_trx_by_xid(): Only consider transactions whose
trx->is_prepared flag is set. The MySQL layer seems to prevent
attempts to roll back connected transactions that are in the PREPARED
state from another connection, but it is better to play it safe. The
is_prepared flag was introduced in the InnoDB Plugin.

trx_n_prepared: A new counter, counting the number of InnoDB
transactions in the PREPARED state.

logs_empty_and_mark_files_at_shutdown(): On shutdown, allow
trx_n_prepared transactions to exist in the system.

trx_undo_free_prepared(), trx_free_prepared(): New functions, to free
the memory objects of PREPARED transactions on shutdown. This is not
needed in the built-in InnoDB, because it would collect all allocated
memory on shutdown. The InnoDB Plugin needs this because of
innodb_use_sys_malloc.

trx_sys_close(): Invoke trx_free_prepared() on all remaining
transactions.
2011-04-07 21:12:54 +03:00
Sunanda Menon
50a1629f64 Merge from mysql-5.5.11-release 2011-04-07 08:59:07 +02:00
Vasil Dimov
7dcb8a0106 Merge mysql-5.5-innodb -> mysql-5.5 2011-04-04 09:12:11 +03:00
Sergei Golubchik
8de6199b16 lp:743017 Diverging results with TIME(3) and ranges depending on the execution plan in 5.1-micro
rewrite get_innobase_type_from_mysql_type() to use types as reported
by the Field objects, instead of relying on ad-hoc assumptions.
2011-03-29 14:48:48 +02:00
Marko Mäkelä
85dbe88d2f Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() [part 4 of 4]
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).

mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.

ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().

buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.

buf_read_ahead_linear(): Add the parameter inside_ibuf.

ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).

ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).

ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).

rb:585 approved by Sunny Bains
2011-03-24 14:00:14 +02:00
MySQL Build Team
97d7805005 Per Jon Olav, change needed for Bug#11784056 2011-03-22 16:52:03 +01:00
Sunanda Menon
24934b254c merge 2011-03-22 14:34:04 +01:00
Jon Olav Hauglid
fc6ef378d4 Bug#11815600 [ERROR] INNODB COULD NOT FIND INDEX PRIMARY
KEY NO 0 FOR TABLE IN ERROR LOG 

With the changes made by the patches for Bug#11751388 and
Bug#11784056, concurrent reads are allowed while secondary
indexes are created in InnoDB. This means that the metadata
lock on the affected table is not upgraded to exclusive
until the .FRM is updated at the end of ALTER TABLE processing.

The problem was that if this lock upgrade failed for some
reason (e.g. timeout), the index information in the server
and inside InnoDB would be out of sync. This would happen
since the add index operation already was committed inside 
InnoDB but the table metadata inside the server had not been
updated yet.

This patch fixes the problem by (for now) reverting the
effects of the patches for Bug#11751388 and Bug#11784056.
Concurrent reads will now again be blocked during creation
of secondary indexes in InnoDB.

Test case added to innodb_mysql_lock.test.
2011-03-09 16:06:13 +01:00
Jon Olav Hauglid
5e32755e4c Bug #11784056 ENABLE CONCURRENT READS WHILE CREATING
NON-PRIMARY UNIQUE INDEX USING INNODB

This patch adds the HA_INPLACE_ADD_UNIQUE_INDEX_NO_WRITE
capability flag to InnoDB, indicating that concurrent reads
can be allowed while non-primary unique indexes are created.

This is an follow-up to Bug #11751388 which enabled concurrent
reads when creating non-primary non-unique indexes.

Test case added to innodb_mysql_sync.test.
2011-03-07 14:30:49 +01:00
Vasil Dimov
e84efcaf3f Merge mysql-5.5 -> mysql-5.5-innodb 2011-03-02 11:00:48 +02:00
Vasil Dimov
8d8784e930 Use plugin_author also for the InnoDB SE plugin
Move the const variable plugin_author to a common header file that is
being included by both ha_innodb.cc and i_s.cc
2011-02-28 11:07:22 +02:00
Vasil Dimov
17621e724f Change InnoDB plugins author to Oracle Corporation 2011-02-28 11:02:24 +02:00
Vasil Dimov
e99019e570 Non-functional change: use plugin_author instead of hardcoded string.
plugin_author is currently defined as "Innobase Oy" so this change is
a no-op.
2011-02-28 10:39:48 +02:00
Jimmy Yang
56845ed3b4 Fix Bug #11765975 __FILE__ macros expanded to full path instead of relative
in CMake builds

rb://600 approved by Sunny Bains
2011-02-25 00:33:13 -08:00
Vasil Dimov
19372f1094 Merge mysql-5.5-innodb -> mysql-5.5 2011-02-23 19:33:41 +02:00
Sunny Bains
f91ce0413b Revert the max value of innodb_purge_batch_size to 5000. 2011-02-23 17:56:37 +11:00
Sunny Bains
b3c9cc6f21 Bug #11766227: InnoDB purge lag much worse for 5.5.8 versus 5.1
Bug #11766501: Multiple RBS break the get rseg with mininum trx_t::no code during purge
      
Bug# 59291 changes:
      
Main problem is that truncating the UNDO log at the completion of every
trx_purge() call is expensive as the number of rollback segments is increased.
We truncate after a configurable amount of pages. The innodb_purge_batch_size
parameter is used to control when InnoDB does the actual truncate. The truncate
is done once after 128 (or TRX_SYS_N_RSEGS iterations). In other words we
truncate after purge 128 * innodb_purge_batch_size. The smaller the batch
size the quicker we truncate.
      
Introduce a new parameter that allows how many rollback segments to use for
storing REDO information. This is really step 1 in allowing complete control
to the user over rollback space management.
      
New parameters:
    i) innodb_rollback_segments = number of rollback_segments to use
       (default is now 128) dynamic parameter, can be changed anytime.
       Currently there is little benefit in changing it from the default.
      
Optimisations in the patch.
      
    i. Change the O(n) behaviour of trx_rseg_get_on_id() to O(log n)
       Backported from 5.6. Refactor some of the binary heap code.
       Create a new include/ut0bh.ic file.
      
    ii. Avoid truncating the rollback segments after every purge.
      
Related changes that were moved to a separate patch:
      
    i. Purge should not do any flushing, only wait for space to be free so that
       it only does purging of records unless it is held up by a long running
       transaction that is preventing it from progressing.
      
   ii. Give the purge thread preference over transactions when acquiring the
       rseg->mutex during commit. This to avoid purge blocking unnecessarily
       when getting the next rollback segment to purge.
      
Bug #11766501 changes:
      
Add the rseg to the min binary heap under the cover of the kernel mutex and
the binary heap mutex. This ensures the ordering of the min binary heap.
      
The two changes have to be committed together because they share the same
that fixes both issues.
      
rb://567 Approved by: Inaam Rana.
2011-02-22 16:04:08 +11:00
Vasil Dimov
0047afa3b2 Merge mysql-5.5-innodb -> mysql-5.5 2011-02-17 13:57:26 +02:00
Jimmy Yang
57325925db Merge from mysql-5.1-innodb to mysql-5.5-innodb 2011-02-14 02:17:51 -08:00
Jonathan Perkin
f13788c9fd Merge from mysql-5.5.9-release 2011-02-08 14:59:03 +01:00
Vasil Dimov
3bf88775f5 Merge mysql-5.5-innodb -> mysql-5.5 2011-01-30 18:51:37 +02:00
Jon Olav Hauglid
e99f2b1c4e Bug #42230 during add index, cannot do queries on storage engines
that implement add_index

The problem was that ALTER TABLE blocked reads on an InnoDB table
while adding a secondary index, even if this was not needed. It is
only needed for the final step where the .frm file is updated.

The reason queries were blocked, was that ALTER TABLE upgraded the
metadata lock from MDL_SHARED_NO_WRITE (which blocks writes) to
MDL_EXCLUSIVE (which blocks all accesses) before index creation.

The way the server handles index creation, is that storage engines
publish their capabilities to the server and the server determines
which of the following three ways this can be handled: 1) build a
new version of the table; 2) change the existing table but with
exclusive metadata lock; 3) change the existing table but without
metadata lock upgrade.

For InnoDB and secondary index creation, option 3) should have been
selected. However this failed for two reasons. First, InnoDB did
not publish this capability properly.

Second, the ALTER TABLE code failed to made proper use of the
information supplied by the storage engine. A variable
need_lock_for_indexes was set accordingly, but was not later used.
This patch fixes this problem by only doing metadata lock upgrade
before index creation/deletion if this variable has been set.

This patch also changes some of the related terminology used 
in the code. Specifically the use of "fast" and "online" with
respect to ALTER TABLE. "Fast" was used to indicate that an
ALTER TABLE operation could be done without involving a
temporary table. "Fast" has been renamed "in-place" to more
accurately describe the behavior.

"Online" meant that the operation could be done without taking
a table lock. However, in the current implementation writes
are always prohibited during ALTER TABLE and an exclusive
metadata lock is held while updating the .frm, so ALTER TABLE
is not completely online. This patch replaces "online" with 
"in-place", with additional comments indicating if concurrent
reads are allowed during index creation/deletion or not.

An important part of this update of terminology is renaming
of the handler flags used by handlers to indicate if index
creation/deletion can be done in-place and if concurrent reads
are allowed. For example, the HA_ONLINE_ADD_INDEX_NO_WRITES
flag has been renamed to HA_INPLACE_ADD_INDEX_NO_READ_WRITE,
while HA_ONLINE_ADD_INDEX is now HA_INPLACE_ADD_INDEX_NO_WRITE.
Note that this is a rename to clarify current behavior, the
flag values have not changed and no flags have been removed or
added.

Test case added to innodb_mysql_sync.test.
2011-01-26 14:23:29 +01:00