buf_page_t: Remove the buf_pool pointer. Add buf_pool_index:6 next to
buf_fix_count:19 (it was buf_fix_count:25). This will limit the number
of concurrent transactions to somewhere around 524,288.
buf_pool_index(buf_pool): A new function to determine the index of a
buffer pool in buf_pool_ptr[].
buf_pool_ptr: Make this a dynamically allocated array instead of an
array of pointers. Allocate the array in buf_pool_init() and free in
buf_pool_free().
buf_pool_free_instance(): No longer free the buf_pool object, as it is
allocated from a big array.
buf_pool_from_bpage(), buf_pool_from_block(): Move the definitions to
the beginning of the files, because some compilers have (had) problems
with forward definitions of inline functions. Calculate the buffer
pool from buf_pool_index.
buf_pool_from_array(): Add debug assertions for input validation.
Remove trx_t::active_trans. Split into two separate fields with distinct
responsibilities. trx_t::is_registered and trx_t::owns_prepare_mutex.
There are wrapper functions for using this fields in ha_innodb.cc. The
wrapper functions check for invariants.
Fix some formatting to conform to InnoDB guidelines.
Remove innobase_register_stmt() and innobase_register_trx_and_stmt().
Add:
trx_is_started()
trx_deregister_from_2pc()
trx_register_for_2pc()
trx_is_registered_for_2pc()
trx_owns_prepare_commit_mutex_set()
trx_has_prepare_commit_mutex()
rb://479, Approved by Jimmy Yang.
These files were needed when InnoDB Plugin was maintained and distributed
separately from the MySQL 5.1 source tree. They have never been needed in
MySQL 5.5.
storage/innobase/mysql-test:
Patches to the test suite.
storage/innobase/handler/mysql_addons.cc:
Wrappers for private MySQL functions.
Rename buf_pool_watch, buf_pool_mutex, buf_pool_zip_mutex
to buf_pool->watch, buf_pool->mutex, buf_pool->zip_mutex
in comments. Refer to buf_pool->flush_list_mutex instead of
flush_list_mutex.
Remove obsolete declarations of buf_pool_mutex and buf_pool_zip_mutex.
For DATA_MBMINLEN(), cast the result of UNIV_EXPECT to ulint because in GCC
it returns a long causing unsigned/signed comparison warnings.
Approved by Jimmy Yang on IM.
Ensure that fdatasync is properly declared as on Mac OS X, the
function is available but there is no prototype. Also, port a
fix for a warning from the InnoDB plugin over to the builtin.
configure.in:
Check that fdatasync is declared.
mysys/my_sync.c:
Use fdatasync only if it is declared.
storage/innobase/include/ut0dbg.h:
Port over from the plugin a fix for a warning.
Additional fixes in 5.5:
ibuf_set_del_mark(): Add diagnostics when setting a buffered delete-mark fails.
ibuf_delete(): Correct a misleading comment about non-found records.
rec_print(): Add a const qualifier to the index parameter.
Bug #56680 wrong InnoDB results from a case-insensitive covering index
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
btr_cur_update_alloc_zip(): Make the function public.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
REC_INFO_DELETED_FLAG: Move the definition from rem0rec.ic to rem0rec.h.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
buf_LRU_free_block(): Refactored from
buf_LRU_search_and_free_block(). This is needed for the
innodb_change_buffering_debug diagnostics.
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
Add new function os_cond_wait_timed(). Change the os_thread_sleep() calls
to timed conditional waits. Signal the background threads during the shutdown
phase so that we avoid waiting for the sleep to timeout thus saving some time.
rb://439 -- Approved by Jimmy Yang
Fix assorted compiler warnings.
sql/mysqld.cc:
Do not declare max_page_size twice. If large pages support
is enabled, the code expects the size in max_desired_page_size.
storage/innobase/include/ibuf0ibuf.h:
Remove trailing comma. Only present in C99.
Approved by: Vasil (via IRC)
storage/innobase/include/row0row.h:
Remove trailing comma. Only present in C99.
Approved by: Vasil (via IRC)
strings/my_vsnprintf.c:
No need to assert the obvious.
This is a manual merge from mysql-5.1-innodb of:
revision-id: vasil.dimov@oracle.com-20100930102618-s9f9ytbytr3eqw9h
committer: Vasil Dimov <vasil.dimov@oracle.com>
timestamp: Thu 2010-09-30 13:26:18 +0300
message:
Fix a potential bug when using __sync_lock_test_and_set()
TYPE __sync_lock_test_and_set (TYPE *ptr, TYPE value, ...)
it is not documented what happens if the two arguments are of different
type like it was before: the first one was lock_word_t (byte) and the
second one was 1 or 0 (int).
Approved by: Marko (via IRC)
bug #49251 (deadlock/crash with concurrent truncate table and index
statistics calculation) by backporting a solution from #54678 fixed
for 5.1 plugin and 5.5.
and above the general requirement free. We call them
BUF_FLUSH_EXTRA_MARGIN. With multiple buffer pools we may end up keeping
this amount of pages for each buffer pool. This patch, diagnosed and fixed
by Michael, throttles flushing in such cases.
rb://435
bug#54346
This patch doesn't get rid of the need to acquire the dict_sys->mutex but
reduces the need to keep the mutex locked for the duration of the query
to fsp_get_available_space_in_free_extents() from ha_innobase::info().
rb://390.
Improve the range estimation algorithm.
Previously:
For a given level the algo knows the number of pages in the requested range and the n
With this change:
Same idea, but peek a few (10) of the intermediate pages to get a better estimate of
In the bug report one of the examples has a btree with a snippet of the leaf level li
page1(899 records), page2(1 record), page3(1 record), page4(1 record)
so when trying to estimate, the previous algo, assumed there are average (899+1)/2=45
Fix Bug#53761 RANGE estimation for matched rows may be 200 times different
Improve the range estimation algorithm.
Previously:
For a given level the algo knows the number of pages in the requested range
and the number of records on the leftmost and the rightmost page. Then it
assumes all pages in between contain the average between the two border pages
and multiplies this average number by the number of intermediate pages.
With this change:
Same idea, but peek a few (10) of the intermediate pages to get a better
estimate of the average number of records per page. If there are less than 10
intermediate pages then all of them will be scanned and the result will be
precise, not an estimation.
In the bug report one of the examples has a btree with a snippet of the leaf
level like this:
page1(899 records), page2(1 record), page3(1 record), page4(1 record)
so when trying to estimate, the previous algo, assumed there are average
(899+1)/2=450 records per page which went terribly wrong. With this change
page2 and page3 will be read and the exact number of records will be returned.
Approved by: Sunny (rb://401)
------------------------------------------------------------
revno: 3476
committer: Sunny Bains <Sunny.Bains@Oracle.Com>
branch nick: 5.1-security
timestamp: Thu 2010-08-05 19:18:17 +1000
message:
Fix bug# 55543 - InnoDB Plugin: Signal 6: Assertion failure in file fil/fil0fil.c line 4306
The bug is due to a double delete of a BLOB, once via:
rollback -> btr_cur_pessimistic_delete()
and the second time via purge.
The bug is in row_upd_clust_rec_by_insert(). There we relinquish ownership
of the non-updated BLOB columns in btr_cur_mark_extern_inherited_fields()
before building the row entry that will be inserted and whose contents will
be logged in the UNDO log. However, we don't set the BLOB column later to
INHERITED so that a possible rollback will not free the original row's
non-updated BLOB entries. This is because the condition that checks for
that is in :
if (node->upd_ext) {}.
node->upd_ext is non-NULL only if a BLOB column was updated and that column
is part of some key ordering (see row_upd_replace()). This results in the
non-update BLOB columns being deleted during a rollback and subsequently by
purge again.
rb://413