This is a followup to vasil.dimov@oracle.com-20100816142329-yimenbuktd416z1a
which improved the sampling algorithm. I have manually checked that the new
values are actually the correct ones, for example:
-rows 16
+rows 32
the number of rows returned by the query is 32.
Improve the range estimation algorithm.
Previously:
For a given level the algo knows the number of pages in the requested range and the n
With this change:
Same idea, but peek a few (10) of the intermediate pages to get a better estimate of
In the bug report one of the examples has a btree with a snippet of the leaf level li
page1(899 records), page2(1 record), page3(1 record), page4(1 record)
so when trying to estimate, the previous algo, assumed there are average (899+1)/2=45
Fix Bug#53761 RANGE estimation for matched rows may be 200 times different
Improve the range estimation algorithm.
Previously:
For a given level the algo knows the number of pages in the requested range
and the number of records on the leftmost and the rightmost page. Then it
assumes all pages in between contain the average between the two border pages
and multiplies this average number by the number of intermediate pages.
With this change:
Same idea, but peek a few (10) of the intermediate pages to get a better
estimate of the average number of records per page. If there are less than 10
intermediate pages then all of them will be scanned and the result will be
precise, not an estimation.
In the bug report one of the examples has a btree with a snippet of the leaf
level like this:
page1(899 records), page2(1 record), page3(1 record), page4(1 record)
so when trying to estimate, the previous algo, assumed there are average
(899+1)/2=450 records per page which went terribly wrong. With this change
page2 and page3 will be read and the exact number of records will be returned.
Approved by: Sunny (rb://401)
Reduce ibuf_mutex and ibuf_pessimistic_insert_mutex contention further.
Protect ibuf->empty by the insert buffer root page latch, not ibuf_mutex.
ibuf_tree_root_get(): Assert that ibuf_mutex is owned by the
caller. Assert that the stamped page number is correct. Assert that
ibuf->empty agrees with the root page.
ibuf_size_update(): Do not update ibuf->empty.
ibuf_init_at_db_start(): Update ibuf->empty while holding the root page latch.
ibuf_add_free_page(): Return TRUE/FALSE instead of DB_SUCCESS/DB_STRONG_FAIL.
ibuf_remove_free_page(): Release ibuf_pessimistic_insert_mutex as
early as possible.
ibuf_contract_ext(): Rely on a dirty read of ibuf->empty, unless the
server is being shut down. Never acquire ibuf_mutex. Eliminate n_stored.
ibuf_contract_after_insert(): Never acquire ibuf_mutex. Perform dirty
reads of ibuf->size and ibuf->max_size.
ibuf_insert_low(): Only acquire ibuf_mutex for mode==BTR_MODIFY_TREE.
Perform dirty reads of ibuf->size and ibuf->max_size. Update
ibuf->empty while holding the root page latch.
ibuf_delete_rec(): Update ibuf->empty while holding the root page latch.
ibuf_is_empty(): Release ibuf_mutex earlier.
regression in Bug #54914, but it does speed up the execution for
innodb_change_buffering=inserts.
ibuf_add_ops(), ibuf_merge_or_delete_for_page(),
ibuf_delete_for_discarded_space(): Use atomic built-ins instead of
ibuf_mutex, when available.
ibuf_add_free_page(), ibuf_remove_free_page(), ibuf_contract_ext():
Release ibuf_mutex earlier.
ibuf_free_excess_pages(): Release ibuf_mutex before a conditional branch.
ibuf_insert_low(): Release ibuf_mutex before a conditional
branch. Create ibuf_entry before re-acquiring ibuf_mutex. Simplify a
loop to reduce code footprint. Release ibuf_mutex before mtr_commit()
[btr_pcur_close()].
ibuf_is_empty(): Release ibuf_mutex before mtr_commit().
pages that it wants to flush then we should honor that value as in
not going beyond that in our eagerness to flush the neighbors of
the selected victim.
DELETE statement
Single-table delete ordered by a field that has a hash-type index
may cause an assertion failure or a crash.
An optimization added by the fix for the bug 36569 forced the
optimizer to use ORDER BY-compatible indices when applicable.
However, the existence of unsorted indices (HASH index algorithm
for some engines such as MEMORY/HEAP, NDB) was ignored.
The test_if_order_by_key function has been modified to skip
unsorted indices.
The problem was that the optimize method of the ARCHIVE storage
engine was not preserving the FRM embedded in the ARZ file when
rewriting the ARZ file for optimization. The ARCHIVE engine stores
the FRM in the ARZ file so it can be transferred from machine to
machine without also copying the FRM -- the engine restores the
embedded FRM during discovery.
The solution is to copy over the FRM when rewriting the ARZ file.
In addition, some initial error checking is performed to ensure
garbage is not copied over.
* Fixed obvious errors (HAVE_BROKEN_PREAD is not true for on any
of systems we use, definitely not on HPUX)
* Remove other junk flags for OSX and HPUX
* Avoid checking type sizes in universal builds on OSX, again
(CMake2.8.0 fails is different architectures return different results)
* Do not compile template instantiation stuff unless
EXPLICIT_TEMPLATE_INSTANTIATION is used.
* Some cleanup (make gen_lex_hash simpler, avoid dependencies)
* Exclude some unused files from compilation (strtol.c etc)
Fix some issues with WiX packaging, particularly
major upgrade and change scenarios.
* remember binary location and data location
(for major upgrade)
* use custom UI, which is WiX Mondo extended
for major upgrade dialog (no feature selection
screen shown on major upgrade, only upgrade
confirmation). This is necessary to prevent
changing installation path during upgrade
(services are not reregistered, so they would
have invalid binary path is it is changed)
* Hide datafiles that are installed into
ProgramFiles, show ones that are installed
in ProgramData
* Make MSI buildable with nmake
* Fix autotools "make dist"
Remove wrappers around inline -- static inline is used without
wrappers throughout the source code. We rely on the compiler or
linker to eliminate unused static functions.
A change in the default values of some config parameters
caused this test to fail, adjust the test and make it more
robust so it does not fail for the same reason in the future.
sporadically
There are two problems:
1. When closing temporary tables, during the THD clean up - and
after the session connection was already closed, there is a
chance we can push an error into the THD diagnostics area, if
the writing of the implicit DROP event to the binary log fails
for some reason. As a consequence an assertion can be
triggered, because at that point the diagnostics area is
already set.
2. Using push_warning with MYSQL_ERROR::WARN_LEVEL_ERROR is a
bug.
Given that close_temporary_tables is mostly called from
THD::cleanup - ie, with the session already closed, we fix
problem #1 by allowing the diagnostics area to be
overwritten. There is one other place in the code that calls
close_temporary_tables - while applying Start_log_event_v3. To
cover that case, we make close_temporary_tables to return the
error, thus, propagating upwards in the stack.
To fix problem #2, we replace push_warning with sql_print_error.
release a dirty page in the middle of a mini-transaction. Replace the code
with an assertion that checks for this condition.
Original svn revision was: r6330.
Silence the UNIV_SYNC_DEBUG assertion failure while upgrading old data files
to multiple rollback segments during server startup. Because the upgrade
takes place while InnoDB is running a single thread, we can safely ignore the
latching order checks without fearing deadlocks.
innobase_start_or_create_for_mysql(): Set srv_is_being_started = FALSE
only after trx_sys_create_rsegs() has completed.
sync_thread_add_level(): If srv_is_being_started, ignore latching order
violations for SYNC_TRX_SYS_HEADER and SYNC_IBUF_BITMAP.
Create all the non-IO threads after creating the extra rollback segments.
Patch originally from Marko with some additions by Sunny.