Before this fix, it was possible to build the server:
- with the performance schema
- with a dummy implementation of my_atomic (MY_ATOMIC_MODE_DUMMY).
In this case, the resulting binary will just crash,
as this configuration is not supported.
This fix enforces that the build will fail with a compilation error in this
configuration, instead of resulting in a broken binary.
In early development of delete buffering, we did allow B-tree pages
to become empty as a result of buffered deletes. That caused fundamental
problems. The fix was to refuse buffering purge operations unless
the page can be guaranteed to be nonempty. Remove an attempt to
cope with empty pages when merging inserts.
ALTER TABLE on a MERGE table could cause a deadlock with two
other connections if we reached a situation where:
1) A connection doing ALTER TABLE can't upgrade to MDL_EXCLUSIVE on the
parent table, but holds TL_READ_NO_INSERT on the child tables.
2) A connection doing DELETE on a child table can't get TL_WRITE on it
since ALTER TABLE holds TL_READ_NO_INSERT.
3) A connection doing SELECT on the parent table can't get TL_READ on
the child tables since TL_WRITE is ahead in the lock queue, but holds
MDL_SHARED_READ on the parent table preventing ALTER TABLE from upgrading.
For regular tables, this deadlock is avoided by having ALTER TABLE
take a MDL_SHARED_NO_WRITE metadata lock on the table. This prevents
DELETE from acquiring MDL_SHARED_WRITE on the table before ALTER TABLE
tries to upgrade to MDL_EXCLUSIVE. In the example above, SELECT would
therefore not be blocked by the pending DELETE as DELETE would not be
able to enter TL_WRITE in the table lock queue.
This patch fixes the problem for merge tables by using the same metadata
lock type for child tables as for the parent table. The child tables will
in this case therefore be locked with MDL_SHARED_NO_WRITE, preventing
DELETE from acquiring a metadata lock and enter into the table lock queue.
Change in behavior: By taking the same metadata lock for child tables
as for the parent table, LOCK TABLE on the parent table will now also
implicitly lock the child tables. Since LOCK TABLE on the parent table
now takes more than one metadata lock, it is possible for LOCK TABLE
... WRITE on the parent table or child tables to give ER_LOCK_DEADLOCK
error.
Test case added to mdl_sync.test.
Merge.test/.result has been updated to reflect the change to LOCK TABLE.
Remove non applicable licensing files
storage/innobase/COPYING is in the MySQL top level directory and
storage/innobase/COPYING.Sun_Microsystems is not applicable anymore
now that Oracle and Sun are one company.
Handle combined instrument states of ENABLED and/or TIMED:
ENABLED TIMED
1 1 Aggregate stats, increment counter
1 0 Increment counter
0 1 Do nothing
0 0 Do nothing
storage/perfschema/pfs.cc:
Aggregate stats only if state is both ENABLED and TIMED. If ENABLED
but not TIMED, only increment the value counter.
storage/perfschema/pfs_stat.h:
Split aggregate and counter increment into separate
methods for performance.
Problem: trailing spaces were stripped using 8-bit code,
so the truncation result length was incorrect, which led
to an assertion failure.
Fix: using multi-byte safe code.
bug #49251 (deadlock/crash with concurrent truncate table and index
statistics calculation) by backporting a solution from #54678 fixed
for 5.1 plugin and 5.5.
Before this fix, some tests failed due to lack of instrumentation slots
in the performance schema, because the default sizing was too low.
Now that more code has been instrumented, the default sizing has to be adjusted
to match the current instrumentation consumption.
This change:
- increases the number of rwlock classes from 20 to 30,
- increases the number of rwlock and mutex instances to 1 million.
Both are to account for the volume of data instrumented
when the innodb storage engine is used (because of the innodb buffer pool).
Adjusted the test output accordingly.
------------------------------------------------------------
revno: 3550
revision-id: marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2
parent: marko.makela@oracle.com-20100823102854-t1clrojqis2ley36
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Tue 2010-08-24 11:10:03 +0300
message:
Bug#55832: selects crash too easily when innodb_force_recovery>3
dict_update_statistics_low(): Create bogus statistics for those
indexes that cannot be accessed because of the innodb_force_recovery
setting.
ha_innobase::info(): Calculate statistics for each index, even if
innodb_force_recovery is set. Fill in bogus data for those indexes
that are not accessed because of the innodb_force_recovery setting.
dict_update_statistics_low(): Create bogus statistics for those
indexes that cannot be accessed because of the innodb_force_recovery
setting.
ha_innobase::info(): Calculate statistics for each index, even if
innodb_force_recovery is set. Fill in bogus data for those indexes
that are not accessed because of the innodb_force_recovery setting.
dict_update_statistics_low(): Create bogus statistics for those
indexes that cannot be accessed because of the innodb_force_recovery
setting.
ha_innobase::info(): Calculate statistics for each index, even if
innodb_force_recovery is set. Fill in bogus data for those indexes
that are not accessed because of the innodb_force_recovery setting.
and above the general requirement free. We call them
BUF_FLUSH_EXTRA_MARGIN. With multiple buffer pools we may end up keeping
this amount of pages for each buffer pool. This patch, diagnosed and fixed
by Michael, throttles flushing in such cases.
rb://435
bug#54346
This patch doesn't get rid of the need to acquire the dict_sys->mutex but
reduces the need to keep the mutex locked for the duration of the query
to fsp_get_available_space_in_free_extents() from ha_innobase::info().
rb://390.
The callers should indicate that the dictionary is locked or not using
the trx->dict_operation_lock_mode == RW_X_LATCH mode. Checking explicitly
for system tables is unnecessary.
Approved by Marko on IRC.
Issue an error message to the error log when
trx->dict_operation_lock_mode == RW_X_LATCH in
srv_suspend_mysql_thread(). Transactions that modify InnoDB
data dictionary tables must be free of lock waits, because they
must be holding the data dictionary latch in exclusive mode.
The transactions must not be accessing any other tables other than
the data dictionary tables.
The handling of RW_X_LATCH was accidentally added in the InnoDB Plugin,
as a wrong fix of an assertion failure. (Fast index creation was accessing
both data dictionary tables and user tables in the same transaction.)
revno: 3545
revision-id: marko.makela@oracle.com-20100818110110-zfs0i1vfrccfb4yw
parent: vasil.dimov@oracle.com-20100817193934-1yl7zz2odikxauf8
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-08-18 14:01:10 +0300
message:
Bug#55626: MIN and MAX reading a delete-marked record from secondary index
Remove a bogus debug assertion that triggered the bug.
Add assertions precisely where records must not be delete-marked.
And a comment to clarify when the record is allowed to be delete-marked.
Remove a bogus debug assertion that triggered the bug.
Add assertions precisely where records must not be delete-marked.
And a comment to clarify when the record is allowed to be delete-marked.
Added InnoDB to the 'default' plugin group, and modified
the autoconf script so the 'default' group is actually
built by default.
(i.e ./configure.am == ./configure.am --with-plugins=default ,
instead of being ./configure.am --with-plugins=none )
Improve the range estimation algorithm.
Previously:
For a given level the algo knows the number of pages in the requested range and the n
With this change:
Same idea, but peek a few (10) of the intermediate pages to get a better estimate of
In the bug report one of the examples has a btree with a snippet of the leaf level li
page1(899 records), page2(1 record), page3(1 record), page4(1 record)
so when trying to estimate, the previous algo, assumed there are average (899+1)/2=45
Fix Bug#53761 RANGE estimation for matched rows may be 200 times different
Improve the range estimation algorithm.
Previously:
For a given level the algo knows the number of pages in the requested range
and the number of records on the leftmost and the rightmost page. Then it
assumes all pages in between contain the average between the two border pages
and multiplies this average number by the number of intermediate pages.
With this change:
Same idea, but peek a few (10) of the intermediate pages to get a better
estimate of the average number of records per page. If there are less than 10
intermediate pages then all of them will be scanned and the result will be
precise, not an estimation.
In the bug report one of the examples has a btree with a snippet of the leaf
level like this:
page1(899 records), page2(1 record), page3(1 record), page4(1 record)
so when trying to estimate, the previous algo, assumed there are average
(899+1)/2=450 records per page which went terribly wrong. With this change
page2 and page3 will be read and the exact number of records will be returned.
Approved by: Sunny (rb://401)