Analysis: When filespace is extended there is first prepare for IO. But if
posix_fallocate is used there was no complete IO causing assertion
at shutdown indicating that all IO is not finished.
Fix: If posix_fallocate is used to extend the filespace, there
is no need to wait for IO to complete, thus we treat this
operation as a read operation. We need to mark IO as
completed or there would be assertion on shutdown at
fil_node_close_file() because all pending IO is not finished.
possibly since it was introduced in the patch for Bug#16720368 around
2013-04-30. This fix is simply to adjust the mtr.add_suppression() lines
in the testcase and to add a missing "\n" in the error message.
Approved by Marko in RB 3746
There was 2 problems:
1) coping/moving of the same type (usually casting) as sizeof() (solved in different ways depends on the cause);
2) using 'const' in SSL_CTX::getVerifyCallback() which return object (not reference) and so copy of the object will be created and 'const' has no sens.
Regression from bug#14621190 due to disabled optimistic restoration
of cursor, which required full key lookup instead of verifying
if previously positioned btree cursor could be reused.
Fixed by enable optimistic restore and adjust cursor afterward.
rb#3324 approved by Marko.
--Implemented CHECK TABLE...QUICK.
Introduce CHECK TABLE...QUICK that would skip the btr_validate_index()
and btr_search_validate() call, and count the no. of records in each index.
Approved by Marko and Kevin. (rb#3567).
AND 'KILL SESSION' LEAD TO CRASH
Analysis:
--------
This situation occurs when the connection executes query
"show engine innodb status" and this connection is killed by
executing statement "kill <con>" by another connection.
In function "innodb_show_status", function "stat_print"
is called to print the status but return value of function
is not checked. After killing connection, if write to
connection fails then error is returned and same is set
in Diagnostic area. Since FALSE is returned from
"innodb_show_status" now, assert to check no error
is set in function "set_eof_status" (called from
my_eof) is failing.
Fix:
----
Changed code to check return value of function "stat_print"
in "innodb_show_status".
ha_innobase::records_in_range() should return HA_POS_ERROR for the table during discarded without requesting pages.
The later other handler method should treat the error correctly.
Approved by Sunny in rb#3433
INDEX_READ_MAP HAD NO MATCH
If index_read_map is called for exact search and no matching records
exists it will position the cursor on the next record, but still having the
relative position to BTR_PCUR_ON.
This will make a call for index_next to read yet another next record,
instead of returning the record the cursor points to.
Fixed by setting pcur->rel_pos = BTR_PCUR_BEFORE if an exact
[prefix] search is done, but failed.
Also avoids optimistic restoration if rel_pos != BTR_PCUR_ON,
since btr_cur may be different than old_rec.
rb#3324, approved by Marko and Jimmy
The testcase for this bug fails randomly due to two reasons.
1. Due to ibuf merge happening background
2. Due to dict stats update which brings the evicted page back into
buffer pool.
Fix ibuf_contract_ext() to not do any merges with ibuf_debug enabled and
also changed dict_stats_update() to return fake statistics without
bringing the secondary index pages into buffer pool.
Approved by Marko. rb#3419
- Better error message when using huge pages
- Fixed link error
- Test suite should run even on system with huge pages
storage/tokudb/ft-index/cmake_modules/TokuThirdParty.cmake:
Fixed that linking works on systems that uses lib64
storage/tokudb/ft-index/portability/huge_page_detection.cc:
Better error message
storage/tokudb/mysql-test/rpl/suite.pm:
Test suite should run even on system with huge pages
storage/tokudb/mysql-test/tokudb/suite.pm:
Test suite should run even on system with huge pages
function if we are doing comparisons in the fractal tree, so that case-insensitivities
get resolved. Comparisons done inside the handlerton are unaffected.
BUILD/compile-solaris-amd64:
* call cmake directly, don't go through three layers of wrappers
(but preserve the compile-solaris-amd64 file - buildbot uses it for 5.1 and 5.5)
* disable jemalloc, it doesn't compile on our sol10-64 box
storage/federated/ha_federated.cc:
clang warning
storage/tokudb/CMakeLists.txt:
* require cmake-2.8.9, because 2.8.8 doesn't add -fPIC for POSITION_INDEPENDENT_CODE
property that ft-index CMakeLists.txt files are using
IT IS DONE IN-PLACE
With change buffer enabled, InnoDB doesn't write a transaction log
record when it merges a record from the insert buffer to an secondary
index page if the insertion is performed as an update-in-place.
Fixed by logging the 'update-in-place' operation on secondary index
pages.
Approved by Marko. rb#2429
* add TokuDB, together with the ft-index library
* cmake support, auto-detecting whether tokudb can be built
* fix packaging - tokudb-engine.rpm, deb
* remove PBXT
* add jemalloc
* the server is built with jemalloc by default even if TokuDB is not built
* documentation files in RPM are installed in the correct location
* support for optional deb packages (tokudb has specific build requirements)
* move plugins from mariadb-server deb to appropriate debs (server/test/libmariadbclient)
* correct mariadb-test.deb to be not architecture-independent
* fix out-of-tree builds to never modify in-tree files
* new handler::prepare_index_scan() method
cmake/plugin.cmake:
* auto-create an rpm for a plugin, if it places itself in a new component
storage/tokudb/CMakeLists.txt:
install tokudb in COMPONENT tokudb-engine.
this automatically creates a separate rpm for it.
* disable jemalloc on windows (cannot run ./configure)
* disable jemalloc on ancient cmake (ExternalProject does not work)
* rewrite TokuDB compiler test to check for features, not versions (to work on cmake before 2.8.11)
* fix ft-index to not add VALGRIND_INCLUDE_DIR to includes, if no valgrind was found
* correct the package name in FindValgrind.cmake (for find_package(... REQUIRED) to work)
* disable ft-index tests by default (faster compilation and they aren't used anyway)
* don't build ft-index with valgrind by default (otherwise it *requires* valgrind, it doesn't auto-detect)
* use --loose-tokudb in the .opt file
cmake/jemalloc.cmake:
for dependencies to work, LIBJEMALLOC should be the target name, not the path
storage/tokudb/CMakeLists.txt:
* check the preconditions
* disable bdb tests (compilation errors)
* set variable, instead of SET_PROPERTY. same effect,
but doesn't fail when a plugin is disabled (that is, a target does not exist)
storage/tokudb/ft-index/CMakeLists.txt:
cmake should not look into examples/ directory,
there is hand-crafted examples/Makefile that
cmake will overwrite
storage/tokudb/ft-index/buildheader/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
storage/tokudb/ft-index/cmake_modules/TokuMergeLibs.cmake:
Libraries must be specified in the specific order,
REMOVE_DUPLICATES cannot be used, because it destroys this order.
(when OSLIBS contains "-lpthread -ljemalloc -lpthread", REMOVE_DUPLICATES
makes it "-lpthread -ljemalloc". But a thread library *must* be *after* jemalloc)
storage/tokudb/ft-index/cmake_modules/TokuSetupCTest.cmake:
* 'which' might print errors to stderr, they are not important, shut them up
* we don't have TOKUDB_DATA, no need to warn about it
* don't configure_file into itself (with input=output)
storage/tokudb/ft-index/cmake_modules/TokuThirdParty.cmake:
jemalloc is built externally to tokudb/ft-index
storage/tokudb/ft-index/ft/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
storage/tokudb/ft-index/ft/tests/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
storage/tokudb/ft-index/locktree/tests/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
storage/tokudb/ft-index/portability/CMakeLists.txt:
s/jemalloc/libjemalloc/
storage/tokudb/ft-index/portability/os_malloc.cc:
unnecessary include file
storage/tokudb/ft-index/portability/tests/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
storage/tokudb/ft-index/src/tests/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
storage/tokudb/ft-index/util/tests/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
storage/tokudb/ft-index/utils/CMakeLists.txt:
the syntax is ADD_EXECUTABLE(target source) and "source" is the file name
compressed pages
After loading a compressed-only page in buf_page_get_gen() we allocate a new
block for decompression. The problem is that the compressed page is neither
buffer-fixed nor I/O-fixed by the time we call buf_LRU_get_free_block(),
so it may end up being evicted and returned back as a new block.
buf_page_get_gen(): Temporarily buffer-fix the compressed-only block
while allocating memory for an uncompressed page frame.
This should prevent this form of the infinite loop, which is more likely
with a small innodb_buffer_pool_size.
rb#2511 approved by Jimmy Yang, Sunny Bains
DICT_TABLE_GET_FORMAT(CLUST_INDEX->TABLE) >= 1
The function row_sel_sec_rec_is_for_clust_rec() was incorrectly
preparing to compare a NULL column prefix in a secondary index with a
non-NULL column in a clustered index.
This can trigger an assertion failure in 5.1 plugin and later. In the
built-in InnoDB of MySQL 5.1 and earlier, we would apparently only do
some extra work, by trimming the clustered index field for the
comparison.
The code might actually have worked properly apart from this debug
assertion failure. It is merely doing some extra work in fetching a
BLOB column, and then comparing it to NULL (which would return the
same result, no matter what the BLOB contents is).
While the test case involves CHECK TABLE, this could theoretically
occur during any read that uses a secondary index on a column prefix
of a column that can be NULL.
rb#3101 approved by Mattias Jonsson
There was a race condition in the rollback of TRX_UNDO_UPD_DEL_REC.
Once row_undo_mod_clust() has rolled back the changes by the rolling-back
transaction, it attempts to purge the delete-marked record, if possible, in a
separate mini-transaction.
However, row_undo_mod_remove_clust_low() fails to check if the DB_TRX_ID of
the record that it found after repositioning the cursor, is still the same.
If it is not, it means that the record was purged and another record was
inserted in its place.
So, the rollback would have performed an incorrect purge, breaking the
locking rules and causing corruption.
The problem was found by creating a table that contains a unique
secondary index and a primary key, and two threads running REPLACE
with only one value for the unique column, so that the uniqueness
constraint would be violated all the time, leading to statement
rollback.
This bug exists in all InnoDB versions (I checked MySQL 3.23.53).
It has become easier to repeat in 5.5 and 5.6 thanks to scalability
improvements and a dedicated purge thread.
rb#3085 approved by Jimmy Yang
FAILED BLOB WRITE
btr_store_big_rec_extern_fields(): Relax a debug assertion so that
some BLOB pointers may remain zero if an error occurs.
btr_free_externally_stored_field(), row_undo_ins(): Allow the BLOB
pointer to be zero on any rollback.
rb#3059 approved by Jimmy Yang, Kevin Lewis
Since the mtr_t struct is marked as invalid in DEBUG_VALGRIND build
during mtr_commit, checking mtr->inside_ibuf will cause this warning.
Also since mtr->inside_ibuf cannot be set in mtr_commit (assert check)
and mtr->state is set to MTR_COMMITTED, the 'ut_ad(!ibuf_inside(&mtr))'
check is not needed if 'ut_ad(mtr.state == MTR_COMMITTED)' is also
checked.
- Reset static variables that are used to signal "init done" for DBUG, in dbug_end()
- Set string server variables to NULL after memory for the value is freed - avoids double free()
- fix DBUG_ASSERTs that happened during reinitialization.
SHUTDOWN IS IN PROGRESS
PROBLEM
-------
In the background thread srv_master_thread() we have a
a one second delay loop which will continuously monitor
server activity .If the server is inactive (with out any
user activity) or in a shutdown state we do some background
activity like flushing the changes.In the current code
we are not checking if server is in shutdown state before
sleeping for one second.
FIX
---
If server is in shutdown state ,then dont go to one second
sleep.
- Let _ma_record_pos() set SEARCH_PART_KEY when doing a search on
a prefix of a [unique] key. Otherwise, _ma_search_pos() would
find the first key equal to search key, and assume it is also
the last one, which will make a wrong estimate of key's position.
A wrong key position may cause min_pos > max_pos and records_in_range()
will return 0, which will make the optimizer think it's an impossible
range while in fact it is not.
Problem:
When the user specified foreign key name contains "_ibfk_", InnoDB wrongly
tries to rename it.
Solution:
When a table is renamed, all its associated foreign keys will also be renamed,
only if the foreign key names are automatically generated. If the foreign key
names are given by the user, even if it has _ibfk_ in it, it must not be
renamed.
rb#2935 approved by Jimmy, Krunal and Satya
Since log_throttle is not available in 5.5. Logging of
error message for failure of thread to create new connection
in "create_thread_to_handle_connection" is not backported.
Since, function "my_plugin_log_message" is not available in
5.5 version and since there is incompatibility between
sql_print_XXX function compiled with g++ and alog files with
gcc to use sql_print_error, changes related to audit log
plugin is not backported.
Backport the fix olav.sandstaa@sun.com-20101102184747-qfuntqwj021imy9r:
"Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT"
(together with further fixes in that code) into MyISAM and Aria.
SERIALIZABLE
Problem:
The documentation claims that WITH CONSISTENT SNAPSHOT will work for both
REPEATABLE READ and SERIALIZABLE isolation levels. But it will work only
for REPEATABLE READ isolation level. Also, the clause WITH CONSISTENT
SNAPSHOT is silently ignored when it is not applicable to the given isolation
level.
Solution:
Generate a warning when the clause WITH CONSISTENT SNAPSHOT is ignored.
rb#2797 approved by Kevin.
Note: Support team wanted to push this to 5.5+.
MULTI-FILE TABLESPACE
ANALYSIS
--------
When a tablespace has multiple data files, InnoDB fails to
open the tablespace. This is because for each ibd file,
the first page is checked.But the first page of all ibd file
need not be the first page of the tablespace. Only the first
page of the tablespace contains the tablespace header. When
we check the first page of an ibd file that is not the first
page of the tablespace, then the "tablespace flags" is not
really available.This was wrongly used to check if a page is
corrupt or not.
FIX
---
Use the tablespace flags only if the page number is 0
in a tablespace.
[Approved by Inaam rb#2836 ]
Analysis
--------
The pthread_mutex commit_threads_m was initiliazed but never
used.
Fix
---
Removing the commit_threads_m mutex from the code base.
[ Approved by Marko rb#2475]
DDL AND I_S QUERIES
Skip partially created indexes (ones whose name starts with TEMP_INDEX_PREFIX)
from stats gathering.
Because InnoDB reports HA_INPLACE_ADD_INDEX_NO_WRITE to MySQL, the latter
allows parallel execution of ha_innobase::add_index() and ha_innobase::info().
Reviewed by: Inaam (rb:2613)
Partitioning didn't store the name of default storage engine for partitions
in the frm file - it only store the typecode. Typecodes aren't stable, and
might vary depending on the order in which storage engines are loaded (can
be changed even from my.cnf without recompilation).
As a temporary workaround for 5.5, we hard-code Aria's typecode, to make sure it
never changes.
IF IT HAS A WRONG COUNT
If CHECK TABLE finds that a secondary index contains the wrong
number of entries, it used to report an error but not mark the
index as corrupt. The error means that the index should be rebuilt,
which can be done with ALTER TABLE DROP INDEX and ALTER TABLE ADD
INDEX. But just in case the DBA does not pay any attention to the
output of CHECK TABLE, the secondary index should be marked as
corrupted so that it is not used again.
Approved by Inaam in RB:2607
Backport to 5.5
(external Bug#69407 Build warnings with mysql)
support-files/build-tags:
Run etags on sql_yacc.yy, ignore other .yy files
unittest/mysys/explain_filename-t.cc:
NO_PLAN seems to fail on some platforms, use the actual number instead.
ON DELETION ORDER
Problem:
When a InnoDB index page is under-filled, we will merge it with either
the left sibling node or the right sibling node. But this checking is
incorrect. When the left sibling node is available, even if merging
is not possible with left sibling node, we do not check for the
possibility of merging with the right sibling node.
Solution:
If left sibling node is available, and merging with left sibling node
is not possible, then check if merge with right sibling node is
possible.
rb#2506 approved by jimmy & ima.
Federated uses SHOW TABLE STATUS LIKE for ::info().
For nonexisting remote table it doesn't fail, but returns an empty result set.
We need to fake the error in the handler.
Fixed some cases that didn't work with > 4G buffers.
Fixed compiler warnings
include/mysql_com.h:
Avoid compiler warning with strncmp()
sql-common/client.c:
Fixed long comment; Added ()
sql/filesort.cc:
Fix code to get filesort to work with big buffers
sql/sys_vars.cc:
Fixed some cache variables that could be set to higher value than the size_t
Limit query cache to ULONG_MAX as the query cache buffer variables are ulong
storage/federatedx/ha_federatedx.cc:
Remove not used variable
storage/maria/ha_maria.cc:
Fix that bulk_insert() works with big buffers
storage/maria/ma_write.c:
Fix that bulk_insert() works with big buffers
storage/myisam/ha_myisam.cc:
Fix that bulk_insert() works with big buffers
storage/myisam/mi_write.c:
Fix that bulk_insert() works with big buffers
storage/sphinx/snippets_udf.cc:
Fixed compiler warnings
i_s_innodb_buffer_page_get_info(): Do not read the buffer block frame
contents of read-fixed blocks, because it may be invalid or
uninitialized. When we are going to decompress or read a block, we
will put it into buf_pool->page_hash and buf_pool->LRU, read-fix the
block and release the mutexes for the duration of the reading or
decompression.
rb#2500 approved by Jimmy Yang
replaced snippets_udf.cc with the latest version (2.0.8 from sphinxsource.com), fixed trivial errors on Windows.
It will be compiled and installed into plugins directory now.
USING THE PLUGIN INTERFACE.
ISSUE: No support for floating-point plugin
system variables.
SOLUTION: Allowing plugins to define and expose floating-point
system variables of type double. MYSQL_SYSVAR_DOUBLE
and MYSQL_THDVAR_DOUBLE are added.
ISSUE: Fractional part of the def, min, max values of system
variables are ignored.
SOLUTION: Adding functions that are used to store the raw
representation of a double in the raw bits of unsigned
longlong in a way that the binary representation
remains the same.
ESCAPED WITH BACKSLASH
Problem:
When the CREATE TABLE statement used COMMENTS with escape sequences like
'foo\'s', InnoDB did not parse is correctly when trying to extract the
foreign key information. Because of this, the foreign keys specified
in the CREATE TABLE statement were not created.
Solution:
Make the InnoDB internal parser aware of escape sequences.
rb#2457 approved by Kevin.
INSERT BUFFER MERGE
Problem:
When the record is merged from the change buffer to the actual page,
in a particular condition, it is assumed that the deleted rec will
be re-used by the inserted rec. With this assumption the lock is
restored on the pointer to the deleted rec itself, thinking that
it is pointing to the newly inserted rec.
Solution:
Just before restoring the lock, update the rec pointer to point
to the newly inserted record. An assert has been added to verify
this. This assert will fail without the fix and will pass with
the fix.
rb#2449 in review by Marko and Jimmy
INSERT BUFFER MERGE
Problem:
When the record is merged from the change buffer to the actual page,
in a particular condition, it is assumed that the deleted rec will
be re-used by the inserted rec. With this assumption the lock is
restored on the pointer to the deleted rec itself, thinking that
it is pointing to the newly inserted rec.
Solution:
Just before restoring the lock, update the rec pointer to point
to the newly inserted record. An assert has been added to verify
this. This assert will fail without the fix and will pass with
the fix.
rb#2449 in review by Marko and Jimmy
INNODB_FAST_SHUTDOWN IS 2
Problem:
When innodb_fast_shutdown is set to 2 and the master thread enters
flush loop, under some circumstances it will not be able to exit it.
This may lead to a shutdown hanging.
This is happening because of the following:
1. In the flush_loop block of code, if the srv_fast_shutdown is
equal to 2 (very fast shutdown), then we do not flush dirty
pages in buffer pool to disk.
2. In the same flush_loop block of code, if the number of dirty
pages is more than user specified limit, we go to step 1.
This results in infinite loop.
Solution:
When we are in the process of doing a very fast shutdown, don't
do step 2 above.
rb#2328 approved by Inaam.
When a record contains no user data bytes (such as when the PRIMARY
KEY is an empty string and all secondary index fields are NULL or the
empty string), page_zip_decompress() could fail to set the record
heap_no correctly.
page_zip_decompress_node_ptrs(), page_zip_decompress_sec(),
page_zip_decompress_clust(): Set heap_no also at the end of the
compressed data stream.
rb#2448 approved by Jimmy Yang and Inaam Rana
AFTER A ROW IS READ
Approved by: Sunny Bains rb://2425
Don't release concurrency tickets when asked to release
btr_search_latch. This is a 5.5 only bug. It is already
fixed in 5.6 upwards.
mysql-test/suite/maria/maria-autozerofill.result:
Updated result
mysql-test/suite/maria/maria-autozerofill.test:
Added test that zerofilled table should not give any warnings when table is used
mysql-test/suite/maria/maria-recovery2.result:
More tests to make it easier to find bugs
mysql-test/suite/maria/maria-recovery2.test:
More tests to make it easier to find bugs
storage/maria/ha_maria.cc:
Set create_trid after repair (needed if table was moved from another system)
Set uuid after repair (needed if table was moved from another system)
storage/maria/maria_chk.c:
Reset share->state.create_trid if we reset share->state.create_rename_lsn.
Make the table moveable
== Analysis ==
Both change buffer pages and on-disk indexes pages are marked as
FIL_PAGE_INDEX. So all ibuf index pages will classify as INDEX with NULL
table_name and index_name.
== Solution ==
A new page type for ibuf data pages named I_S_PAGE_TYPE_IBUF is defined. All
these pages whose index_id equal (DICT_IBUF_ID_MIN + IBUF_SPACE_ID) will
classify as IBUF_DATA instead of INDEX in INNODB_BUFFER_PAGE
and INNODB_BUFFER_PAGE_LRU.
This fix is only for IS reporting, both on-disk and buffer pool structures
keep unchanged.
Approved by both Marko and Jimmy. rb#2334
outside datafile) on INSERT into an Aria table.
The isssue was that the check if a table was moved between systems didn't take into account that create_trid could be bigger than the current max trid on the new system.
This could only happen if one tried to move a table that one had just done a 'REPAIR TABLE' on.
Tables that one had run 'aria_chk --zerofill' on worked.
Fixed this by assuming that if create_trid is too big then the table has been moved from one system to another and we have to do an automatic zerofill.
Other fixed:
- Added a check to detect a wrong create_trid in 'check table'.
- aria_chk -dvv will now write out also the create_trid (to make future error finding easier)
- aria_chk --zerofill doesn't anymore require a aria_control_file
- Removed some warnings from safemalloc when using aria_chk, ma_test1 and ma_test2.
include/myisamchk.h:
Removed wrong 'QQ' flags (the flags are used by myisamchk and aria_chk)
storage/maria/ha_maria.cc:
maria_chk_status() can now also return an error.
storage/maria/ma_check.c:
In maria_chk_status() check if create_trid value is too big.
storage/maria/ma_open.c:
Changed check if table is moved so that we can detect wrong create_trid values.
Don't set STATE_NOT_MOVABLE flag if we are doing repair/check. This was done so that aria_chk can print out the movable flag.
storage/maria/ma_test1.c:
Added code to suppress memory leaks from safemalloc
storage/maria/ma_test2.c:
Added code to suppress memory leaks from safemalloc
storage/maria/maria_chk.c:
Added code to suppress memory leaks from safemalloc.
Make help text a bit better for --HELP and --zerofill.
Incresed version number.
Don't require a control file if we are only doing --zerofill
Print out 'create_trid' when doing --describe --verbose
storage/maria/unittest/ma_test_recovery.expected:
Updated result file
innobase_convert_to_filename_charset() was by mistake kept within
the conditional compilation of UNIV_COMPILE_TEST_FUNCS. Now placing
the function out of UNIV_COMPILE_TEST_FUNCS. Also, removed the
unnecessary log message (as in 5.6+).
MDEV-3989: Server crashes on import from MariaDB mysqldump export with partitioned Aria table.
Problem was that bulk insert in aria was not properly protected against concurrent selects.
storage/maria/ha_maria.cc:
Move settings of file->state to _ma_block_start_trans() to ensure that lock_key_trees is not changed by a concurrent bulk_insert.
storage/maria/ma_check.c:
Added DBUG_ASSERT()
storage/maria/ma_open.c:
Set start_trans to ma_start_trans for default behaviour.
storage/maria/ma_pagecrc.c:
Removed test for 'non_transactional' as a now_transactinal could be reset while a flush was happening.
storage/maria/ma_state.c:
Moved setting of info->state from external_lock to start_trans to protect against concurrent running bulk inserts.
This works as the other threads will wait in thr_lock() until bulk_insert is done and keys are re-generated.
storage/maria/ma_state.h:
Added _ma_start_trans()
This could happen when using Aria for internal temporary files (default case) and using DISTINCT.
_ma_scan_restore_block_record() didn't work correctly if there was rows inserted, updated or deleted on the handler
between calls to _ma_scan_remember_block_record() and _ma_scan_restore_block_record().
The effect was that some DISTINCT queries that used remove_dup_with_compare() could fail.
.bzrignore:
Ignore sql_yacc.hh
mysql-test/suite/maria/r/distinct.result:
Test case for MDEV-4280
mysql-test/suite/maria/t/distinct.test:
Test case for MDEV-4280
mysql-test/t/mysql.test:
Fixed test suite (we could get error -1 in some cases)
sql/sql_select.cc:
Break loop if restart_rnd_next() gives an error
storage/maria/ha_maria.cc:
scan_restore_pos() can return disk fault error.
storage/maria/ma_blockrec.c:
_ma_scan_remember_block_record() did incorrectly update scan.dir instead of scan_save.dir .
_ma_scan_restore_block_record() didn't work correctly if there was rows inserted,updated or deleted on the handler
between calls to _ma_scan_remember_block_record() and _ma_scan_restore_block_record().
Fixed by adding counters for row changes and reading the current scan page if changes had been made.
storage/maria/ma_blockrec.h:
scan_restore_pos() can return disk fault error.
storage/maria/ma_delete.c:
Increment row_changes
storage/maria/ma_scan.c:
scan_restore_pos() can return disk fault error.
storage/maria/ma_update.c:
Increment row_changes
storage/maria/ma_write.c:
Increment row_changes
storage/maria/maria_def.h:
scan_restore_pos() can return disk fault error.
Added comment to clearify the code.
storage/maria/ma_blockrec.c:
Added comment to clearify the code
In case of out of memory or disk error, mark pages with LSN_IMPOSSIBLE to make it easier to know which pages have wrong information.
TRANSACTION ROLLBACK
Problem:
=======
"prepare_commit_mutex" is acquired during "innobase_xa_prepare"
and it is freed only in "innobase_commit". After prepare,
if the commit operation fails the transaction is rolled back
but the mutex is not released.
Analysis:
========
During transaction commit process transaction is prepared and
the "prepare_commit_mutex" is acquired to preserve the order
of commit. After prepare write to binlog is initiated.
File: sql/handler.cc
if (error || (is_real_trans && xid &&
-----> (error= !(cookie= tc_log->log_xid(thd, xid)))))
{
ha_rollback_trans(thd, all);
In the above code "tc_log->log_xid" operation fails.
When the write to binlog fails the transaction is rolled back
with out freeing the mutex. A subsequent "INSERT" operation
tries to acquire the same mutex during its commit process
and the server aborts.
Fix:
===
"prepare_commit_mutex" is freed during "innobase_rollback".
storage/innobase/handler/ha_innodb.cc:
Added code to free "prepare_commit_mutex"
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There are two situations here. The constraint name is explicitly
given by the user and the constraint name is automatically generated
by InnoDB. In the case of generated constraint name, it is formed by
adding table name as prefix. The table names are stored internally in
my_charset_filename. In the case of constraint name explicitly given
by the user, it is stored in UTF8 format itself. So, in some
situations the constraint name is in utf8 and in some situations it is
in my_charset_filename format. Hence this problem.
Solution:
Always store the foreign key constraint name in UTF-8 even when
automatically generated.
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There was a memory leak in the function dict_create_add_foreign_to_dictionary().
The allocated pars_info_t object is not freed in the error code path.
Solution:
Allocate the pars_info_t object after the error checking.
rb#2368 in review
eliminate a race condition over recv_sys->n_addrs which might result in a database corruption
in recovery, without reporting a recovery error.
recv_recover_page_func(): move the code segment that decrements recv_sys->n_addrs
to the end of the function, after the call to mtr_commit()
rb://2282 approved by Inaam
After a clean shutdown, InnoDB will not check the *.ibd file headers,
for maximum performance. This is unchanged before and after this
patch.
What this fix addresses is the case when crash recovery is
needed. Previously, InnoDB could load a corrupted tablespace file.
buf_page_is_corrupted(): Add the parameter check_lsn.
fil_check_first_page(): New function, to perform a consistency check
on the first page of a file. This can be overridden by setting
innodb_force_recovery.
fil_read_first_page(), fil_open_single_table_tablespace(),
fil_load_single_table_tablespace(): Invoke fil_check_first_page().
open_or_create_data_files(): Check the status of
fil_open_single_table_tablespace().
rb#2352 approved by Jimmy Yang
OPENING MISSING PARTITION
In the ha_innobase::open() call, for normal tables, there is no retry logic.
But for partitioned tables, there is a retry logic introduced as fix for:
http://bugs.mysql.com/bug.php?id=33349https://support.mysql.com/view.php?id=21080
The Bug#33349, does not provide sufficient information to analyze the original
problem. The original problem reported by bug#33349 is also minor (just an
annoyance and no loss of functionality). Most importantly, the retry logic
has been introduced without any associated test case.
So we are removing the retry logic for partitioned tables. When the original
problem occurs, a different solution will be explored.
Change default for innodb_use_fallocate to FALSE, due to bugs in older Linux kernels (posix_fallocate() does not always guarantee that file size is like one specified)
Everything else is stored directly in the status rows.
Should be more thread safe if mysql/mariadb removes LOCK_status now.
git-svn-id: file:///svn/mysql/tokudb-engine/tokudb-engine@55091 c7de825b-a66e-492c-adef-691d508d4ae1