The failures are missing entries in the slow query log. The reason for the failure are sleep() calls with short duration 10ms, which is less than the default system timer resolution for various WaitForXXXObject functions (15.6 ms) and thus can't work reliably.
The fix is to make sleeps tiny bit longer (20ms from 10ms) in the test.
Optimization of aggregate functions detected constant under max() and evalueted it, but condition in the WHWRE clause (which is always FALSE) was not taken into account
The patch backports two patches from mysql 5.6:
- BUG#12640437: USING SQL_BUFFER_RESULT RESULTS IN A DIFFERENT QUERY OUTPUT
- Bug#12578908: SELECT SQL_BUFFER_RESULT OUTPUTS TOO MANY ROWS WHEN GROUP IS OPTIMIZED AWAY
Original comment:
-----------------
3714 Jorgen Loland 2012-03-01
BUG#12640437 - USING SQL_BUFFER_RESULT RESULTS IN A DIFFERENT
QUERY OUTPUT
For all but simple grouped queries, temporary tables are used to
resolve grouping. In these cases, the list of grouping fields is
stored in the temporary table and grouping is resolved
there (e.g. by adding a unique constraint on the involved
fields). Because of this, grouping is already done when the rows
are read from the temporary table.
In the case where a group clause may be optimized away, grouping
does not have to be resolved using a temporary table. However, if
a temporary table is explicitly requested (e.g. because the
SQL_BUFFER_RESULT hint is used, or the statement is
INSERT...SELECT), a temporary table is used anyway. In this case,
the temporary table is created with an empty group list (because
the group clause was optimized away) and it will therefore not
create groups. Since the temporary table does not take care of
grouping, JOIN::group shall not be set to false in
make_simple_join(). This was fixed in bug 12578908.
However, there is an exception where make_simple_join() should
set JOIN::group to false even if the query uses a temporary table
that was explicitly requested but is not strictly needed. That
exception is if the loose index scan access method (explain
says "Using index for group-by") is used to read into the
temporary table. With loose index scan, grouping is resolved
by the access method. This is exactly what happens in this bug.
This is a backport of the fix for MySQL bug #13723054 in 5.6.
Original comment:
The crash is caused by arbitrary memory area owerwriting in case of
BLOB fields during attempt to copy BLOB field key image into record
buffer(record buffer is too small to get BLOB key part image).
note:
QUICK_GROUP_MIN_MAX_SELECT can not work with BLOB fields
because it uses record buffer as temporary buffer for key values
however this case is filtered out by covering_keys() check
in get_best_group_min_max() as BLOBs always require key length
modificator in the key declaration and if the key has a BLOB
then it can not be covered key.
The fix is to use 'max_used_key_length' key length instead of 0.
Analysis:
Spcifically the crash in this bug was a result of the call to key_copy()
that copied the whole key, inlcuding the BLOB field which is not used
for index access. Copying the blob field overwrote memory as far as the
function parameter 'key_info'. As a result the contents of key_info was
all 0, which resulted in a crash when this key_info was accessed few
lines below in key_cmp().
The problem was in the code (update_const_equal_items()) which marked
index parts constant independently of the place where the equality was used.
In the test suite it marked t2_1.c part constant despite the fact that
it connected by OR with other expression.
Solution is to mark constant only top equalities connected with AND.
The main problem was a bug in CSV where it provided wrong statistics (it claimed the table was empty when it wasn't)
I also fixed wrong freeing of blob's in the CSV handler. (Any call to handler::read_first_row() on a CSV table with blobs would fail)
mysql-test/r/csv.result:
Added new test case
mysql-test/r/partition_innodb.result:
Updated test results after fixing bug with impossible partitions and const tables
mysql-test/t/csv.test:
Added new test case
sql/sql_select.cc:
Cleaned up code for handling of partitions.
Fixed also a bug where we didn't threat a table with impossible partitions as a const table.
storage/csv/ha_tina.cc:
Allocate blobroot onces.
Fixed that repair removes the 'table is moved' mark.
mysql-test/suite/maria/r/maria-autozerofill.result:
Test case for lp:967914
mysql-test/suite/maria/t/maria-autozerofill.test:
Test case for lp:967914
storage/maria/ha_maria.cc:
Fixed that repair removes the 'table is moved' mark.
The issue was that check/optimize/anaylze did not zerofill the table before they started to work on it.
Added one more element to not often used function handler::auto_repair() to allow handler to decide when to auto repair.
mysql-test/suite/maria/r/maria-autozerofill.result:
Test case for lp:944422
mysql-test/suite/maria/t/maria-autozerofill.test:
Test case for lp:944422
sql/ha_partition.cc:
Added argument to auto_repair()
sql/ha_partition.h:
Added argument to auto_repair()
sql/handler.h:
Added argument to auto_repair()
sql/table.cc:
Let auto_repair() decide which errors to trigger auto-repair
storage/archive/ha_archive.h:
Added argument to auto_repair()
storage/csv/ha_tina.h:
Added argument to auto_repair()
storage/maria/ha_maria.cc:
Give better error & warning messages for auto-repaired tables.
storage/maria/ha_maria.h:
Added argument to auto_repair()
Always auto-repair in case of moved table.
storage/maria/ma_open.c:
Remove special handling of HA_ERR_OLD_FILE (this is now handled in auto_repair())
storage/myisam/ha_myisam.h:
Added argument to auto_repair()
This bug was introduced into mariadb 5.2 in the December 2010 with
the patch that added a new engine property: the ability to support
virtual columns.
As a result of this bug the information from frm files for tables
that contained virtual columns did not appear in the information schema
tables.
If in the where clause of the a query some comparison conditions on the
field under a MIN/MAX aggregate function contained constants whose sizes
exceeded the size of the field then the query could return a wrong result
when the optimizer had chosen to apply the MIN/MAX optimization.
With such conditions the MIN/MAX optimization still could be applied, yet
it would require a more thorough analysis of the keys built to find
the value of MIN/MAX aggregate functions with index look-ups.
The current patch just prohibits using the MIN/MAX optimization in this
situation.
The issue was that Aria allowed too long keys to be created (so that the internal buffer was not big enough to hold the whole key).
Key lengths is now limited to HA_MAX_KEY_LENGTH (1000), as for MyISAM.
Fixed failure in "_ma_apply_redo_index: Assertion `new_page_length == 0", as found by buildbot.
mysql-test/suite/maria/r/maria.result:
Updated results
mysql-test/suite/maria/r/maria3.result:
Updated results. Added test for bug fix
mysql-test/suite/maria/t/maria3.test:
Updated results. Added test for bug fix
mysql-test/suite/maria/t/optimize.test:
Updated test for new max key length
storage/maria/ha_maria.cc:
Limit key to HA_MAX_KEY_LENGTH.
storage/maria/ma_key_recover.c:
Limit used page length to max page size (this is in line with the code that writes the entry to the log).
This fixes failure in "_ma_apply_redo_index: Assertion `new_page_length == 0", as found by buildbot.
storage/maria/ma_search.c:
Extra DBUG
storage/maria/ma_write.c:
Added test to detect errors earlier.
Problem was a crash in internal temporary (Maria) files when row length exceeded 65535
mysql-test/suite/maria/r/maria3.result:
Added test case
mysql-test/suite/maria/t/maria3.test:
Added test case
storage/maria/ma_open.c:
Added support for row length > 65535.
This fixes crash when using tables with longer row lengths.
Fixed supression in mysql-test-run so it also works on windows.
mysql-test/mysql-test-run.pl:
Fixed supression so it also works on windows.
mysql-test/valgrind.supp:
More general handling of memory loss in dlclose (backported from 5.2)
sql/signal_handler.cc:
Added newlines around link to how to do bug reports
Fixed README with link to source
Merged InnoDB change to XtraDB
README:
Added information of where to find MariaDB code
storage/archive/ha_archive.cc:
Removed memset() of rows, a MariaDB checksum's doesn't touch not used data.
This happend when you have more than 1024 open Aria tables during checkpoint.
mysql-test/mysql-test-run.pl:
Fixed that variable names are consistent between external and internal server.
mysql-test/suite/maria/suite.pm:
Test for aria-block-size instead of 'aria' as 'aria' is not set for embedded server.
This should be ok for aria tests, as aria is never disabled for these.
storage/maria/ma_checkpoint.c:
Fixed bug when there are more than 1024 open Aria tables during checkpoint.
This bug was originally filed and fixed as Bug#12612184. The original
fix was buggy, and it was patched by Bug#12704861. Also that patch was
buggy (potentially breaking crash recovery), and both fixes were
reverted.
This fix was not ported to the built-in InnoDB of MySQL 5.1, because
the function signatures of many core functions are different from
InnoDB Plugin and later versions. The block allocation routines and
their callers would have to changed so that they handle block
descriptors instead of page frames.
When a record is updated so that its size grows, non-updated columns
can be selected for external (off-page) storage. The bug is that the
initially inserted updated record contains an all-zero BLOB pointer to
the field that was not updated. Only after the BLOB pages have been
allocated and written, the valid pointer can be written to the record.
Between the release of the page latch in mtr_commit(mtr) after
btr_cur_pessimistic_update() and the re-latching of the page in
btr_pcur_restore_position(), other threads can see the invalid BLOB
pointer consisting of 20 zero bytes. Moreover, if the system crashes
at this point, the situation could persist after crash recovery, and
the contents of the non-updated column would be permanently lost.
The problem is amplified by the ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPRESSED that were introduced in
innodb_file_format=barracuda in InnoDB Plugin, but the bug does exist
in all InnoDB versions.
The fix is as follows. After a pessimistic B-tree operation that needs
to write out off-page columns, allocate the pages for these columns in
the mini-transaction that performed the B-tree operation (btr_mtr),
but write the pages in a separate mini-transaction (blob_mtr). Do
mtr_commit(blob_mtr) before mtr_commit(btr_mtr). A quirk: Do not reuse
pages that were previously freed in btr_mtr. Only write the off-page
columns to 'fresh' pages.
In this way, crash recovery will see redo log entries for blob_mtr
before any redo log entry for btr_mtr. It will apply the BLOB page
writes to pages that were marked free at that point. If crash recovery
fails to see all of the btr_mtr redo log, there will be some
unreachable BLOB data in free pages, but the B-tree will be in a
consistent state.
btr_page_alloc_low(): Renamed from btr_page_alloc(). Add the parameter
init_mtr. Return an allocated block, or NULL. If init_mtr!=mtr but
the page was already X-latched in mtr, do not initialize the page.
btr_page_alloc(): Wrapper for btr_page_alloc_for_ibuf() and
btr_page_alloc_low().
btr_page_free(): Add a debug assertion that the page was a B-tree page.
btr_lift_page_up(): Return the father block.
btr_compress(), btr_cur_compress_if_useful(): Add the parameter ibool
adjust, for adjusting the cursor position.
btr_cur_pessimistic_update(): Preserve the cursor position when
big_rec will be written and the new flag BTR_KEEP_POS_FLAG is defined.
Remove a duplicate rec_get_offsets() call. Keep the X-latch on
index->lock when big_rec is needed.
btr_store_big_rec_extern_fields(): Replace update_inplace with
an operation code, and local_mtr with btr_mtr. When not doing a
fresh insert and btr_mtr has freed pages, put aside any pages that
were previously X-latched in btr_mtr, and free the pages after
writing out all data. The data must be written to 'fresh' pages,
because btr_mtr will be committed and written to the redo log after
the BLOB writes have been written to the redo log.
btr_blob_op_is_update(): Check if an operation passed to
btr_store_big_rec_extern_fields() is an update or insert-by-update.
fseg_alloc_free_page_low(), fsp_alloc_free_page(),
fseg_alloc_free_extent(), fseg_alloc_free_page_general(): Add the
parameter init_mtr. Return an allocated block, or NULL. If
init_mtr!=mtr but the page was already X-latched in mtr, do not
initialize the page.
xdes_get_descriptor_with_space_hdr(): Assert that the file space
header is being X-latched.
fsp_alloc_from_free_frag(): Refactored from fsp_alloc_free_page().
fsp_page_create(): New function, for allocating, X-latching and
potentially initializing a page. If init_mtr!=mtr but the page was
already X-latched in mtr, do not initialize the page.
fsp_free_page(): Add ut_ad(0) to the error outcomes.
fsp_free_page(), fseg_free_page_low(): Increment mtr->n_freed_pages.
fsp_alloc_seg_inode_page(), fseg_create_general(): Assert that the
page was not previously X-latched in the mini-transaction. A file
segment or inode page should never be allocated in the middle of an
mini-transaction that frees pages, such as btr_cur_pessimistic_delete().
fseg_alloc_free_page_low(): If the hinted page was allocated, skip the
check if the tablespace should be extended. Return NULL instead of
FIL_NULL on failure. Remove the flag frag_page_allocated. Instead,
return directly, because the page would already have been initialized.
fseg_find_free_frag_page_slot() would return ULINT_UNDEFINED on error,
not FIL_NULL. Correct a bogus assertion.
fseg_alloc_free_page(): Redefine as a wrapper macro around
fseg_alloc_free_page_general().
buf_block_buf_fix_inc(): Move the definition from the buf0buf.ic to
buf0buf.h, so that it can be called from other modules.
mtr_t: Add n_freed_pages (number of pages that have been freed).
page_rec_get_nth_const(), page_rec_get_nth(): The inverse function of
page_rec_get_n_recs_before(), get the nth record of the record
list. This is faster than iterating the linked list. Refactored from
page_get_middle_rec().
trx_undo_rec_copy(): Add a debug assertion for the length.
trx_undo_add_page(): Return a block descriptor or NULL instead of a
page number or FIL_NULL.
trx_undo_report_row_operation(): Add debug assertions.
trx_sys_create_doublewrite_buf(): Assert that each page was not
previously X-latched.
page_cur_insert_rec_zip_reorg(): Make use of page_rec_get_nth().
row_ins_clust_index_entry_by_modify(): Pass BTR_KEEP_POS_FLAG, so that
the repositioning of the cursor can be avoided.
row_ins_index_entry_low(): Add DEBUG_SYNC points before and after
writing off-page columns. If inserting by updating a delete-marked
record, do not reposition the cursor or commit the mini-transaction
before writing the off-page columns.
row_build(): Tighten a debug assertion about null BLOB pointers.
row_upd_clust_rec(): Add DEBUG_SYNC points before and after writing
off-page columns. Do not reposition the cursor or commit the
mini-transaction before writing the off-page columns.
rb:939 approved by Jimmy Yang
IS EXECUTED TWICE FROM P
This bug is a duplicate of bug 12567331, which was pushed to the
optimizer backporting tree on 2011-06-11. This is just a back-port of
the fix. Both test cases are included as they differ somewhat.
GRACEFUL SHUTDOWN
During startup mysql picks up .frm files from the tmpdir directory and
tries to drop those tables in the storage engine.
The problem is that when tmpdir ends in / then ha_innobase::delete_table()
is passed a string like "/var/tmp//#sql123", then it wrongly normalizes it
to "/#sql123" and calls row_drop_table_for_mysql() which of course fails
to delete the table entry from the InnoDB dictionary cache.
ha_innobase::delete_table() returns an error but nevertheless mysql wipes
away the .frm file and the entry in the InnoDB dictionary cache remains
orphaned with no easy way to remove it.
The "no easy" way to remove it is to create a similar temporary table again,
copy its .frm file to tmpdir under "#sql123.frm" and restart mysqld with
tmpdir=/var/tmp (no trailing slash) - this way mysql will pick the .frm file
after restart and will try to issue drop table for "/var/tmp/#sql123"
(notice do double slash), ha_innobase::delete_table() will normalize it to
"tmp/#sql123" and row_drop_table_for_mysql() will successfully remove the
table entry from the dictionary cache.
The solution is to fix normalize_table_name_low() to normalize things like
"/var/tmp//table" correctly to "tmp/table".
This patch also adds a test function which invokes
normalize_table_name_low() with various inputs to make sure it works
correctly and a mtr test that calls this test function.
Reviewed by: Marko (http://bur03.no.oracle.com/rb/r/929/)