TO 'MYISAM_SORT_BUFFER_SIZE'
Problem: 'myisam_sort_buffer_size' is a parameter used by
mysqld program only whereas 'sort_buffer_size' is used by
mysqld and myisamchk programs. But the error message printed
when myisamchk program is run with insufficient buffer size
is myisam_sort_buffer_size is too small which may mislead to the
server parameter myisam_sort_buffer_size.
SOLUTION: A parameter 'myisam_sort_buffer_size' is added as an
alias for 'sort_buffer_size' and the 'sort_buffer_size' parameter
is marked as deprecated. So myisamchk also has both the parameters
with the same role.
IN ALTER TABLE ... ADD UNIQUE KEY
A bogus debug assertion failure occurred when reporting a duplicate
key on a column prefix of a CHAR column.
This is a regression from Bug#14729221 IN-PLACE ALTER TABLE REPORTS ''
INSTEAD OF REAL DUPLICATE VALUE FOR PREFIX KEYS. The assertion is only
present when UNIV_DEBUG is defined (which it is in debug builds
starting from MySQL 5.5). It is a case of overasserting.
Fix approved by Inaam Rana on IM.
LEN <= SIZEOF(ULONGLONG)
This bug was caught in the WL#6255 ALTER TABLE...ADD COLUMN in MySQL
5.6, but there is a bug in all InnoDB versions that support
auto-increment columns.
row_search_autoinc_read_column(): When reading the maximum value of
the auto-increment column, and the column only contains NULL values,
return 0. This corresponds to the case when the table is empty in
row_search_max_autoinc().
rb:1415 approved by Sunny Bains
REAL DUPLICATE VALUE FOR PREFIX KEYS
innobase_rec_to_mysql(): Invoke dict_index_get_nth_col_or_prefix_pos()
instead of dict_index_get_nth_col_pos() to find the column.
SECONDARY INDEX UPDATES MAKE CONSISTENT READS DO O(N^2) UNDO PAGE
LOOKUPS (honoring kill query while accessing sec_index)
If secondary index is being used for select query evaluation and this
query is operating with consistent read snapshot it might take good time for
secondary index to return back control to mysql as MVCC would kick in.
If user issues "kill query <id>" while query is actively accessing
secondary index it will not be honored as there is no hook to check
for this condition. Added hook for this check.
-----
Parallely secondary index taking too long to evaluate for consistent
read snapshot case is being examined for performance improvement. WL#6540.
The problem is in the error handling in row_create_table_for_mysql().
In the 'disk full' case we may forget to call dict_mem_table_free() on
the table object.
Approved by: Marko (rb:1377 and rb:1386)
We did not allocate enough bits for index->trx_id_offset, causing an
UPDATE or DELETE of a table with a PRIMARY KEY longer than 1024 bytes
to corrupt the PRIMARY KEY.
dict_index_t: Allocate enough bits.
dict_index_build_internal_clust(): Check for overflow of
index->trx_id_offset. Trip a debug assertion when overflow occurs.
rb:1380 approved by Jimmy Yang
TRANSACTION ROLLBACK
Description: During the rollback operation, a blob page
is removed earlier than desired. Consider following scenario:
1. create table t1(a int primary key,b blob) engine=innodb;
2. insert into t1 values (1,repeat('b',9000));
3. begin;
4. update t1 set b=concat(b,'b');
5. update t1 set a=a+1;
6. insert into t1 values (1,repeat('b',9000));
7. rollback;
The update operation in line 5 produces 2 undo log record. The first
undo record (TRX_UNDO_DEL_MARK_REC) goes to trx->update_undo and the
second undo record (TRX_UNDO_INSERT_REC) goes to trx->insert_undo.
During rollback, they are executed out of order.
When the undo record TRX_UNDO_DEL_MARK_REC is applied/executed,
the blob ownership is also reset. Because of this the blob page
is released earlier than desired. This blob page must have been
freed only as part of applying/executing the undo record
TRX_UNDO_INSERT_REC.
This problem can be avoided by executing the undo records in
order. This patch will make innodb to execute the undo records
in order.
rb://1125 approved by Marko.
Delete-mark change buffer records when resorting to a pessimistic
delete from the change buffer B-tree. Skip delete-marked records in
the change buffer merge and when estimating whether an operation can
be buffered. Without this fix, we could try to apply the same buffered
changes multiple times if the server was killed at the right moment.
In MySQL 5.5 and later: ibuf_get_volume_buffered_count_func(): Ignore
delete-marked (already processed) records.
ibuf_delete_rec(): Add a crash point before optimistic delete. If the
optimistic delete fails, flag the record processed before
mtr_commit().
ibuf_merge_or_delete_for_page(): Ignore delete-marked (already
processed) records.
Backport to 5.1: Rename btr_cur_del_unmark_for_ibuf() to
btr_cur_set_deleted_flag_for_ibuf() and add a parameter.
rb:1307 approved by Jimmy Yang
page_zip_validate(), page_zip_validate_low(): Add a parameter for the
B-tree index.
page_zip_validate_low(): If the page contents does not match, check
that the record link chains match. Furthermore, if dict_index_t is
passed, check that the records match. (This reduces coverage a bit: if
index=NULL, we will ignore differences in record contents, that is,
the page payload.)
rb:1264 approved by Inaam Rana
THOUGH IT IS NOT.
The following error message is misleading because it claims
that the BLOB space is not counted.
"ERROR 1118 (42000): Row size too large. The maximum row size for
the used table type, not counting BLOBs, is 8126. You have to
change some columns to TEXT or BLOBs"
When the ROW_FORMAT=compact or ROW_FORMAT=REDUNDANT is used,
the BLOB prefix is stored inline along with the row. So
the above error message is changed as follows depending on
the row format used:
For ROW_FORMAT=COMPRESSED or ROW_FORMAT=DYNAMIC, the error
message is as follows:
"ERROR 42000: Row size too large (> 8126). Changing some
columns to TEXT or BLOB may help. In current row format,
BLOB prefix of 0 bytes is stored inline."
For ROW_FORMAT=COMPACT or ROW_FORMAT=REDUNDANT, the error
message is as follows:
"ERROR 42000: Row size too large (> 8126). Changing some
columns to TEXT or BLOB or using ROW_FORMAT=DYNAMIC or
ROW_FORMAT=COMPRESSED may help. In current row
format, BLOB prefix of 768 bytes is stored inline."
rb://1252 approved by Marko Makela
PAGE SPLIT
page_rec_get_nth_const(): Map nth==0 to the page infimum.
btr_compress(adjust=TRUE): Add a debug assertion for nth>0. The cursor
should never be positioned on the page infimum.
btr_index_page_validate(): Add test instrumentation for checking the
return values of page_rec_get_nth_const() during CHECK TABLE, and for
checking that the page directory slot 0 always contains only one
record, the predefined page infimum record.
page_cur_delete_rec(), page_delete_rec_list_end(): Add debug
assertions guarding against accessing the page slot 0.
page_copy_rec_list_start(): Clarify a comment about ret_pos==0.
rb:1248 approved by Jimmy Yang
ha_innodb::records_in_range(): Remove a debug assertion
that prohibits an open range (full table).
The patch by Jorgen Loland only removed the assertion from the
built-in InnoDB, not from the InnoDB Plugin.
ha_innobase::records_in_range(): Remove a debug assertion
that prohibits an open range (full table).
This assertion catches unnecessary calls to this method,
but such calls are not harming correctness.
HEURISTICS FOR COMPRESSED PAGE SIZE
The fix of Bug#12845774 was supposed to skip known-to-fail
btr_cur_optimistic_insert() calls. There was only one such call, in
btr_cur_pessimistic_update(). All other callers of
btr_cur_pessimistic_insert() would release and reacquire the B-tree
page latch before attempting the pessimistic insert. This would allow
other threads to restructure the B-tree, allowing (and requiring) the
insert to succeed as an optimistic (single-page) operation.
Failure to attempt an optimistic insert before a pessimistic one would
trigger an attempt to split an empty page.
rb:1234 approved by Sunny Bains
Facebook got a case where the page compresses really well so that
btr_cur_optimistic_update() returns DB_UNDERFLOW, but when a record
gets updated, the compression rate radically changes so that
btr_cur_insert_if_possible() can not insert in place despite
reorganizing/recompressing the page, leading to the assertion failing.
rb:1220 approved by Sunny Bains
COMPRESSED PAGE SIZE
This was submitted as MySQL Bug 61456 and a patch provided by
Facebook. This patch follows the same idea, but instead of adding a
parameter to btr_cur_pessimistic_insert(), we simply remove the
btr_cur_optimistic_insert() call there and add it to the only caller
that needs it.
btr_cur_pessimistic_insert(): Do not try btr_cur_optimistic_insert().
btr_insert_on_non_leaf_level_func(): Invoke btr_cur_optimistic_insert()
before invoking btr_cur_pessimistic_insert().
btr_cur_pessimistic_update(): Clarify in a comment why it is not
necessary to invoke btr_cur_optimistic_insert().
btr_root_raise_and_insert(): Assert that the root page is not empty.
This could happen if a pessimistic insert (involving a split or merge)
is performed without first attempting an optimistic (intra-page) insert.
rb:1219 approved by Sunny Bains
btr_cur_optimistic_insert(): Remove a bogus assertion. The insert may
fail after reorganizing the page.
btr_cur_optimistic_update(): Do not attempt to reorganize compressed pages,
because compression may fail after reorganization.
page_copy_rec_list_start(): Use page_rec_get_nth() to restore to the
ret_pos, which may also be the page infimum.
rb:1221
IN QUERIES
This bug was caused by an incorrect fix of
Bug#13807811 BTR_PCUR_RESTORE_POSITION() CAN SKIP A RECORD
There was nothing wrong with btr_pcur_restore_position(), but with the
use of it in the table scan during index creation.
rb:1206 approved by Jimmy Yang
Problem description:
Table 't' created with two colums having compound index on both the
columns under innodb/myisam engine at remote machine. In the local
machine same table is created undet the federated engine.
A select having where clause with along 'AND' operation gives wrong
results on local machine.
Analysis:
The given query at federated engine is wrongly transformed by
federated::create_where_from_key() function and the same was sent to
the remote machine. Hence the local machine is showing wrong results.
Given query "select c1 from t where c1 <= 2 and c2 = 1;"
Query transformed, after ha_federated::create_where_from_key() function is:
SELECT `c1`, `c2` FROM `t` WHERE (`c1` IS NOT NULL ) AND
( (`c1` >= 2) AND (`c2` <= 1) ) and the same sent to real_query().
In the above the '<=' and '=' conditions were transformed to '>=' and
'<=' respectively.
ha_federated::create_where_from_key() function behaving as below:
The key_range is having both the start_key and end_key. The start_key
is used to get "(`c1` IS NOT NULL )" part of the where clause, this
transformation is correct. The end_key is used to get "( (`c1` >= 2)
AND (`c2` <= 1) )", which is wrong, here the given conditions('<=' and '=')
are changed as wrong conditions('>=' and '<=').
The end_key is having {key = 0x39fa6d0 "", length = 10, keypart_map = 3,
flag = HA_READ_AFTER_KEY}
The store_length is having value '5'. Based on store_length and length
values the condition values is applied in HA_READ_AFTER_KEY switch case.
The switch case 'HA_READ_AFTER_KEY' is applicable to only the last part of
the end_key and for previous parts it is going to 'HA_READ_KEY_OR_NEXT' case,
here the '>=' is getting added as a condition instead of '<='.
Fix:
Updated the 'if' condition in 'HA_READ_AFTER_KEY' case to affect for all
parts of the end_key. i.e 'i > 0' will used for end_key, Hence added it in
the if condition.
Backporting the WL#5716, "Information schema table for InnoDB
buffer pool information". Backporting revisions 2876.244.113,
2876.244.102 from mysql-trunk.
rb://1175 approved by Jimmy Yang.
WHEN KILLING
Suppose there is a query waiting for a lock. If the user kills
this query, then "Got error -1 when reading table" error message
must not be logged in the server log file. Since this is a user
requested interruption, no spurious error message must be logged
in the server log. This patch will remove the error message from
the log.
approved by joh and tatjana
rb://1088
approved by: Marko Makela
This bug was introduced in early stages of plugin. We were not
checking for an implicit lock on sec index rec for trx_id that is
stamped on current version of the clust_index in case where the
clust_index has a previous delete marked version.
The following scenario crashes our mysql server:
1. set global innodb_file_per_table=1;
2. create table t1(c1 int) engine=innodb;
3. alter table t1 discard tablespace;
4. alter table t1 add unique index(c1);
Step 4 crashes the server. This patch introduces a check on discarded
tablespace to avoid the crash.
rb://1041 approved by Marko Makela
FULLTEXT INDEX AND CONCURRENT DML.
Problem Statement:
------------------
1) Create a table with FT index.
2) Enable concurrent inserts.
3) In multiple threads do below operations repeatedly
a) truncate table
b) insert into table ....
c) select ... match .. against .. non-boolean/boolean mode
After some time we could observe two different assert core dumps
Analysis:
--------
1)assert core dump at key_read_cache():
Two select threads operating in-parallel on same key
root block.
1st select thread block->status is set to BLOCK_ERROR
because the my_pread() in read_block() is returning '0'.
Truncate table made the index file size as 1024 and pread
was asked to get the block of count bytes(1024 bytes)
from offset of 1024 which it cannot read since its
"end of file" and retuning '0' setting
"my_errno= HA_ERR_FILE_TOO_SHORT" and the key_file_length,
key_root[0] is same i.e. 1024. Since block status has BLOCK_ERROR
the 1st select thread enter into the free_block() and will
be under wait on conditional mutex by making status as
BLOCK_REASSIGNED and goes for wait_on_readers(). Other select
thread will also work on the same block and sees the status as
BLOCK_ERROR and enters into free_block(), checks for BLOCK_REASSIGNED
and asserting the server.
2)assert core dump at key_write_cache():
One select thread and One insert thread.
Select thread gets the unlocks the 'keycache->cache_lock',
which allows other threads to continue and gets the pread()
return value as'0'(please see the explanation above) and
tries to get the lock on 'keycache->cache_lock' and waits
there for the lock.
Insert thread requests for the block, block will be assigned
from the hash list and makes the page_status as
'PAGE_WAIT_TO_BE_READ' and goes for the read_block(), waits
in the queue since there are some other threads performing
reads on the same block.
Select thread which was waiting for the 'keycache->cache_lock'
mutex in the read_block() will continue after getting the my_pread()
value as '0' and sets the block status as BLOCK_ERROR and goes to
the free_block() and go to the wait_for_readers().
Now the insert thread will awake and continues. and checks
block->status as not BLOCK_READ and it asserts.
Fix:
---
In the full text code, multiple readers of index file is not guarded.
Hence added below below code in _ft2_search() and walk_and_match().
to lock the key_root I have used below code in _ft2_search()
if (info->s->concurrent_insert)
mysql_rwlock_rdlock(&share->key_root_lock[0]);
and to unlock
if (info->s->concurrent_insert)
mysql_rwlock_unlock(&share->key_root_lock[0]);
dict_table_replace_index_in_foreign_list(): Replace the dropped index
also in the foreign key constraints of child tables that are
referencing this table.
row_ins_check_foreign_constraint(): If the underlying index is
missing, refuse the operation.
rb:1051 approved by Jimmy Yang
BY A CONCURRENT TRANSACTIO
The member function QUICK_RANGE_SELECT::init_ror_merged_scan() performs
a table handler clone. Innodb does not provide a clone operation.
The ha_innobase::clone() is not there. The handler::clone() does not
take care of the ha_innobase->prebuilt->select_lock_type. Because of
this what happens is that for one index we do a locking read, and
for the other index we were doing a non-locking (consistent) read.
The patch introduces ha_innobase::clone() member function.
It is implemented similar to ha_myisam::clone(). It calls the
base class handler::clone() and then does any additional operation
required. I am setting the ha_innobase->prebuilt->select_lock_type
correctly.
rb://1060 approved by Marko
Bug#13639204 64111: CRASH ON SELECT SUBQUERY WITH NON UNIQUE INDEX
The crash happened due to wrong calculation
of key length during creation of reference for
sort order index. The problem is that
keyuse->used_tables can have OUTER_REF_TABLE_BIT enabled
but used_tables parameter(create_ref_for_key() func) does
not have it. So key parts which have OUTER_REF_TABLE_BIT
are ommited and it could lead to incorrect key length
calculation(zero key length).
FROM BUFFER POOL
rb://975
approved by: Marko Makela
There is a race in lock_validate() where we try to access a page
without ensuring that the tablespace stays valid during the operation
i.e.: it is not deleted. This patch tries to fix that by using an
existing flag (the flag is renamed to make it's name more generic
in line with it's new use).
IN OS_THREAD_EQ
rb://977
approved by: Marko Makela
rw_lock::writer_thread field contains the thread id of current x-holder
or wait-x thread. This field is un-initialized at lock creation and is
written to for the first time when an attempt is made to x-lock.
Current code considers ::writer_thread as valid memory region only when
the lock is held in x-mode (or there is an x-waiter). This is an
overkill and it generates valgrind warnings.
The fix is to consider ::writer_thread as valid memory region once it
has been written to.
Reasoning:
==========
The ::writer_thread can be safely considered valid because:
* We only ever do comparison with current calling threads id.
* We only ever do comparison when ::recursive flag is set
* We always unset ::recursive flag in x-unlock
* Same thread cannot be unlocking and attempting to lock at the same
time
* thread_id recycling is not an issue because before an id is recycled
the thread must leave innodb meaning it must release all locks meaning
it must unset ::recursive flag.
truncating, inserting the same set of rows. When a table is
re-created with the same set of rows, the data file size must
not grow.
rb:968
Approved by Marko.
This bug has been there at least since MySQL 4.0.9. (Before 4.0.9, the
code probably was even more severely broken.)
btr_pcur_restore_position(): When cursor restoration fails, before
invoking btr_pcur_store_position() move to the previous or next record
unless cursor->rel_pos==BTR_PCUR_ON or the record was not a user
record.
This bug can cause skipped records when btr_pcur_store_position() is
called on the last record of a page. A symptom would be record count
mismatch in CHECK TABLE, or failure to find a record to delete-mark or
update or purge. The following operations should be affected by the
bug:
* row_search_for_mysql(): SELECT, UPDATE, REPLACE, CHECK TABLE,
(almost anything else than INSERT)
* foreign key CASCADE operations
* row_merge_read_clustered_index(): index creation (since MySQL 5.1
InnoDB Plugin)
* multi-threaded purge (after MySQL 5.5): not sure, but it might fail
to purge some records
Not all callers of btr_pcur_restore_position() should be affected.
Anything that asserts or checks that restoration succeeds is
unaffected. For example, cursor restoration on the change buffer tree
should always succeed, because access is being protected by additional
latches. Likewise, rollback, or any code accesses data dictionary
tables while holding dict_sys->mutex should be safe.
rb:967 approved by Jimmy Yang
There are two threads. In one thread, dml operation is going on
involving cascaded update operation. In another thread, alter
table add foreign key constraint is happening. Under these
circumstances, it is possible for the dml thread to access a
dict_foreign_t object that has been freed by the ddl thread.
The debug sync test case provides the sequence of operations.
Without fix, the test case will crash the server (because of
newly added assert). With fix, the alter table stmt will return
an error message.
Backporting the fix from MySQL 5.5 to 5.1
rb:961
rb:947