Fix for bug #46947 "Embedded SELECT without FOR UPDATE is
causing a lock", with after-review fixes.
SELECT statements with subqueries referencing InnoDB tables
were acquiring shared locks on rows in these tables when they
were executed in REPEATABLE-READ mode and with statement or
mixed mode binary logging turned on.
This was a regression which were introduced when fixing
bug 39843.
The problem was that for tables belonging to subqueries
parser set TL_READ_DEFAULT as a lock type. In cases when
statement/mixed binary logging at open_tables() time this
type of lock was converted to TL_READ_NO_INSERT lock at
open_tables() time and caused InnoDB engine to acquire
shared locks on reads from these tables. Although in some
cases such behavior was correct (e.g. for subqueries in
DELETE) in case of SELECT it has caused unnecessary locking.
This patch tries to solve this problem by rethinking our
approach to how we handle locking for SELECT and subqueries.
Now we always set TL_READ_DEFAULT lock type for all cases
when we read data. When at open_tables() time this lock
is interpreted as TL_READ_NO_INSERT or TL_READ depending
on whether this statement as a whole or call to function
which uses particular table should be written to the
binary log or not (if yes then statement should be properly
serialized with concurrent statements and stronger lock
should be acquired).
Test coverage is added for both InnoDB and MyISAM.
This patch introduces an "incompatible" change in locking
scheme for subqueries used in SELECT ... FOR UPDATE and
SELECT .. IN SHARE MODE.
In 4.1 the server would use a snapshot InnoDB read for
subqueries in SELECT FOR UPDATE and SELECT .. IN SHARE MODE
statements, regardless of whether the binary log is on or off.
If the user required a different type of read (i.e. locking read),
he/she could request so explicitly by providing FOR UPDATE/IN SHARE MODE
clause for each individual subquery.
On of the patches for 5.0 broke this behaviour (which was not documented
or tested), and started to use locking reads fora all subqueries in SELECT ...
FOR UPDATE/IN SHARE MODE. This patch restored 4.1 behaviour.
on same table
Protect dict_index_t::stat_n_diff_key_vals[] with an array of
mutexes.
Testing: tested all code paths under UNIV_SYNC_DEBUG
for the one in dict_print() one has to enable the InnoDB table monitor:
CREATE TABLE innodb_table_monitor (a int) ENGINE=INNODB;
------------------------------------------------------------------------
r6103 | marko | 2009-10-26 15:46:18 +0200 (Mon, 26 Oct 2009) | 4 lines
Changed paths:
M /branches/zip/row/row0ins.c
branches/zip: row_ins_alloc_sys_fields(): Zero out the system columns
DB_TRX_ID, DB_ROLL_PTR and DB_ROW_ID, in order to avoid harmless
Valgrind warnings about uninitialized data. (The warnings were
harmless, because the fields would be initialized at a later stage.)
------------------------------------------------------------------------
Stored routine DDL statements use statement-based replication
regardless of the current binlog format. The problem here was
that if a DDL statement failed during metadata lock acquisition
or opening of mysql.proc, the binlog format would not be reset
before returning. So the following DDL or DML statements are
binlogged with a wrong binlog format, which causes the slave
to stop.
The problem can be resolved by grabbing an exclusive MDL lock firstly
instead of clearing the current binlog format. So that the binlog
format will not be affected when the lock grab returns directly with
an error. The same way is taken to open a proc table for update.
parallel mode
The failure has nothing to do with parallel, but rather on the
order the tests are executed. In this case, the test
binlog_tmp_table (lets call it test2) was not ensuring that the
binary logs would be reset when it started. Later the test issues
a mysqlbinlog .../master-bin.000002 | mysql ... If the test that
was executed before this one (lets call it test1) had issued a
flush logs, then the file in use in test1 (master-bin.000002)
would not actually match the one that was expected. Eventually,
this would cause the statements logged in test1 to be replayed,
instead of the ones logged in the beginning of test2.
We fix this by:
1. adding RESET MASTER to the beginning of binlog_tmp_table
2. setting dynamically the file to use in binlog_tmp_table
Only #1 was needed, but the two make the tests cases more robust.
The problem is that message resource (message.rc) is compiled as part of static library
sql.lib rather than with executable mysqld.exe. resource files do not work in static
libraries.
The fix is to add message.rc to mysqld.exe source files list.
Extract part of innodb.innodb into innodb.innodb_misc1
This is needed in order to be able to more easily debug this test,
under valgrind, it is too huge.
The problem is that message resource (message.rc) is compiled as part of static library
sql.lib rather than with executable mysqld.exe. resource files do not work in static
libraries.
The fix is to add message.rc to mysqld.exe source files list.
The problem was in an incorrect debug assertion. The expression
used in the failing assertion states that when finding
references matching ORDER BY expressions, there can be only one
reference to a single table. But that does not make any sense,
all test cases for this bug are valid examples with multiple
identical WHERE expressions referencing the same table which
are also present in the ORDER BY list.
Fixed by removing the failing assertion. We also have to take
care of the 'found' counter so that we count multiple
references only once. We rely on this fact later in
eq_ref_table().
Statements with CONNECTION_ID were forced to be kept in the transactional
cache and by consequence non-transactional changes that were supposed to
be flushed ahead of the transaction were kept in the transactional cache.
This happened because after BUG#51894 any statement whose thd's
thread_specific_used was set was kept in the transactional cache. The idea
was to keep changes on temporary tables in the transactional cache. However,
the thread_specific_used was set not only for statements that accessed
temporary tables but also when the CONNECTION_ID was used.
To fix the problem, we created a new variable to keep track of updates
to temporary tables.
Problem: ALTER TABLE ADD INDEX may lead to table copying if there's
numeric field(s) with non-default display width modificator specified.
Fix: compare numeric field's storage lenghts when we decide whether
they can be considered 'equal' for table alteration purposes.
The bug was a side effect of WL#5030 (fix header files) and
WL#5161 (CMake).
The problem was that CMake-generated config.h (and my_config.h
as a copy of it) had a header guard. GNU autotools-generated
[my_]config.h did not. During WL#5030 the order of header files
was changed, so the following started to happen (using GNU autotools,
in embedded server):
- my_config.h included, defining HAVE_OPENSSL
- my_global.h included, un-defining HAVE_OPENSSL
- zlib.h included, including config.h,
defining HAVE_OPENSSL again.
The fix is to check HAVE_OPENSSL in conjuction with EMBEDDED_LIBRARY.
More common fix would be to define a macros as HAVE_OPENSSL && !EMBEDDED_LIBRARY
and use it instead of HAVE_OPENSSL.
Previously installed dynamic plugins are explicitly not loaded
on startup with --skip-grant-tables enabled. However, INSTALL
PLUGIN/UNINSTALL PLUGIN commands are allowed, and result in
inconsistent error messages (reporting duplicate plugin or
plugin does not exist).
This patch adds a check for --skip-grant-tables mode, and
returns error ER_OPTION_PREVENTS_STATEMENT to the user when
the above commands are attempted.
When row_merge_drop_temp_indexes() was reworked to drop the indexes
via the data dictionary cache, the code was broken because it would
read the index name from the wrong field.
partition if muliple columns used
Problem was that range scanning through the sorted array of
the column list values did not use a correct index calculation.
Fixed by also taking the number of columns in the calculation.
The bug was a side effect of WL#5030 (fix header files) and
WL#5161 (CMake).
The problem was that CMake-generated config.h (and my_config.h
as a copy of it) had a header guard. GNU autotools-generated
[my_]config.h did not. During WL#5030 the order of header files
was changed, so the following started to happen (using GNU autotools,
in embedded server):
- my_config.h included, defining HAVE_OPENSSL
- my_global.h included, un-defining HAVE_OPENSSL
- zlib.h included, including config.h,
defining HAVE_OPENSSL again.
The fix is to change the order of header file, moving zlib.h
to the top of the header list. More proper fix would be to wrap
unguarded auto-generated [my_]config.h by guarded non-generated
header file.
btr_page_tuple_smaller(): New function, refactored from
btr_page_split_and_insert().
btr_page_get_split_rec(): Renamed from btr_page_get_sure_split_rec().
Note that a NULL return may mean that the tuple is to be inserted into
either the lower or upper page, to be determined by btr_page_tuple_smaller().
btr_page_split_and_insert(): When btr_page_get_split_rec() returns NULL,
invoke btr_page_tuple_smaller() to determine which half-page the tuple
belongs to.
Reviewed by Sunny Bains
of sync
In RBR, sometimes the table->s->last_null_bit_pos can be zero. This
has impact at the slave when it compares records fetched from the
storage engine against records in the binary log event. If
last_null_bit_pos is zero the slave, while comparing in
log_event.cc:record_compare function, would set all bits in the last
null_byte to 1 (assumed all 8 were unused) . Thence it would loose the
ability to distinguish records that were similar in contents except
for the fact that some field was null in one record, but not in the
other. Ultimately this would cause wrong matches, and in the specific
case depicted in the bug report the same record would be updated
twice, resulting in a lost update.
Additionally, in the record_compare function the slave was setting the
X bit unconditionally. There are cases that the X bit does not exist
in the record header. This could also lead to wrong matches between
records.
We fix both by conditionally resetting the bits: (i) unused null_bits
are set if last_null_bit_pos > 0; (ii) X bit is set if
HA_OPTION_PACK_RECORD is in use.