timestamp: Thu 2011-12-01 15:12:10 +0100
Fix for Bug#13430436 PERFORMANCE DEGRADATION IN SYSBENCH ON INNODB DUE TO ICP
When running sysbench on InnoDB there is a performance degradation due
to index condition pushdown (ICP). Several of the queries in sysbench
have a WHERE condition that the optimizer uses for executing these
queries as range scans. The upper and lower limit of the range scan
will ensure that the WHERE condition is fulfilled. Still, the WHERE
condition is part of the queries' condition and if ICP is enabled the
condition will be pushed down to InnoDB as an index condition.
Due to the range scan's upper and lower limits ensure that the WHERE
condition is fulfilled, the pushed index condition will not filter out
any records. As a result the use of ICP for these queries results in a
performance overhead for sysbench. This overhead comes from using
resources for determining the part of the condition that can be pushed
down to InnoDB and overhead in InnoDB for executing the pushed index
condition.
With the default configuration for sysbench the range scans will use
the primary key. This is a clustered index in InnoDB. Using ICP on a
clustered index provides the lowest performance benefit since the
entire record is part of the clustered index and in InnoDB it has the
highest relative overhead for executing the pushed index condition.
The fix for removing the overhead ICP introduces when running sysbench
is to disable use of ICP when the index used by the query is a
clustered index.
When WL#6061 is implemented this change should be re-evaluated.
The patch differs from the original MySQL patch as follows:
- All test case differences have been reviewed one by one, and
care has been taken to restore the original plan so that each
test case executes the code path it was designed for.
- A bug was found and fixed in MariaDB 5.3 in
Item_allany_subselect::cleanup().
- ORDER BY is not removed because we are unsure of all effects,
and it would prevent enabling ORDER BY ... LIMIT subqueries.
- ref_pointer_array.m_size is not adjusted because we don't do
array bounds checking, and because it looks risky.
Original comment by Jorgen Loland:
-------------------------------------------------------------
WL#5953 - Optimize away useless subquery clauses
For IN/ALL/ANY/SOME/EXISTS subqueries, the following clauses are
meaningless:
* ORDER BY (since we don't support LIMIT in these subqueries)
* DISTINCT
* GROUP BY if there is no HAVING clause and no aggregate
functions
This WL detects and optimizes away these useless parts of the
query during JOIN::prepare()
compilation error in mysys/my_getsystime.c fixed
some redundant code removed
sec_to_time, time_to_sec, from_unixtime, unix_timestamp, @@timestamp now
use decimal, not double for numbers with a fractional part.
purge_master_logs_before_date() fixed
many bugs in corner cases fixed
mysys/my_getsystime.c:
compilation failure fixed
sql/sql_parse.cc:
don't cut corners. it backfires.
A lot of small fixes and new test cases.
client/mysqlbinlog.cc:
Cast removed
client/mysqltest.cc:
Added missing DBUG_RETURN
include/my_pthread.h:
set_timespec_time_nsec() now only takes one argument
mysql-test/t/date_formats.test:
Remove --disable_ps_protocl as now also ps supports microseconds
mysys/my_uuid.c:
Changed to use my_interval_timer() instead of my_getsystime()
mysys/waiting_threads.c:
Changed to use my_hrtime()
sql/field.h:
Added bool special_const_compare() for fields that may convert values before compare (like year)
sql/field_conv.cc:
Added test to get optimal copying of identical temporal values.
sql/item.cc:
Return that item_int is equal if it's positive, even if unsigned flag is different.
Fixed Item_cache_str::save_in_field() to have identical null check as other similar functions
Added proper NULL check to Item_cache_int::save_in_field()
sql/item_cmpfunc.cc:
Don't call convert_constant_item() if there is nothing that is worth converting.
Simplified test when years should be converted
sql/item_sum.cc:
Mark cache values in Item_sum_hybrid as not constants to ensure they are not replaced by other cache values in compare_datetime()
sql/item_timefunc.cc:
Changed sec_to_time() to take a my_decimal argument to ensure we don't loose any sub seconds.
Added Item_temporal_func::get_time() (This simplifies some things)
sql/mysql_priv.h:
Added Lazy_string_decimal()
sql/mysqld.cc:
Added my_decimal constants max_seconds_for_time_type, time_second_part_factor
sql/table.cc:
Changed expr_arena to be of type CONVENTIONAL_EXECUTION to ensure that we don't loose any items that are created by fix_fields()
sql/tztime.cc:
TIME_to_gmt_sec() now sets *in_dst_time_gap in case of errors
This is needed to be able to detect if timestamp is 0
storage/maria/lockman.c:
Changed from my_getsystime() to set_timespec_time_nsec()
storage/maria/ma_loghandler.c:
Changed from my_getsystime() to my_hrtime()
storage/maria/ma_recovery.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/maria/unittest/trnman-t.c:
Changed from my_getsystime() to mmicrosecond_interval_timer()
storage/xtradb/handler/ha_innodb.cc:
Added support for new time,datetime and timestamp
unittest/mysys/thr_template.c:
my_getsystime() -> my_interval_timer()
unittest/mysys/waiting_threads-t.c:
my_getsystime() -> my_interval_timer()
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The
same scenario works fine for MyISAM temporary tables.
The assertion failure was caused by discrepancy between lock
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.
Possible approaches to fixing this problem are:
1) Skip locking of temporary tables as locking doesn't make any
sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ...
READ.
Unfortunately both of these approaches have drawbacks which make
them unviable for stable versions of server.
So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.1 version of this patch switch from read lock to write
lock is done inside of InnoDBs handler methods as doing it on
SQL-layer causes compatibility troubles with FLUSH TABLES WITH
READ LOCK.
mysql-test/suite/innodb/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
Added test for bug #11762012 - "54553: INNODB ASSERTS IN
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
storage/innobase/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
storage/innodb_plugin/handler/ha_innodb.cc:
Assume that a temporary table locked by LOCK TABLES can be updated
even if it was only locked for read and therefore an X-lock should
be always requested for such tables.
Adjusted the result files in the pbxt, innodb and innodb_plugin suites.
Commented out a failing test case in innodb.innodb_multi_update.test.
It should be returned back when the problem with multi-update of derived
tables (bug 784297) is resolved.
sql/event_parse_data.cc:
don't use "not_used" variable
sql/item_timefunc.cc:
Item_temporal_func::fix_length_and_dec()
and other changes
sql/item_timefunc.h:
introducing Item_timefunc::fix_length_and_dec()
sql/share/errmsg.txt:
don't say "column X" in the error message that used not only for columns
Resolved all conflicts, bad merges and fixed a few minor bugs in the code.
Commented out the queries from multi_update, view, subselect_sj, func_str,
derived_view, view_grant that failed either with crashes in ps-protocol or
with wrong results.
The failures are clear indications of some bugs in the code and these bugs
are to be fixed.
The bug was a result of the fix for bug 668644 that turned out to be
not quite correct. A problem appeared with HAVING conditions containing
more than one predicate. If a query with an ORDER BY clause uses
such HAVING condition and the required order can be obtained with
a range/index scan then the HAVING condition has to be pushed into
two different formulas (items). To be able to do it we have to create
a copy of the ANDOR structure of the pushed condition.
even in the cases when there existed range/index-merge scans that
were cheaper than the full table scan.
This was a defect/bug of the implementation of mwl #128.
Now hash join can work not only with full table scan of the joined
table, but also with full index scan, range and index-merge scans.
Accordingly, in the cases when hash join is used the column 'type'
in the EXPLAINs can contain now 'hash_ALL', 'hash_index', 'hash_range'
and 'hash_index_merge'. If hash join is coupled with a range/index_merge
scan then the columns 'key' and 'key_len' contain info not only on
the used hash index, but also on the indexes used for the scan.
- Fixed some issues with partitions and connection_string, which also fixed lp:716890 "Pre- and post-recovery crash in Aria"
- Fixed wrong assert in Aria
Now need to merge with latest xtradb before pushing
sql/ha_partition.cc:
Ensure that m_ordered_rec_buffer is not freed before close.
sql/mysqld.cc:
Changed to use opt_stack_trace instead of opt_pstack.
Removed references to pstack
sql/partition_element.h:
Ensure that connect_string is initialized
storage/maria/ma_key_recover.c:
Fixed wrong assert
Problem is that these tests run with --innodb-lock-wait-timeout=2 in .opt
(and this is necessary as built-in innodb does not allow to change this
dynamically). This cases another part of the test to occasionally time
out an UPDATE, which subsequently caused the test case to timeout due to
waiting for a condition (successful UPDATE) that never occurs.
Fixed by re-trying the update in case of timeout.
Tested by inserting a sleep() in the connection that the UPDATE is waiting
for, and checking that the retry loops a couple of times until the other
connection is done and COMMITs.
- Fixed problem with oqgraph and 'make dist'
Note that after this merge we have a problem show in join_outer where we examine too many rows in one specific case (related to BUG#57024).
This will be fixed when mwl#128 is merged into 5.3.
independent execution plans.
Fixed a bug in Unique::unique_add that caused a crash for a query from
index_intersect_innodb on some platforms.
Fixed two bugs in opt_range.cc that led to the choice of not the
cheapest plans for index intersections.
In case of low memory sort buffer QUICK_INDEX_MERGE_SELECT creates
temporary file where is stores row ids which meet QUICK_SELECT ranges
except of clustered pk range, clustered range is processed separately.
In init_read_record we check if temporary file is used and choose
appropriate record access method. It does not take into account that
temporary file contains partial result in case of QUICK_INDEX_MERGE_SELECT
with clustered pk range.
The fix is always to use rr_quick if QUICK_INDEX_MERGE_SELECT
with clustered pk range is used.
mysql-test/suite/innodb/r/innodb_mysql.result:
test case
mysql-test/suite/innodb/t/innodb_mysql.test:
test case
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
test case
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
test case
sql/opt_range.h:
added new method
sql/records.cc:
The fix is always to use rr_quick if QUICK_INDEX_MERGE_SELECT
with clustered pk range is used.
ALTER TABLE RENAME, DISABLE KEYS.
The code of ALTER TABLE RENAME, DISABLE KEYS could
issue a commit while holding LOCK_open mutex.
This is a regression introduced by the fix for
Bug 54453.
This failed an assert guarding us against a potential
deadlock with connections trying to execute
FLUSH TABLES WITH READ LOCK.
The fix is to move acquisition of LOCK_open outside
the section that issues ha_autocommit_or_rollback().
LOCK_open is taken to protect against concurrent
operations with .frms and the table definition
cache, and doesn't need to cover the call to commit.
A test case added to innodb_mysql.test.
The patch is to be null-merged to 5.5, which
already has 54453 null-merged to it.
mysql-test/suite/innodb/r/innodb_mysql.result:
Added test results for test for bug#56619.
mysql-test/suite/innodb/t/innodb_mysql.test:
Added test for bug#56619.
sql/sql_table.cc:
mysql_alter_table() modified: moved acquisition of LOCK_open
after call to ha_autocommit_or_rollback.
The pushdown condition for the sorted table in a query can be complemented
by the conditions from HAVING. This transformation is done in JOIN::exec
pretty late after the original pushdown condition have been saved in the
field pre_idx_push_select_cond for the sorted table. So this field must
be updated after the inclusion of the condition from HAVING.
case than in corr index".
Server was unable to find existing or explicitly created supporting
index for foreign key if corresponding statement clause used field
names in case different than one used in key specification and created
yet another supporting index.
In cases when name of constraint (and thus name of generated index)
was the same as name of existing/explicitly created index this led
to duplicate key name error.
The problem was that unlike all other code Key_part_spec::operator==()
compared field names in case sensitive fashion. As result routines
responsible for getting rid of redundant generated supporting indexes
for foreign key were not working properly for versions of field names
using different cases.
(backported from mysql-trunk)
sql/sql_class.cc:
Make field name comparison case-insensitive like it is
in the rest of server.