I_S table
mdl_iterate() helper function (which is used by the plugin to iterate mdl
locks) acquired mutexes in reverse order.
Fixed by iterating MDL locks in two stages:
1. Iterate locks hash under the protection of hash mutex, store all
lock pointers in a thread local array and increment reference counter
for the lock.
2. Iterate local array without protection of hash mutex, handle destroyed
locks.
It somewhat echoes hack in MDL_map_partition::move_from_hash_to_lock_mutex.
MDEV-5980: EITS: if condition is used for REF access, its selectivity is still in filtered%
MDEV-5985: EITS: selectivity estimates look illogical for join and non-key equalities
MDEV-6003: EITS: ref access, keypart2=const vs keypart2=expr - inconsistent filtered% value
- Made a number of fixes in table_cond_selectivity() so that it returns
correct selectivity estimates.
- Added comments in related code.
Better comments
MDEV-5985: EITS: selectivity estimates look illogical for join and non-key equalities
MDEV-6003: EITS: ref access, keypart2=const vs keypart2=expr - inconsistent filtered% value
- Made a number of fixes in table_cond_selectivity() so that it returns
correct selectivity estimates.
- Added comments in related code.
The previous patch for this bug was unfortunately completely wrong.
The purpose of cached_charset is to remember which character set we
have installed currently in the THD, so that in the common case where
charset does not change between queries, we do not need to update it
in the THD. Thus, it is important that the cached_charset field is
tightly coupled to the THD for which it handles caching.
Thus the right place to put cached_charset seems to be in the THD.
This patch introduces a field THD:system_thread_info where such info
in general can be placed without further inflating the THD with unused
data for other threads (THD is already far too big as it is). It then
moves the cached_charset into this slot for the SQL driver thread and
for the parallel replication worker threads.
The THD::rpl_filter field is also moved inside system_thread_info, to
keep the size of THD unchanged. Moving further fields in to reduce the
size of THD is a separate task, filed as MDEV-6164.
The problem is the async binlog checkpointing; this could on rare
occasions occur too late, causing SHOW BINLOG EVENTS to show the
wrong events and cause .result file difference.
Replication caches the character sets used in a query, to be able to quickly
reuse them for the next query in the common case of them not having changed.
In parallel replication, this caching needs to be per-worker-thread. The
code was not modified to handle this correctly, so the caching in one worker
could cause another worker to run a query using the wrong character set,
causing replication corruption.
Back-ported from the mysql 5.6 code line the patch with
the following comment:
Fix for Bug#11757108 CHANGE IN EXECUTION PLAN FOR COUNT_DISTINCT_GROUP_ON_KEY
CAUSES PEFORMANCE REGRESSION
The cause for the performance regression is that the access strategy for the
GROUP BY query is changed form using "index scan" in mysql-5.1 to use "loose
index scan" in mysql-5.5. The index used for group by is unique and thus each
"loose scan" group will only contain one record. Since loose scan needs to
re-position on each "loose scan" group this query will do a re-position for
each index entry. Compared to just reading the next index entry as a normal
index scan does, the use of loose scan for this query becomes more expensive.
The cause for selecting to use loose scan for this query is that in the current
code when the size of the "loose scan" group is one, the formula for
calculating the cost estimates becomes almost identical to the cost of using
normal index scan. Differences in use of integer versus floating point arithmetic
can cause one or the other access strategy to be selected.
The main issue with the formula for estimating the cost of using loose scan is
that it does not take into account that it is more costly to do a re-position
for each "loose scan" group compared to just reading the next index entry.
Both index scan and loose scan estimates the cpu cost as:
"number of entries needed too read/scan" * ROW_EVALUATE_COST
The results from testing with the query in this bug indicates that the real
cost for doing re-position four to eight times higher than just reading the
next index entry. Thus, the cpu cost estimate for loose scan should be increased.
To account for the extra work to re-position in the index we increase the
cost for loose index scan to include the cost of navigating the index.
This is modelled as a function of the height of the b-tree:
navigation cost= ceil(log(records in table)/log(indexes per block))
* ROWID_COMPARE_COST;
This will avoid loose index scan being used for indexes where the "loose scan"
group contains very few index entries.
back-ported the patch for bug #13256831 from mysql-5.6 code line.
Here's the comment this patch was provided with:
Fixed bug#13256831 - ERROR 1032 (HY000): CAN'T FIND RECORD.
This bug only occurs if a user tries to update a base table using
an updatable view and this view was created as a join for which
the clause 'WITH CHECK OPTION' was specified.
The reason for the bug was that when such an update was
executed, row positions were not properly handled for tables
that were not updated but had constraints that had to be
checked due to the 'WITH CHECK OPTION' clause.
The reason for the bug was that when such update is executed
then for tables specified in the view definition and
also listed in the 'WITH CHECK OPTION' clause the positioning to
row being updated is not performed.
catalog data path had not been set. This was added into ha_connect::info.
modified:
storage/connect/ha_connect.cc
- All the functions querying table options could return information from the wrong
table when several CONNECT tables were used in the same query (for instance joined
together) This was because they belonged to the catalog class that is shared between
all tables in the same query. They have been moved from the catalog class to the
TABDEF/RELDEF class that is attached to each table. This was a major potential bug.
modified:
storage/connect/catalog.h
storage/connect/filamvct.cpp
storage/connect/filamzip.cpp
storage/connect/mycat.cc
storage/connect/mycat.h
storage/connect/reldef.cpp
storage/connect/reldef.h
storage/connect/tabdos.cpp
storage/connect/tabfmt.cpp
storage/connect/tabmul.cpp
storage/connect/tabmysql.cpp
storage/connect/taboccur.cpp
storage/connect/tabodbc.cpp
storage/connect/tabpivot.cpp
storage/connect/tabsys.cpp
storage/connect/tabtbl.cpp
storage/connect/tabutil.cpp
storage/connect/tabvct.cpp
storage/connect/tabwmi.cpp
storage/connect/tabxcl.cpp
storage/connect/tabxml.cpp
storage/connect/xindex.cpp
- Prepare indexing of MYSQL/ODBC tables (as does FEDERATED) (Not implemented yet)
modified:
storage/connect/ha_connect.cc
storage/connect/ha_connect.h
storage/connect/mycat.cc
storage/connect/mycat.h
- Typo
modified:
storage/connect/plgdbutl.cpp
Add a testcase and backport this fix:
Bug#14338686: MYSQL IS GENERATING DIFFERENT AND SLOWER
(IN NEWER VERSIONS) EXECUTION PLAN
PROBLEM:
While checking for an index to sort for the order by clause
in this query
"SELECT datestamp FROM contractStatusHistory WHERE
contract_id = contracts.id ORDER BY datestamp asc limit 1;"
we do not calculate the number of rows to be examined correctly.
As a result we choose index 'idx_contractStatusHistory_datestamp'
defined on the 'datestamp' field, rather than choosing index
'contract_id'. And hence the lower performance.
ANALYSIS:
While checking if an index is present to give the records in
sorted order(datestamp), we consider the selectivity of the
'ref_key'(contract_id here) using 'table->quick_condition_rows'.
'ref_key' here can be an index from 'REF_ACCESS' or from 'RANGE'.
As this is a 'REF_ACCESS', 'table->quick_condition_rows' is not
set to the actual value which is 2. Instead is set to the number
of tuples present in the table indicating that every row that
is selected would be satisfying the condition present in the query.
Hence, the selectivity becomes 1 even when we choose the index
on the order by column instead of the join_condition.
But, in reality as only 2 rows satisy the condition, we need to
examine half of the entire data set to get one tuple when we
choose index on the order by column.
Had we chosen the 'REF_ACCESS' we would have examined only 2 tuples.
Hence the delay in executing the query specified.
FIX:
While calculating the selectivity of the ref_key:
For REF_ACCESS consider quick_rows[ref_key] if range
optimizer has an estimate for this key. Else consider
'rec_per_key' statistic.
For RANGE ACCESS consider 'table->quick_condition_rows'.
The problem was that the view substitute its fields (on prepare) with reverting the change after execution. After prepare on optimization exists2in convertion substituted arguments of '=' with constsnt '1', but then one of the arguments of '=' was reverted to the view field reference.This lead to incorrect WHERE condition on the second execution.
To fix the problem we replace whole '=' with '1' permannently.
We need to use mysql_cond_broadcast() rather than _signal for
COND_thread_count, as there can be multiple waiters.
Thanks to Pavel Ivanov for reporting both the problem and the
solution.
modified:
storage/connect/ha_connect.cc
storage/connect/ha_connect.h
storage/connect/xindex.cpp
- Optimize retrieving numeric values in scan_record. Was previously
translating numeric values to character representation back and forth.
modified:
storage/connect/ha_connect.cc
storage/connect/mysql-test/connect/r/xml.result
- Modify Pivot table creation to avoid reading the entire source table
when making columns from Discovery. MDEV-6024
modified:
storage/connect/tabpivot.cpp
- Make JOIN::const_key_parts include keyparts for which
the WHERE clause has an equality in form
"t.key_part=reference_outside_this_select"
- This allows to avoid filesort'ing in some cases (and also
avoid a difficult choice between using filesort or using an index)