ON COL WITH COMPOSITE INDEX
This problem is caused by the patch for the bug#11751794.
While checking for the keypart covering non grouping attribute. we are not
checking whether the root node of the SEL_ARG* tree for the index have any
cvalue or not.
ON COL WITH COMPOSITE INDEX
This problem is caused by the patch for the bug#11751794.
While checking for the keypart covering non grouping attribute. we are not
checking whether the root node of the SEL_ARG* tree for the index have any
cvalue or not.
Consider the following query:
SELECT f_1,..,f_m, AGGREGATE_FN(C)
FROM t1
WHERE ...
GROUP BY ...
Loose index scan ("Using index for group-by") can be used for
this query if there is an index 'i' covering all fields in the
select list, and the GROUP BY clause makes up a prefix f1,...,fn
of 'i'. Furthermore, according to rule NGA2 of
get_best_group_min_max(), the WHERE clause must contain a
conjunction of equality predicates for all fields fn+1,...,fm.
The problem in this bug was that a query with WHERE clause that
broke NGA2(NGA: Non Group Attribuite) was not detected and therefore
used loose index scan.
This lead to wrong result. The query had an index
covering (c1,c2) and had:
"WHERE (c1 = 1 AND c2 = 'a') OR (c1 = 2 AND c2 = 'b')
GROUP BY c1"
or
"WHERE (c1 = 1 ) OR (c1 = 2 AND c2 = 'b')
GROUP BY c1"
This WHERE clause cannot be transformed to a conjunction of
equality predicates.
The solution is to introduce another rule, NGA3, that complements
NGA2. NGA3 says that if a gap field (field between those
listed in GROUP BY and C in the index) has a predicate, then
there can only be one range in the query. This requirement is
more strict than it has to be in theory. BUG 15947433 will deal
with that.
Consider the following query:
SELECT f_1,..,f_m, AGGREGATE_FN(C)
FROM t1
WHERE ...
GROUP BY ...
Loose index scan ("Using index for group-by") can be used for
this query if there is an index 'i' covering all fields in the
select list, and the GROUP BY clause makes up a prefix f1,...,fn
of 'i'. Furthermore, according to rule NGA2 of
get_best_group_min_max(), the WHERE clause must contain a
conjunction of equality predicates for all fields fn+1,...,fm.
The problem in this bug was that a query with WHERE clause that
broke NGA2 was not detected and therefore used loose index scan.
This lead to wrong result. The query had an index
covering (c1,c2) and had:
"WHERE (c1 = 1 AND c2 = 'a') OR (c1 = 2 AND c2 = 'b')
GROUP BY c1"
or
"WHERE (c1 = 1 ) OR (c1 = 2 AND c2 = 'b')
GROUP BY c1"
This WHERE clause cannot be transformed to a conjunction of
equality predicates.
The solution is to introduce another rule, NGA3, that complements
NGA2. NGA3 says that if a gap field (field between those
listed in GROUP BY and C in the index) has a predicate, then
there can only be one range in the query. This requirement is
more strict than it has to be in theory. BUG 15947433 will deal
with that.
Problem:
During the index intersect access method, the SQL layer will access one row,
that satisfies a set of conditions, using an index i1. And then it will try to
access the same row, with other set of conditions using the next index i2. If
the fetch from i2 fails (we are talking about an error situation here and not
simply an unmatched row situation), then it will unlock the row accessed via
i1. This will work in all situations except deadlock error.
When a deadlock happens, InnoDB will rollback the transaction. InnoDB intimates
the SQL layer about this through the THD::transaction_rollback_request member.
But this is not currently used by the SQL layer.
Solution:
When an error happens, the SQL layer must check the
THD::transaction_rollback_request member, before calling handler::unlock_row().
We have also added a debug assert in ha_innobase::unlock_row() checking that
it must be called only when the transaction is in active state.
rb#1773 approved by Marko and Sunny.
The problem is a shift operation that is not 64-bit safe.
The consequence is that used tables information for a join with 32 tables
or more will be incorrect.
Fixed by adding a type cast in Item_sum::update_used_tables().
Also used the opportunity to fix some other potential bugs by adding an
explicit type-cast to an integer in a left-shift operation.
Some of them were quite harmless, but was fixed in order to get the same
signed-ness as the other operand of the operation it was used in.
sql/item_cmpfunc.cc
Adjusted signed-ness for some integers in left-shift.
sql/item_subselect.cc
Added type-cast to nesting_map (which is a 32/64 bit type, so
potential bug for deeply nested queries).
sql/item_sum.cc
Added type-cast to nesting_map (32/64-bit type) and table_map
(64-bit type).
sql/opt_range.cc
Added type-cast to ulonglong (which is a 64-bit type).
sql/sql_base.cc
Added type-cast to nesting_map (which is a 32/64-bit type).
sql/sql_select.cc
Added type-cast to nesting_map (32/64-bit type) and key_part_map
(64-bit type).
sql/strfunc.cc
Changed type-cast from longlong to ulonglong, to preserve signed-ness.
When a DML statement is issued, and if the index merge
access method is chosen, then many rows from the
storage engine will be locked because of the way the
algorithm works. Many rows will be locked, but they
will not be part of the final result set.
To reduce the excessive locking, the locks of unmatched
rows are released by this patch. This patch will
affect only transactions with isolation level
equal to or less stricter than READ COMMITTED. This is
because of the behaviour of ha_innobase::unlock_row().
rb://1296 approved by jorgen and olav.
CONSISTENT SNAPSHOT OPTION
A transaction is started with a consistent snapshot. After
the transaction is started new indexes are added to the
table. Now when we issue an update statement, the optimizer
chooses an index. When the index scan is being initialized
via ha_innobase::change_active_index(), InnoDB reports
the error code HA_ERR_TABLE_DEF_CHANGED, with message
stating that "insufficient history for index".
This error message is propagated up to the SQL layer. But
the my_error() api is never called. The statement level
diagnostics area is not updated with the correct error
status (it remains in Diagnostics_area::DA_EMPTY).
Hence the following check in the Protocol::end_statement()
fails.
516 case Diagnostics_area::DA_EMPTY:
517 default:
518 DBUG_ASSERT(0);
519 error= send_ok(thd->server_status, 0, 0, 0, NULL);
520 break;
The fix is to backport the fix of bugs 14365043, 11761652
and 11746399.
14365043 PROTOCOL::END_STATEMENT(): ASSERTION `0' FAILED
11761652 HA_RND_INIT() RESULT CODE NOT CHECKED
11746399 RETURN VALUES OF HA_INDEX_INIT() AND INDEX_INIT() IGNORED
rb://1227 approved by guilhem and mattiasj.
Bug#14530242 CRASH / MEMORY CORRUPTION IN FILESORT_BUFFER::GET_RECORD_BUFFER WITH MYISAM
This is a backport of
Bug#12694872 - VALGRIND: 18,816 BYTES IN 196 BLOCKS ARE DEFINITELY LOST
Bug#13340270: assertion table->sort.record_pointers == __null
Bug#14536113 CRASH IN CLOSEFRM (TABLE.CC) OR UNPACK (FIELD.H) ON SUBQUERY WITH MYISAM TABLES
Also:
removed and re-added test files with file-ids from trunk.
ROWS THAT ARE EXPECTED
For non range/list partitioned tables (i.e. HASH/KEY):
When prune_partitions finds a multi-range list
(or in this test '<>') for a field of the partition index,
even if it cannot make any use of the multi-range,
it will continue with the next field of the partition index
and use that for pruning (even if it the previous
field could not be used). This results in partitions is
pruned away, leaving partitions that only matches
the last field in the partition index, and will exclude
partitions which might match any previous fields.
Fixed by skipping rest of partitioning key fields/parts
if current key field/part could not be used.
Also notice it is the order of the fields in the CREATE TABLE
statement that triggers this bug, not the order of fields in
primary/unique key or PARTITION BY KEY ().
It must not be the last field in the partitioning expression that
is not equal (or have a non single point range).
I.e. the partitioning index is created with the same field order
as in the CREATE TABLE. And for the bug to appear
the last field must be a single point and some previous field
must be a multi-point range.
COUNT DISTINCT GROUP BY
PROBLEM:
To calculate the final result of the count(distinct(select 1))
we call 'end_send' function instead of 'end_send_group'.
'end_send' cannot be called if we have aggregate functions
that need to be evaluated.
ANALYSIS:
While evaluating for a possible loose_index_scan option for
the query, the variable 'is_agg_distinct' is set to 'false'
as the item in the distinct clause is not a field. But, we
choose loose_index_scan by not taking this into
consideration.
So, while setting the final 'select_function' to evaluate
the result, 'precomputed_group_by' is set to TRUE as in
this case loose_index_scan is chosen and we do not have
agg_distinct in the query (which is clearly wrong as we
have one).
As a result, 'end_send' function is chosen as the final
select_function instead of 'end_send_group'. The difference
between the two being, 'end_send_group' evaluates the
aggregates while 'end_send' does not. Hence the wrong result.
FIX:
The variable 'is_agg_distinct' always represents if
'loose_idnex_scan' can be chosen for aggregate_distinct
functions present in the select.
So, we check for this variable to continue with
loose_index_scan option.
The assertion in innodb is triggered in this way:
1. mysql server does lookup on the primary key with full key,
innodb decides to not store cursor position because
"any index_next/prev call will return EOF anyway"
2. server asks innodb to return any next record in the index and the
assertion is triggered because no cursor position is stored.
It happens when a unique search (match_mode=ROW_SEL_EXACT)
in the clustered index is performed. InnoDB has never stored
the cursor position after a unique key lookup in the
clustered index because storing the position is an expensive
operation. The bug was introduced by
WL3220 'Loose index scan for aggregate functions'.
The fix is to disallow loose index scan optimization
for AGG_FUNC(DISTINCT ...) if GROUP_MIN_MAX quick select
uses clustered key.
When executing row-ordered-retrieval index merge,
the handler was cloned, but it used the wrong
memory root, so instead of allocating memory
on the thread/query's mem_root, it used the table's
mem_root, resulting in non released memory in the
table object, and was not freed until the table was
closed.
Solution was to ensure that memory used during cloning
of a handler was allocated from the correct memory root.
This was implemented by fixing handler::clone() to also
take a name argument, so it can be used with partitioning.
And in ha_partition only allocate the ha_partition's ref, and
call the original ha_partition partitions clone() and set at cloned
partitions.
Fix of .bzrignore on Windows with VS 2010
- Removed files specific to compiling on OS/2
- Removed files specific to SCO Unix packaging
- Removed "libmysqld/copyright", text is included in documentation
- Removed LaTeX headers for NDB Doxygen documentation
- Removed obsolete NDB files
- Removed "mkisofs" binaries
- Removed the "cvs2cl.pl" script
- Changed a few GPL texts to use "program" instead of "library"
when semijoin=on
When setting the aggregate function as having no rows to report
the function no_rows_in_result() was calling Item_sum::reset().
However this function in addition to cleaning up the aggregate
value by calling aggregator_clear() was also adding the current
value to the aggregate value by calling aggregator_add().
Fixed by making no_rows_in_result() to call aggregator_clear()
directly.
Renamed Item_sum::reset to Item_sum::reset_and_add() to
and added a comment to avoid misinterpretation of what the
function does.
Regression introduced by WL#2649.
Problem: queries with date/datetime columns did not use indexes:
set names non_latin1_charset;
select * from date_index_test
where date_column between '2010-09-01' and '2010-10-01';
before WL#2649 indexes worked fine because charset of
date/datetime
columns was BINARY which always won.
Fix: testing that collation of the operation matches collation
of the field is only needed in case of "real" string data types.
For DATE, DATETIME it's not needed.
@ mysql-test/include/ctype_numconv.inc
@ mysql-test/r/ctype_binary.result
@ mysql-test/r/ctype_cp1251.result
@ mysql-test/r/ctype_latin1.result
@ mysql-test/r/ctype_ucs.result
@ mysql-test/r/ctype_utf8.result
Adding tests
@ sql/field.h
Adding new method Field_str::match_collation_to_optimize_range()
for use in opt_range.cc to distinguish between
"real string" types like CHAR, VARCHAR, TEXT
(Field_string, Field_varstring, Field_blob)
and "almost string" types DATE, TIME, DATETIME
(Field_newdate, Field_datetime, Field_time, Field_timestamp)
@ sql/opt_range.cc
Using new method instead of checking result_type() against STRING result.
Note:
Another part of this problem (which is not regression)
is submitted separately (see bug##58329).
Subselect executes twice, at JOIN::optimize stage
and at JOIN::execute stage. At optimize stage
Innodb prebuilt struct which is used for the
retrieval of column values is initialized in.
ha_innobase::index_read(), prebuilt->sql_stat_start is true.
After QUICK_ROR_INTERSECT_SELECT finished his job it
restores read_set/write_set bitmaps with initial values
and deactivates one of the handlers used by
QUICK_ROR_INTERSECT_SELECT in JOIN::cleanup
(it's the case when we reuse original handler as one of
handlers required by QUICK_ROR_INTERSECT_SELECT object).
On second subselect execution inactive handler is activated
in QUICK_RANGE_SELECT::reset, file->ha_index_init().
In ha_index_init Innodb prebuilt struct is reinitialized
with inappropriate read_set/write_set bitmaps. Further
reinitialization in ha_innobase::index_read() does not
happen as prebuilt->sql_stat_start is false.
It leads to partial retrieval of required field values
and we get a mix of field values from different records
in the record buffer.
The fix is to reset
read_set/write_set bitmaps as these values
are required for proper intialization of
internal InnoDB struct which is used for
the retrieval of column values
(see build_template(), ha_innodb.cc)
Queries involving predicates of the form "const NOT BETWEEN
not_indexed_column AND indexed_column" could return wrong data
due to incorrect handling by the range optimizer.
For "c NOT BETWEEN f1 AND f2" predicates, get_mm_tree()
produces a disjunction of the SEL_ARG trees for "f1 > c" and
"f2 < c". If one of the trees is empty (i.e. one of the
arguments is not sargable) the resulting tree should be empty
as well, since the whole expression in this case is not
sargable.
The above logic is implemented in get_mm_tree() as follows. The
initial state of the resulting tree is NULL (aka empty). We
then iterate through arguments and compute the corresponding
SEL_ARG tree (either "f1 > c" or "f2 < c"). If the resulting
tree is NULL, it is simply replaced by the generated
tree. Otherwise it is replaced by a disjunction of itself and
the generated tree. The obvious flaw in this implementation is
that if the first argument is not sargable and thus produces a
NULL tree, the resulting tree will simply be replaced by the
tree for the second argument. As a result, "c NOT BETWEEN f1
AND f2" will end up as just "f2 < c".
Fixed by adding a check so that when the first argument
produces an empty tree for the NOT BETWEEN case, the loop is
aborted with an empty tree as a result. The whole idea of using
a loop for 2 arguments does not make much sense, but it was
probably used to avoid code duplication for several BETWEEN
variants.
Calculating the estimated number of records for a range scan
may take a significant time, and it was impossible for a user
to interrupt that process by killing the connection or the
query.
Fixed by checking the thread's 'killed' status in
check_quick_keys() and interrupting the calculation process if
it is set to a non-zero value.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
from mysql-next-mr-opt-backporting.
Bug#54515: Crash in opt_range.cc::get_best_group_min_max on
SELECT from VIEW with GROUP BY
When handling the grouping items in get_best_group_min_max, the
items need to be of type Item_field. In this bug, an ASSERT
triggered because the item used for grouping was an
Item_direct_view_ref (i.e., the group column is from a view).
The fix is to get the real_item since Item_ref* pointing to
Item_field is ok.
require O(#scans) memory
When an index merge operation was restarted, it would
re-allocate the Unique object controlling the duplicate row
ID elimination. Fixed by making the Unique object a member
of QUICK_INDEX_MERGE_SELECT and thus reusing it throughout
the lifetime of this object.
use limit efficiently
Bug #36569: UPDATE ... WHERE ... ORDER BY... always does a
filesort even if not required
Also two bugs reported after QA review (before the commit
of bugs above to public trees, no documentation needed):
Bug #53737: Performance regressions after applying patch
for bug 36569
Bug #53742: UPDATEs have no effect after applying patch
for bug 36569
Execution of single-table UPDATE and DELETE statements did not use the
same optimizer as was used in the compilation of SELECT statements.
Instead, it had an optimizer of its own that did not take into account
that you can omit sorting by retrieving rows using an index.
Extra optimization has been added: when applicable, single-table
UPDATE/DELETE statements use an existing index instead of filesort. A
corresponding SELECT query would do the former.
Also handling of the DESC ordering expression has been added when
reverse index scan is applicable.
From now on most single table UPDATE and DELETE statements show the
same disk access patterns as the corresponding SELECT query. We verify
this by comparing the result of SHOW STATUS LIKE 'Sort%
Currently the get_index_for_order function
a) checks quick select index (if any) for compatibility with the
ORDER expression list or
b) chooses the cheapest available compatible index, but only if
the index scan is cheaper than filesort.
Second way is implemented by the new test_if_cheaper_ordering
function (extracted part the test_if_skip_sort_order()).
In process of record search it is not taken into account
that inital quick->file->ref value could be inapplicable
to range interval. After proper row is found this value is
stored into the record buffer and later the record is
filtered out at condition evaluation stage.
The fix is store a refernce of found row to the handler ref field.
Fix various mismatches between function's language linkage. Any
particular function that is declared in C++ but should be callable
from C must have C linkage. Note that function types with different
linkages are also distinct. Thus, if a function type is declared in
C code, it will have C linkage (same if declared in a extern "C"
block).
remember range endpoints
The Loose Index Scan optimization keeps track of a sequence
of intervals. For the current interval it maintains the
current interval's endpoints. But the maximum endpoint was
not stored in the SQL layer; rather, it relied on the
storage engine to retain this value in-between reads. By
coincidence this holds for MyISAM and InnoDB. Not for the
partitioning engine, however.
Fixed by making the key values iterator
(QUICK_RANGE_SELECT) keep track of the current maximum endpoint.
This is also more efficient as we save a call through the
handler API in case of open-ended intervals.
The code to calculate endpoints was extracted into
separate methods in QUICK_RANGE_SELECT, and it was possible to
get rid of some code duplication as part of fix.
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h
It appears that stack overflow checks for recusrive stored procedure
calls, that run in the normal server, did not work in embedded and were
dummified with preprocessor magic( #ifndef EMBEDDED_SERVER ).
The fix is to remove ifdefs, there is no reason not to run overflow checks
and crash in deeply recursive calls.
Note: Start of the stack (thd->thread_stack variable) in embedded is not
necessarily exact but stil provides the best guess. Unless the caller of
mysql_read_connect() is already deep in the stack, thd->thread_stack
variable should approximate stack start address well.
function with distinct.
Loose index scan is used to find MIN/MAX values using appropriate index and
thus allow to avoid grouping. For each found row it updates non-aggregated
fields with values from row with found MIN/MAX value.
Without loose index scan non-aggregated fields are copied by end_send_group
function. With loose index scan there is no need in end_send_group and
end_send is used instead. Non-aggregated fields still need to be copied and
this was wrongly implemented in QUICK_GROUP_MIN_MAX_SELECT::get_next.
WL#3220 added a case when loose index scan can be used with end_send_group to
optimize calculation of aggregate functions with distinct. In this case
the row found by QUICK_GROUP_MIN_MAX_SELECT::get_next might belong to a next
group and copying it will produce wrong result.
Update of non-aggregated fields is moved to the end_send function from
QUICK_GROUP_MIN_MAX_SELECT::get_next.
Queries optimized with GROUP_MIN_MAX didn't cleanup KEYREAD
optimization properly. As a result subsequent queries may
return incomplete rows (fields are initialized to default
values).
Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables
Bug#20415 Output of mysqld --help --verbose is incomplete
Bug#25430 variable not found in SELECT @@global.ft_max_word_len;
Bug#32902 plugin variables don't know their names
Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting!
Bug#34829 No default value for variable and setting default does not raise error
Bug#34834 ? Is accepted as a valid sql mode
Bug#34878 Few variables have default value according to documentation but error occurs
Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var.
Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status
Bug#40988 log_output_basic.test succeeded though syntactically false.
Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails)
Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations
Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled
Bug#44797 plugins w/o command-line options have no disabling option in --help
Bug#46314 string system variables don't support expressions
Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken
Bug#46586 When using the plugin interface the type "set" for options caused a crash.
Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number
Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds
Bug#49417 some complaints about mysqld --help --verbose output
Bug#49540 DEFAULT value of binlog_format isn't the default value
Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix)
Bug#49644 init_connect and \0
Bug#49645 init_slave and multi-byte characters
Bug#49646 mysql --show-warnings crashes when server dies