query / no aggregate of subquery
The optimizer counts the aggregate functions that
appear as top level expressions (in all_fields) in
the current subquery. Later it makes a list of these
that it uses to actually execute the aggregates in
end_send_group().
That count is used in several places as a flag whether
there are aggregates functions.
While collecting the above info it must not consider
aggregates that are not aggregated in the current
context. It must treat them as normal expressions
instead. Not doing that leads to incorrect data about
the query, e.g. running a query that actually has no
aggregate functions as if it has some (and hence is
expected to return only one row).
Fixed by ignoring the aggregates that are not aggregated
in the current context.
One other smaller omission discovered and fixed in the
process : the place of aggregation was not calculated for
user defined functions. Fixed by calling
Item_sum::init_sum_func_check() and
Item_sum::check_sum_func() as it's done for the rest of
the aggregate functions.
Conversion errors when constructing the condition for an
IN predicates were treated as if the affected column contains
NULL. If such a IN predicate is inside NOT we get wrong
results.
Corrected the handling of conversion errors in an IN predicate
that is resolved by unique_subquery (through
subselect_uniquesubquery_engine).
a crash when the left operand of the predicate is evaluated to NULL.
It happens when the rows from the inner tables (tables from the subquery)
are accessed by index methods with key values obtained by evaluation of
the left operand of the subquery predicate. When this predicate is
evaluated to NULL an alternative access with full table scan is used
to check whether the result set returned by the subquery is empty or not.
The crash was due to the fact the info about the access methods used for
regular key values was not properly restored after a switch back from the
full scan access method had occurred.
The patch restores this info properly.
The same problem existed for queries with IN subquery predicates if they
were used not at the top level of the queries.
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
Non-correlated scalar subqueries may get executed
in EXPLAIN at the optimization phase if they are
part of a right hand sargable expression.
If the scalar subquery uses a temp table to
materialize its results it will replace the
subquery structure from the parser with a simple
select from the materialization table.
As a result the EXPLAIN will crash as the
temporary materialization table is not to be shown
in EXPLAIN at all.
Fixed by preserving the original query structure
right after calling optimize() for scalar subqueries
with temp tables executed during EXPLAIN.
Before this fix, a IN predicate of the form: "IN (( subselect ))", with two
parenthesis, would be evaluated as a single row subselect: if the subselect
returns more that 1 row, the statement would fail.
The SQL:2003 standard defines a special exception in the specification,
and mandates that this particular form of IN predicate shall be equivalent
to "IN ( subselect )", which involves a table subquery and works with more
than 1 row.
This fix implements "IN (( subselect ))", "IN ((( subselect )))" etc
as per the SQL:2003 requirement.
All the details related to the implementation of this change have been
commented in the code, and the relevant sections of the SQL:2003 spec
are given for reference, so they are not repeated here.
Having access to the spec is a requirement to review in depth this patch.
index_read(), index_read_idx(), index_read_last(), and
records_in_range() - instead of 'uint keylen' argument take
'ulonglong keypart_map', a bitmap showing which keyparts are
present in the key value.
Fallback method is provided for handlers that are lagging behind.
The bug report has demonstrated the following two problems.
1. If an ORDER/GROUP BY list includes a constant expression being
optimized away and, at the same time, containing single-row
subselects that return more that one row, no error is reported.
Strictly speaking the standard allows to ignore error in this case.
Yet, now a corresponding fatal error is reported in this case.
2. If a query requires sorting by expressions containing single-row
subselects that, however, return more than one row, then the execution
of the query may cause a server crash.
To fix this some code has been added that blocks execution of a subselect
item in case of a fatal error in the method Item_subselect::exec.
- Make the code produce correct result: use an array of triggers to turn on/off equalities for each
compared column. Also turn on/off optimizations based on those equalities.
- Make EXPLAIN output show "Full scan on NULL key" for tables for which we switch between
ref/unique_subquery/index_subquery and ALL access.
- index_subquery engine now has HAVING clause when it is needed, and it is
displayed in EXPLAIN EXTENDED
- Fix incorrect presense of "Using index" for index/unique-based subqueries (BUG#22930)
// bk trigger note: this commit refers to BUG#24127
When transforming "oe IN (SELECT ie ...)" wrap the pushed-down predicates
iff "oe can be null", not "ie can be null".
The fix doesn't cover row-based subqueries, those will be fixed in #24127.
- Removed not used variables and functions
- Added #ifdef around code that is not used
- Renamed variables and functions to avoid conflicts
- Removed some not used arguments
Fixed some class/struct warnings in ndb
Added define IS_LONGDATA() to simplify code in libmysql.c
I did run gcov on the changes and added 'purecov' comments on almost all lines that was not just variable name changes