"Re-factor the code for post-join operations".
The patch mainly contains the code ported from mysql-5.6 and
created for two essential architectural changes:
1. WL#5558: Resolve ORDER BY execution method at the optimization stage
2. WL#6071: Inline tmp tables into the nested loops algorithm
The first task was implemented for mysql-5.6 by Ole John Aske.
It allows to make all decisions on ORDER BY operation at the optimization
stage.
The second task implemented for mysql-5.6 by Evgeny Potemkin adds JOIN_TAB
nodes for post-join operations that require temporary tables. It allows
to execute these operations within the nested loops algorithm that used to
be used before this task only for join queries. Besides these task moves
all planning on the execution of these operations from the execution phase
to the optimization phase.
Some other re-factoring changes of mysql-5.6 were pulled in, mainly because
it was easier to pull them in than roll them back. In particular all
changes concerning Ref_ptr_array were incorporated.
The port required some changes in the MariaDB code that concerned the
functionality of EXPLAIN and ANALYZE. This was done mainly by Sergey
Petrunia.
The following left in semi-improved state to keep patch size reasonable:
- Field operator new: left thd_alloc(current_thd)
- Sql_alloc operator new: left thd_alloc(thd_get_current_thd())
- Item_args constructors: left thd_alloc(thd)
- Item_func_interval::fix_length_and_dec(): no THD arg, have to call current_thd
- Item_func_dyncol_exists::val_int(): same
- Item_dyncol_get::val_str(): same
- Item_dyncol_get::val_int(): same
- Item_dyncol_get::val_real(): same
- Item_dyncol_get::val_decimal(): same
- Item_singlerow_subselect::fix_length_and_dec(): same
Problem:
Procedure which uses stack of views first executed without most deep view.
It fails but one view cached (as well as whole procedure).
Then simultaniusely create the second view we lack and execute the procedure.
In the beginning of procedure execution the view is not yet created so
procedure used as it was cached (cache was not invalidated).
But by the time we are trying to use most deep view it is already created.
The problem with the view is that thd->select_number (first view was not parsed) so second view will get the same number.
The fix is in keeping the thd->select_number correct even if we use cached views.
In the proposed solution (to keep it simple) counter can be bigger then should but it should not create problem because numbers are still unique and situation is very rare.
When CHANGE MASTER was executed as a PS, its attributes were wrongly
getting reset toward the end of PREPARE. As a result, the subsequent
executions had no effect. Fixed by making sure that the CHANGE MASTER
attributes are preserved during the lifetime of the PS.
- Part 3: Adding mem_root to push_back() and push_front()
Other things:
- Added THD as an argument to some partition functions.
- Added memory overflow checking for XML tag's in read_xml()
Other things:
- Avoid calling init_and_set_log_file_name() when opening binary log.
- Remove newlines early when reading from index file.
- Ensure that reset_logs() will work even if thd is 0 (Can happen on startup)
- Added thd to sart_slave_threads() for better error handling.
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd
Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)
Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
examined_rows -> join_examined_rows
record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()
Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
This is MDEV-7601, including it's sub tasks MDEV-7594, MDEV-7555, MDEV-7590, MDEV-7581, MDEV-7589
The problem was that select_lex->non_agg_fields was not properly reset for re-execution and this caused an overwrite of a random memory position.
The fix was move non_agg_fields from select_lext to JOIN, which is properly reset.
Handle the case where the optimizer decides to use
handler->delete_all_rows(), but then this call returns
HA_ERR_UNSUPPORTED and execution switches to regular
row-by-row deletion.
To make st_select_lex::master_unit() inlinable:
- moved it's definition to sql_lex.h
- removed base class virtual master_unit() declaration since this method is
specific to st_select_lex
Overhead change:
st_select_lex::master_unit() 0.17% -> out of radar
execute_sqlcom_select() 0.13% -> 0.12%
JOIN::save_explain_data_intern() 0.27% -> 0.23%
JOIN::optimize_inner() 0.76% -> 0.72%
JOIN::exec_inner() 0.30% -> 0.24%
JOIN::prepare() 0.30% -> 0.29%
JOIN::optimize() 0.05% -> 0.05%
sql_alloc() has additional costs compared to direct mem_root allocation:
- function call: it is defined in a separate translation unit and can't be
inlined
- it needs to call pthread_getspecific() to get THD::mem_root
It is called dozens of times implicitely at least by:
- List<>::push_back()
- List<>::push_front()
- new (for Sql_alloc derived classes)
- sql_memdup()
Replaced lots of implicit sql_alloc() calls with direct mem_root allocation,
passing through THD pointer whenever it is needed.
Number of sql_alloc() calls reduced 345 -> 41 per OLTP RO transaction.
pthread_getspecific() overhead dropped 0.76 -> 0.59
sql_alloc() overhed dropped 0.25 -> 0.06
delete_dynamic() was called 9-11x per OLTP RO query + 3x per BEGIN/COMMIT.
3 calls were performed by LEX_MASTER_INFO. Added condition to call those only
for CHANGE MASTER.
1 call was performed by lock_table_names()/Hash_set/my_hash_free(). Hash_set was
supposed to be used for DDL and LOCK TABLES to gather database names, while it
was initialized/freed for DML too. In fact Hash_set didn't do any useful job
here. Hash_set was removed and MDL requests are now added directly to the list.
The rest 5-7 calls are done by optimizer, mostly by Explain_query and friends.
Since dynamic arrays are used in most cases, they can hardly be optimized.
my_hash_free() overhead dropped 0.02 -> out of radar.
delete_dynamic() overhead dropped 0.12 -> 0.04.
including the big commit
commit 305130361bf72726de220f3d2b2787395e10be61
Author: Marc Alff <marc.alff@oracle.com>
Date: Tue Feb 10 11:31:32 2015 +0100
WL#8354 BACKPORT DIGEST IMPROVEMENTS TO MYSQL 5.6
(with the following commits) and related changes in sql/
- Remove ANALYZE's timing code off the the execution path of regular
SELECTs.
- Improve the tracker that tracks counts/execution times of SELECTs or
DML statements:
= regular execution just increments counters
= ANALYZE will also collect timings.
MDEV-6218 Wrong result of CHAR_LENGTH(non-BMP-character) with 3-byte utf8
- Moving get_text() as a method to Lex_input_stream.
- Moving the unescaping part into a separate function,
this piece of code will later go to /strings most likely.
- Removing Lex_input_string::yytoklen, as it's not needed any more.
fixed embedded server tests
MDEV-7009: SET STATEMENT min_examined_row_limit has no effect
MDEV-6948:SET STATEMENT gtid_domain_id = ... FOR has no effect (same for gtid_seq_no and server_id)
old values of SET STATENENT variables now saved in its own Query_arena and restored later
length/dec/charset are still in LEX, because they're also used
for CAST and dynamic columns.
also
1. fix "MDEV-7041 COLLATION(CAST('a' AS CHAR BINARY)) returns a wrong result"
2. allow BINARY modifier in stored function RETURN clause
3. allow "COLLATION without CHARSET" in SP/SF (parameters, RETURN, DECLARE)
4. print correct variable name in error messages for stored routine parameters
* pass LEX_STRING's from the parser, don't ignore the length only to strlen later
* init LEX::server_options only for SERVER commands, not for every statement
* don't put temporary values into a global persistent memroot
but really it's just scratching a surface
When parsing a field declaration, grab type information from LEX before it's overwritten
by further rules. Pass type information through the parser stack to the rule that needs it.