- Part 3: Adding mem_root to push_back() and push_front()
Other things:
- Added THD as an argument to some partition functions.
- Added memory overflow checking for XML tag's in read_xml()
- Added mem_root to all calls to new Item
- Added private method operator new(size_t size) to Item to ensure that
we always use a mem_root when creating an item.
This saves use once call to current_thd per Item creation
Added mandatory thd parameter to Item (and all derivative classes) constructor.
Added thd parameter to all routines that may create items.
Also removed "current_thd" from Item::Item. This reduced number of
pthread_getspecific() calls from 290 to 177 per OLTP RO transaction.
While sql_bin_log=1(0) is meant to control binary logging for the
current session so that the updates to do(not) get logged into the
binary log to be replicated to the async MariaDB slave. The same
should not affect galera replication.
That is, the updates should always get replicated to other galera
nodes regardless of sql_bin_log's value.
Fixed by making sure that the updates are written to binlog cache
irrespective of sql_bin_log.
Added test cases.
The old code used pthread_setspecific() to store temporary data used by the thread.
This is not safe when used with thread pool, as the thread may change for the transaction.
The fix is to save the data in THD, which is guaranteed to be properly freed.
I also fixed the code so that we don't do a malloc() for every transaction.
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd
Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)
Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
examined_rows -> join_examined_rows
record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()
Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
Initialize abs_timeout when it is about to be used. This saves one my_hrtime()
call on hot path (when we acquire MDL lock without waiting).
When filling I_S.PROCESSLIST use THD::start_utime/THD::utime_after_query instead
of THD::start_time. This allows us to save 2 clock_gettime() calls.
Overhead change:
__clock_gettime 0.13% -> 0.11% (122 -> 76 calls per OLTP RO transaction)
my_interval_timer 0.07% -> 0.06%
my_hrtime 0.04% -> 0.01%
When the slave processes the master restart format_description event,
parallel replication needs to complete any prior events before processing
the restart event (which closes temporary tables and such stuff).
This happens in wait_for_workers_idle(), however it was not waiting long
enough. The wait was using wait_for_prior_commit(), but at that points table
can still be open. This lead to assertion in this case.
So change wait_for_workers_idle() to wait until all worker threads have
reached finish_event_group(), at which point all tables should have been
closed.
Avoid calling current_thd from thd_kill_level(). This reduces number of
pthread_getspecific() calls from 776 to 354.
Also thd_kill_level(NULL) is not permitted anymore: this saves one condition.
Added THD argument to select_result and all derivative classes.
This reduces number of pthread_getspecific calls from 796 to 776 per OLTP RO
transaction.
THD::enter_stage() optimizations:
- stage backup code moved to THD::backup_stage(), saves one condition
- moved check for "new_stage" out to callers that actually need it
- remnants of enter_stage() moved to sql_class.h so it can be inlined
THD::enter_stage() overhead dropped 0.48% -> 0.07%.
PROFILING::status_change() optimizations:
- "status_arg" is now checked by QUERY_PROFILE::new_status()
- no need to check "enabled": !enabled && current is impossible
- remnants of status_change() moved to sql_profile.h so it can be inlined
PROFILING::status_change() overhead dropped 0.1% -> out of radar.
XA COMMIT/ROLLBACK of XA transaction owned by different thread may access
freed memory if that thread disconnects at the same time.
Also concurrent XA COMMIT/ROLLBACK of recovered XA transaction were not
serialized properly.
including the big commit
commit 305130361bf72726de220f3d2b2787395e10be61
Author: Marc Alff <marc.alff@oracle.com>
Date: Tue Feb 10 11:31:32 2015 +0100
WL#8354 BACKPORT DIGEST IMPROVEMENTS TO MYSQL 5.6
(with the following commits) and related changes in sql/
on disconnect THD must clean user_var_events array before
dropping temporary tables. Otherwise when binlogging a DROP,
it'll access user_var_events, but they were allocated
in the already freed memroot.
XID cache is now based on lock-free hash.
Also fixed lf_hash_destroy() to call alloc destructor.
Note that previous implementation had race condition when thread was accessing
XA owned by different thread. This new implementation doesn't fix it either.
Parallel replication (in 10.0 / "conservative" mode) relies on binlog group
commits to group transactions that can be safely run in parallel on the
slave. The --binlog-commit-wait-count and --binlog-commit-wait-usec options
exist to increase the number of commits per group. But in case of conflicts
between transactions, this can cause unnecessary delay and reduced througput,
especially on a slave where commit order is fixed.
This patch adds a heuristics to reduce this problem. When transaction T1 goes
to commit, it will first wait for N transactions to queue up for a group
commit. However, if we detect that another transaction T2 is waiting for a row
lock held by T1, then we will skip the wait and let T1 commit immediately,
releasing locks and let T2 continue.
On a slave, this avoids the unfortunate situation where T1 is waiting for T2
to join the group commit, but T2 is waiting for T1 to release locks, causing
no work to be done for the duration of the --binlog-commit-wait-usec timeout.
(The heuristic seems reasonable on the master as well, so it is enabled for
all transactions, not just replication transactions).
fixed embedded server tests
MDEV-7009: SET STATEMENT min_examined_row_limit has no effect
MDEV-6948:SET STATEMENT gtid_domain_id = ... FOR has no effect (same for gtid_seq_no and server_id)
old values of SET STATENENT variables now saved in its own Query_arena and restored later
Create_tmp_table was called incorrectly called in
select_materialized_with_stats::create_result_table, having keep_row_order
passed for the do_not_open parameter and keep_row_order always set to false.
* reset current_thd in THD::~THD, otherwise my_malloc_size_cb_func()
might access THD after it was destroyed.
* remove now redundant set_current_thd(0) calls that follow delete thd.
Implement a new mode for parallel replication. In this mode, all transactions
are optimistically attempted applied in parallel. In case of conflicts, the
offending transaction is rolled back and retried later non-parallel.
This is an early-release patch to facilitate testing, more changes to user
interface / options will be expected. The new mode is not enabled by default.