Originally introduced by e972125f1 to avoid harmless wait for
LOCK_global_system_variables in a newly created thread, which creation was
initiated by system variable update.
At the same time it opens dangerous hole, when system variable update
thread already released LOCK_global_system_variables and ack_receiver
thread haven't yet completed new THD construction. In this case THD
constructor goes completely unprotected.
Since ack_receiver.stop() waits for the thread to go down, we have to
temporarily release LOCK_global_system_variables so that it doesn't
deadlock with ack_receiver.run(). Unfortunately it breaks atomicity
of rpl_semi_sync_master_enabled updates and makes them not serialized.
LOCK_rpl_semi_sync_master_enabled was introduced to workaround the above.
TODO: move ack_receiver start/stop into repl_semisync_master
enable_master/disable_master under LOCK_binlog protection?
Part of MDEV-14984 - regression in connect performance
Rather than parsing session_track_system_variables when thread starts, do
it when first trackable event occurs.
Benchmarked on a 2socket/20core/40threads Broadwell system using sysbench
connect brencmark @40 threads (with select 1 disabled):
101379.77 -> 143016.68 CPS, whereas 10.2 is currently at 137766.31 CPS.
Part of MDEV-14984 - regression in connect performance
One less new/delete per connection.
Removed m_mem_flag since most allocs are thread specific. The only
exception are allocs performed during initialization.
Removed State_tracker and Session_tracker constructors as they don't make
sense anymore.
No reason to access session_sysvars_tracker via get_tracker(), so access
it directly instead.
Part of MDEV-14984 - regression in connect performance
If a derived table has SELECT DISTINCT, provide index statistics for it so that the join optimizer in the
upper select knows that ref access to the table will produce one row.
depends on uninitialised value
Initialized THD::force_read_stats introduced in the patch for MDEV-17605.
Leaving this field uninitialized in the constructor of the THD class may
trigger reading statistical data that is not needed.
depends on uninitialised value
Initialized THD::force_read_stats introduced in the patch for MDEV-17605.
Leaving this field uninitialized in the constructor of the THD class may
trigger reading statistical data that is not needed.
Let xid_cache_insert()/xid_cache_delete() handle xa_state.
Let session tracker use is_explicit_XA() rather than xa_state != XA_NOTR.
Fixed open_tables() to refuse data access in XA_ROLLBACK_ONLY state.
Removed dead code from THD::cleanup(). It was supposed to be a reminder,
but it got messed up over time.
spider_internal_start_trx() is called either with XA_NOTR or XA_ACTIVE,
which is guarded by server callers. Thus is_explicit_XA() is acceptable
replacement for XA_ACTIVE check (which was likely wrong anyway).
Setting xa_state to XA_PREPARED in spider_internal_xa_prepare() isn't
meaningful, as this value is never accessed later. It can't be accessed
by current thread and it can't be recovered either. It can only be
accessed by spider internally, which never happens.
Make spider_xa_lock()/spider_xa_unlock() static.
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
XID_STATE::rm_error is never used by internal 2PC, it is intended to be
used by explicit XA only.
Also removed redundant xid reset from THD::init_for_queries(). Must've
been done already either by THD::transaction constructor or by
THD::cleanup().
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
- Changed replaying to always allocate a separate THD object
for applying log events. This is to avoid tampering original
THD state during replay process.
- Return success from sp_instr_stmt::exec_core() if replaying
succeeds.
- Do not push warnings/errors into diagnostics area if the
transaction must be replayed. This is to avoid reporting
transient errors to the client.
Added two tests galera_sp_bf_abort, galera_sp_insert_parallel.
Wsrep-lib position updated.
This bug is caused by pushdown from HAVING into WHERE.
It appears because condition that is pushed wasn't fixed.
It is also discovered that condition pushdown from HAVING into
WHERE is done wrong. There is no need to build clones for some
conditions that can be pushed. They can be simply moved from HAVING
into WHERE without cloning.
build_pushable_cond_for_having_pushdown(),
remove_pushed_top_conjuncts_for_having() methods are changed.
It is found that there is no transformation made for fields of
pushed condition.
field_transformer_for_having_pushdown transformer is added.
New tests are added. Some comments are changed.
ignore FK-prelocked tables when looking for write-prelocked tables
with auto-increment to complain about "Statement is unsafe because
it invokes a trigger or a stored function that inserts into an
AUTO_INCREMENT column"
The MDEV-17262 commit 26432e49d3
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.
In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
This patch contains a fix for the MDEV-17262/17243 issues and
new mtr test.
These issues (MDEV-17262/17243) have two reasons:
1) After an intermediate commit, a transaction loses its status
of "transaction that registered in the MySQL for 2pc coordinator"
(in the InnoDB) due to the fact that since version 10.2 the
write_row() function (which located in the ha_innodb.cc) does
not call trx_register_for_2pc(m_prebuilt->trx) during the processing
of split transactions. It is necessary to restore this call inside
the write_row() when an intermediate commit was made (for a split
transaction).
Similarly, we need to set the flag of the started transaction
(m_prebuilt->sql_stat_start) after intermediate commit.
The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the
wsrep_load_data_split() function (which located in sql_load.cc)
will also do this, but it will be too late. As a result, the call
to the wsrep_append_keys() function from the InnoDB engine may be
lost or function may be called with invalid transaction identifier.
2) If a transaction with the LOAD DATA statement is divided into
logical mini-transactions (of the 10K rows) and binlog is rotated,
then in rare cases due to the wsrep handler re-registration at the
boundary of the split, the last portion of data may be lost. Since
splitting of the LOAD DATA into mini-transactions is technical,
I believe that we should not allow these mini-transactions to fall
into separate binlogs. Therefore, it is necessary to prohibit the
rotation of binlog in the middle of processing LOAD DATA statement.
https://jira.mariadb.org/browse/MDEV-17262 and
https://jira.mariadb.org/browse/MDEV-17243
* MDEV-16509 Improve wsrep commit performance with binlog disabled
Release commit order critical section early after trx_commit_low() if
binlog is not transaction coordinator. In order to avoid two phase commit,
binlog_hton is not registered for THD during IO_CACHE population.
Implemented a test which verifies that the transactions release
commit order early.
This optimization will change behavior during recovery as the commit
is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25
and wsrep-recover to match the behavior.
* MDEV-18730 Ordering for wsrep binlog group commit
Previously out of order execution was allowed for wsrep commits.
Established proper ordering by populating wait_for_commit
for every wsrep THD and making group commit leader to wait for
prior commits before proceeding to trx_group_commit_leader().
* MDEV-18730 Added a test case to verify correct commit ordering
* MDEV-16509, MDEV-18730 Review fixes
Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton
should be registered. Whitespace/syntax fixes and cleanups.
* MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test
If the commit to InnoDB is done in one phase, the native InnoDB behavior
is that the transaction is committed in memory before it is persisted to
disk. This means that the innodb_disallow_writes=ON may not prevent
transaction to become visible to other readers before commit is completely
over. On the other hand, if the commit is two phase (as it is with binlog),
the transaction will be blocked in prepare phase.
Fixed the test to use binlog, which enforces two phase commit, which
in turn makes commit to block before the changes become visible to
other connections. This guarantees that the test produces expected
result.
The patches features an optional shutdown behavior to hold on until
after all connected slaves have been sent the last binlogged event.
The connected slave is one whose START SLAVE has been acknowledged and
that was not stopped since that though it could be technically
reconnecting in background.
The solution therefore disallows killing the dump thread until is has
found EOF of the latest binlog file. It is up to the shutdown
requester (DBA) to set up a sufficiently large shutdown timeout value
for shudown to wait patiently until lagging behind slaves have been
synchronized. On the other hand if a specific slave needs exclusion
from synchronization the DBA would have to stop it manually which
would terminate its dump thread.
`mysqladmin shutdown' is extended with a `--wait_for_all_slaves' option
which translates to `SHUTDOW WAIT FOR ALL SLAVES' sql query
to enable the feature on the client side.
The patch also performs a small refactoring of the server shutdown
around close_connections() to introduce kill thread phases which
are two as of current.
Make mysqltest to use --ps-protocol more
use prepared statements for everything that server supports
with the exception of CALL (for now).
Fix discovered test failures and bugs.
tests:
* PROCESSLIST shows Execute state, not Query
* SHOW STATUS increments status variables more than in text protocol
* multi-statements should be avoided (see tests with a wrong delimiter)
* performance_schema events have different names in --ps-protocol
* --enable_prepare_warnings
mysqltest.cc:
* make sure run_query_stmt() doesn't crash if there's
no active connection (in wait_until_connected_again.inc)
* prepare all statements that server supports
protocol.h
* Protocol_discard::send_result_set_metadata() should not send
anything to the client.
sql_acl.cc:
* extract the functionality of getting the user for SHOW GRANTS
from check_show_access(), so that mysql_test_show_grants() could
generate the correct column names in the prepare step
sql_class.cc:
* result->prepare() can fail, don't ignore its return value
* use correct number of decimals for EXPLAIN columns
sql_parse.cc:
* discard profiling for SHOW PROFILE. In text protocol it's done in
prepare_schema_table(), but in --ps it is called on prepare only,
so nothing was discarding profiling during execute.
* move the permission checking code for SHOW CREATE VIEW to
mysqld_show_create_get_fields(), so that it would be called during
prepare step too.
* only set sel_result when it was created here and needs to be
destroyed in the same block. Avoid destroying lex->result.
* use the correct number of tables in check_show_access(). Saying
"as many as possible" doesn't work when first_not_own_table isn't
set yet.
sql_prepare.cc:
* use correct user name for SHOW GRANTS columns
* don't ignore verbose flag for SHOW SLAVE STATUS
* support preparing REVOKE ALL and ROLLBACK TO SAVEPOINT
* don't ignore errors from thd->prepare_explain_fields()
* use select_send result for sending ANALYZE and EXPLAIN, but don't
overwrite lex->result, because it might be needed to issue execute-time
errors (select_dumpvar - too many rows)
sql_show.cc:
* check grants for SHOW CREATE VIEW here, not in mysql_execute_command
sql_view.cc:
* use the correct function to check privileges. Old code was doing
check_access() for thd->security_ctx, which is invoker's sctx,
not definer's sctx. Hide various view related errors from the invoker.
sql_yacc.yy:
* initialize lex->select_lex for LOAD, otherwise it'll contain garbage
data that happen to fail tests with views in --ps (but not otherwise).
slave_list was used to provide data for SHOW SLAVE HOSTS and
Slaves_connected status variable.
Introduced binlog_dump_thread_count which is exposed via Slaves_connected
(replaces slave_list.records).
Store Slave_info on THD and access it by iterating server_threads
(replaces slave_list).
Added:
THD::slave_info
binlog_dump_thread_count
show_slave_hosts_callback()
Removed:
slave_list
SLAVE_LIST_CHUNK
SLAVE_ERRMSG_SIZE
slave_list_key()
slave_info_free()
init_slave_list()
end_slave_list()
all_slave_list_mutexes
init_all_slave_list_mutexes()
key_LOCK_slave_list
LOCK_slave_list
Moved:
SLAVE_INFO -> Slave_info
register_slave() -> THD::register_slave()
unregister_slave() -> THD::unregister_slave()
Also removed redundant end_slave() from close_connections(): it is called
again soon afterwards by clean_up().
Pre-requisite for clean MDEV-18450 solution.
and, again, *don't use thd->clear_error()*
this fixed main.sp_notembedded failure on various amd64 platforms
(where ER_STACK_OVERRUN_NEED_MORE happens to fire in open_stat_tables()
under Dummy_error_handler)
This patch adds support for expiring user passwords.
The following statements are extended:
CREATE USER user@localhost PASSWORD EXPIRE [option]
ALTER USER user@localhost PASSWORD EXPIRE [option]
If no option is specified, the password is expired with immediate
effect. If option is DEFAULT, global policy applies according to
the default_password_lifetime system var (if 0, password never
expires, if N, password expires every N days). If option is NEVER,
the password never expires and if option is INTERVAL N DAY, the
password expires every N days.
The feature also supports the disconnect_on_expired_password system
var and the --connect-expired-password client option.
Closes#1166
The check for streaming replication logging format in
THD::decide_logging_format() did the check also for DDLs running
in TOI mode. This caused DROP DATABASE to fail if streaming
replication was enabled.
Added check for THD wsrep execution mode and perform the check
only if the THD is in local processing mode (i.e. not TOI).
Added galera_sr_create_drop test to verify that CREATE/DROP
statements pass even if streaming replication is on.
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.
This task involves the implementation for the optimizer trace.
This feature produces a trace for any SELECT/UPDATE/DELETE/,
which contains information about decisions taken by the optimizer during
the optimization phase (choice of table access method, various costs,
transformations, etc). This feature would help to tell why some decisions were
taken by the optimizer and why some were rejected.
Trace is session-local, controlled by the @@optimizer_trace variable.
To enable optimizer trace we need to write:
set @@optimizer_trace variable= 'enabled=on';
To display the trace one can run:
SELECT trace FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;
This task also involves:
MDEV-18489: Limit the memory used by the optimizer trace
introduces a switch optimizer_trace_max_mem_size which limits
the memory used by the optimizer trace. This was implemented by
Sergei Petrunia.
ANALYZE and ANALYZE FORMAT=JSON structures are changed in the way that they
show additional information when rowid filter is used:
- r_selectivity_pct - the observed filter selectivity
- r_buffer_size - the size of the rowid filter container buffer
- r_filling_time_ms - how long it took to fill rowid filter container
New class Rowid_filter_tracker was added. This class is needed to collect data
about how rowid filter is executed.