Commit graph

3427 commits

Author SHA1 Message Date
Teemu Ollakka
1ef50a34ec 10.4 wsrep group commit fixes (#1224)
* MDEV-16509 Improve wsrep commit performance with binlog disabled

Release commit order critical section early after trx_commit_low() if
binlog is not transaction coordinator. In order to avoid two phase commit,
binlog_hton is not registered for THD during IO_CACHE population.

Implemented a test which verifies that the transactions release
commit order early.

This optimization will change behavior during recovery as the commit
is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25
and wsrep-recover to match the behavior.

* MDEV-18730 Ordering for wsrep binlog group commit

Previously out of order execution was allowed for wsrep commits.
Established proper ordering by populating wait_for_commit
for every wsrep THD and making group commit leader to wait for
prior commits before proceeding to trx_group_commit_leader().

* MDEV-18730 Added a test case to verify correct commit ordering

* MDEV-16509, MDEV-18730 Review fixes

Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton
should be registered. Whitespace/syntax fixes and cleanups.

* MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test

If the commit to InnoDB is done in one phase, the native InnoDB behavior
is that the transaction is committed in memory before it is persisted to
disk. This means that the innodb_disallow_writes=ON may not prevent
transaction to become visible to other readers before commit is completely
over. On the other hand, if the commit is two phase (as it is with binlog),
the transaction will be blocked in prepare phase.

Fixed the test to use binlog, which enforces two phase commit, which
in turn makes commit to block before the changes become visible to
other connections. This guarantees that the test produces expected
result.
2019-03-15 07:09:13 +02:00
Sergey Vojtovich
3568427d11 MDEV-18450 Slaves wait shutdown
The patches features an optional shutdown behavior to hold on until
after all connected slaves have been sent the last binlogged event.
The connected slave is one whose START SLAVE has been acknowledged and
that was not stopped since that though it could be technically
reconnecting in background.

The solution therefore disallows killing the dump thread until is has
found EOF of the latest binlog file.  It is up to the shutdown
requester (DBA) to set up a sufficiently large shutdown timeout value
for shudown to wait patiently until lagging behind slaves have been
synchronized. On the other hand if a specific slave needs exclusion
from synchronization the DBA would have to stop it manually which
would terminate its dump thread.

`mysqladmin shutdown' is extended with a `--wait_for_all_slaves' option
which translates to `SHUTDOW WAIT FOR ALL SLAVES' sql query
to enable the feature on the client side.

The patch also performs a small refactoring of the server shutdown
around close_connections() to introduce kill thread phases which
are two as of current.
2019-03-12 17:34:48 +02:00
Sergey Vojtovich
2b711d231a Adieu slave_list
slave_list was used to provide data for SHOW SLAVE HOSTS and
Slaves_connected status variable.

Introduced binlog_dump_thread_count which is exposed via Slaves_connected
(replaces slave_list.records).

Store Slave_info on THD and access it by iterating server_threads
(replaces slave_list).

Added:
THD::slave_info
binlog_dump_thread_count
show_slave_hosts_callback()

Removed:
slave_list
SLAVE_LIST_CHUNK
SLAVE_ERRMSG_SIZE
slave_list_key()
slave_info_free()
init_slave_list()
end_slave_list()
all_slave_list_mutexes
init_all_slave_list_mutexes()
key_LOCK_slave_list
LOCK_slave_list

Moved:
SLAVE_INFO -> Slave_info
register_slave() -> THD::register_slave()
unregister_slave() -> THD::unregister_slave()

Also removed redundant end_slave() from close_connections(): it is called
again soon afterwards by clean_up().

Pre-requisite for clean MDEV-18450 solution.
2019-03-06 17:06:09 +04:00
Sergei Golubchik
132216faf7 don't invoke error interceptors for fatal errors
and, again, *don't use thd->clear_error()*

this fixed main.sp_notembedded failure on various amd64 platforms
(where ER_STACK_OVERRUN_NEED_MORE happens to fire in open_stat_tables()
under Dummy_error_handler)
2019-02-21 15:04:03 +01:00
Robert Bindar
90ad4dbd17 MDEV-7597 Expiration of user passwords
This patch adds support for expiring user passwords.
The following statements are extended:
  CREATE USER user@localhost PASSWORD EXPIRE [option]
  ALTER USER user@localhost PASSWORD EXPIRE [option]
If no option is specified, the password is expired with immediate
effect. If option is DEFAULT, global policy applies according to
the default_password_lifetime system var (if 0, password never
expires, if N, password expires every N days). If option is NEVER,
the password never expires and if option is INTERVAL N DAY, the
password expires every N days.
The feature also supports the disconnect_on_expired_password system
var and the --connect-expired-password client option.

Closes #1166
2019-02-21 15:04:03 +01:00
Oleksandr Byelkin
93ac7ae70f Merge branch '10.3' into 10.4 2019-02-21 14:40:52 +01:00
Galina Shalygina
7fe1ca7ed6 Merge branch '10.4' into bb-10.4-mdev7486 2019-02-19 11:00:39 +03:00
Teemu Ollakka
4baab8697a MDEV-18587 Don't reject DDLs if streaming replication is on
The check for streaming replication logging format in
THD::decide_logging_format() did the check also for DDLs running
in TOI mode. This caused DROP DATABASE to fail if streaming
replication was enabled.

Added check for THD wsrep execution mode and perform the check
only if the THD is in local processing mode (i.e. not TOI).

Added galera_sr_create_drop test to verify that CREATE/DROP
statements pass even if streaming replication is on.
2019-02-19 07:58:03 +02:00
Varun Gupta
9cb55143ac Minor cleanup in the optimizer trace code.
More test coverage added for the optimizer trace.
2019-02-18 17:11:20 +05:30
Galina Shalygina
7a77b221f1 MDEV-7486: Condition pushdown from HAVING into WHERE
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.

How the pushdown is performed on the example:

SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);

=>

SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);

The implementation scheme:

1. Extract the most restrictive condition cond from the HAVING clause of
   the select that depends only on the fields that are used in the GROUP BY
   list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
   of the select
3. Remove cond from the HAVING clause if it is possible

The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().

New test file having_cond_pushdown.test is created.
2019-02-17 23:38:44 -08:00
Igor Babaev
98d55b1366 Merge branch '10.4' into bb-10.4-mdev16188 2019-02-14 22:07:33 -08:00
Varun Gupta
be8709eb7b MDEV-6111 Optimizer Trace
This task involves the implementation for the optimizer trace.

This feature produces a trace for any SELECT/UPDATE/DELETE/,
which contains information about decisions taken by the optimizer during
the optimization phase (choice of table access method, various costs,
transformations, etc). This feature would help to tell why some decisions were
taken by the optimizer and why some were rejected.

Trace is session-local, controlled by the @@optimizer_trace variable.
To enable optimizer trace we need to write:
   set @@optimizer_trace variable= 'enabled=on';

To display the trace one can run:
   SELECT trace FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;

This task also involves:
    MDEV-18489: Limit the memory used by the optimizer trace
    introduces a switch optimizer_trace_max_mem_size which limits
    the memory used by the optimizer trace. This was implemented by
    Sergei Petrunia.
2019-02-13 11:52:36 +05:30
Oleksandr Byelkin
65c5ef9b49 dirty merge 2019-02-07 13:59:31 +01:00
Galina Shalygina
447e0f023f MDEV-18144: ANALYZE for statement support for PK filters
ANALYZE and ANALYZE FORMAT=JSON structures are changed in the way that they
show additional information when rowid filter is used:

- r_selectivity_pct - the observed filter selectivity
- r_buffer_size - the size of the rowid filter container buffer
- r_filling_time_ms - how long it took to fill rowid filter container

New class Rowid_filter_tracker was added. This class is needed to collect data
about how rowid filter is executed.
2019-02-06 23:40:07 +03:00
Marko Mäkelä
e80bcd7f64 Merge 10.3 into 10.4 2019-02-05 12:48:02 +02:00
Sergei Golubchik
5b15cc613e MDEV-11340 Allow multiple alternative authentication methods for the same user
introduce the syntax

... IDENTIFIED { WITH | VIA }
      plugin [ { USING | AS } auth ]
 [ OR plugin [ { USING | AS } auth ]
 [ OR ... ]]

Server will try auth plugins in the specified order until the first
success. No protocol changes, server uses the existing "switch plugin"
packet.

The auth chain is stored in json as

  "auth_or":[{"plugin":"xxx","authentication_string":"yyy"},
             {},
             {"plugin":"foo","authentication_string":"bar"},
            ...],
  "plugin":"aaa", "authentication_string":"bbb"

Note:
* "auth_or" implies that there might be "auth_and" someday;
* one entry in the array is an empty object, meaning to take plugin/auth
  from the main json object. This preserves compatibility with
  the existing mysql.global_priv table and with the mysql.user view.
  This entry is preferrably a mysql_native_password plugin for a
  non-empty mysql.user.password column.

SET PASSWORD is supported and changes the password for the *first*
plugin in the chain that has a notion of a "password"
2019-02-04 16:06:57 +01:00
Igor Babaev
37deed3f37 Merge branch '10.4' into bb-10.4-mdev16188 2019-02-03 18:41:18 -08:00
Igor Babaev
658128af43 MDEV-16188 Use in-memory PK filters built from range index scans
This patch contains a full implementation of the optimization
that allows to use in-memory rowid / primary filters built for range  
conditions over indexes. In many cases usage of such filters reduce  
the number of disk seeks spent for fetching table rows.

In this implementation the choice of what possible filter to be applied  
(if any) is made purely on cost-based considerations.

This implementation re-achitectured the partial implementation of
the feature pushed by Galina Shalygina in the commit
8d5a11122c.

Besides this patch contains a better implementation of the generic  
handler function handler::multi_range_read_info_const() that
takes into account gaps between ranges when calculating the cost of
range index scans. It also contains some corrections of the
implementation of the handler function records_in_range() for MyISAM.

This patch supports the feature for InnoDB and MyISAM.
2019-02-03 14:56:12 -08:00
Vladislav Vaintroub
261ce5286f MDEV-18281 COM_RESET_CONNECTION changes the connection encoding
Store original charset during client authentication, and restore it for
COM_RESET_CONNECTION
2019-02-02 17:32:15 +01:00
Vladislav Vaintroub
e214aa1cd3 MDEV-18281 COM_RESET_CONNECTION changes the connection encoding
Store original charset during client authentication, and restore it for
COM_RESET_CONNECTION
2019-02-02 17:29:33 +01:00
Marko Mäkelä
081fd8bfa2 Merge 10.1 into 10.2 2019-02-02 11:40:02 +02:00
Sergey Vojtovich
4b3656a44d Avoid taking LOCK_thread_count for thread_count protection
Replaced wait on COND_thread_count with busy waiting with 1 millisecond
sleep.

Aim is to reduce usage of LOCK_thread_count and COND_thread_count.
2019-01-29 11:56:35 +04:00
Sergey Vojtovich
9824ec81aa Removed redundant service_thread_count
In contrast to thread_count, which is decremented by THD destructor,
this one was most probably intended to be decremented after all THD
destructors are done.

THD_count class was added to achieve similar effect with thread_count.

Aim is to reduce usage of LOCK_thread_count and COND_thread_count.
Part of MDEV-15135.
2019-01-28 17:39:08 +04:00
Sergey Vojtovich
3503fbbebf Move THD list handling to THD_list
Implemented and integrated THD_list as a replacement for the global
thread list. It uses own mutex instead of LOCK_thread_count for THD
list protection.

Removed unused first_global_thread() and next_global_thread().

delayed_insert_threads is now protected by LOCK_delayed_insert. Although
this patch doesn't fix very wrong synchronization of this variable.

After this patch there are only 2 legitimate uses of LOCK_thread_count
left, both in mysqld.cc: thread_count and ready_to_exit.

Aim is to reduce usage of LOCK_thread_count and COND_thread_count.
Part of MDEV-15135.
2019-01-28 17:39:07 +04:00
Andrei Elkin
f9ac7032cb MDEV-14605 Changes to "ON UPDATE CURRENT_TIMESTAMP" fields are not
always logged properly with binlog_row_image=MINIMAL

There are two issues fixed in this commit.
The first is an observation of a multi-table UPDATE binlogged
in row-format in binlog_row_image=MINIMAL mode. While the UPDATE aims
at a table with an ON-UPDATE attribute its binlog after-image misses
to record also installed default value.
The reason for that turns out missed marking of default-capable fields
in TABLE::write_set.

This is fixed to mark such fields similarly to 10.2's MDEV-10134 patch (db7edfed17)
that introduced it. The marking follows up 93d1e5ce0b841bed's idea
to exploit TABLE:rpl_write_set introduced there though,
and thus does not mess (in 10.1) with the actual MDEV-10134 agenda.
The patch makes formerly arg-less TABLE::mark_default_fields_for_write()
to accept an argument which would be TABLE:rpl_write_set.

The 2nd issue is extra columns in in binlog_row_image=MINIMAL before-image
while merely a packed primary key is enough. The test main.mysqlbinlog_row_minimal
always had a wrong result recorded.
This is fixed to invoke a function that intended for read_set
possible filtering and which is called (supposed to) in all type of MDL, UPDATE
including; the test results have gotten corrected.

At *merging* from 10.1->10.2 the 1st "main" part of the patch is unnecessary
since the bug is not observed in 10.2, so only hunks from

  sql/sql_class.cc

are required.
2019-01-24 20:07:53 +02:00
Brave Galera Crew
36a2a185fe Galera4 2019-01-23 15:30:00 +04:00
Marko Mäkelä
9dc81d2a7a Merge 10.3 into 10.4 2019-01-17 13:21:27 +02:00
Marko Mäkelä
77cbaa96ad Merge 10.2 into 10.3 2019-01-17 12:38:46 +02:00
Marko Mäkelä
8e80fd6bfd Merge 10.1 into 10.2 2019-01-17 11:24:38 +02:00
Vladislav Vaintroub
7d3161def8 MDEV-18225 Avoid use of LOCK_prepared_stmt_count mutex in Statement_map destructo
This mutex can be freed when server shuts down (when thread_count goes down to 0)
, but it is still used inside THD::~THD() when Statement_map is destroyed.

The fix is to call Statement_map::reset() at the point where thread_count
is still positive, and avoid locking LOCK_prepared_stmt_count in THD
destructor.
2019-01-15 10:28:00 +01:00
Michael Widenius
aad0165cea Added support for BACKUP LOCK / BACKUP UNLOCK 2019-01-14 16:18:50 +02:00
Sergey Vojtovich
1e7df0e530 XID_cache_element::m_state transition to std::atomic 2018-12-29 14:09:33 +04:00
Sergey Vojtovich
d2bdd78915 Master_info counters transition to Atomic_counter 2018-12-29 14:09:15 +04:00
Marko Mäkelä
b5763ecd01 Merge 10.3 into 10.4 2018-12-18 11:33:53 +02:00
Marko Mäkelä
45531949ae Merge 10.2 into 10.3 2018-12-18 09:15:41 +02:00
Alexey Botchkov
c4ab352b67 MDEV-14576 Include full name of object in message about incorrect value for column.
The error message modified.
Then the TABLE_SHARE::error_table_name() implementation taken from 10.3,
          to be used as a name of the table in this message.
2018-12-16 02:21:41 +04:00
Marko Mäkelä
fd37344feb Merge 10.3 into 10.4 2018-12-14 16:20:17 +02:00
Marko Mäkelä
5fefcb0a21 Merge 10.2 into 10.3 2018-12-14 16:15:59 +02:00
Marko Mäkelä
a2f2f686cb Work around the crash in MDEV-17814
Internal transactions may not have trx->mysql_thd.
But at the same time, trx->duplicates should only hold if
REPLACE or INSERT...ON DUPLICATE KEY UPDATE was executed from SQL.

The flag feels misplaced. A more appropriate place for it would
be row_prebuilt_t or similar.
2018-12-14 15:50:01 +02:00
Monty
c53aab974b Added syntax and implementation for BACKUP STAGE's
Part of MDEV-5336 Implement LOCK FOR BACKUP

- Changed check of Global_only_lock to also include BACKUP lock.
- We store latest MDL_BACKUP_DDL lock in thd->mdl_backup_ticket to be able
  to downgrade lock during copy_data_between_tables()
2018-12-09 22:12:27 +02:00
Igor Babaev
5f46670bd0 Merge branch '10.4' into 10.4-mdev16188 2018-11-10 14:52:57 -08:00
Marko Mäkelä
074c684099 Merge 10.3 into 10.4 2018-11-06 16:24:16 +02:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Marko Mäkelä
8a346f31b9 MDEV-17073 INSERT…ON DUPLICATE KEY UPDATE became more deadlock-prone
thd_rpl_stmt_based(): A new predicate to check if statement-based
replication is active. (This can also hold when replication is not
in use, but binlog is.)

que_thr_stop(), row_ins_duplicate_error_in_clust(),
row_ins_sec_index_entry_low(), row_ins(): On a duplicate key error,
only lock all index records when statement-based replication is in use.
2018-11-02 18:52:39 +02:00
Marko Mäkelä
2a955c7a83 Merge 10.3 into 10.4 2018-10-10 10:36:51 +03:00
Marko Mäkelä
43ee6915fa Merge 10.2 into 10.3 2018-10-09 09:11:30 +03:00
Vladislav Vaintroub
7fefd53f94 MDEV-14581 Server does not clear diagnostics between sessions
Amend previous patch, so it works in all cases (also for "change user"
command, and for RESET CONNECTION in 10.3)
2018-10-05 16:48:51 +01:00
Galina Shalygina
8d5a11122c MDEV-16188: Use in-memory PK filters built from range index scans
First phase: make optimizer choose to use filter and show it in EXPLAIN.
2018-09-28 23:50:22 +03:00
Alexander Barkov
ad8e02ac45 MDEV-17317 Add THD* parameter into Item::get_date() and stricter data type control to "fuzzydate" 2018-09-28 14:01:17 +04:00
Oleksandr Byelkin
31081593aa Merge branch '11.0' into 10.1 2018-09-06 22:45:19 +02:00