Commit graph

184 commits

Author SHA1 Message Date
Sergey Vojtovich
5876ed9e5b Relay_log_info::executed_entries to Atomic_counter 2020-04-15 18:36:07 +04:00
Sergey Vojtovich
1c8de231a3 dequeued_count my_atomic to Atomic_counter
Also allocate inuse_relaylog with new rather than my_malloc(MY_ZEROFILL).
2020-03-25 23:49:38 +04:00
seppo
421d52e896 MDEV-6860 Parallel async replication hangs (#1400)
Instrumenting parallel slave worker thread with wsrep replication hooks.
Added mtr test for testing parallel slave support.
The test is based on the test attached in MDEV-6860 jira tracker.
2019-10-16 07:51:36 +03:00
Alexander Barkov
dc588e3d3f Merge remote-tracking branch 'origin/10.3' into 10.4 2019-10-01 10:45:52 +04:00
Alexander Barkov
7e44c455f4 Merge remote-tracking branch 'origin/10.2' into 10.3 2019-10-01 09:37:40 +04:00
Alexander Barkov
f203245e9e Merge remote-tracking branch 'origin/10.1' into 10.2 2019-10-01 07:11:54 +04:00
Sujatha
9b80f9300d MDEV-20645: Replication consistency is broken as workers miss the error notification from an earlier failed group.
Analysis:
========
In general if there are three groups.
1 - Inserts 32 which fails due to local entry '32' on slave.
2 - Inserts 33
3 - Inserts 34

Each group considers itself as a waiter and it waits for prior group 'waitee'.
This is done in 'register_wait_for_prior_event_group_commit'. If there is no
other parallel group being scheduled then no waitee will be there.

Let us assume 3 groups are being scheduled in parallel.

3-> waits for 2-> waits for->1

'1' upon completion it checks is there any registered subsequent waiter. If
so it wakes up the subsequent waiter with its execution status. This execution
status is stored in wakeup_error.

If '1' failed then it sends corresponding wakeup_error to 2. Then '2' aborts
and it propagates error to '3'.  So all further commits are aborted.  This
mechanism works only when all transactions reach a stage where they are
waiting for their prior commit to complete.

In case of optimistic following scenario occurs.

1,2,3 are scheduled in parallel.

3 - Reaches group_commit_code waits for 2 to complete.
1 - errors out sets stop_on_error_sub_id=1.

When a group execution results in error its corresponding sub_id is set to
'stop_on_error_sub_id'. Any new groups queued for execution will check if
their sub_id is > stop_on_error_sub_id.  If it is true their execution will be
skipped as prior group execution failed.  'skip_event_group=1' will be set.
Since the execution of SQL thread is about to stop we just skip execution of
all the following event groups.  We still do all the normal waiting and wakeup
processing between the event groups as a simple way to ensure that everything
is stopped and cleaned up correctly.

Upon error '1' transaction checks for registered waiters. Since no one is
there it simply goes away.

2 - Starts the execution. It checks do I have a waitee.

Since wait_commit_sub_id == entry->last_committed_sub_id no waitee is set.

Secondly: 'entry->stop_on_error_sub_id' is set by '1'st execution.  Now
'handle_parallel_thread' code checks if the current group 'sub_id' is greater
than the 'sub_id' set within 'stop_on_error_sub_id'.

Since the above is true 'skip_event_group=true' is set.  Simply call
'wait_for_prior_commit' to wakeup all waiters.  Group '2' didn't had any
waitee and its execution is skipped.  Hence its wakeup_error=0.It sends a
positive wakeup signal to '3'. Which commits. This results in a missed
transaction. i.e 33 is missed and 34 is committed.

Fix:
===
When a worker learns that an earlier transaction execution has failed, and it
should not proceed for further execution, it should mark its own execution
status as failed so that it alerts its followers to abort as well.
2019-09-30 13:22:37 +05:30
Sergey Vojtovich
3503fbbebf Move THD list handling to THD_list
Implemented and integrated THD_list as a replacement for the global
thread list. It uses own mutex instead of LOCK_thread_count for THD
list protection.

Removed unused first_global_thread() and next_global_thread().

delayed_insert_threads is now protected by LOCK_delayed_insert. Although
this patch doesn't fix very wrong synchronization of this variable.

After this patch there are only 2 legitimate uses of LOCK_thread_count
left, both in mysqld.cc: thread_count and ready_to_exit.

Aim is to reduce usage of LOCK_thread_count and COND_thread_count.
Part of MDEV-15135.
2019-01-28 17:39:07 +04:00
Marko Mäkelä
ae9d82c9f8 Merge 10.2 into 10.3 2018-10-11 08:22:08 +03:00
Marko Mäkelä
07815d9555 Merge 10.1 into 10.2 2018-10-11 08:16:08 +03:00
Andrei Elkin
f517d8c742 MDEV-17346 parallel slave start and stop races to workers disappeared
The bug appears as a slave SQL thread hanging in
rpl_parallel_thread_pool::get_thread() while there are no slave worker
threads to awake it.

The reason of the hang is that at the parallel slave worker pool
activation the being stared SQL thread could read the worker pool size
concurrently with pool deactivation. At reading the SQL thread did not
employ necessary protection from a race.

Fixed with making the SQL thread at the pool activation first
to grab the same lock as potential deactivator also does prior
to access the pool size.
2018-10-08 19:46:34 +03:00
Monty
58721c3e38 MDEV-16286 Killed CREATE SEQUENCE leaves sequence in unusable state
Fixed by deleting the sequence if we where not able to initialize it

I also noticed that we didn't always set the error message when
check_killed(), which could lead to aborted queries without error
beeing properly set. Fixed by default setting error message if
check_error() noticed that killed had been called.
This allowed me to remove a lot of calls to thd->send_kill_message().
2018-05-27 19:47:17 +03:00
Thirunarayanan Balathandayuthapani
85cc6b70bd MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY and ALGORITHM=INSTANT
Introduced new alter algorithm type called NOCOPY & INSTANT for
inplace alter operation.

NOCOPY - Algorithm refuses any alter operation that would
rebuild the clustered index. It is a subset of INPLACE algorithm.

INSTANT - Algorithm allow any alter operation that would
modify only meta data. It is a subset of NOCOPY algorithm.

Introduce new variable called alter_algorithm. The values are
DEFAULT(0), COPY(1), INPLACE(2), NOCOPY(3), INSTANT(4)

Message to deprecate old_alter_table variable and make it alias
for alter_algorithm variable.

alter_algorithm variable for slave is always set to default.
2018-05-07 14:58:11 +05:30
Monty
30ebc3ee9e Add likely/unlikely to speed up execution
Added to:
- if (error)
- Lex
- sql_yacc.yy and sql_yacc_ora.yy
- In header files to alloc() calls
- Added thd argument to thd_net_is_killed()
2018-05-07 00:07:32 +03:00
Sergei Golubchik
b1818dccf7 Merge branch '10.2' into 10.3 2018-03-28 17:31:57 +02:00
Andrei Elkin
30019a48bf MDEV-12746 rpl.rpl_parallel_optimistic_nobinlog fails committing
out of order at retry

The test failures were of two sorts. One is that the number of retries
what the slave thought as a temporary error exceeded
the default value of the slave retry option.
The 2nd issue was an out of order commit by transactions that
were supposed to error out instead.
Both issues are caused by the same reason that the post-temporary-error
retry did not check possibly already existing error status.

This is mended with refining conditions to retry. Specifically, a retrying
worker checks `rpl_parallel_entry::stop_on_error_sub_id` that
a potential failing predecessor could set to its own sub id.
Now should the member be set the retrying follower errors out with
ER_PRIOR_COMMIT_FAILED.
2018-03-13 12:46:07 +02:00
Monty
a7e352b54d Changed database, tablename and alias to be LEX_CSTRING
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db

Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
  for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
  correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
  handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
  NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
2018-01-30 21:33:55 +02:00
Alexander Barkov
c7a2f23a7b Merge remote-tracking branch 'origin/bb-10.2-ext' into 10.3 2018-01-29 12:44:20 +04:00
Monty
7fc25cfbca Fix for MDEV-12730
Assertion `count > 0' failed in rpl_parallel_thread_pool::
get_thread, rpl.rpl_parallel failed in buildbot

The reason for this is that one thread can call
rpl_parallel_resize_pool_if_no_slaves() while
another thread calls at the same time
rpl_parallel_activate_pool(). If rpl_parallel_active_pool() is
called before rpl_parallel_resize_pool_if_no_slaves() has
finished, pool->count will be set to 0 even if there exists
active slave threads.

Added a mutex lock in rpl_parallel_activate_pool() to protect against this scenario, which seams to fix this issue.
2018-01-24 23:33:21 +02:00
Monty
13770edbcb Changed from using LOCK_log to LOCK_binlog_end_pos for binary log
Part of MDEV-13073 AliSQL Optimize performance of semisync

The idea it to use a dedicated lock detecting if there is new data in
the master's binary log instead of the overused LOCK_log.

Changes:
- Use dedicated COND variables for the relay and binary log signaling.
  This was needed as we where the old 'update_cond' variable was used
  with different mutex's, which could cause deadlocks.
   - Relay log uses now COND_relay_log_updated and LOCK_log
   - Binary log uses now COND_bin_log_updated and LOCK_binlog_end_pos
- Renamed signal_cnt to relay_signal_cnt (as we now have two signals)
- Added some missing error handling in MYSQL_BIN_LOG::new_file_impl()
- Reformatted some comments with old style
- Renamed m_key_LOCK_binlog_end_pos to key_LOCK_binlog_end_pos
- Changed 'signal_update()' to update_binlog_end_pos() which works for
  both relay and binary log
2017-12-18 13:43:37 +02:00
Monty
ea37c129f9 Removed not used lock argument from read_log_event 2017-12-18 13:43:36 +02:00
Marko Mäkelä
2c1067166d Merge bb-10.2-ext into 10.3 2017-10-04 08:24:06 +03:00
Vladislav Vaintroub
7354dc6773 MDEV-13384 - misc Windows warnings fixed 2017-09-28 17:20:46 +00:00
Sergei Golubchik
bb8e99fdc3 Merge branch 'bb-10.2-ext' into 10.3 2017-08-26 00:34:43 +02:00
Monty
21518ab2e4 New option for slow logging (log_slow_disable_statements)
This fixes MDEV-7742 and MDEV-8305 (Allow user to specify if stored
procedures should be logged in the slow and general log)

New functionality:
- Added new variables log_slow_disable_statements and log_disable_statements
  that can be used to disable logging of certain queries to slow and
  general log. Currently supported options are 'admin', 'call', 'slave'
  and 'sp'.
  Defaults are as before. Only 'sp' (stored procedure statements) is
  disabled for  slow and general_log.
- Slow log to files now includes the following new information:
  - When logging stored procedure statements the name of stored
    procedure is logged.
  - Number of created tmp_tables, tmp_disk_tables and the space used
    by temporary tables.
- When logging 'call', the logged status now contains the sum of all
  included statements.  Before only 'time' was correct.
- Added filsort_priority_queue as an option for log_slow_filter (this
  variable existed before, but was not exposed)
- Added support for BIT types in my_getopt()

Mapped some old variables to bitmaps (old variables can still be used)
- Variable 'log_queries_not_using_indexes' is mapped to
  log_slow_filter='not_using_index'
- Variable 'log_slow_slave_statements' is mapped to
  log_slow_disabled_statements='slave'
- Variable 'log_slow_admin_statements' is mapped to
  log_slow_disabled_statements='admin'
- All the above variables are changed to session variables from global
  variables

Other things:
- Simplified LOGGER::log_command. We don't need to check for super if
  OPTION_LOG_OFF is set as this flag can only be set if one is a super
  user.
- Removed some setting of enable_slow_log as it's guaranteed to be set by
  mysql_parse()
- mysql_admin_table() now sets thd->enable_slow_log
- Added prepare_logs_for_admin_command() to reset thd->enable_slow_log if
  needed.
- Added new functions to store, restore and add slow query status
- Added new functions to store and restore query start time
- Reorganized Sub_statement_state according to types
- Added code in dispatch_command() to ensure that
  thd->reset_for_next_command() is always called for a query.
- Added thd->last_sql_command to simplify checking of what was the type
  of the last command. Needed when logging to slow log as lex->sql_command
  may have changed before slow logging is called.
- Moved QPLAN_TMP_... to where status for tmp tables are updated
- Added new THD variable, affected_rows, to be able to correctly log
  number of affected rows to slow log.
2017-08-24 01:05:51 +02:00
Michael Widenius
f71bed08ca Safety fix: lock binlog_end_pos before calling signal_update
The mutex is needed to ensure that sql thread should not not miss the error
signal.
2017-08-24 01:05:50 +02:00
Michael Widenius
4aaa38d26e Enusure that my_global.h is included first
- Added sql/mariadb.h file that should be included first by files in sql
  directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
  that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
2017-08-24 01:05:44 +02:00
Sergei Golubchik
cb1e76e4de Merge branch '10.1' into 10.2 2017-08-17 11:38:34 +02:00
Monty
74543698a7 MDEV-13179 main.errors fails with wrong errno
The problem was that the introduction of max-thread-mem-used can cause
an allocation error very early, even before mysql_parse() is called.
As mysql_parse() calls thd->reset_for_next_command(), which called
clear_error(), the error number was lost.

Fixed by adding an option to have unique messages for each KILL
signal and change max-thread-mem-used to use this new feature.
This removes a lot of problems with the original approach, where
one could get errors signaled silenty almost any time.

ixed by moving clear_error() from reset_for_next_command() to
do_command(), before any memory allocation for the thread.

Related changes:
- reset_for_next_command() now have an optional parameter if we should
  call clear_error() or not. By default it's called, but not anymore from
  dispatch_command() which was the original problem.
- Added optional paramater to clear_error() to force calling of
  reset_diagnostics_area(). Before clear_error() only called
  reset_diagnostics_area() if there was no error, so we normally
  called reset_diagnostics_area() twice.
- This change removed several duplicated calls to clear_error()
  when starting a query.
- Reset max_mem_used on COM_QUIT, to protect against kill during
  quit.
- Use fatal_error() instead of setting is_fatal_error (cleanup)
- Set fatal_error if max_thead_mem_used is signaled.
  (Same logic we use for other places where we are out of resources)
2017-08-07 03:48:58 +03:00
Kristian Nielsen
1d91910b94 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Merge into MariaDB 10.3.
2017-07-03 09:33:41 +02:00
Marko Mäkelä
f740d23ce6 Merge 10.1 into 10.2 2017-04-28 12:22:32 +03:00
Kristian Nielsen
89aad233de MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Move the discovery of mysql.gtid_slave_pos* tables into the SQL thread.

This avoids doing things like opening tables and scanning the mysql
schema for tables inside of the START SLAVE statement, which might
interact badly with existing transaction or table locks.

(Even though START SLAVE is documented to implicitly commit any active
transactions, this appears not to be the case in current code).

Table discovery fits naturally in the SQL thread init code, next to
the loading of mysql.gtid_slave_pos state.
2017-04-23 10:49:58 +02:00
Marko Mäkelä
8c38147cdd Merge 10.0 into 10.1 2017-04-21 12:46:12 +03:00
Kristian Nielsen
88613e1df6 MDEV-11201: gtid_ignore_duplicates incorrectly ignores statements when GTID replication is not enabled
When master_use_gtid=no, the IO thread loads the slave GTID state from
the master during connect. This races with the SQL thread when
gtid_ignore_duplicates=1. If an event is in the relay log from before
the new connect and has not been applied yet, moving the slave
position causes the SQL thread to think that event should be skipped
due to gtid_ignore_duplicates=1.

This patch simply disables gtid_ignore_duplicates when not using GTID,
which seems to be what one would expect.
2017-04-10 07:53:27 +02:00
Sergei Golubchik
da4d71d10d Merge branch '10.1' into 10.2 2017-03-30 12:48:42 +02:00
Sergei Golubchik
09a2107b1b Merge branch '10.0' into 10.1 2017-03-21 19:20:44 +01:00
Monty
e7f55fde88 Removed wrong assert
The following is an updated commit message for the following commit
that was pushed before I had a chance to update the commit message:
c5e25c8b40

Fixed dead locks when doing stop slave while slave was starting.

- Added a separate lock for protecting start/stop/reset of a specific slave.
  This solves some possible dead locks when one calls stop slave while
  the slave is starting as the old run_locks was over used for other things.
- Set hash->records to 0 before calling free of all hash elements.
  This was set to stop concurrent threads to loop over hash elements and
  access members that was already freed.
  This was a problem especially in start_all_slaves/stop_all_slaves
  as the mutex protecting the hash was temporarily released while a slave
  was started/stopped.
- Because of change to hash->records during hash_reset(),
  any_slave_sql_running() will return 1 during shutdown as one can't
  loop over master_info_index->master_info_hash while hash_reset() of it
  is in progress.
  This also fixes a potential old bug in any_slave_sql_running() where
  during shutdown and ~Master_info_index(), my_hash_free() we could
  potentially try to access elements that was already freed.
2017-03-16 14:21:32 +02:00
Marko Mäkelä
adc91387e3 Merge 10.0 into 10.1 2017-03-03 13:27:12 +02:00
Monty
84ed5e1d5f Fixed hang doing FLUSH TABLES WITH READ LOCK and parallel replication
The problem was that waiting for pause_for_ftwrl was done before
event_group was completed. This caused rpl_pause_for_ftwrl() to wait
forever during FLUSH TABLES WITH READ LOCK.
Now we only wait for FLUSH TABLES WITH READ LOCK when we are changing
to a new event group.
2017-02-28 16:10:47 +01:00
Monty
c5e25c8b40 Added a separate lock for start/stop/reset slave.
This solves some possible dead locks when one calls stop slave while slave
is starting.
2017-02-28 16:10:46 +01:00
Monty
e65f667bb6 MDEV-9573 'Stop slave' hangs on replication slave
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.

Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()

Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
  of stop_slave() to not have to use LOCK_active_mi inside
  terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
  have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
  replication. Problem was that waiting for pause_for_ftwrl was not done
  when deleting rpt->current_owner after a force_abort.
2017-02-28 16:10:46 +01:00
Sergei Golubchik
4a5d25c338 Merge branch '10.1' into 10.2 2016-12-29 13:23:18 +01:00
Sergei Golubchik
2f20d297f8 Merge branch '10.0' into 10.1 2016-12-11 09:53:42 +01:00
Kristian Nielsen
390f2a013b Fix incorrect reading of events from relaylog in parallel replication.
The SQL thread keeps track of the position in the current relay log from
which to read the next event. This position is not normally used, but a
certain interaction with the IO thread can cause the SQL thread to re-open
the relay log and seek to the stored position.

In parallel replication, there were a couple of places where the position
was not updated. This created a race where a re-open of the relay log could
seek to the wrong position and start re-reading and processing events
already handled once, causing various kinds of problems.

Fix this by moving the position update into a single place in
apply_event_and_update_pos(), which should ensure that the position is
always updated in the parallel replication case.

This problem was found from the testcase of MDEV-10863, but it is logically
a separate problem.
2016-11-16 11:00:38 +01:00
Kristian Nielsen
c06bc66816 MDEV-11065: Compressed binary log
Minor review comments/changes:

 - A bunch of style-fixes.

 - Change macros to static inline functions.

 - Update check_event_type() with compressed event types.

 - Small .result file update.
2016-10-20 18:00:59 +02:00
Kristian Nielsen
e1ef99c3dc MDEV-7145: Delayed replication
Merge feature into 10.2 from feature branch.

Delayed replication adds an option

  CHANGE MASTER TO master_delay=<seconds>

Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.

Feature is ported from MySQL source tree.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-16 23:44:44 +02:00
Kristian Nielsen
3011060b2a MDEV-7145: Delayed slave.
Extend to work also for parallel replication.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
Kristian Nielsen
50f19ca809 Remove unnecessary global mutex in parallel replication.
The function apply_event_and_update_pos() is called with the
rli->data_lock mutex held. However, there seems to be nothing in the
function actually needing the mutex to be held. Certainly not in the
parallel replication case, where sql_slave_skip_counter is always 0
since the non-zero case is handled by the SQL driver thread.

So this patch makes parallel replication use a variant of
apply_event_and_update_pos() without the need to take the
rli->data_lock mutex. This avoids one contended global mutex for each
event executed, which might improve performance on CPU-bound workloads
somewhat.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 22:44:40 +02:00
Kristian Nielsen
ec47beaba6 Merge parallel replication async deadlock kill into 10.2.
Conflicts:
	sql/mysqld.cc
	sql/slave.cc
2016-09-09 12:15:53 +02:00
Kristian Nielsen
7e0c9de864 Parallel replication async deadlock kill
When a deadlock kill is detected inside the storage engine, the kill
is not done immediately, to avoid calling back into the storage engine
kill_query method with various lock subsystem mutexes held. Instead the
kill is queued and done later by a slave background thread.

This patch in preparation for fixing TokuDB optimistic parallel
replication, as well as for removing locking hacks in InnoDB/XtraDB in
10.2.

Signed-off-by: Kristian Nielsen <knielsen at knielsen-hq.org>
2016-09-08 15:25:40 +02:00