mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-31 19:11:46 +01:00

Author	SHA1	Message	Date
Sergei Golubchik	810cf362ea	Merge branch '5.5' into 10.0	2015-06-11 20:20:35 +02:00
Alexander Barkov	92b365981b	MDEV-7268 Column of table cannot be converted from type 'decimal(0,?)' to type ' 'decimal(10,7)' Changing the error message to: "...from type 'decimal(0,?)/old/' to type ' 'decimal(10,7)'..." So it's now clear that the master data type is OLD decimal.	2015-06-09 12:05:06 +04:00
Sergei Golubchik	f84f577aa1	Merge tag 'mysql-5.5.44' into bb-5.5-serg	2015-06-05 02:06:51 +02:00
Kristian Nielsen	f7385980d3	Merge MDEV-8147 into 10.0	2015-05-26 13:15:57 +02:00
Kristian Nielsen	e5f1e841dc	MDEV-8147: Assertion `m_lock_type == 2' failed in handler::ha_close() during parallel replication When the slave processes the master restart format_description event, parallel replication needs to complete any prior events before processing the restart event (which closes temporary tables and such stuff). This happens in wait_for_workers_idle(), however it was not waiting long enough. The wait was using wait_for_prior_commit(), but at that points table can still be open. This lead to assertion in this case. So change wait_for_workers_idle() to wait until all worker threads have reached finish_event_group(), at which point all tables should have been closed.	2015-05-26 13:04:15 +02:00
Elena Stepanova	0b4231e9f1	MDEV-8154 rpl.show_status_stop_slave_race-7126 sporadically causes internal check failure The test did not have a proper replication cleanup	2015-05-13 15:17:19 +03:00
Sergei Golubchik	49c853fb94	Merge branch '5.5' into 10.0	2015-05-04 22:00:24 +02:00
Sergei Golubchik	f875c9f2a0	MDEV-5114 seconds_behind_master flips to 0 & spikes back, when running show slaves status 1. After a period of wait (where last_master_timestamp=0) do NOT restore the last_master_timestamp to the timestamp of the last executed event (which would mean we've just executed it, and we're that much behind the master). 2. Update last_master_timestamp before executing the event, not after. Take the approach from the this commit (but with a different test case that actually makes sense): commit 0c75ab453fb8c5439576af8fe5add7a1b89f1569 Author: Luis Soares <luis.soares@sun.com> Date: Thu Apr 15 17:39:31 2010 +0100 BUG#52166: Seconds_Behind_Master spikes after long idle period	2015-05-03 11:21:55 +02:00
Kristian Nielsen	ed701c6a23	MDEV-7864: Slave SQL: stopping on non-last RBR event with annotations results in SEGV (signal 11) The slave SQL thread was clearing serial_rgi->thd before deleting serial_rgi, which could cause access to NULL THD. The clearing was introduced in commit `2e100cc5a4` and is just plain wrong. So revert that part (single line) of that commit. Thanks to Daniel Black for bug analysis and test case.	2015-04-28 11:56:54 +02:00
Sergei Golubchik	0f12ada6b6	Merge remote-tracking branch 'mysql/5.5' into 5.5	2015-04-27 21:04:06 +02:00
Sergei Golubchik	f8320210e7	MDEV-7126 replication slave - deadlock in terminate_slave_thread with stop slave and show variables of replication filters and show global status Three-way deadlock: T1: SHOW GLOBAL STATUS -> acquire LOCK_status T2: STOP SLAVE -> acquire LOCK_active_mi -> terminate_slave_thread() -> -> cond_timedwait for handle_slave_sql to stop T3: sql slave thread (same applies to io thread) -> handle_slave_sql(), when exiting -> -> THD::add_status_to_global() -> -> -> wait for LOCK_status... T1: SHOW GLOBAL STATUS -> for "Slave_heartbeat_period" status variable -> -> show_heartbeat_period() -> -> -> wait for LOCK_active_mi cherry-pick from 5.6: commit fc8b395898f40387b3468122bd0dae31e29a6fde Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com> Date: Wed Jun 12 21:41:05 2013 +0530 BUG#16904035-SHOW STATUS - EXCESSIVE LOCKING ON LOCK_ACTIVE_MI AND ACTIVE_MI->RLI->DATA_LOCK Problem: Excessive locking on lock_active_mi and rli->data_lock while executing any `show status like 'X'` command. Analysis: SHOW_FUNCs for Slave_running, Slave_retried_transactions, Slave_heartbeat_period, Slave_received_heartbeats, Slave_last_heartbeat are acquiring lock_active_mi and rli->data_lock to show their variable value. It is ok to show stale data while showing the status variables i.e., even if they miss one update, it will not cause any great trouble. Fix: Remove the locks from the above mentioned SHOW_FUNC functions. Add a test case	2015-04-26 22:05:33 +02:00
f4rnham	060ec5b6b9	MDEV-7130: MASTER_POS_WAIT(log_name,log_pos,timeout,"connection_name") hangs, does not respect the timeout Changed also arg_count check for connection_name to prevent same bug if fifth argument is introduced in future	2015-04-24 13:08:27 +02:00
Kristian Nielsen	b616991a68	MDEV-8031: Parallel replication stops on "connection killed" error (probably incorrectly handled deadlock kill) There was a rare race, where a deadlock error might not be correctly handled, causing the slave to stop with something like this in the error log: 150423 14:04:10 [ERROR] Slave SQL: Connection was killed, Gtid 0-1-2, Internal MariaDB error code: 1927 150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927 150423 14:04:10 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213 150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927 150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927 150423 14:04:10 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master-bin.000001 position 1234 The problem was incorrect error handling. When a deadlock is detected, it causes a KILL CONNECTION on the offending thread. This error is then later converted to a deadlock error, and the transaction is retried. However, the deadlock error was not cleared at the start of the retry, nor was the lingering kill signal. So it was possible to get another deadlock kill early during retry. If this happened with particular thread scheduling/timing, it was possible that the new KILL CONNECTION error was masked by the earlier deadlock error, so that the second kill was not properly converted into a deadlock error and retry. This patch adds code that clears the old error and killed flag before starting the retry. It also adds code to handle a deadlock kill caught in a couple of places where it was not handled before.	2015-04-23 14:09:15 +02:00
Kristian Nielsen	4760528754	MDEV-8029: test failure in rpl.rpl_parallel_temptable Fix a silly typo that caused the test to occasionally fail.	2015-04-21 10:16:14 +02:00
Kristian Nielsen	519ad0f7e3	MDEV-8016: Replication aborts on DROP /!40005 TEMPORARY / TABLE IF EXISTS This was a regression from the patch for MDEV-7668. A test was incorrect, so the slave would not properly handle re-using temporary tables, which lead to replication failure in this case.	2015-04-20 12:59:46 +02:00
Kristian Nielsen	17aff4b17b	Merge MDEV-7936 into 10.0. Conflicts: sql/sql_base.cc	2015-04-13 14:27:25 +02:00
Kristian Nielsen	60d094aeac	MDEV-7936: Assertion `!table \|\| table->in_use == _current_thd()' failed on parallel replication in optimistic mode Make sure that in parallel replication, we execute wait_for_prior_commit() before setting table->in_use for a temporary table. Otherwise we can end up with two parallel replication worker threads competing with each other for use of a temporary table. Re-factor the use of find_temporary_table() to be able to handle errors in the caller (as wait_for_prior_commit() can return error in case of deadlock kill).	2015-04-13 14:24:18 +02:00
Kristian Nielsen	c47fe0e9db	MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing parallel replication failure [This commit cherry-picked to be able to merge MDEV-7936, of which it is a pre-requisite, into both 10.0 and 10.1.] Parallel replication depends on locking (table locks, row locks, etc.) to prevent two conflicting transactions from running and committing in parallel. But temporary tables are designed to be visible only to one thread, and have no such locking. In the concrete issue, an intermediate master could commit a CREATE TEMPORARY TABLE in the same group commit as in INSERT into that table. Thus, a lower-level master could attempt to run them in parallel and get an error. More generally, we need protection from parallel replication trying to run transactions in parallel that access a common temporary table. This patch simply causes use of a temporary table from parallel replication to wait for all previous transactions to commit, serialising the replication at that point. (A more fine-grained locking could be added later, possibly. However, using temporary tables in statement-based replication is in any case normally undesirable; for example a restart of the server will lose temporary tables and can break replication). Note that row-based replication is not affected, as it does not do any temporary tables on the slave-side. This patch also cleans up the locking around protecting the list of temporary tables in Relay_log_info. This used to take the rli->data_lock at the end of every statement, which is very bad for concurrency. With this patch, the lock is not taken unless temporary tables (with statement-based binlogging) are in use on the slave.	2015-04-13 14:08:57 +02:00
Kristian Nielsen	50d98e9cbd	Merge MDEV-7940 into 10.0	2015-04-09 10:13:17 +02:00
Kristian Nielsen	15a2b5aab1	MDEV-7940: Sporadic failure in rpl.rpl_gtid_until Fix a race in the test case. When we do start_slave.inc immediately followed by stop_slave.inc, it is possible to kill the IO thread while it is still running inside get_master_version_and_clock(), and this gives warnings in the error log that cause the test to fail.	2015-04-09 10:03:59 +02:00
Kristian Nielsen	accdabd668	Merge MDEV-7888 and MDEV-7929 into 10.0.	2015-04-08 13:19:22 +02:00
Kristian Nielsen	3b961347db	MDEV-7888, MDEV-7929: Parallel replication hangs sometimes on ANALYZE TABLE or DDL The hangs occur when the group_commit_orderer object is freed before the last mark_start_commit() call on it - this loses the wakeup to other waiting worker threads, causing them to hang until killed manually. The object was freed because wakeup_subsequent_commits() was called two early in two places. For MDEV-7888, during ANALYZE TABLE, and for MDEV-7929 during record_gtid() after processing a DDL event. The group_commit_orderer object can be freed when its last transaction has called wait_for_prior_commit(). Fix by implementing a suspend/resume mechanism for wakeup_subsequent_commits() that can be used in places where a transaction is committed without this being the commit of the actual replication event group. Also add a protection mechanism (that asserts in debug builds) which can prevent the too-early free and hang if other similar bugs should remain in other parts of the code.	2015-04-08 11:01:18 +02:00
Kristian Nielsen	c41e4d3b49	Merge MDEV-7847 and MDEV-7882 into 10.0. Conflicts: mysql-test/suite/rpl/r/rpl_parallel.result mysql-test/suite/rpl/t/rpl_parallel.test	2015-03-30 14:51:25 +02:00
Kristian Nielsen	880f2273fd	MDEV-7847: "Slave worker thread retried transaction 10 time(s) in vain, giving up", followed by replication hanging This patch fixes a bug in the error handling in parallel replication, when one worker thread gets a failure and other worker threads processing later transactions have to rollback and abort. The problem was with the lifetime of group_commit_orderer objects (GCOs). A GCO is freed when we register that its last event group has committed. This relies on register_wait_for_prior_commit() and wait_for_prior_commit() to ensure that the fact that T2 has committed implies that any earlier T1 has also committed, and can thus no longer execute mark_start_commit(). However, in the error case, the code was skipping the register_wait_for_prior_commit() and wait_for_prior_commit() calls. Thus commit ordering was not guaranteed, and a GCO could be freed too early. Then a later mark_start_commit() would reference deallocated GCO, which could lead to lost wakeup (causing slave threads to hang) or other corruption. This patch makes also the error case respect commit order. This way, also the error case gets the GCO lifetime correct, and the hang no longer occurs.	2015-03-30 14:33:44 +02:00
Kristian Nielsen	3d4850158f	Fix embarrassing bug in test case that caused sporadic test failures.	2015-03-17 10:36:38 +01:00
Kristian Nielsen	2e82a8233c	MDEV-7785: errorneous -> erroneous spelling mistake	2015-03-16 10:54:47 +01:00
Venkatesh Duggirala	59142d9a27	Bug #20439913 CREATE TABLE DB.TABLE LIKE TMPTABLE IS BINLOGGED INCORRECTLY - BREAKS A SLAVE Submitted a incomplete patch with my previous push, re submitting the extra changes the required to make the patch complete.	2015-03-13 13:13:48 +05:30
Venkatesh Duggirala	151b8ec4d1	Bug #20439913 CREATE TABLE DB.TABLE LIKE TMPTABLE IS BINLOGGED INCORRECTLY - BREAKS A SLAVE Analysis: In row based replication, Master does not send temp table information to Slave. If there are any DDLs that involves in regular table that needs to be sent to Slave and a temp tables (which will not be available at Slave), the Master rewrites the query replacing temp table with it's defintion. Eg: create table regular_table like temptable. In rewrite logic, server is ignoring the database of regular table which can cause problems mentioned in this bug. Fix: dont ignore database information (if available) while rewriting the query	2015-03-13 12:32:44 +05:30
Kristian Nielsen	ed04c40b01	MDEV-5289: master server starts slave parallel threads Delay spawning parallel replication worker threads until a slave SQL thread is running, and de-spawn them when the last SQL thread stops. This is especially useful to avoid needless threads on a master in a setup where same my.cnf is used on masters and slaves.	2015-03-11 09:18:16 +01:00
Kristian Nielsen	96784eb106	MDEV-7668: Intermediate master groups CREATE TEMPORARY with INSERT, causing parallel replication failure Parallel replication depends on locking (table locks, row locks, etc.) to prevent two conflicting transactions from running and committing in parallel. But temporary tables are designed to be visible only to one thread, and have no such locking. In the concrete issue, an intermediate master could commit a CREATE TEMPORARY TABLE in the same group commit as in INSERT into that table. Thus, a lower-level master could attempt to run them in parallel and get an error. More generally, we need protection from parallel replication trying to run transactions in parallel that access a common temporary table. This patch simply causes use of a temporary table from parallel replication to wait for all previous transactions to commit, serialising the replication at that point. (A more fine-grained locking could be added later, possibly. However, using temporary tables in statement-based replication is in any case normally undesirable; for example a restart of the server will lose temporary tables and can break replication). Note that row-based replication is not affected, as it does not do any temporary tables on the slave-side. This patch also cleans up the locking around protecting the list of temporary tables in Relay_log_info. This used to take the rli->data_lock at the end of every statement, which is very bad for concurrency. With this patch, the lock is not taken unless temporary tables (with statement-based binlogging) are in use on the slave.	2015-03-09 13:17:37 +01:00
Kristian Nielsen	7dee7a036a	GTID: Add missing test of reconnecting into out-of-order binlog.	2015-03-05 09:41:01 +01:00
Kristian Nielsen	3ef0b9b235	Merge MDEV-6589 and MDEV-6403 into 10.0.	2015-03-04 13:36:54 +01:00
Kristian Nielsen	78c74dbe30	MDEV-6403: Temporary tables lost at STOP SLAVE in GTID mode if master has not rotated binlog since restart The binlog contains specially marked format description events to mark when a master restart happened (which could have caused temporary tables to be silently dropped). Such events also cause slave to close temporary tables. However, there was a bug that if after this, slave re-connects to the master in GTID mode, the master can send an old format description event again. If temporary tables are closed when such event is seen for the second time, it might drop temporary tables created after that event, and cause replication failure. With this patch, the restart flag of the format description event is cleared by the master when it is sent to the slave in a subsequent connection, to avoid the errorneous temp table close.	2015-03-04 13:36:29 +01:00
Kristian Nielsen	ad0d203f2e	MDEV-6589: Incorrect relay log start position when restarting SQL thread after error in parallel replication The problem occurs in parallel replication in GTID mode, when we are using multiple replication domains. In this case, if the SQL thread stops, the slave GTID position may refer to a different point in the relay log for each domain. The bug was that when the SQL thread was stopped and restarted (but the IO thread was kept running), the SQL thread would resume applying the relay log from the point of the most advanced replication domain, silently skipping all earlier events within other domains. This caused replication corruption. This patch solves the problem by storing, when the SQL thread stops with multiple parallel replication domains active, the current GTID position. Additionally, the current position in the relay logs is moved back to a point known to be earlier than the current position of any replication domain. Then when the SQL thread restarts from the earlier position, GTIDs encountered are compared against the stored GTID position. Any GTID that was already applied before the stop is skipped to avoid duplicate apply. This patch should have no effect if multi-domain GTID parallel replication is not used. Similarly, if both SQL and IO thread are stopped and restarted, the patch has no effect, as in this case the existing relay logs are removed and re-fetched from the master at the current global @@gtid_slave_pos.	2015-03-04 13:36:04 +01:00
Kristian Nielsen	aa845d123c	MDEV-6391: GTID binlog state not recovered if mariadb-bin.state is removed When the server starts up, check if the master-bin.state file was lost. If it was, recover its contents by scanning the last binlog file, thus avoiding running with a corrupt binlog state.	2015-02-27 14:34:52 +01:00
Kristian Nielsen	a227cf8046	MDEV-7335: Potential parallel slave deadlock with specific binlog corruption If somehow the COMMIT or XID event in an event group was missing, the code in parallel replication to handle this was not sufficient, leading to server deadlock.	2015-02-24 14:39:15 +01:00
Sergei Golubchik	0ba168020e	MDEV-6769 DROP TRIGGER IF NOT EXIST binlogged on master but not on slave don't return from DROP TRIGGER IF NOT EXISTS on the slave side early when the trigger couldn't be read	2015-02-22 12:54:52 +01:00
Sergei Golubchik	b739103f12	MDEV-7591 master crashed when slave specfied a future position with semi-repl plugin cherry-pick the upstream fix commit d4ba10184cd7bde9c31c610e664ecd0c93605c46 Author: Sujatha Sivakumar <sujatha.sivakumar@oracle.com> Date: Wed Jul 2 11:34:11 2014 +0530 Bug#17453826:ASSERTION ERROR WHEN SETTING FUTURE BINLOG FILE/POS WITH SEMISYNC Problem: ======== When DMLs are in progress on the master stopping a slave and setting ahead binlog name/pos will cause an assert on the master. ...	2015-02-22 12:54:52 +01:00
Sergei Golubchik	d7e7862364	Merge branch '5.5' into 10.0	2015-02-18 15:16:27 +01:00
Sergei Golubchik	8e80f91fa3	Merge remote-tracking branch 'mysql/5.5' into bb-5.5-merge @ mysql-5.5.42	2015-02-11 23:50:40 +01:00
Sergei Golubchik	d9c01e4b4a	5.5 merge	2015-01-21 12:03:02 +01:00
s.sujatha	70f5d81a96	Bug#20041860: SLAVE ERROR WHEN DROP DATABASE Fixing a post push test issue.	2015-01-19 18:22:14 +05:30
Kristian Nielsen	df2db86341	MDEV-7430: rpl.rpl_gtid_crash still fails in buildbot The problem was a too low timeout for slave reconnect. It was set to 9 seconds (10 retries with 1 second in-between). This is occasinally too short on some Buildbot hosts, when the test crashes and restarts the master while the slave IO thread is running. Fix by increasing --master-retry-count for this test.	2015-01-15 15:55:09 +01:00
Kristian Nielsen	02099a335e	MDEV-7467: sporadic failure in rpl.rpl_gtid_crash The test case injects a DBUG that will crash the server during replication, then does a START SLAVE. We need to use --error 0,2006,2013 on the START SLAVE, so that we will not fail the test if the server has time to crash before the START SLAVE returns to the client. Fixes a failure seen in Buildbot.	2015-01-14 18:19:05 +01:00
Venkatesh Duggirala	ebb2a3f5e1	Problem: IO thread fails to connect to master if servers are configured with special character sets like utf16, utf32, ucs2. Analysis: MySQL server does not support few special character sets like utf16,utf32 and ucs2 as "client's character set"(eg: utf16,utf32, ucs2). It is known limitation listed in the documentation http://dev.mysql.com/doc/refman/5.5/en/charset-connection.html. The default value for default-character-set parameter is 'auto' which means that if the server's character set is not supported, then server automatically changes client's character set to predefined character-set which is 'latin1' in the current code. Eg: $ ./mysql -uroot -S$SOCKET_FILE --default-character-set=utf16 ERROR 1231 (42000): Variable 'character_set_client' can't be set to the value of 'utf16' $ ./mysql -uroot -S$SOCKET_FILE will be successfully connected to server with 'latin1' as default client side character set. When IO thread is trying to connect to Master, it sets server's character set as client's character set. When Slave server is started with these special character sets, IO thread (which is like a connection to Master) fails because of the above said limitation. Fix: Now even IO thread also behaves the same as a regular client behaves. i.e., If server's character set is not supported as client's character set, then set default's client character set(latin1) as client's character set.	2015-01-14 14:13:52 +05:30
Kristian Nielsen	f27817c1d0	MDEV-7326: Server deadlock in connection with parallel replication The bug occurs when a transaction does a retry after all transactions have done mark_start_commit() in a batch of group commit from the master. In this case, the retrying transaction can unmark_start_commit() after the following batch has already started running and de-allocated the GCO. Then after retry, the transaction will re-do mark_start_commit() on a de-allocated GCO, and also wakeup of later GCOs can be lost. This was seen "in the wild" by a user, even though it is not known exactly what circumstances can lead to retry of one transaction after all transactions in a group have reached the commit phase. The lifetime around GCO was somewhat clunky anyway. With this patch, a GCO lives until rpl_parallel_entry::last_committed_sub_id has reached the last transaction in the GCO. This guarantees that the GCO will still be alive when a transaction does mark_start_commit(). Also, we now loop over the list of active GCOs for wakeup, to ensure we do not lose a wakeup even in the problematic case.	2015-01-07 14:45:39 +01:00
Kristian Nielsen	6e0a00ed75	MDEV-7353: rpl_mdev6386 fails sporadically in buildbot Use include/sync_with_master_gtid.inc instead of --sync_with_master to avoid a race in the test case. In parallel replication, the old-style slave position (which is used by --sync_with_master) is updated out-of-order between parallel threads. This makes it possible for the position to be updated past DROP TEMPORARY TABLE t2 just before the commit of INSERT INTO t1 SELECT * FROM t2 becomes visible. In this case, there is a small window where a SELECT just after --sync_with_master may not see the changes from the INSERT.	2015-01-06 09:52:09 +01:00
s.sujatha	5da083ef67	Bug#20041860: SLAVE ERROR WHEN DROP DATABASE Fix: === Backport Bug#11756194 to mysql-5.5. slave breaks if 'drop database' fails on master and mismatched tables on slave. 'DROP TABLE <deleted tables>' was binlogged when 'DROP DATABASE' failed and at least one table was deleted from the database. The log event would lead slave SQL thread stop if some of the tables did not exist on slave. After this patch, It is always binlogged with 'IF EXISTS' option.	2014-12-29 12:17:55 +05:30
Michael Widenius	4a32d9c058	MDEV-6871 Multi-value insert on MyISAM table that makes slaves crash (when using --skip-external-locking=0) Problem was that repair() did lock and unlock tables, which leaved already locked tables in wrong state include/my_check_opt.h: Added option T_NO_LOCKS to disable locking during repair() Fixed duplicated bit T_NO_CREATE_RENAME_LSN mysql-test/suite/rpl/r/myisam_external_lock.result: Test case for MDEV-6871 mysql-test/suite/rpl/t/myisam_external_lock-slave.opt: Test case for MDEV-6871 mysql-test/suite/rpl/t/myisam_external_lock.test: Test case for MDEV-6871 storage/maria/ha_maria.cc: Don't lock tables during enable_indexes() Removed some calls to current_thd storage/myisam/ha_myisam.cc: Don't lock tables during enable_indexes() Removed some calls to current_thd	2014-12-15 11:16:33 +02:00
Kristian Nielsen	5fc2814698	MDEV-7251: Test failure in rpl.rpl_parallel There was a race. The test case was expecting the slave to start processing a particular DELETE statement, then the test would stop the slave at this point. But there was missing something to wait until the slave would actually reach this point; thus depending on timing it was possible that the slave would be stopped too early, causing .result file difference. Fixed by adding an appropriate wait to the test case.	2014-12-02 18:11:05 +01:00

1 2 3 4 5 ...

3057 commits