Commit graph

33801 commits

Author SHA1 Message Date
Kristian Nielsen
16fb2963ff Fix typo that breaks compilation on platforms without atomics. 2014-12-12 14:03:20 +01:00
Kristian Nielsen
d79cce86ab MDEV-4393: show_explain.test times out randomly
The problem was a race between the debug code in the server and the SHOW
EXPLAIN FOR in the test case.

The test case would wait for a query to reach the first point of interest
(inside dbug_serve_apcs()), then send it a SHOW EXPLAIN FOR, then wait for the
query to reach the next point of interest. However, the second wait was
insufficient. It was possible for the the second wait to complete immediately,
causing both the first and the second SHOW EXPLAIN FOR to hit the same
invocation of dbug_server_apcs(). Then a later invocation would miss its
intended SHOW EXPLAIN FOR and hang, and the test case would eventually time
out.

Fix is to make sure that the second wait can not trigger during the first
invocation of dbug_server_apcs(). We do this by clearing the thd_proc_info
(that the wait is looking for) before processing the SHOW EXPLAIN FOR; this
way the second wait can not start until the thd_proc_info from the first
invocation has been cleared.
2014-12-03 15:49:31 +01:00
Kristian Nielsen
1eed274848 Fix wording in error log message, to be consistent with other messages ("IO thread" -> "I/O thread"). 2014-12-02 12:11:07 +01:00
Sergei Golubchik
912bbfda12 MDEV-7144 Warnings "bytes lost" during server shutdown after running connect.part_file test in buildbot 2014-11-22 11:56:29 +01:00
Kristian Nielsen
52b25934d7 MDEV-7237: Parallel replication: incorrect relaylog position after stop/start the slave
The replication relay log position was sometimes updated incorrectly at the
end of a transaction in parallel replication. This happened because the relay
log file name was taken from the current Relay_log_info (SQL driver thread),
not the correct value for the transaction in question.

The result was that if a transaction was applied while the SQL driver thread
was at least one relay log file ahead, _and_ the SQL thread was subsequently
stopped before applying any events from the most recent relay log file, then
the relay log position would be incorrect - wrong relay log file name. Thus,
when the slave was started again, usually a relay log read error would result,
or in rare cases, if the position happened to be readable, the slave might
even skip arbitrary amounts of events.

In GTID mode, the relay log position is reset when both slave threads are
restarted, so this bug would only be seen in non-GTID mode, or in GTID mode
when only the SQL thread, not the IO thread, was stopped.
2014-12-01 13:53:57 +01:00
Alexander Barkov
f8e1952be4 MDEV-7149 Constant propagation erroneously applied for LIKE
Simply disallowing equality propagation into LIKE.
A more delicate fix is be possible, but it would need too many changes,
which is not desirable in 10.0 at this point.
2014-11-28 18:11:58 +04:00
Alexander Barkov
5ae1639c02 Backporting a cleanup in boolean function from 10.1:
Moving Item_bool_func2 and Item_func_opt_neg from Item_int_func to
Item_bool_func. Now all functions that return is_bool_func()=true
have a common root class Item_bool_func.
This change is needed to fix MDEV-7149 properly.
2014-11-27 11:47:22 +04:00
Kristian Nielsen
06d0d09077 MDEV-6582: DEBUG_SYNC does not reset mysys_var->current_mutex, causes assertion "Trying to unlock mutex that wasn't locked"
The bug was in DEBUG_SYNC. When waiting, debug_sync_execute() temporarily sets
thd->mysys_var->current_mutex to a new value while waiting. However, if the
old value of current_mutex was NULL, it was not restored, current_mutex
remained set to the temporary value (debug_sync_global.ds_mutex).

This made possible the following race: Thread T1 goes to KILL thread T2. In
THD::awake(), T1 loads T2->mysys_var->current_mutex, it is set to ds_mutex, T1
locks this mutex.

Now T2 runs, it does ENTER_COND, it sets T2->mysys_var->current_mutex to
LOCK_wait_commit (for example).

Then T1 resumes, it reloads mysys_var->current_mutex, now it is set to
LOCK_wait_commit, T1 unlocks this mutex instead of the ds_mutex that it locked
previously.

This causes safe_mutex to assert with the message: "Trying to unlock mutex
LOCK_wait_commit that wasn't locked".

The fix is to ensure that DEBUG_SYNC also will restore
mysys_var->current_mutex in the case where the original value was NULL.
2014-11-26 11:07:32 +01:00
Kristian Nielsen
b79685902d MDEV-6903: gtid_slave_pos is incorrect after master crash
When a master slave restarts, it logs a special restart format description
event in its binlog. When the slave sees this event, it knows it needs to roll
back any active partial transaction, in case the master crashed previously in
the middle of writing such transaction to its binlog.

However, there was a bug where this rollback did not reset rgi->pending_gtid.
This caused the @@gtid_slave_pos to be updated incorrectly with the GTID of
the partial transaction that was rolled back.

Fix this by always clearing rgi->pending_gtid in cleanup_context(), hopefully
preventing similar bugs from turning up in other special cases where a
transaction is rolled back during replication.

Thanks to Pavel Ivanov for tracking down the issue and providing a test case.
2014-11-25 12:19:48 +01:00
Sergei Golubchik
67e2e14627 Merge 2014-11-21 08:50:44 +01:00
Sergei Golubchik
82f56328ea after merge fixes:
* adjust viossl.c to take account the new code
  (SSL_get_error is used now, cannot simply remap it)
* remove unnecessary version check
* update the test to 10.0
2014-11-21 00:02:24 +01:00
Sergei Golubchik
3495801e2e 5.5 merge 2014-11-19 17:23:39 +01:00
Sergey Petrunya
00475d40d1 MDEV-7118: Anemometer stop working after upgrade to from...
When the optimizer considers an option to use Loose Scan, it should 
still consider UNIQUE keys (Previously, MDEV-4120 disabled loose scan
for all kinds of unique indexes. That was wrong)

However, we should not use Loose Scan when trying to satisfy 
 "SELECT DISTINCT col1, col2, .. colN"
when using an index defined as UNIQU(col1, col2, ... colN).
2014-11-19 17:14:49 +03:00
Alexander Barkov
154ec0f420 MDEV-6993 Bad results with join comparing DECIMAL and ENUM/SET columns 2014-11-19 12:08:35 +04:00
Alexander Barkov
55dd89e919 MDEV-6978 Bad results with join comparing case insensitive VARCHAR/ENUM/SET
expression to a _bin ENUM column
2014-11-19 10:33:49 +04:00
Sergei Golubchik
51d7e80355 MDEV-4285 Server crashes in ptr_compare on NOW and CAST in ORDER BY
skip qsort if the sort key has zero length
2014-11-18 22:27:31 +01:00
Sergei Golubchik
c417da24a3 MDEV-6794 XtraDB no longer using UNIQUE as clustered index when PK missing
try the first unique key as a surrogate PK *before* disabling extended
keys because of missing PK
2014-11-18 22:26:14 +01:00
Sergei Golubchik
79c76400a6 two more unused error messages 2014-11-18 22:26:09 +01:00
Sergei Golubchik
ea04a8cfda MDEV-6805 one can set character_set_client to utf32
use the same restriction for character_set_client on the command line
and from SQL.

Also: remove strange hack from thd_init_client_charset() that contradicted
the manual (collation_connection and character_set_result were not always set)
2014-11-18 22:25:47 +01:00
Sergei Golubchik
a8bd285f7c MDEV-6785 Wrong result on 2nd execution of PS with aggregate function, FROM SQ or MERGE view
a different fix for view.test --ps-protocol crash
(revert the old fix that has caused a regression)
2014-11-18 22:25:41 +01:00
Sergei Golubchik
303eec5774 MDEV-6880 Can't define CURRENT_TIMESTAMP as default value for added column
ALTER TABLE: don't fill default values per row, do it once.
And do it in two places - for copy_data_between_tables() and for online ALTER.

Also, run function_defaults test both for MyISAM and for InnoDB.
2014-11-18 22:25:33 +01:00
Sergei Golubchik
5cfc62f9c6 MDEV-7087 main.stat_tables-enospc fails in buildbot on a valgrind build
when reading data into the record buffer, the tail of the VARCHAR
(between real and max varchar length) is not written to. initialize the record
buffer to avoid writing uninitialized memory to disk.
2014-11-18 22:25:27 +01:00
Alexander Barkov
b52d4d0076 MDEV-6991 GROUP_MIN_MAX optimization is erroneously applied in some cases 2014-11-18 23:15:54 +04:00
Sergei Golubchik
84fc27fbef 5.3 merge 2014-11-18 17:36:51 +01:00
Sergei Golubchik
cc2c296309 MDEV-4513 Valgrind warnings (Conditional jump or move depends on uninitialised value) in inflate on UNCOMPRESS 2014-11-18 15:42:48 +01:00
Sergei Golubchik
5d0122bd77 MDEV-7113 difference between check_vcol_func_processor and check_partition_func_processor
MDEV-6789 segfault in Item_func_from_unixtime::get_date on updating table with virtual columns

* prohibit VALUES in partitioning expression
* prohibit user and system variables in virtual column expressions
* fix Item_func_date_format to cache locale (for %M/%W to return the same as MONTHNAME/DAYNAME)
* fix Item_func_from_unixtime to cache time_zone directly, not THD (and not to crash)
* added tests for other incorrectly allowed (in vcols) functions to see that they don't crash
2014-11-18 15:42:40 +01:00
Sergei Golubchik
84f25c25f2 MDEV-3940 Server crash or assertion `item->type() == Item::STRING_ITEM' failure on LOAD DATA through a view with statement binary logging
A "field" could be either an Item_field or
(if loading into a view) an Item_direct_ref that references Item_field.

Also: when iterating fields, use fields of the TABLE_LIST (table or view),
not fields of a TABLE (actual underlying table - might have more columns).
2014-11-18 15:42:32 +01:00
Alexander Barkov
e52b1637e0 MDEV-6950 Bad results with joins comparing DATE/DATETIME and INT/DECIMAL/DOUBLE/ENUM/VARCHAR columns
MDEV-6971 Bad results with joins comparing TIME and DOUBLE/DECIMAL columns
Disallow using indexes on non-temporal columns to optimize
ref access, range access and table elimination when the counterpart's
cmp_type is TIME_RESULT, e.g.:
  SELECT * FROM t1 WHERE indexed_int_column=time_expression;
Only index on a temporal column can be used to optimize temporal comparison
operations.
2014-11-18 16:33:29 +04:00
Kristian Nielsen
e9fc98b583 MDEV-7121: Parallel slave may hang if master crashes in the middle of writing transaction to binlog
When a master server restarts, it writes a restart format_description event as
the first event in the next binlog file. The parallel slave SQL thread queues
a special restart entry for the current worker thread to signal this, so that
the worker thread can roll back any prior partial transaction that might have
been written to the binlog due to master crashing.

This queueing was missing a mysql_cond_signal() to notify the worker
thread. This could cause the worker thread to not process the restart entry,
and this in turn would cause the SQL thread to hang infinitely waiting for the
worker thread to complete processing.

Fix by adding the missing wakeup signalling for this case.
2014-11-17 12:42:02 +01:00
Kristian Nielsen
7671fd70c0 MDEV-7080: rpl.rpl_gtid_crash fails sporadically in buildbot
The real problem here was inconsistent handling of entry->commit_errno in
MYSQL_BIN_LOG::write_transaction_or_stmt(). Some return paths were setting it
to the value of errno, some where not. And the setting was redundant anyway,
as it is set consistently by the caller.

Fix by consistently setting it in the caller, and not in each return path in
the function.

The test failure happened because a DBUG_EXECUTE_IF() used in the test case
set an entry->commit_errno that was immediately overwritten in the caller with
whatever happened to be the value of errno. This could lead to different error
message in the .result file.
2014-11-17 08:53:42 +01:00
Alexey Botchkov
c9742ceac5 MDEV-6883 ST_WITHIN crashes server if (0,0) is matched to POLYGON((0 0)).
Fixed the case when a polygon contains a single-point ring.
2014-11-15 21:30:16 +04:00
Sergei Golubchik
81d7e2f61c MDEV-7003 test-alter-table crashes debug build due to double free of plugin
correct the buffer boundary check
2014-11-13 13:40:19 +01:00
Sergei Golubchik
6a2c170141 MDEV-6849 ON UPDATE CURRENT_TIMESTAMP doesn't always work
reset default fields not for every modified row, but only once,
at the beginning, as the set of modified fields doesn't change.

exception: INSERT ... ON DUPLICATE KEY UPDATE - the set of fields
does change per row and in that case we reset default fields per row.
2014-11-13 13:40:11 +01:00
Sergey Petrunya
50c5339272 MDEV-7068: MRR accessing uninitialised bytes, test case failure main.innodb_mrr
Backport to 5.3:
- Don't call index_reader->interrupt_read() if the
  index reader has returned all rows that matched its keys.
2014-11-13 14:12:41 +03:00
Kristian Nielsen
26b1113032 MDEV-6917: Parallel replication: "Commit failed due to failure of an earlier commit on which this one depends", but no prior failure seen
This bug was seen when parallel replication experienced a deadlock between
transactions T1 and T2, where T2 has reached the commit phase and is waiting
for T1 to commit first. In this case, the deadlock is broken by sending a kill
to T2; that kill error is then later detected and converted to a deadlock
error, which causes T2 to be rolled back and retried.

The problem was that the kill caused ha_commit_trans() to errorneously call
wakeup_subsequent_commits() on T3, signalling it to abort because T2 failed
during commit. This is incorrect, because the error in T2 is only a temporary
error, which will be resolved by normal transaction retry. We should not
signal error to the next transaction until we have executed the code that
handles such temporary errors.

So this patch just removes the calls to wakeup_subsequent_commits() from
ha_commit_trans(). They are incorrect in this case, and they are not needed in
general, as wakeup_subsequent_commits() must in any case be called in
finish_event_group() to wakeup any transactions that may have started to wait
after ha_commit_trans(). And normally, wakeup will in fact have happened
earlier, either from the binlog group commit code, or (in case of no
binlogging) after the fast part of InnoDB/XtraDB group commit.

The symptom of this bug was that replication would break on some transaction
with "Commit failed due to failure of an earlier commit on which this one
depends", but with no such failure of an earlier commit visible anywhere.
2014-11-13 11:01:31 +01:00
Kristian Nielsen
3dcd01e5e6 MDEV-7065: Incorrect relay log position in parallel replication after retry of transaction
The retry of an event group in parallel replication set the wrong value for
the end log position of the event that was retried
(qev->future_event_relay_log_pos). It was too large by the size of the event,
so it pointed into the middle of the following event.

If the retry happened in the very last event of the event group, _and_ the SQL
thread was stopped just after successfully retrying that event, then the SQL
threads's relay log position would be left incorrect. Restarting the SQL
thread could then try to read events from a garbage offset in the relay log,
usually leading to an error about not being able to read the event.
2014-11-13 10:46:09 +01:00
Kristian Nielsen
d08b893b39 MDEV-6775: Wrong binlog order in parallel replication: Intermediate commit
The code in binlog group commit around wait_for_commit that controls commit
order, did the wakeup of subsequent commits early, as soon as a following
transaction is put into the group commit queue, but before any such commit has
actually taken place. This causes problems with too early wakeup of
transactions that need to wait for prior to commit, but do not take part in
the binlog group commit for one reason or the other.

This patch solves the problem, by moving the wakeup to happen only after the
binlog group commit is completed.

This requires a new solution to ensure that transactions that arrive later
than the leader are still able to participate in group commit. This patch
introduces a flag wait_for_commit::commit_started. When this is set, a waiter
can queue up itself in the group commit queue.

This way, effectively the wait_for_prior_commit() is skipped only for
transactions that participate in group commit, so that skipping the wait is
safe. Other transactions still wait as needed for correctness.
2014-11-13 10:31:20 +01:00
Kristian Nielsen
eec04fb4f6 MDEV-6680: Performance of domain_parallel replication is disappointing
The code that handles free lists of various objects passed to worker threads
in parallel replication handles freeing in batches, to avoid taking and
releasing LOCK_rpl_thread too often. However, it was possible for freeing to
be delayed to the point where one thread could stall the SQL driver thread due
to full queue, while other worker threads might be idle. This could
significantly degrade possible parallelism and thus performance.

Clean up the batch freeing code so that it is more robust and now able to
regularly free batches of object, so that normally the queue will not run full
unless the SQL driver thread is really far ahead of the worker threads.
2014-11-13 10:20:48 +01:00
Kristian Nielsen
8a3e2f29bb MDEV-6718: Server crashed in Gtid_log_event::Gtid_log_event with parallel replication
The bug occured in parallel replication when re-trying transactions that
failed due to deadlock. In this case, the relay log file is re-opened and the
events are read out again. This reading requires a format description event of
the appropriate version. But the code was using a description event stored in
rli, which is not thread-safe. This could lead to various rare races if the
format description event was replaced by the SQL driver thread at the exact
moment where a worker thread was trying to use it.

The fix is to instead make the retry code create and maintain its own format
description event. When the relay log file is opened, we first read the format
description event from the start of the file, before seeking to the current
position. This now uses the same code as when the SQL driver threads starts
from a given relay log position. This also makes sure that the correct format
description event version will be used in cases where the version of the
binlog could change during replication.
2014-11-13 10:09:46 +01:00
Kristian Nielsen
a98a034c5e MDEV-7102: Incorrect PSI_stage_info message in SHOW PROCESSLIST during parallel replication
In parallel replication, threads can do two different waits for a prior
transaction. One is for the prior transaction to start commit, the other is
for it to complete commit.

It turns out that the same PSI_stage_info message was errorneously used in
both cases (probably a merge error), causing SHOW PROCESSLIST to be
misleading.

Fix by using correct, distinct message in each case.
2014-11-13 09:56:28 +01:00
Kristian Nielsen
684715a269 MDEV-6775: Wrong binlog order in parallel replication
In parallel replication, the wait_for_commit facility is used to ensure that
events are written into the binlog in the correct order. This is handled in an
optimised way in the binlogging group commit code.

However, some statements, for example GRANT, are written directly into the
binlog, outside of the group commit code. There was a bug that this direct
write does not correctly wait for the prior transactions to have been written
first, which allows f.ex. GRANT to be written ahead of earlier transactions.

This patch adds the missing wait_for_prior_commit() before writing directly to
the binlog.

However, the problem is still there, although the race is much less likely to
occur now. The problem is that the optimised group commit code does wakeup of
following transactions early, before the binlog write is actually done. A
woken-up following transaction is then allowed to run ahead and queue up for
the group commit, which will ensure that binlog write happens in correct order
in the end. However, the code for directly written events currently bypass
this mechanism, so they get woken up and written too early.

This will be fixed properly in a later patch.
2014-11-13 09:49:07 +01:00
Kristian Nielsen
55791c1a77 Revert incorrect/redundant fix for old BUG#34656
The real bug was that open_tables() returned error in case of
thd->killed() without properly calling thd->send_kill_message()
to set the correct error. This was fixed some time ago.

So remove the, now redundant, extra checks for thd->is_error(),
possibly allowing to catch in debug builds more incorrect
error handling cases.
2014-11-13 09:20:40 +01:00
Kristian Nielsen
fbc8768ce5 MDEV-7101: SAFE_MUTEX lock order warning when reusing wait_for_commit mutex
In SAFE_MUTEX builds, reset the wait_for_commit mutex (destroy and
re-initialise), so that SAFE_MUTEX lock order check does not become
confused when the mutex is re-used for a different purpose.
2014-11-13 09:19:12 +01:00
Sergei Golubchik
815667086c sql_update.cc: always update default fields *after* compare_record()
(it was *after* in two cases and *before* in one case)
2014-11-11 10:39:35 +01:00
Alexander Barkov
9e8202013a MDEV-6965 non-captured group \2 in regexp_replace 2014-11-10 16:43:27 +04:00
Sergei Golubchik
360c49c1b9 MDEV-6179: dynamic columns functions/cast()/convert() doesn't play nice with CREATE/ALTER TABLE
When parsing a field declaration, grab type information from LEX before it's overwritten
by further rules. Pass type information through the parser stack to the rule that needs it.
2014-11-08 19:54:42 +01:00
Alexander Barkov
e072a647d9 MDEV-6865 Merge Bug#18935421 RPAD DIES WITH CERTAIN PADSTR INTPUTS.. 2014-11-17 17:24:04 +04:00
unknown
e7c356f717 MDEV-6868: MariaDB server crash ( select with union and order by with subquery )
Excluding ORDER BY condition should be done after preparation it (even to catch syntax errors).
2014-11-15 22:18:33 +01:00
Sergey Petrunya
06c7f493e3 MDEV-7068: MRR accessing uninitialised bytes, test case failure main.innodb_mrr
- Don't call index_reader->interrupt_read() if the
  index reader has returned all rows that matched its keys.
2014-11-13 13:56:35 +03:00
Alexander Barkov
b84a892fb2 MDEV-7019 String::chop() is wrong and may potentially crash (MySQL bug#56492)
Merging a fix from the upstream.
2014-11-10 18:08:17 +04:00