The reason was that a couple of variables that hold number of rows that was used to calculate buffers was uint and caused an overflow.
Fixed by changing variables that could hold number of rows from uint to ulong and also added a cast for this test.
include/heap.h:
Reorder to get better alignment. Changed variables that could hold number of rows from uint to ulong
mysql-test/suite/heap/heap.result:
Added test case
mysql-test/suite/heap/heap.test:
Added test case
mysql-test/suite/plugins/t/server_audit.test:
Added sleep as we want to have disconnect logged before we try a new connect
storage/heap/ha_heap.cc:
Changed variables that could hold number of rows from uint to ulong
Limit number of rows to 4G (as most of the variables that holds rows are ulong anyway)
reset records_changed when key_stat_version is changed to not cause increments for every row changed
storage/heap/ha_heap.h:
changed records_changed to ulong as this can get big
storage/heap/hp_create.c:
Changed variables that could hold number of rows from uint to ulong
Added cast (fixed the original bug)
storage/heap/hp_delete.c:
Changed variables that could hold number of rows from uint to ulong
storage/heap/hp_open.c:
Removed not needed cast
storage/heap/hp_write.c:
Changed variables that could hold number of rows from uint to ulong
support-files/compiler_warnings.supp:
Removed extra : from supression
Due to how gap locks work, two transactions could group commit together on the
master, but get lock conflicts and then deadlock due to different thread
scheduling order on slave.
For now, remove these deadlocks by running the parallel slave in READ
COMMITTED mode. And let InnoDB/XtraDB allow statement-based binlogging for the
parallel slave in READ COMMITTED.
We are also investigating a different solution long-term, which is based on
relaxing the gap locks only between the transactions running in parallel for
one slave, but not against possibly external transactions.
When a transaction fails in parallel replication, it should signal the error
to any following transactions doing wait_for_prior_commit() on it. But the
code for this was incorrect, and would not correctly remember a prior error
when sending the signal. This caused corruption when slave stopped due to an
error.
Fix by remembering the error code when we first get an error, and passing the
saved error code to wakeup_subsequent_commits().
Thanks to nanyi607rao who reported this bug on
maria-developers@lists.launchpad.net and analysed the root cause.
Let TABLE_SHARE::tdc.free_tables, TABLE_SHARE::tdc.all_tables,
TABLE_SHARE::tdc.flushed and corresponding invariants be protected by
per-share TABLE_SHARE::tdc.LOCK_table_share instead of global LOCK_open.
Now if CREATE OR REPLACE fails but we have deleted a table already, we will generate a DROP TABLE in the binary log.
This fixes this issue.
In addition, for a failing CREATE OR REPLACE TABLE ... SELECT we don't generate a log of all the inserted rows, only the DROP TABLE.
I added code for not logging DROP TEMPORARY TABLE for tables where the CREATE TABLE was not logged. This code will be activated in 10.1
by removing the code protected by DONT_LOG_DROP_OF_TEMPORARY_TABLES.
mysql-test/suite/rpl/r/create_or_replace_mix.result:
More test cases
mysql-test/suite/rpl/r/create_or_replace_row.result:
More test cases
mysql-test/suite/rpl/r/create_or_replace_statement.result:
More test cases
mysql-test/suite/rpl/t/create_or_replace.inc:
More test cases
sql/log.cc:
Added binlog_reset_cache() to clear the binary log.
sql/log.h:
Added prototype
sql/sql_insert.cc:
If CREATE OR REPLACE TABLE ... SELECT fails:
- Don't log anything if nothing changed
- If table was deleted, log a DROP TABLE.
Remember if we table creation of temporary tables was logged.
sql/sql_table.cc:
Added log_drop_table()
Remember if we table creation of temporary tables was logged.
If CREATE OR REPLACE TABLE ... SELECT fails and a table was deleted, log a DROP TABLE.
sql/sql_table.h:
Added prototype
sql/sql_truncate.cc:
Remember if we table creation of temporary tables was logged.
sql/table.h:
Added table_creation_was_logged
mysql-test/suite/rpl/t/rpl_000011-slave.opt:
Renamed test case as it's slave that needs to restarted
support-files/compiler_warnings.supp:
Fixed bad characters in suppression
mysql-test/suite/rpl/t/rpl_000011-master.opt:
Added master.opt file to ensure that other tests don't interfere with rpl_000011
plugin/server_audit/server_audit.c:
Fixed compiler error on solaris
support-files/compiler_warnings.supp:
Ignore warning from xtradb
Reason for the bug was an optimization for higher connect speed where we moved when global status was updated,
but forgot to update states when slave thread dies.
Fixed by adding thd->add_status_to_global() before deleting slave thread's thd.
mysys/my_delete.c:
Added missing newline
sql/mysqld.cc:
Use add_status_to_global()
sql/slave.cc:
Added missing add_status_to_global()
sql/sql_class.cc:
Use add_status_to_global()
sql/sql_class.h:
Simplify adding local status to global by adding add_status_to_global()
Before, the arrival of same GTID twice in multi-source replication
would cause double-apply or in gtid strict mode an error.
Keep the behaviour, but add an option --gtid-ignore-duplicates which
allows to correctly handle duplicates, ignoring all but the first.
This relies on the user ensuring correct configuration so that
sequence numbers are strictly increasing within each replication
domain; then duplicates can be detected simply by comparing the
sequence numbers against what is already applied.
Only one master connection (but possibly multiple parallel worker
threads within that connection) is allowed to apply events within
one replication domain at a time; any other connection that
receives a GTID in the same domain either discards it (if it is
already applied) or waits for the other connection to not have
any events to apply.
Intermediate patch, as proof-of-concept for testing. The main limitation
is that currently it is only implemented for parallel replication,
@@slave_parallel_threads > 0.
Automatic merge, except for server_audit.cc that had to be modified slightly
Changes to xtradb and innobase where ignored was these made no sence for 10.0
When an rpl_group_info object was returned from the free list, the
rgi->deferred_events_collecting and rgi->deferred_events was not correctly
re-inited. Additionally, the rgi->deferred_events was incorrectly freed in
free_rgi(), which causes unnecessary malloc/free (or crash when re-init is not
done).
Thanks to user nanyi607rao, who reported this bug on maria-developers@.
The problem was when a GTID event was part of a group commit, and so contained
a commit id. The code that replaces GTID with a BEGIN event for old slaves did
not correctly handle this case.
Fix the code so that the GTID with commit id can also be properly replaced
with a BEGIN query event. The extra two bytes are in the BEGIN event replaced
with a dummy, empty time zone string.
Older master has no GTID events, so such events are not available for
deciding on scheduling of event groups and so on.
With this patch, we run such events from old masters single-threaded, in the
sql driver thread.
This seems better than trying to make the parallel code handle the data from
older masters; while possible, this would require a lot of testing (as well as
possibly some extra overhead in the scheduling of events), which hardly seems
worthwhile.
Fix ha_table_exists() to take discovery into account correctly.
It must be able to discover both table existence (when no frm is
found) and table non-existance (when frm was found).
Clean up and improve the parallel implementation code, mainly related to
scheduling of work to threads and handling of stop and errors.
Fix a lot of bugs in various corner cases that could lead to crashes or
corruption.
Fix that a single replication domain could easily grab all worker threads and
stall all other domains; now a configuration variable
--slave-domain-parallel-threads allows to limit the number of
workers.
Allow next event group to start as soon as previous group begins the commit
phase (as opposed to when it ends it); this allows multiple event groups on
the slave to participate in group commit, even when no other opportunities for
parallelism are available.
Various fixes:
- Fix some races in the rpl.rpl_parallel test case.
- Fix an old incorrect assertion in Log_event iocache read.
- Fix repeated malloc/free of wait_for_commit and rpl_group_info objects.
- Simplify wait_for_commit wakeup logic.
- Fix one case in queue_for_group_commit() where killing one thread would
fail to correctly signal the error to the next, causing loss of the
transaction after slave restart.
- Fix leaking of pthreads (and their allocated stack) due to missing
PTHREAD_CREATE_DETACHED attribute.
- Fix how one batch of group-committed transactions wait for the previous
batch before starting to execute themselves. The old code had a very
complex scheduling where the first transaction was handled differently,
with subtle bugs in corner cases. Now each event group is always scheduled
for a new worker (in a round-robin fashion amongst available workers).
Keep a count of how many transactions have started to commit, and wait for
that counter to reach the appropriate value.
- Fix slave stop to wait for all workers to actually complete processing;
before, the wait was for update of last_committed_sub_id, which happens a
bit earlier, and could leave worker threads potentially accessing bits of
the replication state that is no longer valid after slave stop.
- Fix a couple of places where the test suite would kill a thread waiting
inside enter_cond() in connection with debug_sync; debug_sync + kill can
crash in rare cases due to a race with mysys_var_current_mutex in this
case.
- Fix some corner cases where we had enter_cond() but no exit_cond().
- Fix that we could get failure in wait_for_prior_commit() but forget to flag
the error with my_error().
- Fix slave stop (both for normal stop and stop due to error). Now, at stop
we pick a specific safe point (in terms of event groups executed) and make
sure that all event groups before that point are executed to completion,
and that no event group after start executing; this ensures a safe place to
restart replication, even for non-transactional stuff/DDL. In error stop,
make sure that all prior event groups are allowed to execute to completion,
and that any later event groups that have started are rolled back, if
possible. The old code could leave eg. T1 and T3 committed but T2 not, or
it could even leave half a transaction not rolled back in some random
worker, which would cause big problems when that worker was later reused
after slave restart.
- Fix the accounting of amount of events queued for one worker. Before, the
amount was reduced immediately as soon as the events were dequeued (which
happens all at once); this allowed twice the amount of events to be queued
in memory for each single worker, which is not what users would expect.
- Fix that an error set during execution of one event was sometimes not
cleared before executing the next, causing problems with the error
reporting.
- Fix incorrect handling of thd->killed in worker threads.
- Change the default flag value to ON.
- Update the testcases to be run extended_keys=ON:
= trivial test result updates
= If extended_keys setting makes a difference for a testcase, run the testcase
with extended_keys=off. There were only a few such cases
- Update to vcol_select_innodb looks like a worse plan but it will be gone in 10.0.
As part of the fix we don't anymore generate a create table statement when doing a
CREATE TABLE IF NOT EXISTS table_that_exist LiKE temporary_table
if the 'table_that_exist' existed.
This is because it's not self evident if we should generate a create statement
matching the existing table or the temporary_table.
The old code generated a table like the existing table in row based replication and like the temporary table
in statement based replication.
It's better to ensure that both cases works the same way.
mysql-test/suite/rpl/r/rpl_row_create_table.result:
Updated results
(Now we don't anymore CREATE TABLE IF NOT EXISTS LIKE if the table existed)
sql/sql_base.cc:
More DBUG_PRINT
sql/sql_error.cc:
More DBUG_PRINT
sql/sql_table.cc:
Don't generate a create table statement when doing a
CREATE TABLE IF NOT EXISTS table_that_exist like temporary_table if the table existed.
revision-id: monty@askmonty.org-20140211120313-z158i1sdlxxeotgl
committer: Michael Widenius <monty@askmonty.org>
message:
Enable rpl_row_create_table (no reason to keep this disabled anymore)
Still fails (in --ps), no reason to enable it if it is not fixed.