These self references were previously used to avoid having to check the
IO_CACHE's type. However, a benchmark shows that on x86 5930k stock,
the type comparison is marginally faster than the double pointer dereference.
For 40 billion my_b_tell calls, the difference is .1 seconds in favor of performing the
type check. (Basically there is no measurable difference)
To prevent bugs from copying the structure using the equals(=) operator,
and having to do the bookkeeping manually, remove these "convenience"
variables.
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.
Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()
Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
of stop_slave() to not have to use LOCK_active_mi inside
terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
replication. Problem was that waiting for pause_for_ftwrl was not done
when deleting rpt->current_owner after a force_abort.
The bug was caused by several issues.
2 problems in seek_io_cache. Due to wrong offsets used, we would end up
seeking way too much (first change), or over the intended seek point
(second change). Fixing it requires correctly detecting available data
in buffer (first change), and not using "IO_SIZE alligned" reads. The
second is needed because _my_b_cache_read adjusts the pos_in_file itself
based on read_pos and read_end. Pretending buffer is empty when we want
to force a read will aleviate this problem.
Secondly, the big-table cursors didn't repect the interface definitions
of always returning the rownumber that Table_read_cursor::fetch() would activate.
At the same time, next(), prev() and move_to() should not perform any
row activation.
update info->write_end and info->write_pos together, with no
"return on error" in between, otherwise write_end might end up being
smaller than write_pos
Add support for having multiple IO_CACHEs with type=READ_CACHE to share
the file they are reading from.
Each IO_CACHE keeps its own in-memory buffer. When doing a read or seek
operation on the file, it notifies other IO_CACHEs that the file position
has been changed.
Make Rowid_seq_cursor use cloned IO_CACHE when reading filesort result.
Instead of encrypt(src, dst, key, iv) that encrypts all
data in one go, now we have encrypt_init(key,iv),
encrypt_update(src,dst), and encrypt_finish(dst).
This also causes collateral changes in the internal my_crypt.cc
encryption functions and in the encryption service.
There are wrappers to provide the old all-at-once encryption
functionality. But binlog events are often written piecewise,
they'll need the new api.
* remove unused (and not implemented) WRITE_NET type
* remove cast in my_b_write() macro. my_b_* macros are
function-like, casts are responsibility of the caller
* replace hackish _my_b_write(info,0,0) with the explicit
my_b_flush_io_cache() in my_b_write_byte()
* remove unused my_b_fill_cache()
* replace pbool -> my_bool
* make internal IO_CACHE functions static
* reformat comments, correct typos, remove obsolete comments (ISAM)
* assert valid cache type in init_functions()
* use IO_ROUND_DN() macro where appropriate
* remove unused DBUG_EXECUTE_IF in _my_b_cache_write()
* remove unnecessary __attribute__((unused))
* fix goto error in parse_file.cc
* remove redundant reinit_io_cache() in uniques.cc
* don't do reinit_io_cache() if the cache was not initialized
in ma_check.c
* extract duplicate functionality from various _my_b_*_read
functions into a common wrapper. Same for _my_b_*_write
* create _my_b_cache_write_r instead of having if's in
_my_b_cache_write (similar to existing _my_b_cache_read and
_my_b_cache_read_r)
* don't call mysql_file_write() from my_b_flush_io_cache(),
call info->write_function() instead
remove some 14-year old code that added support for
LOAD DATA replication to IO_CACHE:
* three callbacks, of which only two were actually used and that
were only needed for LOAD DATA replication but were
tested in every IO_CACHE instance
* an additional opaque void * argument in IO_CACHE, also only
used for LOAD DATA replication, but present everywhere
* the code to close IO_CACHE prematurely in LOAD DATA to have
these callbacks called in the correct order and a long
comment explaining what will happen if IO_CACHE is not
closed prematurely
* a variable to track whether IO_CACHE was closed prematurely
(to avoid double-closing it)
Clean up and improve the parallel implementation code, mainly related to
scheduling of work to threads and handling of stop and errors.
Fix a lot of bugs in various corner cases that could lead to crashes or
corruption.
Fix that a single replication domain could easily grab all worker threads and
stall all other domains; now a configuration variable
--slave-domain-parallel-threads allows to limit the number of
workers.
Allow next event group to start as soon as previous group begins the commit
phase (as opposed to when it ends it); this allows multiple event groups on
the slave to participate in group commit, even when no other opportunities for
parallelism are available.
Various fixes:
- Fix some races in the rpl.rpl_parallel test case.
- Fix an old incorrect assertion in Log_event iocache read.
- Fix repeated malloc/free of wait_for_commit and rpl_group_info objects.
- Simplify wait_for_commit wakeup logic.
- Fix one case in queue_for_group_commit() where killing one thread would
fail to correctly signal the error to the next, causing loss of the
transaction after slave restart.
- Fix leaking of pthreads (and their allocated stack) due to missing
PTHREAD_CREATE_DETACHED attribute.
- Fix how one batch of group-committed transactions wait for the previous
batch before starting to execute themselves. The old code had a very
complex scheduling where the first transaction was handled differently,
with subtle bugs in corner cases. Now each event group is always scheduled
for a new worker (in a round-robin fashion amongst available workers).
Keep a count of how many transactions have started to commit, and wait for
that counter to reach the appropriate value.
- Fix slave stop to wait for all workers to actually complete processing;
before, the wait was for update of last_committed_sub_id, which happens a
bit earlier, and could leave worker threads potentially accessing bits of
the replication state that is no longer valid after slave stop.
- Fix a couple of places where the test suite would kill a thread waiting
inside enter_cond() in connection with debug_sync; debug_sync + kill can
crash in rare cases due to a race with mysys_var_current_mutex in this
case.
- Fix some corner cases where we had enter_cond() but no exit_cond().
- Fix that we could get failure in wait_for_prior_commit() but forget to flag
the error with my_error().
- Fix slave stop (both for normal stop and stop due to error). Now, at stop
we pick a specific safe point (in terms of event groups executed) and make
sure that all event groups before that point are executed to completion,
and that no event group after start executing; this ensures a safe place to
restart replication, even for non-transactional stuff/DDL. In error stop,
make sure that all prior event groups are allowed to execute to completion,
and that any later event groups that have started are rolled back, if
possible. The old code could leave eg. T1 and T3 committed but T2 not, or
it could even leave half a transaction not rolled back in some random
worker, which would cause big problems when that worker was later reused
after slave restart.
- Fix the accounting of amount of events queued for one worker. Before, the
amount was reduced immediately as soon as the events were dequeued (which
happens all at once); this allowed twice the amount of events to be queued
in memory for each single worker, which is not what users would expect.
- Fix that an error set during execution of one event was sometimes not
cleared before executing the next, causing problems with the error
reporting.
- Fix incorrect handling of thd->killed in worker threads.
This is port of fix for MySQL BUG#17647863.
revno: 5572
revision-id: jon.hauglid@oracle.com-20131030232243-b0pw98oy72uka2sj
committer: Jon Olav Hauglid <jon.hauglid@oracle.com>
timestamp: Thu 2013-10-31 00:22:43 +0100
message:
Bug#17647863: MYSQL DOES NOT COMPILE ON OSX 10.9 GM
Rename test() macro to MY_TEST() to avoid conflict with libc++.
'MAX_BINLOG_CACHE_SIZE' ERROR
Problem:
=======
MySQL returns following error in win64.
"ERROR 1197 (HY000): Multi-statement transaction required more than
'max_binlog_cache_size' bytes of storage; increase this mysqld variable
and try again" when user tries to load >4G file even if
max_binlog_cache_size set to maximum value. On Linux everything
works fine.
Analysis:
========
The `max_binlog_cache_size' variable is of type `ulonglong'. This
value is set to `ULONGLONG_MAX' at the time of server start up. The
above value is stored in an intermediate variable named
`saved_max_binlog_cache_size' which is of type `ulong'. In visual
c++ complier the `ulong' type is of 4bytes in size and hence the value
is getting truncated to '4GB' and the cache is not able to grow beyond
4GB size. The same limitation is observed with
"max_binlog_stmt_cache_size" as well. Similar fix has been applied.
Fix:
===
As part of fix the type "ulong" is replaced with "my_off_t" which is of
type "ulonglong".
mysys/mf_iocache.c:
Added debug statement to simulate a scenario where the cache
file's current position is set to >4GB
sql/log.cc:
Replaced the type of `saved_max_binlog_cache_size' from "ulong" to
"my_off_t", which is a type def for "ulonglong".
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
Give more information when finding an error in a MyISAM table.
When killing system thread, use KILL_SYSTEM_THREAD instead of KILL_CONNECTION to make it easier to ignore the signal in sensitive context (like auto-repair)
Added new kill level: KILL_SERVER that will in the future to be used to signal killed by shutdown.
Add more warnings about killed connections when warning level > 3
include/myisamchk.h:
Added counting of printed info/notes
mysys/mf_iocache.c:
Remove duplicate assignment
sql/handler.cc:
Added test of KILL_SERVER
sql/log.cc:
Ignore new 'kill' error ER_NEW_ABORTING_CONNECTION when requesting query error code.
sql/mysqld.cc:
Add more warnings for killed connections when warning level > 3
sql/scheduler.cc:
Added checks for new kill signals
sql/slave.cc:
Ignore new kill signal ER_NEW_ABORTING_CONNECTION
sql/sp_head.cc:
Fixed assignment to bool
Added testing of new kill signals
sql/sql_base.cc:
Use KILL_SYSTEM_THREAD to auto-kill system threads
sql/sql_class.cc:
Add more warnings for killed connections when warning level > 3
thd_killed() now ignores KILL_BAD_DATA and THD::KILL_SYSTEM_THREAD as these should not abort sensitive operations.
sql/sql_class.h:
Added KILL_SYSTEM_THREAD and KILL_SERVER
sql/sql_connect.cc:
Added handling of KILL_SERVER
sql/sql_insert.cc:
Use KILL_SYSTEM_THREAD to auto-kill system threads
Added handling of KILL_SERVER
sql/sql_parse.cc:
Add more warnings for killed connections when warning level > 3
Added checking that thd->abort_on_warning is reset at end of query.
sql/sql_show.cc:
Update condition for when a query is 'killed'
storage/myisam/ha_myisam.cc:
Added counting of info/notes printed
storage/myisam/mi_check.c:
Always print an an error if we find data errors when checking/repairing a MyISAM table.
When a repair was killed, don't retry repair.
Added assert if sort_get_next_record() returned an error without an error message.
Removed nonsence check "if (sort_param->read_cache.error < 0)" in repair.
storage/myisam/myisamchk.c:
Added counting of notes printed
storage/pbxt/src/thread_xt.cc:
Better error message.
Added more printing of errors to myisamchk.
mysys/mf_iocache.c:
Write error message if failed seek.
sql/table.cc:
Fixed buffer overflow bug:
- It's not enough to check for mysql_version to to detect partion indicator as the version may have been updated by mysql_upgrade.
storage/myisam/ha_myisam.cc:
Don't log same error twice.
Don't reset log_all_errors if it's set
storage/myisam/mi_check.c:
Fixed bug that caused repair() to not report error if called twice (as when doing retry)
More printing of errors.
storage/myisam/sort.c:
Set my_errno in case of out of memory errors.
The patch replaces the use of the POSIX I/O interfaces in mysys on Windows with
the Win32 API calls (CreateFile, WriteFile, etc). The Windows HANDLE for the open
file is stored in the my_file_info struct, along with a flag for append mode
(because the Windows API does not support opening files in append mode in all cases)
The default max open files has been increased to 16384 and can be increased further
by setting --max-open-files=<value> during the server start.
Noteworthy benefit of this patch is that it removes limits from the table_cache size -
allowing for more simultaneus users
and 'THREAD_SAFE_CLIENT'.
As of MySQL 5.5, we no longer support non-threaded
builds. This patch removes all references to the
obsolete THREAD and THREAD_SAFE_CLIENT preprocessor
symbols. These were used to distinguish between
threaded and non-threaded builds.
Before this fix, file io for the binary log file was not accounted properly,
and showed no io at all.
This bug was due to the following issues:
1) file io for the binlog was instrumented:
- sometime as "wait/io/file/sql/binlog"
- sometime as "wait/io/file/sql/MYSQL_LOG"
leading to inconsistent event_names.
2) the binlog file itself was using an IO_CACHE,
but the IO_CACHE implementation in mysys/mf_iocache.c was
not instrumented to make performance schema calls to record file io.
3) The "wait/io/file/sql/MYSQL_LOG" instrumentation was used
for several log files, such as:
- the binary log
- the slow log
- the query log
which caused file io in these different log files to be accounted
against the same instrument.
The instrumentation needs to have a finer grain and report io
in different event_names, because each file really serves a different purpose.
With this fix:
- the IO_CACHE implementation is now instrumented
- the "wait/io/file/sql/MYSQL_LOG" instrument has been removed
- binlog io is now always instrumented with "wait/io/file/sql/binlog"
- the slow log is instrumented with a new name, "wait/io/file/sql/slow_log"
- the query log is instrumented with a new name, "wait/io/file/sql/query_log"
- Changed to still use bcmp() in certain cases becasue
- Faster for short unaligneed strings than memcmp()
- Bettern when using valgrind
- Changed to use my_sprintf() instead of sprintf() to get higher portability for old systems
- Changed code to use MariaDB version of select->skip_record()
- Removed -%::SCCS/s.% from Makefile.am:s to remove automake warnings