recv_reset_logs(): Initialize the redo log buffer, so that no data
from the old redo log can be written to the new redo log.
This bug has very little impact before MariaDB 10.2. The
innodb_log_encrypt option that was introduced in MariaDB 10.1
increases the impact. If the redo log used to be encrypted, and
it is being resized and encryption disabled, then previously
encrypted data could end up being written to the new redo log
in clear text. This resulted in encryption.innodb_encrypt_log
test failures in MariaDB 10.2.
buf_page_print(): Remove the parameter 'flags',
and when a server abort is intended, perform that in the caller.
In this way, page corruption reports due to different reasons
can be distinguished better.
This is non-functional code refactoring that does not fix any
page corruption issues. The change is only made to avoid falsely
grouping together unrelated causes of page corruption.
This is a backport of the following:
MDEV-13009 10.1.24 does not compile on architectures without 64-bit atomics
Add a missing #include "sync0types.h" that was removed in MDEV-12674.
recv_find_max_checkpoint(): Refer to MariaDB 10.2.2 instead of
MySQL 5.7.9. Do not hint that a binary downgrade might be possible,
because there are many changes in InnoDB 5.7 that could make
downgrade impossible: a column appended to SYS_INDEXES, added
SYS_* tables, undo log format changes, and so on.
Assertions failed due to incorrect handling of the --tc-heuristic-recover
option when InnoDB is in read-only mode either due to innodb_read_only=1
or innodb_force_recovery>3. InnoDB failed to refuse a XA COMMIT or
XA ROLLBACK operation, and there were errors in the error handling in
the upper layer.
This was fixed by making InnoDB XA operations respect the
high_level_read_only flag. The InnoDB part of the fix and
parts of the test main.tc_heuristic_recover were provided
by Marko Mäkelä.
LOCK_log mutex lock/unlock had to be added to fix MDEV-13438.
The measure is confirmed by mysql sources as well.
For testing of the conflicting option combination, mysql-test-run is
made to export a new $MYSQLD_LAST_CMD. It holds the very last value
generated by mtr.mysqld_start(). Even though the options have been
also always stored in $mysqld->{'started_opts'} there were no access
to them beyond the automatic server restart by mtr through the expect
file interface.
Effectively therefore $MYSQLD_LAST_CMD represents a more general
interface to $mysqld->{'started_opts'} which can be used in wider
scopes including server launch with incompatible options.
Notice another existing method to restart the server with incompatible
options relying on $MYSQLD_CMD is is aware of $mysqld->{'started_opts'}
(the actual options that the server is launched by mtr). In order to use
this method they would have to be provided manually.
NOTE: When merging to 10.2, the file search_pattern_in_file++.inc
should be replaced with the pre-existing search_pattern_in_file.inc.
The function ibuf_remove_free_page() may be called while the caller
is holding several mutexes or rw-locks. Because of this, this
housekeeping loop may cause performance glitches for operations that
involve tables that are stored in the InnoDB system tablespace.
Also deadlocks might be possible.
The worst impact of all is that due to the mutexes being held, calls to
log_free_check() had to be skipped during this housekeeping.
This means that the cyclic InnoDB redo log may be overwritten.
If the system crashes during this, it would be unable to recover.
The entry point to the problematic code is ibuf_free_excess_pages().
It would make sense to call it before acquiring any mutexes or rw-locks,
in any 'pessimistic' operation that involves the system tablespace.
fseg_create_general(), fseg_alloc_free_page_general(): Do not call
ibuf_free_excess_pages() while potentially holding some latches.
ibuf_remove_free_page(): Do call log_free_check(), like every operation
that is about to generate redo log should do.
ibuf_free_excess_pages(): Remove some assertions that are replaced
by stricter assertions in the log_free_check() that is now called by
ibuf_remove_free_page().
row_ins_sec_index_entry(), row_undo_ins_remove_sec_low(),
row_undo_mod_del_mark_or_remove_sec_low(),
row_undo_mod_del_unmark_sec_and_undo_update(): Call
ibuf_free_excess_pages() if the operation may involve allocating pages
and change buffering in the system tablespace.
When MySQL 5.0.3 introduced InnoDB support for two-phase commit,
it also introduced the questionable logic to roll back XA PREPARE
transactions on startup when innodb_force_recovery is 1 or 2.
Remove this logic in order to avoid unwanted side effects when
innodb_force_recovery is being set for other reasons. That is,
XA PREPARE transactions will always remain in that state until
InnoDB receives an explicit XA ROLLBACK or XA COMMIT request
from the upper layer.
At the time the logic was introduced in MySQL 5.0.3, there already
was a startup parameter that is the preferred way of achieving
the behaviour: --tc-heuristic-recover=ROLLBACK.
Revert the following change, because Memcached is not present
in MariaDB Server. We had better avoid adding dead code.
commit d9bc5e03d788b958ce8c76e157239953db60adb2
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date: Thu May 18 14:31:01 2017 +0530
Bug #24605783 MYSQL GOT SIGNAL 6 ASSERTION FAILURE
Following merge from 5.6.36, this merge also rejects changes that
collided with the rejection of 6ca4f693c1ce472e2b1bf7392607c2d1124b4293.
We initially rejected 6ca4f693c1ce472e2b1bf7392607c2d1124b4293 because
it was introducing a new storage engine API method.
Problem was that dict_sys->size tries to maintain used memory
occupied by the data dictionary table and index objects.
However at least on table objects table->heap size can increase
between when table object is inserted to dict_sys and when
it is removed from dict_sys causing inconsistency on amount
of memory added to and removed from dict_sys->size variable.
Removed unnecessary dict_sys:size variable as it is really
used only for status output.
Introduced dict_sys_get_size function to calculate memory
occupied by the data dictionary table and index objects
that is then used on show engine innodb output.
dict_table_add_to_cache(),
dict_table_rename_in_cache(),
dict_table_remove_from_cache_low(),
dict_index_remove_from_cache_low(),
Remove size calculation.
srv_printf_innodb_monitor(): Use dict_sys_get_size function to
get dictionary memory allocated.
xtradb_internal_hash_tables_fill_table(): Use dict_sys_get_size
function to get dictionary memory allocated.
end_io_call uses uninitialized values from the new_data_cache
As such we the buffer 0 and check this before calling end_io_cache on it.
Thanks Sergey Vojtovich for the review and for this solution.
Found by Coverity (ref 972481).
Coverity report this as:
CID 971840 (#1 of 1): Operands don't affect result (CONSTANT_EXPRESSION_RESULT)
result_independent_of_operands: 4 | (flags & 1) is always true regardless of the values of its operands. This occurs as the logical first operand of "?:".
The C order of precidence has | of higher precidence than ?:. The
intenting implies an | of the 3 terms.
Adjust to intented meaning.
log_calc_max_ages(): Use the requested size in the check, instead of
the detected redo log size. The redo log will be resized at startup
if it differs from what has been requested.
in innodb_read_only mode.
The reason for the hang is that there was no notification received about
completed read io. File handles are bound to completion_port, and there
were no background "write" threads that would be waiting on completion_port,
only 2 "read" threads waiting on read_completion_port were active.
The fix is to use a single IO completion port for all IOs, if
innodb_read_only is set.
When the server is started in innodb_read_only mode, there cannot be
any writes to persistent InnoDB/XtraDB files. Just like the creation
of buf_flush_page_cleaner_thread is skipped in this case, also
the creation of the XtraDB-specific buf_flush_lru_manager_thread
should be skipped.
When a slow shutdown is performed soon after spawning some work for
background threads that can create or commit transactions, it is possible
that new transactions are started or committed after the purge has finished.
This is violating the specification of innodb_fast_shutdown=0, namely that
the purge must be completed. (None of the history of the recent transactions
would be purged.)
Also, it is possible that the purge threads would exit in slow shutdown
while there exist active transactions, such as recovered incomplete
transactions that are being rolled back. Thus, the slow shutdown could
fail to purge some undo log that becomes purgeable after the transaction
commit or rollback.
srv_undo_sources: A flag that indicates if undo log can be generated
or the persistent, whether by background threads or by user SQL.
Even when this flag is clear, active transactions that already exist
in the system may be committed or rolled back.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Do not return an error code; the operation never fails.
Clear the srv_undo_sources flag, and also ensure that the background
DROP TABLE queue is empty.
srv_purge_should_exit(): Do not allow the purge to exit if
srv_undo_sources are active or the background DROP TABLE queue is not
empty, or in slow shutdown, if any active transactions exist
(and are being rolled back).
srv_purge_coordinator_thread(): Remove some previous workarounds
for this bug.
innobase_start_or_create_for_mysql(): Set buf_page_cleaner_is_active
and srv_dict_stats_thread_active directly. Set srv_undo_sources before
starting the purge subsystem, to prevent immediate shutdown of the purge.
Create dict_stats_thread and fts_optimize_thread immediately
after setting srv_undo_sources, so that shutdown can use this flag to
determine if these subsystems were started.
dict_stats_shutdown(): Shut down dict_stats_thread. Backported from 10.2.
srv_shutdown_table_bg_threads(): Remove (unused).
InnoDB shutdown assumes that once the server has entered
SRV_SHUTDOWN_FLUSH_PHASE, no change to persistent data is allowed.
It was possible for the master thread to wake up while shutdown
is executing in SRV_SHUTDOWN_FLUSH_PHASE or
even in SRV_SHUTDOWN_LAST_PHASE.
We do not yet know if further crashes at shutdown are possible.
Also, we do not know if all the observed crashes could be explained
by the race conditions that we are now fixing.
srv_shutdown_print_master_pending(): Remove a redundant ut_time() call.
srv_shutdown(): Renamed from srv_master_do_shutdown_tasks().
srv_master_thread(): Do not resume after shutdown has been initiated.
This fixes warnings that were emitted when running InnoDB test
suites on a debug server that was compiled with GCC 7.1.0 using
the flags -O3 -fsanitize=undefined.
thd_requested_durability(): XtraDB can call this with trx->mysql_thd=NULL.
Remove the function in InnoDB, because it is not used there.
calc_row_difference(): Do not call memcmp(o_ptr, NULL, 0).
innobase_index_name_is_reserved(): This can be called with
key_info=NULL, num_of_keys=0.
innobase_dropping_foreign(), innobase_check_foreigns_low(),
innobase_check_foreigns(): This can be called with
drop_fk=NULL, n_drop_fk=0.
rec_convert_dtuple_to_rec_comp(): Do not invoke memcpy(end, NULL, 0).
On 64-bit systems, the constant 1 would be 32-bit (int or unsigned)
by default. Cast the constant to ulint before shifting to avoid a
-fsanitize=undefined warning or any potential overflow.
Fix a -fsanitizer=undefined warning that trx_undo_report_row_operation()
was being passed thr=NULL when the BTR_NO_UNDO_LOG_FLAG flag was set.
trx_undo_report_row_operation(): Remove the first two parameters.
The parameter clust_entry!=NULL distinguishes inserts from updates.
This should be a non-functional change (no observable change in
behaviour; slightly smaller code).
Allocate srv_sys statically so that the desired alignment can be
guaranteed. This silences -fsanitize=undefined warnings.
There probably is no performance impact of this, because the
reason for the alignment to ensure the absence of false sharing
between counters. Even with the misalignment, each counter would
have been been aligned at 64 bits, and the counters would reside
in separate cache lines.
The parameter thr of the function btr_cur_optimistic_insert()
is not declared as nonnull, but GCC 7.1.0 with -O3 is wrongly
optimizing away the first part of the condition
UNIV_UNLIKELY(thr && thr_get_trx(thr)->fake_changes)
when the function is being called by row_merge_insert_index_tuples()
with thr==NULL.
The fake_changes is an XtraDB addition. This GCC bug only appears
to have an impact on XtraDB, not InnoDB.
We work around the problem by not attempting to dereference thr
when both BTR_NO_LOCKING_FLAG and BTR_NO_UNDO_LOG_FLAG are set
in the flags. Probably BTR_NO_LOCKING_FLAG alone should suffice.
btr_cur_optimistic_insert(), btr_cur_pessimistic_insert(),
btr_cur_pessimistic_update(): Correct comments that disagree with
usage and with nonnull attributes. No other parameter than thr can
actually be NULL.
row_ins_duplicate_error_in_clust(): Remove an unused parameter.
innobase_is_fake_change(): Unused function; remove.
ibuf_insert_low(), row_log_table_apply(), row_log_apply(),
row_undo_mod_clust_low():
Because we will be passing BTR_NO_LOCKING_FLAG | BTR_NO_UNDO_LOG_FLAG
in the flags, the trx->fake_changes flag will be treated as false,
which is the right thing to do at these low-level operations
(change buffer merge, ALTER TABLE…LOCK=NONE, or ROLLBACK).
This might be fixing actual XtraDB bugs.
Other callers that pass these two flags are also passing thr=NULL,
implying fake_changes=false. (Some callers in ROLLBACK are passing
BTR_NO_LOCKING_FLAG and a nonnull thr. In these callers, fake_changes
better be false, to avoid corruption.)
This merge reverts commit 6ca4f693c1ce472e2b1bf7392607c2d1124b4293
from current 5.6.36 innodb.
Bug #23481444 OPTIMISER CALL ROW_SEARCH_MVCC() AND READ THE
INDEX APPLIED BY UNCOMMITTED ROW
Problem:
========
row_search_for_mysql() does whole table traversal for range query
even though the end range is passed. Whole table traversal happens
when the record is not with in transaction read view.
Solution:
=========
Convert the innodb last record of page to mysql format and compare
with end range if the traversal of row_search_mvcc() exceeds 100,
no ICP involved. If it is out of range then InnoDB can avoid the
whole table traversal. Need to refactor the code little bit to
make it compile.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Reviewed-by: Knut Hatlen <knut.hatlen@oracle.com>
Reviewed-by: Dmitry Shulga <dmitry.shulga@oracle.com>
RB: 14660
The macro UT_LIST_INIT() zero-initializes the UT_LIST_NODE.
There is no need to call this macro on a buffer that has
already been zero-initialized by mem_zalloc() or mem_heap_zalloc()
or similar.
For some reason, the statement UT_LIST_INIT(srv_sys->tasks) in
srv_init() caused a SIGSEGV on server startup when compiling with
GCC 7.1.0 for AMD64 using -O3. The zero-initialization was attempted
by the instruction movaps %xmm0,0x50(%rax), while the proper offset
of srv_sys->tasks would seem to have been 0x48.
Do not silence uncertain cases, or fix any bugs.
The only functional change should be that ha_federated::extra()
is not calling DBUG_PRINT to report an unhandled case for
HA_EXTRA_PREPARE_FOR_DROP.
bunch of bugs when external_lock() fails on unlock:
* mi_lock_database() used mi_mark_crashed() under share->intern_lock,
but mi_mark_crashed() itself locks this mutex.
* handler::close() required table to be unlocked, but failed
external_lock didn't count as unlock
* mysql_unlock_tables() ignored all unlock errors, but they still set
the error status in stmt_da.
Given the OK macro used in innodb does a DBUG_RETURN(1) on expression failure
the innodb implementation has a number of errors in i_s.cc.
We introduce a new macro BREAK_IF that replaces some use of the OK macro.
Also, do some other cleanup detailed below.
When invoking Field::store() on integers, always pass the parameter
is_unsigned=true to avoid an unnecessary conversion to double.
i_s_fts_deleted_generic_fill(), i_s_fts_config_fill():
Use the BREAK_IF macro instead of OK.
i_s_fts_index_cache_fill_one_index(), i_s_fts_index_table_fill_one_index():
Add a parameter for conv_string, and let the caller allocate that buffer.
i_s_fts_index_cache_fill(): Check the return status of
i_s_fts_index_cache_fill_one_index().
i_s_fts_index_table_fill(): Check the return status of
i_s_fts_index_table_fill_one_index().
i_s_fts_index_table_fill_one_fetch(): Always let the caller invoke
i_s_fts_index_table_free_one_fetch().
i_s_innodb_buffer_page_fill(), i_s_innodb_buf_page_lru_fill():
Do release dict_sys->mutex if filling the buffers fails.
i_s_innodb_buf_page_lru_fill(): Also display the value
INFORMATION_SCHEMA.INNODB_BUFFER_PAGE.PAGE_IO_FIX='IO_PIN'
when a block is in that state. Remove the unnecessary variable 'heap'.
simple_counter::add(): Add a type cast to the os_atomic_increment_ulint()
call, because GCC would check the type compatibility even when the code
branch is not being instantiated (atomic=false). On Solaris,
os_atomic_increment_ulint() actually needs a compatible parameter type,
and an error would be emitted due to an incompatible 64-bit type,
for srv_stats.n_lock_wait_time.add(diff_time).
There is a race condition related to the variable
srv_stats.n_lock_wait_current_count, which is only
incremented and decremented by the function lock_wait_suspend_thread(),
The incrementing is protected by lock_sys->wait_mutex, but the
decrementing does not appear to be protected by anything.
This mismatch could allow the counter to be corrupted when a
transactional InnoDB table or record lock wait is terminating
roughly at the same time with the start of a wait on a
(possibly different) lock.
ib_counter_t: Remove some unused methods. Prevent instantiation for N=1.
Add an inc() method that takes a slot index as a parameter.
single_indexer_t: Remove.
simple_counter<typename Type, bool atomic=false>: A new counter wrapper.
Optionally use atomic memory operations for modifying the counter.
Aligned to the cache line size.
lsn_ctr_1_t, ulint_ctr_1_t, int64_ctr_1_t: Define as simple_counter<Type>.
These counters are either only incremented (and we do not care about
losing some increment operations), or the increment/decrement operations
are protected by some mutex.
srv_stats_t::os_log_pending_writes: Document that the number is protected
by log_sys->mutex.
srv_stats_t::n_lock_wait_current_count: Use simple_counter<ulint, true>,
that is, atomic inc() and dec() operations.
lock_wait_suspend_thread(): Release the mutexes before incrementing
the counters. Avoid acquiring the lock mutex if the lock wait has
already been resolved. Atomically increment and decrement
srv_stats.n_lock_wait_current_count.
row_insert_for_mysql(), row_update_for_mysql(),
row_update_cascade_for_mysql(): Use the inc() method with the trx->id
as the slot index. This is a non-functional change, just using
inc() instead of add(1).
buf_LRU_get_free_block(): Replace the method add(index, n) with inc().
There is no slot index in the simple_counter.
SYMLINK CHECK RACE CONDITIONS
ANALYSIS:
=========
A potential defect exists in the handling of CREATE
TABLE .. DATA DIRECTORY/ INDEX DIRECTORY which gives way to
the user to gain access to another user table or a system
table.
FIX:
====
The lstat and fstat output of the target files are now
stored which help in determining the identity of the target
files thus preventing the unauthorized access to other
files.
Problem was two race condtion in Aria page cache:
- find_block() didn't inform free_block() that it had released requests
- free_block() didn't handle pinned blocks, which could happen if
free_block() was called as part of flush. This is fixed by not freeing
blocks that are pinned. This is safe as when maria_close() is called
when last thread is using a table, there can be no pinned blocks. For
other flush calls it's safe to ignore pinned blocks.
- Subfolder Option: SELECT Query Never Ends
modified: storage/connect/tabmul.cpp
modified: storage/connect/tabmul.h
Work on MDEV-12667 Crash when using JSON tables
modified: storage/connect/connect.cc
modified: storage/connect/ha_connect.cc
modified: storage/connect/ha_connect.h
modified: storage/connect/plgdbutl.cpp
Change Base offset for DIR tables on Linux
modified: storage/connect/reldef.cpp
This is a reduced version of an originally much larger patch.
We will keep the definition of the ulint, lint data types unchanged,
and we will not be replacing fprintf() calls with ib_logf().
On Windows, use the standard format strings instead of nonstandard
extensions.
This patch fixes some errors in format strings.
Most notably, an IMPORT TABLESPACE error message in InnoDB was
displaying the number of columns instead of the mismatching flags.
Allow 64-bit atomic operations on 32-bit systems,
only relying on HAVE_ATOMIC_BUILTINS_64, disregarding
the width of the register file.
Define UNIV_WORD_SIZE correctly on all systems, including Windows.
In MariaDB 10.0 and 10.1, it was incorrectly defined as 4 on
64-bit Windows.
Define HAVE_ATOMIC_BUILTINS_64 on Windows
(64-bit atomics are available on both 32-bit and 64-bit Windows
platforms; the operations were unnecessarily disabled even on
64-bit Windows).
MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES: Enable by default.
os_file_n_pending_preads, os_file_n_pending_pwrites,
os_n_pending_reads, os_n_pending_writes: Remove.
Use the monitor counters instead.
os_file_count_mutex: Remove. On a system that does not support
64-bit atomics, monitor_mutex will be used instead.
table (ODBC, JDBC, MYSQL) with a WHERE clause on an indexed column.
Also fix a bugs in TDBEXT::MakeCommand (use of uninitialised Quote)
Add in this function the eventual Schema (database) prefixing.
modified: storage/connect/connect.cc
modified: storage/connect/tabext.cpp
Typo
modified: storage/connect/tabjdbc.h
FT_BOOLEAN_CHECK_SYNTAX_STRING
ISSUE: my_isalnum macro used for checking if character is
alphanumeric dereferences uninitialized pointer
in default character set structure resulting in
server exiting abnormally.
FIX: Used standard isalnum function instead of macro my_isalnum.
In the 10.1 InnoDB Plugin, a call os_event_free(buf_flush_event) was
misplaced. The event could be signalled by rollback of resurrected
transactions while shutdown was in progress. This bug was caught
by cmake -DWITH_ASAN testing. This call was only present in the
10.1 InnoDB Plugin, not in other versions, or in XtraDB.
That said, the bug affects all InnoDB versions. Shutdown assumes the
cessation of any page-dirtying activity, including the activity of
the background rollback thread. InnoDB only waited for the background
rollback to finish as part of a slow shutdown (innodb_fast_shutdown=0).
The default is a clean shutdown (innodb_fast_shutdown=1). In a scenario
where InnoDB is killed, restarted, and shut down soon enough, the data
files could become corrupted.
logs_empty_and_mark_files_at_shutdown(): Wait for the
rollback to finish, except if innodb_fast_shutdown=2
(crash-like shutdown) was requested.
trx_rollback_or_clean_recovered(): Before choosing the next
recovered transaction to roll back, terminate early if non-slow
shutdown was initiated. Roll back everything on slow shutdown
(innodb_fast_shutdown=0).
srv_innodb_monitor_mutex: Declare as static, because the mutex
is only used within one module.
After each call to os_event_free(), ensure that the freed event
is not reachable via global variables, by setting the relevant
variables to NULL.
Also, implement MDEV-11027 a little differently from 5.5:
recv_sys_t::report(ib_time_t): Determine whether progress should
be reported.
recv_apply_hashed_log_recs(): Rename the parameter to last_batch.
Provide more useful progress reporting of crash recovery.
recv_sys_t::progress_time: The time of the last report.
recv_scan_print_counter: Remove.
log_group_read_log_seg(): After after each I/O request,
report progress if needed.
recv_apply_hashed_log_recs(): At the start of each batch,
if there are pages to be recovered, issue a message.
dir_per_db_rename_to_nenexisting_schema: mysqltest fails with no output
percona_kill_idle_trx_tokudb: MariaDB doesn't support kill_idle_trx var
for all SE.
my_readline can fail due to missing file. Make my_readline report this
condition separately so that we can catch it and report an appropriate
error message to the user.
The function posix_fallocate() as well as the Linux system call
fallocate() can return EINTR when the operation was interrupted
by a signal. In that case, keep retrying the operation, except
if InnoDB shutdown has been initiated.
The function posix_fallocate() as well as the Linux system call
fallocate() can return EINTR when the operation was interrupted
by a signal. In that case, keep retrying the operation, except
if InnoDB shutdown has been initiated.
modified: storage/connect/ha_connect.cc
Add conditional SE exception support
modified: storage/connect/json.cpp
modified: storage/connect/plgdbutl.cpp
Change %p in %x in some sprintf functions.
This to avoid some compiler warnings.
modified: storage/connect/tabwmi.cpp
modified: storage/connect/tabxml.cpp
modified: storage/connect/value.h
Add JavaWrappers.jar to the class path
modified: storage/connect/jdbconn.cpp
Fix wrong declare (char *buf[256]; --> char buf[256];)
modified: storage/connect/xindex.cpp
On FreeBSD liblz4 is installed in /usr/local/lib.
Groonga uses pkg_check_modules to check for liblz4 (that is, pkg-config),
and then it used to set for libgroonga.a
link_directories({$LIBLZ4_LIBRARY_DIRS})
target_link_libraries(... ${LIBLZ4_LIBRARIES})
Now groonga is a static library, linked into ha_mroonga.so. CMake won't
link dynamic liblz4.so into libgroonga.a, instead it'll pass it as a
dependency and will link it into ha_mroonga.so. Fine so far. But it will
not pass link_directories from the static library as a dependency,
so ha_mroonga.so won't find liblz4.so
As suggested on cmake mailing list (e.g.
here: http://public.kitware.com/pipermail/cmake/2011-November/047468.html)
we switch to use the full path to liblz4.so, instead of the -l/-L pair.
- Removed not used variables
- Added __attribute__()
- Added static to some local functions
(gcc 5.4 gives a warning for external functions without an external definition)
it was race condition prone. instead use either a pair of my_delete()
calls with already resolved paths, or a safe high-level function
my_handler_delete_with_symlink(), like MyISAM and Aria already do.
TOCTOU bug. The path is checked to be valid, symlinks are resolved.
Then the resolved path is opened. Between the check and the open,
there's a window when one can replace some path component with a
symlink, bypassing validity checks.
Fix: after we resolved all symlinks in the path, don't allow open()
to resolve symlinks, there should be none.
Compared to the old MyISAM/Aria code:
* fastpath. Opening of not-symlinked files is just one open(),
no fn_format() and lstat() anymore.
* opening of symlinked tables doesn't do fn_format() and lstat() either.
it also doesn't to realpath() (which was lstat-ing every path
component), instead if opens every path component with O_PATH.
* share->data_file_name stores realpath(path) not readlink(path). So,
SHOW CREATE TABLE needs to do lstat/readlink() now (see ::info()),
and certain error messages (cannot open file "XXX") show the real
file path with all symlinks resolved.
fil_extend_space_to_desired_size(): Use a proper type cast when
computing start_offset for the posix_fallocate() call on 32-bit systems
(where sizeof(ulint) < sizeof(os_offset_t)). This could affect 32-bit
systems when extending files that are at least 4 MiB long.
This bug existed in MariaDB 10.0 before MDEV-11520. In MariaDB 10.1
it had been fixed in MDEV-11556.
a large memory buffer on Windows
fil_extend_space_to_desired_size(), os_file_set_size(): Use calloc()
for memory allocation, and handle failures. Properly check the return
status of posix_fallocate(), and pass the correct arguments to
posix_fallocate().
On Windows, instead of extending the file by at most 1 megabyte at a time,
write a zero-filled page at the end of the file.
According to the Microsoft blog post
https://blogs.msdn.microsoft.com/oldnewthing/20110922-00/?p=9573
this will physically extend the file by writing zero bytes.
(InnoDB never uses DeviceIoControl() to set the file sparse.)
I tested that the file extension works properly with a multi-file
system tablespace, both with --innodb-use-fallocate and
--skip-innodb-use-fallocate (the default):
./mtr \
--mysqld=--innodb-use-fallocate \
--mysqld=--innodb-autoextend-increment=1 \
--mysqld=--innodb-data-file-path='ibdata1:5M;ibdata2:5M:autoextend' \
--parallel=auto --force --retry=0 --suite=innodb &
ls -lsh mysql-test/var/*/mysqld.1/data/ibdata2
(several samples while running the test)
Before the MDEV-11520 fixes, fil_extend_space_to_desired_size()
in MariaDB Server 5.5 incorrectly passed the desired file size as the
third argument to posix_fallocate(), even though the length of the
extension should have been passed. This looks like a regression
that was introduced in the 5.5 version of MDEV-5746.
Remove the unused variable desired_size.
Also, correct the expression for the posix_fallocate() start_offset,
and actually test that it works with a multi-file system tablespace.
Before MDEV-11520, the expression was wrong in both innodb_plugin and
xtradb, in different ways.
The start_offset formula was tested with the following:
./mtr --big-test --mysqld=--innodb-use-fallocate \
--mysqld=--innodb-data-file-path='ibdata1:5M;ibdata2:5M:autoextend' \
--parallel=auto --force --retry=0 --suite=innodb &
ls -lsh mysql-test/var/*/mysqld.1/data/ibdata2
a large memory buffer on Windows
fil_extend_space_to_desired_size(), os_file_set_size(): Use calloc()
for memory allocation, and handle failures. Properly check the return
status of posix_fallocate().
On Windows, instead of extending the file by at most 1 megabyte at a time,
write a zero-filled page at the end of the file.
According to the Microsoft blog post
https://blogs.msdn.microsoft.com/oldnewthing/20110922-00/?p=9573
this will physically extend the file by writing zero bytes.
(InnoDB never uses DeviceIoControl() to set the file sparse.)
For innodb_plugin, port the XtraDB fix for MySQL Bug#56433
(introducing fil_system->file_extend_mutex). The bug was
fixed differently in MySQL 5.6 (and MariaDB Server 10.0).
The function trx_purge_stop() was calling os_event_reset(purge_sys->event)
before calling rw_lock_x_lock(&purge_sys->latch). The os_event_set()
call in srv_purge_coordinator_suspend() is protected by that X-latch.
It would seem a good idea to consistently protect both os_event_set()
and os_event_reset() calls with a common mutex or rw-lock in those
cases where os_event_set() and os_event_reset() are used
like condition variables, tied to changes of shared state.
For each os_event_t, we try to document the mutex or rw-lock that is
being used. For some events, frequent calls to os_event_set() seem to
try to avoid hangs. Some events are never waited for infinitely, only
timed waits, and os_event_set() is used for early termination of these
waits.
os_aio_simulated_put_read_threads_to_sleep(): Define as a null macro
on other systems than Windows. TODO: remove this altogether and disable
innodb_use_native_aio on Windows.
os_aio_segment_wait_events[]: Initialize only if innodb_use_native_aio=0.
recv_writer_thread(): Do not assign recv_writer_thread_active=true
in order to avoid a race condition with
recv_recovery_from_checkpoint_finish().
recv_init_crash_recovery(): Assign recv_writer_thread_active=true
before creating recv_writer_thread.
Remove the debug parameter innodb_force_recovery_crash that was
introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
to resize the redo log on startup.
Let innodb.log_file_size actually start up the server, but ensure
that the InnoDB storage engine refuses to start up in each of the
scenarios.
If InnoDB is started in innodb_read_only mode such that
recovered incomplete transactions exist at startup
(but the redo logs are clean), an assertion will fail at shutdown,
because there would exist some non-prepared transactions.
logs_empty_and_mark_files_at_shutdown(): Do not wait for incomplete
transactions to finish if innodb_read_only or innodb_force_recovery>=3.
Wait for purge to finish in only one place.
trx_sys_close(): Relax the assertion that would fail first.
trx_free_prepared(): Also free recovered TRX_STATE_ACTIVE transactions
if innodb_read_only or innodb_force_recovery>=3.
srv_release_threads(): Actually wait for the threads to resume
from suspension. On CentOS 5 and possibly other platforms,
os_event_set() may be lost.
srv_resume_thread(): A counterpart of srv_suspend_thread().
Optionally wait for the event to be set, optionally with a timeout,
and then release the thread from suspension.
srv_free_slot(): Unconditionally suspend the thread. It is always
in resumed state when this function is entered.
srv_active_wake_master_thread_low(): Only call os_event_set().
srv_purge_coordinator_suspend(): Use srv_resume_thread() instead
of the complicated logic.
crashes server
This bug is the result of merging the Oracle MySQL follow-up fix
BUG#22963169 MYSQL CRASHES ON CREATE FULLTEXT INDEX
without merging the base bug fix:
Bug#79475 Insert a token of 84 4-bytes chars into fts index causes
server crash.
Unlike the above mentioned fixes in MySQL, our fix will not change
the storage format of fulltext indexes in InnoDB or XtraDB
when a character encoding with mbmaxlen=2 or mbmaxlen=3
and the length of a word is between 128 and 84*mbmaxlen bytes.
The Oracle fix would allocate 2 length bytes for these cases.
Compatibility with other MySQL and MariaDB releases is ensured by
persisting the used maximum length in the SYS_COLUMNS table in the
InnoDB data dictionary.
This fix also removes some unnecessary strcmp() calls when checking
for the legacy default collation my_charset_latin1
(my_charset_latin1.name=="latin1_swedish_ci").
fts_create_one_index_table(): Store the actual length in bytes.
This metadata will be written to the SYS_COLUMNS table.
fts_zip_initialize(): Initialize only the first byte of the buffer.
Actually the code should not even care about this first byte, because
the length is set as 0.
FTX_MAX_WORD_LEN: Define as HA_FT_MAXCHARLEN * 4 aka 336 bytes,
not as 254 bytes.
row_merge_create_fts_sort_index(): Set the actual maximum length of the
column in bytes, similar to fts_create_one_index_table().
row_merge_fts_doc_tokenize(): Remove the redundant parameter word_dtype.
Use the actual maximum length of the column. Calculate the extra_size
in the same way as row_merge_buf_encode() does.
InnoDB would refuse to start up if there is a mismatch on
the size of the system tablespace files. However, before this
check is conducted, the system tablespace may already have been
heavily modified.
InnoDB should perform the size check as early as possible.
recv_recovery_from_checkpoint_finish():
Move the recv_apply_hashed_log_recs() call to
innobase_start_or_create_for_mysql().
innobase_start_or_create_for_mysql(): Test the mutex functionality
before doing anything else. Use a compile_time_assert() for a
sizeof() constraint. Check the size of the system tablespace as
early as possible.
recv_scan_log_recs(): Remember if redo log apply is needed,
even if starting up in innodb_read_only mode.
recv_recovery_from_checkpoint_start_func(): Refuse
innodb_read_only startup if redo log apply is needed.
buf_flush_init_flush_rbt() was called too early in MariaDB server 10.0,
10.1, MySQL 5.5 and MySQL 5.6. The memory leak has been fixed in
the XtraDB storage engine and in MySQL 5.7.
As a result, when the server is started to initialize new data files,
the buf_pool->flush_rbt will be created unnecessarily and then leaked.
This memory leak was noticed in MariaDB server 10.1 when running the
test encryption.innodb_first_page.
The problem in MariaDB is introduced by this merge commit:
c33db2cdc0
The merge comes from mysql and the original author comes from this
commit from MySQL:
------------------------------------------------
commit 160b823d146288d66638e4a740d6d2da72f9a689
Author: Marc Alff <marc.alff@oracle.com>
Date: Tue Aug 30 12:14:07 2016 +0200
Bug#22551677 SIGNAL 11 IN LF_PINBOX_PUT_PINS
Backport to 5.6
------------------------------------------------
The breaking change is in start_socket_wait_v1 where instead of using
m_thread_owner, we make use of my_pthread_getspecific_ptr to fetch a
thread local storage value. Unfortunately this invalidates the
"m_thread_owner" member when a socket is created. The internals of the
socket structure have m_thread_owner set to NULL, but when checking for
ownership we actually look at the current thread's key store.
This seems incorrect however it is not immediately apparent why.
To not diverge from MySQL's reasoning as it is not described what the
actual problem was that this commit is trying to fix, I have adjusted the
unittest to account for this new behaviour. We destroy the current
thread in the unit test, such that the newly created socket actually has
no thread owner. The m_thread_owner is untouched in all this.
* Update mysqld_safe script to remove duplicated parameter --crash-script
* Make --core-file-size accept underscores as well as dashes correctly.
* Add mysqld_safe_helper to Debian and Ubuntu files.
* Update innodb minor version to 35
MY_THREAD_INIT IN BACKGROUND THREAD
Description:
===========
Add my_thread_init() and my_thread_exit() for background threads which
initializes and frees the st_my_thread_var structure.
Reviewed-by: Jimmy Yang<jimmy.yang@oracle.com>
RB: 15003
Memory was leaked when ALTER TABLE is attempted on a table
that contains corrupted indexes.
The memory leak was reported by AddressSanitizer for the test
innodb.innodb_corrupt_bit. The leak was introduced into
MariaDB Server 10.0.26, 10.1.15, 10.2.1 by the following:
commit c081c978a2
Merge: 1d21b22155a482e76e65
Author: Sergei Golubchik <serg@mariadb.org>
Date: Tue Jun 21 14:11:02 2016 +0200
Merge branch '5.5' into bb-10.0
MariaDB Server 10.0.28 and 10.1.19 merged code from Percona XtraDB
that introduced support for compressed columns. Much but not all
of this code was disabled by placing #ifdef HAVE_PERCONA_COMPRESSED_COLUMNS
around it.
Among the unused but not disabled code is code to access
some new system tables related to compressed columns.
The creation of these system tables SYS_ZIP_DICT and SYS_ZIP_DICT_COLS
would cause a crash in --innodb-read-only mode when upgrading
from an earlier version to 10.0.28 or 10.1.19.
Let us remove all the dead code related to compressed columns.
Users who already upgraded to 10.0.28 and 10.1.19 will have the two
above mentioned empty tables in their InnoDB system tablespace.
Subsequent versions of MariaDB Server will completely ignore those tables.
- in DOMNODELIST::DropItem
if (Listp == NULL || Listp->length <= n)
return true;
is wrong, should be:
if (Listp == NULL || Listp->length < n)
return true;
- Crash in discovery with libxml2 in XMLColumns because:
if (!tdp->Usedom) // nl was destroyed
vp->nl = vp->pn->GetChildElements(g);
is executed with vp->pn uninitialized. Fixed by adding:
vp->pn = node;
line 264.
-In discovery with libxml2 some columns are not found.
Because list was not recovered properly, nodes being modified and not reallocated.
Fixed lines 214 and 277.
modified: storage/connect/domdoc.cpp
modified: storage/connect/tabxml.cpp
Add support for zipped table files
modified: storage/connect/domdoc.cpp
modified: storage/connect/domdoc.h
modified: storage/connect/filamap.cpp
modified: storage/connect/filamap.h
modified: storage/connect/filamzip.cpp
modified: storage/connect/filamzip.h
modified: storage/connect/ha_connect.cc
modified: storage/connect/libdoc.cpp
modified: storage/connect/plgdbutl.cpp
modified: storage/connect/plgxml.cpp
modified: storage/connect/plgxml.h
modified: storage/connect/tabdos.cpp
modified: storage/connect/tabdos.h
modified: storage/connect/tabfmt.cpp
modified: storage/connect/tabjson.cpp
modified: storage/connect/tabxml.cpp
Essentially revert MDEV-6759, which addressed a double free of memory
by removing the freeing altogether, introducing the memory leaks.
No double free was observed when running the test suite -DWITH_ASAN.
Replace some mem_heap_free(foreign->heap) with dict_foreign_free(foreign)
so that the calls can be located and instrumented more easily when needed.
A first experimental and limited implementation.
modified: storage/connect/CMakeLists.txt
modified: storage/connect/filamap.cpp
new file: storage/connect/filamzip.cpp
new file: storage/connect/filamzip.h
modified: storage/connect/ha_connect.cc
new file: storage/connect/ioapi.c
new file: storage/connect/ioapi.h
modified: storage/connect/mycat.cc
modified: storage/connect/plgdbsem.h
modified: storage/connect/plgdbutl.cpp
modified: storage/connect/tabdos.cpp
modified: storage/connect/tabdos.h
modified: storage/connect/tabfmt.cpp
modified: storage/connect/tabfmt.h
modified: storage/connect/tabjson.cpp
modified: storage/connect/tabjson.h
new file: storage/connect/tabzip.cpp
new file: storage/connect/tabzip.h
new file: storage/connect/unzip.c
new file: storage/connect/unzip.h
new file: storage/connect/zip.c
be consistent and don't include the table name into the error message,
no other CREATE TABLE error does it.
(the crash happened, because thd->lex->query_tables was NULL)
Fix includes launchpad fix plus more to cover writing BIN tables.
modified: storage/connect/tabfix.cpp
modified: storage/connect/value.cpp
modified: storage/connect/value.h
- Typo: Change the name of filamzip to filamgz to prepare future ZIP tables.
modified: storage/connect/CMakeLists.txt
added: storage/connect/filamgz.cpp
added: storage/connect/filamgz.h
deleted: storage/connect/filamzip.cpp
deleted: storage/connect/filamzip.h
modified: storage/connect/plgdbsem.h
modified: storage/connect/reldef.cpp
modified: storage/connect/tabdos.cpp
modified: storage/connect/tabdos.h
modified: storage/connect/tabfix.cpp
modified: storage/connect/tabfmt.cpp
modified: storage/connect/tabjson.cpp
By setting the context class loader.
modified: storage/connect/JavaWrappers.jar
modified: storage/connect/JdbcInterface.java
modified: storage/connect/mysql-test/connect/std_data/JdbcMariaDB.jar
This is not a fix, this is instrumentation to find out is MySQL frm dictionary
and InnoDB data dictionary really out-of-sync when this assertion is fired,
or is there some other reason (bug).
Try to fix the INSTALL command.
modified: storage/connect/CMakeLists.txt
- Make some JDBC tests available on Windows
modified: storage/connect/mysql-test/connect/t/jdbc.test
modified: storage/connect/mysql-test/connect/t/jdbc_new.test
added: storage/connect/mysql-test/connect/t/windows.inc
Now it is also possible to escape it by a backslash.
modified: storage/connect/tabfmt.cpp
- Prepare making VEC table type support conditional.
VEC tables might be unsupported in future versions
modified: storage/connect/CMakeLists.txt
modified: storage/connect/mycat.cc
modified: storage/connect/reldef.cpp
modified: storage/connect/xindex.cpp
- MDEV-11067 suggested to add configuration support to the Apache wrapper.
Was added but commented out until prooved it is really useful.
modified: storage/connect/ApacheInterface.java
modified: storage/connect/ha_connect.cc
modified: storage/connect/jdbccat.h
modified: storage/connect/jdbconn.cpp
modified: storage/connect/jdbconn.h
modified: storage/connect/tabjdbc.cpp
modified: storage/connect/tabjdbc.h
- Remove useless members.
modified: storage/connect/jdbconn.cpp
modified: storage/connect/jdbconn.h
- New UDF countin.
modified: storage/connect/jsonudf.cpp
modified: storage/connect/jsonudf.h
The following directives to ignore warnings where in the PerconaFT build in tokudb.
These generate errors when g++ ... -o xxx.so is used to compile are shared object.
As these don't actually hit any warnings they have been removed.
* -Wno-ignored-attributes
* -Wno-pointer-bool-conversion
Signed-off-by: Daniel Black <daniel.black@au.ibm.com>
Now the null is tested using the result set getObject method.
modified: storage/connect/JdbcInterface.java
modified: storage/connect/jdbconn.cpp
modified: storage/connect/jdbconn.h
Was because the quoting character was always '"' instead of being
retrieve from the JDBC source.
modified: storage/connect/JdbcInterface.java
modified: storage/connect/jdbconn.cpp
modified: storage/connect/tabjdbc.cpp
Prevent GCC from moving a mach_read_from_4() before we have checked that
we have 4 bytes to read. The pointer may only point to a 1, 2 or 3
bytes in which case the code should not read 4 bytes. This is a
workaround to a GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77673
Patch submitted by: Laurynas Biveinis <laurynas.biveinis@gmail.com>
RB: 14135
Reviewed by: Pawel Olchawa <pawel.olchawa@oracle.com>
(Fixing both InnoDB and XtraDB)
Re-opening a TABLE object (after e.g. FLUSH TABLES or open table cache
eviction) causes ha_innobase to call
dict_stats_update(DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY).
Inside this call, the following is done:
dict_stats_empty_table(table);
dict_stats_copy(table, t);
On the other hand, commands like UPDATE make this call to get the "rows in
table" statistics in table->stats.records:
ha_innobase->info(HA_STATUS_VARIABLE|HA_STATUS_NO_LOCK)
note the HA_STATUS_NO_LOCK parameter. It means, no locks are taken by
::info() If the ::info() call happens between dict_stats_empty_table
and dict_stats_copy calls, the UPDATE's optimizer will get an estimate
of table->stats.records=1, which causes it to pick a full table scan,
which in turn will take a lot of row locks and cause other bad
consequences.
Linking tokudb with jemalloc privately causes problems on library
load/unload. To prevent dangling destructor pointers, link with the same
library as the server is using.