MySQL-Connector-Net casts SEQ_IN_INDEX to uint and will
raise an exception if the type is a System.Int64.
As we don't support a huge number of multi-columns in
an index reducing to a uint is sufficient to represent
all values and maintain compatibility with MySQL-Connector-Net.
This matches the type (uint) returned by MySQL-8.3 and 8.0.
Reviewer: Alexander Barkov <bar@mariadb.com>
buf_flush_buffer_pool(): Wait for any pending asynchronous reads
to complete. This assertion failed in a run where buf_read_ahead_linear()
had been triggered in an SQL statement that was executed right
before shutdown.
Reviewed by: Debarun Banerjee
btr_cur_t::search_leaf(): When the index root page is also a leaf page,
we may need to upgrade our existing shared root page latch into an
exclusive latch. Even if we end up waiting, the root page won't be able
to go away while we hold an index()->lock. The index page may be split;
that is all.
btr_latch_prev(): Acquire the page latch while holding a buffer-fix
and an index tree latch. Merge the change buffer if needed. Use
buf_pool_t::page_fix() for this special case instead of complicating
buf_page_get_low() and buf_page_get_gen().
row_merge_read_clustered_index(): Remove some code that does not seem
to be useful. No difference was observed with regard to removing this
code when a CREATE INDEX or OPTIMIZE TABLE statement was run concurrently
with sysbench oltp_update_index --tables=1 --table_size=1000 --threads=16.
buf_pool_t::unzip(): Decompress a ROW_FORMAT=COMPRESSED page.
buf_pool_t::page_fix(): Handle also ROW_FORMAT=COMPRESSED pages
as well as change buffer merge. Optionally return an error.
Add a flag for suppressing a page latch wait and a special return
value -1 to indicate that the call would block.
This is the preferred way of buffer-fixing blocks.
The functions buf_page_get_gen() and buf_page_get_low() are only being
invoked with rw_latch=RW_NO_LATCH in operations on SPATIAL INDEX.
buf_page_t: Define some static functions for interpreting state().
buf_page_get_zip(), buf_read_page(),
buf_read_ahead_random(), buf_read_ahead_linear():
Remove the redundant parameter zip_size. We must look up the
tablespace and can invoke fil_space_t::zip_size() on it.
buf_page_get_low(): Require mtr!=nullptr.
buf_page_get_gen(): Implement some lock downgrading during recovery.
ibuf_page_low(): Use buf_pool_t::page_fix() in a debug check.
We do wait for a page read here, because otherwise a debug assertion in
buf_page_get_low() in the test innodb.ibuf_delete could occasionally fail.
PageConverter::operator(): Invoke buf_pool_t::page_fix() in order
to possibly evict a block. This allows us to remove some
special case code from buf_page_get_low().
It's possible that MDL conflict handling code is called more
than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed,
which results in repeated attempts to BF-abort already aborted
transaction.
In such situations, it might be that BF-aborting logic sees
a partially rolled back transaction and erroneously decides
on future actions for such a transaction.
The specific situation tested and fixed is when a SR transaction
applied in the node gets BF-aborted by a started TOI operation.
It's then caught with the server transaction already rolled back,
but with no MDL locks yet released. This caused wrong state
detection for such a transaction during repeated MDL conflict
handling code execution.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
(Variant 2b: call greedy_search() twice, correct handling for limited
search_depth)
Modify the join optimizer to specifically try to produce join orders that
can short-cut their execution for ORDER BY..LIMIT clause.
The optimization is controlled by @@optimizer_join_limit_pref_ratio.
Default value 0 means don't construct short-cutting join orders.
Other value means construct short-cutting join order, and prefer it only
if it promises speedup of more than #value times.
In Optimizer Trace, look for these names:
* join_limit_shortcut_is_applicable
* join_limit_shortcut_plan_search
* join_limit_shortcut_choice
Problem was that wsrep_schema tables were not marked as
category information. Fix allows access to wsrep_schema
tables even when node is detached.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Replication of MyISAM and Aria DML is experimental and best
effort only. Earlier change make INSERT SELECT on both
MyISAM and Aria to replicate using TOI and STATEMENT
replication. Replication should happen only if user
has set needed wsrep_mode setting.
Note: This commit contains additional changes compared
to those already made for the 10.5 branch.
+ small refactoring after main fix.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
It's possible that MDL conflict handling code is called more
than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed,
which results in repeated attempts to BF-abort already aborted
transaction.
In such situations, it might be that BF-aborting logic sees
a partially rolled back transaction and erroneously decides
on future actions for such a transaction.
The specific situation tested and fixed is when a SR transaction
applied in the node gets BF-aborted by a started TOI operation.
It's then caught with the server transaction already rolled back,
but with no MDL locks yet released. This caused wrong state
detection for such a transaction during repeated MDL conflict
handling code execution.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
* Fixes galera.galera_bf_kill_debug test case.
* Enable galera_ssl_upgrade, galera_ssl_reload, galera_pc_bootstrap
* Add MDEV to disabled tests that miss it
P.S. This commit contains additional changes compared
to the similar commit for 10.5 branch.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
After closing https://github.com/codership/galera-bugs/issues/947,
Galera now correctly certifies table-level keys, which made bulk
insert work again.
The corresponding MTR test is made deterministic and re-enabled.
Requires Galera 26.4.19
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
* Fixes galera.galera_bf_kill_debug test case.
* Enable galera_ssl_upgrade, galera_ssl_reload, galera_pc_bootstrap
* Add MDEV to disabled tests that miss it
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Based on logs SST was started before donor reached
Primaty state. Add wait_conditions to make sure that
nodes reach Primary state before starting next node.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
For TOI events specifically we have a situation where in case of the
same error different nodes may generate different messages. This may
be for two reasons:
- different locale setting between the current client session and
server default (we can reasonably require server locales to be
identical on all nodes, but user can change message locale for the
session)
- non-deterministic course of STATEMENT execution e.g. for ALTER TABLE
On the other hand we may reasonably expect TOI event failures since
they are executed after replication, so we must ensure that voting is
consistent. For that purpose error codes should be sufficiently unique
and deterministic for TOI event failures as DDLs normally deal with
a single object, so we can merely use MySQL error codes to vote on.
Notice that this problem does not happen with regular transactional
writesets, since the originator node will always vote success and
replica nodes are assumed to have the same global locale setting.
As such different error messages indicate different errors even if
the error code is the same (e.g. ER_DUP_KEY can happen on different
rows tables).
Use only MySQL error code (without the error message) for error voting
in case of TOI event failure.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
When handling fatal signal, shut down Galera networking
before printing out stack trace and writing core file.
This is to achieve fail-silent semantics on crashes which may
keep the process running for a long time, but not fully responding
e.g. due to core dumping or symbol resolving.
Also suppress all Galera/wsrep logging to avoid logging from
background threads to garble crash information from signal handler.
Notice that for fully fail-silent crash, Galera 26.4.19 is needed.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The recent commit 4ca355d863 (MDEV-33894)
caused a serious regression for online InnoDB ib_logfile0 resizing,
breaking crash-safety unless the memory-mapped log file interface is
being used. However, the log resizing was broken also before this.
To prevent such regressions in the future, we extend the test
innodb.log_file_size_online with a kill and restart of the server
and with some writes running concurrently with the log size change.
When run enough many times, this test revealed all the bugs that
are being fixed by the code changes.
log_t::resize_start(): Do not allow the resized log to start before
the current log sequence number. In this way, there is no need to
copy anything to the first block of resize_buf. The previous logic
regarding that was incorrect in two ways. First, we would have to
copy from the last written buffer (buf or flush_buf). Second, we failed
to ensure that the mini-transaction end marker bytes would be 1
in the buffer. If the source ib_logfile0 had wrapped around an odd number
of times, the end marker would be 0. This was occasionally observed
when running the test innodb.log_file_size_online.
log_t::resize_write_buf(): To adjust for the resize_start() change,
do not write anything that would be before the resize_lsn.
Take the buffer (resize_buf or resize_flush_buf) as a parameter.
Starting with commit 4ca355d863
we no longer swap buffers when rewriting the last log block.
log_t::append(): Define as a static function; only some debug
assertions need to refer to the log_sys object.
innodb_log_file_size_update(): Wake up the buf_flush_page_cleaner()
if needed, and wait for it to complete a batch while waiting for
the log resizing to be completed. If the current LSN is behind the
resize target LSN, we will write redundant FILE_CHECKPOINT records to
ensure that the log resizing completes. If the buf_pool.flush_list is
empty or the buf_flush_page_cleaner() is stuck for some reason, our wait
will time out in 5 seconds, so that we can periodically check if the
execution of SET GLOBAL innodb_log_file_size was aborted. Previously,
we could get into a busy loop here while the buf_flush_page_cleaner()
would remain idle.
Problem was that wsrep_schema tables were not marked as
category information. Fix allows access to wsrep_schema
tables even when node is detached.
This is 10.4-10.9 version of fix.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Problem was that we did not found that table was partitioned
and then we should find what is actual underlaying storage
engine.
We should not use RSU for !InnoDB tables.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
recv_recovery_from_checkpoint_start(): Abort startup due to log
corruption if we were unable to parse the entire log between
the latest log checkpoint and the corresponding FILE_CHECKPOINT record.
Also, reduce some code bloat related to log output and log_sys.mutex.
Reviewed by: Debarun Banerjee
The option binlog_optimize_thread_scheduling was initially added
to provide a safe alternative for the newly added binlog group
commit logic, such that when 0, it would disable a leader thread
from performing the binlog write for all transactions that are a
part of the group commit. Any problems related to the binlog group
commit optimization should be sorted out by now, so we can
deprecate-to-eventually-remove the option altogether.
This commit performs the deprecation, and the removal is tracked
by MDEV-33745. Note, as the option is only able to be provided
via configuration at startup time, users will not see a
deprecation message unless looking through the CLI help
message.
Reviewed By
============
Kristian Nielsen <knielsen@knielsen-hq.org>
Sergei Golubchik <serg@mariadb.org>
Changing the alias LOCALTIME->CURRENT_TIMESTAMP to LOCALTIME->CURRENT_TIME.
This changes the return type of LOCALTIME from DATETIME to TIME,
according to the SQL Standard.
In commit fa8a46eb68 (MDEV-33613)
the parameter innodb_lru_flush_size ceased to have any effect.
Let us declare the parameter as deprecated and additionally as
MARIADB_REMOVED_OPTION, so that there will be a warning written
to the error log in case the option is specified in the command line.
Let us also do the same for the parameter
innodb_purge_rseg_truncate_frequency
that was deprecated&ignored earlier in MDEV-32050.
Reviewed by: Debarun Banerjee
Let us avoid EXTENDED in the CHECK TABLE after a defragmentation,
because it would occasionally report an orphan delete-marked record
in the index "third". That error does not seem to be reproducible
when using the regular OPTIMIZE TABLE.
Also, let us make the test --repeat safe by removing the defragmentation
related statistics after DROP TABLE.
The defragmentation feature was removed in later releases in
commit 7ca89af6f8 (MDEV-30545)
along with this test case.