The tests innodb.import_tablespace_race, innodn.restart, and innodb.innodb-wl5522 move
the tablespace file between the data directory and the tmp directory specified by
global environment variables. However this is risky because it's not unusual that the
set tmp directory (often under /tmp) is mounted on another disk partition or device,
and 'move_file' command may fail with "Errcode: 18 'Invalid cross-device link.'"
To stabilize mysqltest in the described scenario, and prevent such
behavior in the future, let make_file() check both from file path and to
file path and make sure they are either both under MYSQLTEST_VARDIR or
MYSQL_TMP_DIR.
All new code of the whole pull request, including one or several files that
are either new files or modified ones, are contributed under the BSD-new license.
I am contributing on behalf of my employer Amazon Web Services, Inc.
Add vital missing step to MariaDB 10.5 upgrade job to actually install
the new binary being built. Without this the test was happily passing all
the time but actually not testing the upgrade.
Also stop using oneliner syntax for the install step to make the debugging
of failing installs/upgrades from build logs easier.
NOTE TO MERGERS: This commit is made on 10.6 branch and can be merged to
all later branches (10.7, 10.8, ..., 11.0). If/when some jobs break, they
will be fixed per branch on follow-up commits.
Tests that try to upgrade MariaDB 10.6 in Debian Sid can no longer work
properly on versions < 10.11 as MariaDB 10.11 in now Debian Sid.
Change RELEASE to use Bullseye and refactor builds and upgrade tests to
use primarily Bullseye, as it has MariaDB 10.5 and thus testing upgrades
to MariaDB 10.6 and higher can work. Add on new 'build sid' job to continue
at least on build on Sid as well, even though it is not tested in other
jobs.
Due to this many Sid specific workarounds are can also be dropped, and
since Bullseye is now used for everything, the old bullseye-backports jobs
are obsolete and removed.
Tests that upgrade MySQL in Sid to MariaDB are also removed, as no test
in Debian Sid with MariaDB < 10.11 can reliably work anymore.
Also disable reprotest as unnecessary on old branches, refactor the naming
of autobake-deb.sh and native Debian build jobs to be more clear.
NOTE TO MERGERS: This commit is made on 10.6 branch and can be merged to
all later branches (10.7, 10.8, ..., 11.0). If/when some jobs break, they
will be fixed per branch on follow-up commits.
Compare to Debian packaging of MariaDB 1:10.6.11-2 release at commit
2934e8a795
and sync upstream everything that is relevant for upstream and safe
to import on a stable 10.6 release.
* Use OpenSSL 1.1 from Debian Snapshots
0040c272bf
Related: https://jira.mariadb.org/browse/MDEV-30322
* Prefer using bullseye-backports in mosts tests over buster-bpo
daa827ecde
* Add new upgrade test for MySQL Community Cluster 8.0
3c71bec9b7
* Enable automatic datadir move also on upgrades from MySQL.com packages
4cbbcb7e56
* Update Breaks/Replaces
2cab13d059
* Normalize apt-get and curl commands
8754ea2578
* Standardize on using capitalized 'ON' in CMake build options
938757a85a
* Apply wrap-and-sort -av and other minor tweaks and inline documentation
NOTE TO MERGERS: This commit is made on 10.6 branch and can be merged to
all later branches (10.7, 10.8, ..., 11.0).
This is allowed:
STRING_WITH_LEN("string literal")
This is not:
char *str = "pointer to string";
... STRING_WITH_LEN(str) ..
In C++ this is also allowed:
const char str[] = "string literal";
... STRING_WITH_LEN(str) ...
Test fails sporadically and very rarely on this:
```
let $org_queries= `SHOW STATUS LIKE 'Queries'`;
SELECT f1();
CALL p1();
let $new_queries= `SHOW STATUS LIKE 'Queries'`;
let $diff= `SELECT SUBSTRING('$new_queries',9)-SUBSTRING('$org_queries',9)`;
```
if COM_QUIT from one of the earlier (in the test) disconnect's
happens between the two SHOW STATUS commands.
Because COM_QUIT increments "Queries".
The directly previous test uses wait_condition to wait for
its disconnects to complete. But there are more disconnects earlier
in the test file and nothing waits for them.
Let's change wait_condition to wait for *all* disconnect to complete.
MariaDB server prints the stack information if a crash happens.
It traverses the stack frames in function `print_with_addr_resolve`.
For *EACH* frame, it tries to parse the file name and line number of the
frame using `addr2line`, or prints `backtrace_symbols_fd` if `addr2line`
fails.
1. Logic in `addr_resolve` function uses addr2line to get the file name
and line numbers. It has a timeout of 500ms to wait for the response
from addr2line. However, that's not enough on small instances
especially if the debug information is in a separate file or
compressed.
Increase the timeout to 5 seconds to support some edge cases, as
experiments showed addr2line may take 2-3 seconds on some frames.
2. While parsing a frame inside of a shared library using `addr2line`,
the file name and line numbers could be `??`, empty or `0` if the
debug info is not loaded.
It's easy to reproduce when glibc-debuginfo is not installed.
Instead of printing a meaningless frame like:
:0(__GI___poll)[0x1505e9197639]
...
??:0(__libc_start_main)[0x7ffff6c8913a]
We want to print the frame information using `backtrace_symbols_fd`,
with the shared library name and a hexadecimal offset.
Stacktrace example on a real instance with this commit:
/lib64/libc.so.6(__poll+0x49)[0x145cbf71a639]
...
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f4d0034d13a]
`addr_resolve` has considered the case of meaningless combination of
file name and line number returned by `addr2line`. e.g. `??:?`
However, conditions like `:0` and `??:0` are not handled. So now the
function will rollback to `backtrace_symbols_fd` in above cases.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
In block-nl-join, add:
- r_loops - this shows how many incoming record combinations this
query plan node had.
- r_effective_rows - this shows the average number of matching rows
that this table had for each incoming record combination. This is
comparable with r_rows in non-blocked access methods.
For BNL-joins, it is always equal to
$.table.r_rows * $.table.r_filtered
For BNL-H joins the value cannot be computed from other values
Reviewed by: Monty <monty@mariadb.org>
CREATE [TEMPORARY] SEQUENCE is internally CREATE+INSERT (initial value)
and it is replicated using statement based replication. In Galera
we use either TOI or RSU so we should skip commit time hooks
for it.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
With binlogs enabled, debug assertion ut_ad(xid_seqno > wsrep_seqno)
fired in trx_rseg_update_wsrep_checkpoint() when an applier thread
synced the seqno out of order for write set which had failed
certification. This was caused by releasing commit
order too early when binlogs were on, allowing group
commit to run in parallel and commit following transactions
too early.
Fixed by extending the commit order critical section to cover
call to wsrep_set_SE_checkpoint() also when binlogs are on.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
When using LEFT() function with a string that is without a charset,
the function crashes. This is because the function assumes that
the string has a charset, and tries to use it to calculate the
length of the string.
Two functions, UNHEX and WEIGHT_STRING, returned a string without
the charset being set to a not null value.
The fix is to set charset when calling val_str on these two functions.
Reviewed-by: Alexander Barkov <bar@mariadb.com>
Reviewed-by: Daniel Black <daniel@mariadb.org>
Sequence objects are implemented using special tables.
These tables do not have primary key and only one row.
NEXTVAL is basically update from existing value to new
value. In Galera this could mean that two write-sets
from different nodes do not conflict and this could
lead situation where write-sets are executed
concurrently and possibly in wrong seqno order.
This is fixed by using table-level exclusive key for
SEQUENCE updates. Note that this naturally works
correctly only if InnoDB storage engine is used for
sequence.
This fix does not contain a test case because while
it is possible to syncronize appliers using dbug_sync
it was too hard to syncronize MDL-lock requests to
exact objects. Testing done for this fix is documented
on MDEV.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Let us make innodb_buffer_pool_filename a read-only variable
so that a malicious user cannot cause an important file to be
deleted on InnoDB shutdown. An attempt to delete a directory
will fail because it is not a regular file, but what if the
variable pointed to (say) ibdata1, ib_logfile0 or some *.ibd file?
It does not seem to make much sense for this parameter to be
configurable in the first place, but we will not change that in order
to avoid breaking compatibility.
Problem:
UNIX_TIMESTAMP() called for a expression of the TIME data type
returned NULL.
Inside Type_handler_timestamp_common::Item_val_native_with_conversion
the call for item->get_date() did not convert TIME to DATETIME
automatically (because it does not have to, by design).
As a result, Type_handler_timestamp_common::TIME_to_native() received
a MYSQL_TIME value with zero date 0000-00-00 and therefore returned "true"
(indicating SQL NULL value).
Fix:
Removing the call for item->get_date().
Instantiating Datetime(item) instead.
This forces automatic TIME to DATETIME conversion
(unless @@old_mode is zero_date_time_cast).
handle_slave_io(), handle_slave_sql(), os_thread_exit():
Remove a redundant pthread_exit(nullptr) call, because it
would cause SIGSEGV.
mysql_print_status(): Add MEM_MAKE_DEFINED() to work around
some missing instrumentation around mallinfo2().
que_graph_free_stat_list(): Invoke que_node_get_next(node) before
que_graph_free_recursive(node). That is the logical and
MSAN_OPTIONS=poison_in_dtor=1 compatible way of freeing memory.
ins_node_t::~ins_node_t(): Invoke mem_heap_free(entry_sys_heap).
que_graph_free_recursive(): Rely on ins_node_t::~ins_node_t().
fts_t::~fts_t(): Invoke mem_heap_free(fts_heap).
fts_free(): Replace with direct calls to fts_t::~fts_t().
The failures in free_root() due to MSAN_OPTIONS=poison_in_dtor=1
will be covered in MDEV-30942.
The solution is to suppress error messages for missing tablespaces if
mariabackup is launched with "--prepare --export" options.
"mariabackup --prepare --export" invokes itself with --mysqld parameter.
If the parameter is set, then it starts server to feed "FLUSH TABLES ...
FOR EXPORT;" queries for exported tablespaces. This is "normal" server
start, that's why new srv_operation value is introduced.
Reviewed by Marko Makela.
EXPLAIN EXTENDED for an UPDATE/DELETE/INSERT/REPLACE statement did not
produce the warning containing the text representation of the query
obtained after the optimization phase. Such warning was produced for
SELECT statements, but not for DML statements.
The patch fixes this defect of EXPLAIN EXTENDED for DML statements.
…with: Test assertion failed
Problem:
=======
Assertion text: 'Value returned by SSS and PS table for Last_Error_Number
should be same.'
Assertion condition: '"1146" = "0"'
Assertion condition, interpolated: '"1146" = "0"'
Assertion result: '0'
Analysis:
========
In parallel replication when slave is started the worker pool gets
activated and it gets cleared when slave stops. Each time the worker pool
gets activated a backup worker pool also gets created to store worker
specific perforance schema information in case of errors. On error, all
relevant information is copied from rpl_parallel_thread to rli and it gets
cleared from thread. Then server waits for all workers to complete their
work, during this stage performance schema table specific worker info is
stored into the backup pool and finally the actual pool gets cleared. If
users query the performance schema table to know the status of workers the
information from backup pool will be used. The test simulates
ER_NO_SUCH_TABLE error and verifies the worker information in pfs table.
Test works fine if execution occurs in following order.
Step 1. Error occurred 'worker information is copied to backup pool'.
Step 2. handle_slave_sql invokes 'rpl_parallel_resize_pool_if_no_slaves' to
deactivate worker pool, it marks the pool->count=0
Step 3. PFS table is queried, since actual pool is deactivated backup pool
information is read.
If the Step 3 happens prior to Step2 the pool is yet to be deactivated and
the actual pool is read, which doesn't have any error details as they were
cleared. Hence test ocasionally fails.
Fix:
===
Upon error mark the back pool as being active so that if PFS table is
quried since the backup pool is flagged as valid its information will be
read, in case it is not flagged regular pool will be read.
This work is one of the last pieces created by the late Sujatha Sivakumar.
Problem:
========
- InnoDB replace statement returns can't find record as result during
bulk insert operation. InnoDB returns DB_END_OF_INDEX blindly when
bulk transaction is visible to current transaction even though
the search tuple is inserted as a part of current replace statement.
Solution:
=========
row_search_mvcc(): InnoDB should allow the transaction to read
all the rows when innodb intends to do any locking on the
record even though bulk insert transaction changes are
visible to the current transaction
buf_dblwr_t::init(), buf_dblwr_t::close(): Cover also write_cond,
which was added in commit a55b951e60
without explicit initialization. On GNU/Linux, PTHREAD_COND_INITIALIZER
is a zero-initializer. That is why the default zero initialization
happened to work on that platform.
- Description:
- Before 10.3.8 semisync was a plugin that is built into the server with
MDEV-13073,starting with commit cbc71485e2.
There are still some usage of `rpl_semi_sync_master` in mtr.
Note:
- To recognize the replica in the `dump_thread`, replica is creating
local variable `rpl_semi_sync_slave` (the keyword of plugin) in
function `request_transmit`, that is catched by primary in
`is_semi_sync_slave()`. This is the user variable and as such not
related to the obsolete plugin.
- Found in `sys_vars.all_vars` and `rpl_semi_sync_wait_point` tests,
usage of plugins `rpl_semi_sync_master`, `rpl_semi_sync_slave`.
The former test is disabled by default (`sys_vars/disabled.def`)
and marked as `obsolete`, however this patch will remove the queries.
- Add cosmetic fixes to semisync codebase
Reviewer: <brandon.nesterenko@mariadb.com>
Closes PR #2528, PR #2380
btr_cur_upd_rec_in_place(): Avoid calling page_zip_write_rec() if we
are not modifying any fields that are stored in compressed format.
btr_cur_update_in_place_zip_check(): New function to check if a
ROW_FORMAT=COMPRESSED record can actually be updated in place.
btr_cur_pessimistic_update(): If the BTR_KEEP_POS_FLAG is not set
(we are in a ROLLBACK and cannot write any BLOBs), ignore the potential
overflow and let page_zip_reorganize() or page_zip_compress() handle it.
This avoids a failure when an attempted UPDATE of an NULL column to 0 is
rolled back. During the ROLLBACK, we would try to move a non-updated
long column to off-page storage in order to avoid a compression failure
of the ROW_FORMAT=COMPRESSED page.
page_zip_write_trx_id_and_roll_ptr(): Remove an assertion that would fail
in row_upd_rec_in_place() because the uncompressed page would already
have been modified there.
This is a 10.5 version of commit ff3d4395d8
(different because of commit 08ba388713).
row_upd_rec_in_place(): Avoid calling page_zip_write_rec() if we
are not modifying any fields that are stored in compressed format.
btr_cur_update_in_place_zip_check(): New function to check if a
ROW_FORMAT=COMPRESSED record can actually be updated in place.
btr_cur_pessimistic_update(): If the BTR_KEEP_POS_FLAG is not set
(we are in a ROLLBACK and cannot write any BLOBs), ignore the potential
overflow and let page_zip_reorganize() or page_zip_compress() handle it.
This avoids a failure when an attempted UPDATE of an NULL column to 0 is
rolled back. During the ROLLBACK, we would try to move a non-updated
long column to off-page storage in order to avoid a compression failure
of the ROW_FORMAT=COMPRESSED page.
page_zip_write_trx_id_and_roll_ptr(): Remove an assertion that would fail
in row_upd_rec_in_place() because the uncompressed page would already
have been modified there.
Thanks to Jean-François Gagné for providing a copy of a page that
triggered these bugs on the ROLLBACK of UPDATE and DELETE.
A 10.6 version of this was tested by Matthias Leich using
cmake -DWITH_INNODB_EXTRA_DEBUG=ON a.k.a. UNIV_ZIP_DEBUG.
mtr uses group suffix, but some existing inc and test files use
server_id for expect files. This patch aims to fix that.
For spider:
With this change we will not have to maintain a separate version of
restart_mysqld.inc for spider, that duplicates code, just because
spider tests use different names for expect files, and shutdown_mysqld
requires magical names for them.
With this change spider tests will also be able to use other features
provided by restart_mysqld.inc without code duplication, like the
parameter $restart_parameters (see e.g. the testcase mdev_29904.test
in commit ef1161e5d4f).
Tests run after this change: default, spider, rocksdb, galera, using
the following command
mtr --parallel=auto --force --max-test-fail=0 --skip-core-file
mtr --suite spider,spider/*,spider/*/* \
--skip-test="spider/oracle.*|.*/t\..*" --parallel=auto --big-test \
--force --max-test-fail=0 --skip-core-file
mtr --suite galera --parallel=auto
mtr --suite rocksdb --parallel=auto
Commit a923d6f49c disabled numeric setting
of character_set_* variables with non-default values:
MariaDB [(none)]> set character_set_client=224;
ERROR 1115 (42000): Unknown character set: '224'
However the corresponding binlog functionality still write numeric
values for log event, and this will break binlog replay if the value is
not default. Now make the server use 'String' type for
'character_set_client' when generating binlog events
Before:
/*!\C utf8mb4 *//*!*/;
SET @@session.character_set_client=224,@@session.collation_connection=224,@@session.collation_server=33/*!*/;
After:
/*!\C utf8mb4 *//*!*/;
SET @@session.character_set_client=utf8mb4,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
Note: prior to the previous commit, setting with '224' or '45' or
'utf8mb4' have the same effect, as they all set the parameter to
'utf8mb4'.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
buf_LRU_block_remove_hashed(): Ever since
commit 2e814d4702
we could get page_zip_validate() failures after an ALTER TABLE
operation was aborted and BtrBulk::pageCommit() had never been
executed on some blocks.
- InnoDB does rollback the whole transaction and discards the
savepoint when there is a failure happens during bulk
insert operation. When server request to release the savepoint,
InnoDB should return DB_SUCCESS when it deals with bulk
insert operation
The hang could be seen as show slave status displaying an error like
Last_Error: Could not execute Write_rows_v1
along with
Slave_SQL_Running: Yes
accompanied with one of the replication threads in show-processlist
characteristically having status like
2394 | system user | | NULL | Slave_worker | 50852| closing tables
It turns out that closing tables worker got entrapped in endless looping
in mark_start_commit_inner() across already garbage-collected gco items.
The reclaimed gco links are explained with actually possible
out-of-order groups of events termination due to the Last_Error.
This patch reinforces the correct ordering to perform
finish_event_group's cleanup actions, incl unlinking gco:s
from the active list.
For more convenient monitoring of something that could greatly affect
the volume of page writes, we add the status variable
Innodb_buffer_pool_pages_split that was previously only available
via information_schema.innodb_metrics as "innodb_page_splits".
This was suggested by Axel Schwenke.
buf_flush_page_count: Replaced with buf_pool.stat.n_pages_written.
We protect buf_pool.stat (except n_page_gets) with buf_pool.mutex
and remove unnecessary export_vars indirection.
buf_pool.flush_list_bytes: Moved from buf_pool.stat.flush_list_bytes.
Protected by buf_pool.flush_list_mutex.
buf_pool_t::page_cleaner_status: Replaces buf_pool_t::n_flush_LRU_,
buf_pool_t::n_flush_list_, and buf_pool_t::page_cleaner_is_idle.
Protected by buf_pool.flush_list_mutex. We will exclusively broadcast
buf_pool.done_flush_list by the buf_flush_page_cleaner thread,
and only wait for it when communicating with buf_flush_page_cleaner.
There is no need to keep a count of pending writes by the
buf_pool.flush_list processing. A single flag suffices for that.
Waits for page write completion can be performed by
simply waiting on block->page.lock, or by invoking
buf_dblwr.wait_for_page_writes().
buf_LRU_block_free_non_file_page(): Broadcast buf_pool.done_free and
set buf_pool.try_LRU_scan when freeing a page. This would be
executed also as part of buf_page_write_complete().
buf_page_write_complete(): Do not broadcast buf_pool.done_flush_list,
and do not acquire buf_pool.mutex unless buf_pool.LRU eviction is needed.
Let buf_dblwr count all writes to persistent pages and broadcast a
condition variable when no outstanding writes remain.
buf_flush_page_cleaner(): Prioritize LRU flushing and eviction right after
"furious flushing" (lsn_limit). Simplify the conditions and reduce the
hold time of buf_pool.flush_list_mutex. Refuse to shut down
or sleep if buf_pool.ran_out(), that is, LRU eviction is needed.
buf_pool_t::page_cleaner_wakeup(): Add the optional parameter for_LRU.
buf_LRU_get_free_block(): Protect buf_lru_free_blocks_error_printed
with buf_pool.mutex. Invoke buf_pool.page_cleaner_wakeup(true) to
to ensure that buf_flush_page_cleaner() will process the LRU flush
request.
buf_do_LRU_batch(), buf_flush_list(), buf_flush_list_space():
Update buf_pool.stat.n_pages_written when submitting writes
(while holding buf_pool.mutex), not when completing them.
buf_page_t::flush(), buf_flush_discard_page(): Require that
the page U-latch be acquired upfront, and remove
buf_page_t::ready_for_flush().
buf_pool_t::delete_from_flush_list(): Remove the parameter "bool clear".
buf_flush_page(): Count pending page writes via buf_dblwr.
buf_flush_try_neighbors(): Take the block of page_id as a parameter.
If the tablespace is dropped before our page has been written out,
release the page U-latch.
buf_pool_invalidate(): Let the caller ensure that there are no
outstanding writes.
buf_flush_wait_batch_end(false),
buf_flush_wait_batch_end_acquiring_mutex(false):
Replaced with buf_dblwr.wait_for_page_writes().
buf_flush_wait_LRU_batch_end(): Replaces buf_flush_wait_batch_end(true).
buf_flush_list(): Remove some broadcast of buf_pool.done_flush_list.
buf_flush_buffer_pool(): Invoke also buf_dblwr.wait_for_page_writes().
buf_pool_t::io_pending(), buf_pool_t::n_flush_list(): Remove.
Outstanding writes are reflected by buf_dblwr.pending_writes().
buf_dblwr_t::init(): New function, to initialize the mutex and
the condition variables, but not the backing store.
buf_dblwr_t::is_created(): Replaces buf_dblwr_t::is_initialised().
buf_dblwr_t::pending_writes(), buf_dblwr_t::writes_pending:
Keeps track of writes of persistent data pages.
buf_flush_LRU(): Allow calls while LRU flushing may be in progress
in another thread.
Tested by Matthias Leich (correctness) and Axel Schwenke (performance)