This commit fixes a documentation installation
issue (for debian packaging) and generally brings
the installation control files up to date (as for
the rest of the components).
This commit fixes a bug in the algorithm for converting hexadecimal
strings to binary key values, which leads to incompatibility with other
plugins and reduces the effective information capacity of the keys.
The new key conversion algorithm is incompatible with tables which
alrady encrypted using a old plugin (plugin version less than or
equalt to the 1.05).
This commit contains changes to refactor the the Hashicorp plugin code
which hides all variables previously declared as "static" and which are
not user-visible parameters into a special class that contains all the
plugin's dynamic data. This was done primarily to significantly simplify
the code of the initialization and deinitialization functions, which
previously contained a large number of gotos and complex branching
conditions to control memory deallocation.
This commit fixes an issue with no visible update in caching
option values after changing them dynamically while the server
is running. This issue was related to forgotten copy operations
of new values into dynamic variables. At the same time, internal
variables (responsible for caching) were always updated correctly.
The commit includes a test that checks that the update is now
reflected in the values of dynamic variables.
For the plugin to work properly, we need support for key versioning,
and for this, the kv storage in Hashicorp Vault must be created with
version 2 or higher. This commit adds such a check performed during
plugin initialization.
Note: checking for kv storage version during plugin initialization
can be disabled via --hashicorp-key-management-check-kv-version=off
command-line option or via the corresponding option in the server
configuration files.
According to the Hashicorp Vault API specifications,
the URL to access the keys must include the "/v1/" prefix
at the beginning of the path. This commit adds this parameter
check, as well as a check for the presence of at least one
letter in the hostname inside the URL and in the secret
store name (after "/v1/").
This commit adds an indication of the ID of the not found key
(and, when appropriate, also an indication of the version number
of the key) in the log file, making it easier to find errors.
- Authentication is done using the Hashicorp Vault's token
authentication method;
- If additional client authentication is required, then the
path to the CA authentication bundle file may be passed
as a plugin parameter;
- The creation of the keys and their management is carried
out using the Hashicorp Vault KMS and their tools;
- Key values stored as hexadecimal strings;
- Key values caching is supported.
- Implemented a time-invalidated cache for key values and
for key version numbers received from the Hashicorp Valult
server;
- The plugin uses libcurl (https) as an interface to
the HashiCorp Vault server;
- JSON parsing is performed through the JSON service
(through the include/mysql/service_json.h);
- HashiCorp Vault 1.2.4 was used for development and testing.
If UPDATE/DELETE does not change data it is skipped from
replication. We now force replication of such events when they trigger
partition auto-creation.
For ROLLBACK it is as simple as set OPTION_KEEP_LOG
flag. trans_cannot_safely_rollback() does the rest.
For UPDATE/DELETE .. LIMIT 0 we make additional binlog_query() calls
at the early points of return.
As a safety measure we also convert row format into statement if it is
needed. The condition is decided by
binlog_need_stmt_format(). Basically if there are some row events in
cache we don't need that: table open of row event will trigger
auto-creation anyway.
Multi-update/delete works via mysql_select(). There is no early points
of return, so binlogging is always checked by
send_eof()/abort_resultset(). But we must comply with the above
measure of converting into statement.
:: Syntax change ::
Keyword AUTO enables history partition auto-creation.
Examples:
CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO;
CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME INTERVAL 1 MONTH
STARTS '2021-01-01 00:00:00' AUTO PARTITIONS 12;
CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME LIMIT 1000 AUTO;
Or with explicit partitions:
CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO
(PARTITION p0 HISTORY, PARTITION pn CURRENT);
To disable or enable auto-creation one can use ALTER TABLE by adding
or removing AUTO from partitioning specification:
CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO;
# Disables auto-creation:
ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR;
# Enables auto-creation:
ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO;
If the rest of partitioning specification is identical to CREATE TABLE
no repartitioning will be done (for details see MDEV-27328).
:: Description ::
Before executing history-generating DML command (see the list of commands below)
add N history partitions, so that N would be sufficient for potentially
generated history. N > 1 may be required when history partitions are switched
by INTERVAL and current_timestamp is N times further than the interval
boundary of the last history partition.
If the last history partition equals or exceeds LIMIT records then new history
partition is created and selected as the working partition. According to
MDEV-28411 partitions cannot be switched (or created) while the command is
running. Thus LIMIT does not carry strict limitation and the history partition
size must be planned as LIMIT value plus average number of history one DML
command can generate.
Auto-creation is implemented by synchronous fast_alter_partition_table() call
from the thread of the executed DML command before the command itself is run
(by the fallback and retry mechanism similar to Discovery feature,
see Open_table_context).
The name for newly added partitions are generated like default partition names
with extension of MDEV-22155 (which avoids name clashes by extending assignment
counter to next free-enough gap).
These DML commands can trigger auto-creation:
DELETE (including multitable DELETE, excluding DELETE HISTORY)
UPDATE (including multitable UPDATE)
REPLACE (including REPLACE .. SELECT)
INSERT .. ON DUPLICATE KEY UPDATE (including INSERT .. SELECT .. ODKU)
LOAD DATA .. REPLACE
:: Bug fixes ::
MDEV-23642 Locking timeout caused by auto-creation affects original DML
The reasons for this are:
- Do not disrupt main business process (the history is auxiliary service);
- Consequences are non-fatal (history is not lost, but comes into wrong
partition; fixed by partitioning rebuild);
- There is more freedom for application to fail in this case or not: it may
read warning info and find corresponding error number.
- While non-failing command is easy to handle by an application and fail it,
the opposite is hard to handle: there is no automatic actions to fix
failed command and retry, DBA intervention is required and until then
application is non-functioning.
MDEV-23639 Auto-create does not work under LOCK TABLES or inside triggers
Don't do tdc_remove_table() for OT_ADD_HISTORY_PARTITION because it is
not possible in locked tables mode.
LTM_LOCK_TABLES mode (and LTM_PRELOCKED_UNDER_LOCK_TABLES) works out
of the box as fast_alter_partition_table() can reopen tables via
locked_tables_list.
In LTM_PRELOCKED we reopen and relock table manually.
:: More fixes ::
* some_table_marked_for_reopen flag fix
some_table_marked_for_reopen affets only reopen of
m_locked_tables. I.e. Locked_tables_list::reopen_tables() reopens only
tables from m_locked_tables.
* Unused can_recover_from_failed_open() condition
Is recover_from_failed_open() can be really used after
open_and_process_routine()?
:: Reviewed by ::
Sergei Golubchik <serg@mariadb.org>
When we need to add/remove or change LIMIT, INTERVAL, AUTO we have to
recreate partitioning from scratch (via data copy). Such operations
should be done fast. To remove options like LIMIT or INTERVAL one
should write:
alter table t1 partition by system_time;
The command checks whether it is new or existing SYSTEM_TIME
partitioning. And in the case of new it behaves as CREATE would do:
adds default number of partitions (2). If SYSTEM_TIME partitioning
already existed it just changes its options: removes unspecified ones
and adds/changes those specified explicitly. In case when partitions
list was supplied it behaves as usual: does full repartitioning.
Examples:
create or replace table t1 (x int) with system versioning
partition by system_time limit 100 partitions 4;
# Change LIMIT
alter table t1 partition by system_time limit 33;
# Remove LIMIT
alter table t1 partition by system_time;
# This does full repartitioning
alter table t1 partition by system_time limit 33 partitions 4;
# This does data copy as pruning will require records in correct partitions
alter table t1 partition by system_time interval 1 hour
starts '2000-01-01 00:00:00';
# But this works fast, LIMIT will apply to DML commands
alter table t1 partition by system_time limit 33;
To sum up, ALTER for SYSTEM_TIME partitioning does full repartitioning
when:
- INTERVAL was added or changed;
- partition list or partition number was specified;
Otherwise it does fast alter table.
Cleaned up dead condition in set_up_default_partitions().
Reviewed by:
Oleksandr Byelkin <sanja@mariadb.com>
Nikita Malyavin <nikitamalyavin@gmail.com>
rename OPTION_KEEP_LOG -> OPTION_BINLOG_THIS_TRX.
Meaning: transaction cache will be written to binlog even on rollback.
convert log_current_statement to OPTION_BINLOG_THIS_STMT.
Meaning: the statement will be written to binlog (or trx binlog cache)
even if it normally wouldn't be.
setting OPTION_BINLOG_THIS_STMT must always set OPTION_BINLOG_THIS_TRX,
otherwise the statement won't be logged if the transaction is rolled back.
Use OPTION_BINLOG_THIS to set both.
Per man dh_missing, not-installed will exand wildcards
since debhelper 11.1. Since Stretch is on 10.2.5, this won't happen.
As columnstore is still only x86_64 we can use that in the file.
Put man3 pages in libmariadb-dev.install
Ignore /usr/share/mysql/*.jar because CI
environment inconsistent in the availablity of
java to compile parts.
Make the Debian build fail if it detects that the build (CMake) created
files that are not used in any package nor accounted in the special
not-installed file.
Stop creating symbolic links in Debian packaging for files that the CMake
build already created.
Document known cases of files that are intentionally not installed.
Leave the rest in the not-installed list for visibility. The list can
later be trimmed down and having the --fail-missing will prevent any new
unaccounted files from being introduced.
Note that despite extensive refactoring in the Debian packaging files,
there was no changes in the packages produced as verified by package
files lists before and after.
In addition to the binary .deb packages, also remove the version
string from the Debian source package.
Also clean away excess use of __MARIADB_MAJOR_VER__ constant
and add inline note that the whole debian-XX.X.flag file thing
should be removed and replaced by using the new MariaDB server
mysql_upgrade_info file.
Fixes issues like e.g.:
The following packages have unmet dependencies:
mariadb-client : Breaks: mariadb-client-core-10.9
Breaks: mariadb-server-10.9
mariadb-server-core : Breaks: mariadb-client-10.9
Breaks: mariadb-server-10.9
and
[ERROR] Missing Breaks/Replaces found
[ERROR] libmariadb-dev-compat conflicts with libmariadbclient-dev
files: {'/usr/bin/mysql_config'}
Upgrades from Debian 10 "Buster" directly to Debian 12 "Bookworm",
skipping Debian 11 "Bullseye", fail with apt erroring on:
libcrypt.so.1: cannot open shared object file
This is an intentional OpenSSL transition as described in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993755
Hence, clean away such tests.
Also other minor cleanups in salsa-ci.yml.
Switch to using bullseye-backports where buster-backports was used or
remove steps that only worked on buster-backports. For example the
Percona XtraDB Cluster 5.7 was available up until Buster but no longer
in Bullseye, so remove it.
Sort and organize the Debian packaging files.
Also revert 4d03269425 that was done in vain.
For the sake of CI we do want to have working upgrades from previous 10.9
releases and it is doable with another kind of fix in a later commit.
(cherry-pick into preview-10.9-MDEV-27021-explain tree)
Expression_cache_tmptable object uses an Expression_cache_tracker object
to report the statistics.
In the common scenario, Expression_cache_tmptable destructor sets
tracker->cache=NULL. The tracker object survives after the expression
cache is deleted and one may call cache_tracker->fetch_current_stats()
for it with no harm.
However a degenerate cache with no parameters does not set
tracker->cache=NULL in Expression_cache_tmptable destructor which
results in an attempt to use freed data in the
cache_tracker->fetch_current_stats() call.
Fixed by setting tracker->cache to NULL and wrapping the assignment into
a function.
- Describe the lifetime of EXPLAIN data structures in
sql_explain.h:ExplainDataStructureLifetime.
- Make Item_field::set_field() call set_refers_to_temp_table()
when it refers to a temp. table.
- Introduce QT_DONT_ACCESS_TMP_TABLES flag for Item::print.
It directs Item_field::print to not try access its the
temp table.
- Introduce Explain_query::notify_tables_are_closed()
and call it right before the query closes its tables.
- Make Explain data stuctures' print_explain_json() methods
accept "no_tmp_tbl" parameter which means pass
QT_DONT_ACCESS_TMP_TABLES when printing items.
- Make Show_explain_request::call_in_target_thread() not call
set_current_thd(). This wasn't needed as the code inside
lex->print_explain() uses output->thd anyway. output->thd
refers to the SHOW command's THD object.
SHOW EXPLAIN/ANALYZE FORMAT=JSON tries to access items that have already been
freed by a call to free_items() during THD::cleanup_after_query().
The solution is to disallow APC calls including SHOW EXPLAIN/ANALYZE
just before the call to free_items().
1. Add explicit indication that the output is produced by
SHOW EXPLAIN/ANALYZE FORMAT=JSON command.
2. Remove useless "r_total_time_ms" field from SHOW ANALYZE FORMAT=JSON
output when there is no timed statistics gathered.
3. Add "r_query_time_in_progress_ms" to the output of SHOW ANALYZE FORMAT=JSON.
EXPLAIN FOR CONNECTION is a MySQL-compatible syntax for SHOW EXPLAIN.
This commit also adds support for FORMAT=JSON to SHOW EXPLAIN,
so the possible options to get JSON-formatted output are:
- SHOW EXPLAIN FORMAT=JSON FOR $con
- EXPLAIN FORMAT=JSON FOR CONNECTION $con
Problem:
========
The test logic checked for the wrong condition to validate that the
slave had caught up with the master. Specifically, it used the
thread stage of the IO and SQL thread to be in the “Waiting for
master to send event” and “Slave has read all relay log; waiting for
more updates” states, respectively. The problem exposed by this MDEV
is that, this state is also the initial slave state before reading
data from the primary (whereas the intended state was having already
read all available events from the primary and now waiting for new
events). This made the MTR test validate data that it had not yet
received, and thereby fail.
Solution:
========
Instead of using the IO/SQL thread states, use the existing helper
functions save_master_gtid.inc and sync_with_master_gtid.inc. Note
that the test result file also needed to be updated to reflect
this fix.
Special thanks to Angelique Sklavounos for pointing out that
--stop-position was not specified in any buildbot failures, as this
led to an IF block in the MTR test that was the source of the test
failure.
Reviewed By
============
Andrei Elkin <andrei.elkin@mariadb.com>
This follows up the previous fix in
commit c3c53926c4 (MDEV-26554).
ha_innobase::delete_table(): Work around the insufficient
metadata locking (MDL) during DML operations by acquiring exclusive
InnoDB table locks on all child tables. Previously, this was only
done on TRUNCATE and ALTER.
ibuf_delete_rec(), btr_cur_optimistic_delete(): Do not invoke
lock_update_delete() during change buffer operations.
The revised trx_t::commit(std::vector<pfs_os_file_t>&) will
hold exclusive lock_sys.latch while invoking fil_delete_tablespace(),
which in turn may invoke ibuf_delete_rec().
dict_index_t::has_locking(): A new predicate, replacing the dummy
!dict_table_is_locking_disabled(index->table). Used for skipping lock
operations during ibuf_delete_rec().
trx_t::commit(std::vector<pfs_os_file_t>&): Release the locks
and remove the table from the cache while holding exclusive
lock_sys.latch.
trx_t::commit_in_memory(): Skip release_locks() if dict_operation holds.
trx_t::commit(): Reset dict_operation before invoking commit_in_memory()
via commit_persist().
lock_release_on_drop(): Release locks while lock_sys.latch is
exclusively locked.
lock_table(): Add a parameter for a pointer to the table.
We must not dereference the table before a lock_sys.latch has
been acquired. If the pointer to the table does not match the table
at that point, the table is invalid and DB_DEADLOCK will be returned.
row_ins_foreign_check_on_constraint(): Improve the checks.
Remove a bogus DB_LOCK_WAIT_TIMEOUT return that was needed
before commit c5fd9aa562 (MDEV-25919).
row_upd_check_references_constraints(),
wsrep_row_upd_check_foreign_constraints(): Simplify checks.
The macro MY_GNUC_PREREQ() was used for testing for some minor
GCC 4 versions before GCC 4.8.5, which is the oldest version
that supports C++11, which we depend on ever since
commit d9613b750c
- InnoDB should avoid bulk insert operation when table has active
DDL. Because bulk insert writes only one undo log as TRX_UNDO_EMPTY
and logging of concurrent DML happens at commit time uses undo log
record to parse and get the value and operation.
- Removed ROW_T_EMPTY, ROW_OP_EMPTY and their associated functions
and also the test case which tries to log the ROW_OP_EMPTY
when table has active DDL.
Analysis: When trying to find path and handling the match for path,
value at current index is not set to 0 for array_counters. This causes wrong
current step value which eventually causes wrong cur_step->type value.
Fix: Set the value at current index for array_counters to 0.
constructor
Analysis: counter does not increment while sending rows for table value
constructor and so row_number assumes the default value (0 in this case).
Fix: Increment the counter to avoid counter using default value.
Analysis: There were two kinds of failing tests on buildbot with UBSAN.
1) runtime error: signed integer overflow and
2) runtime error: load of value is not valid value for type
Signed integer overflow was occuring because addition of two integers
(size of json array + item number in array) was causing overflow in
json_path_parts_compare. This overflow happens because a->n_item_end
wasn't set.
The second error was occuring because c_path->p.types_used is not
initialized but the value is used later on to check for negative path index.
Fix: For signed integer overflow, use a->n_item_end only in case of range
so that it is set.