Create test for for case insensitive gives a basic warning on creating
a test file and the next thing a user might see is an abort.
ProtectHome and other systemd setting protect system services from
accessing user data. Unfortunately some of our users do put things
on /home due space or other reasons.
Rather than enumberate the systemd options in a very clunkly fragile
way we put an error associated with the "Can't create test file" and
hope the user can work it out from there.
%M tip thanks Sergei.
Fixed memory leak taken place on executing a prepared statement or
a stored routine that querying a view and this view constructed
on an information schema table. For example,
Lets consider the following definition of the view 'v1'
CREATE VIEW v1 AS SELECT table_name FROM information_schema.views
ORDER BY table_name;
Querying this view in PS mode result in hit of assert.
PREPARE stmt FROM "SELECT * FROM v1";
EXECUTE stmt;
EXECUTE stmt; (*)
Running the statement marked with (*) leads to a crash in case
server build with mode to control allocation of a memory from SP/PS
memory root on the second and following executions of PS/SP.
The reason of leaking the memory is that a memory allocated on
processing of FRM file for the view requested from a PS/PS memory
root meaning that this memory be released only when a stored routine
be evicted from SP-cache or a prepared statement be deallocated
that typically happens on termination of a user session.
To fix the issue switch to a memory root specially created for
allocation of short-lived objects that requested on parsing FRM.
In case a table accessed by a PS/SP is dropped after the first execution of
PS/SP and a view created with the same name as a table just dropped then
the second execution of PS/SP leads to allocation of a memory on SP/PS
memory root already marked as read only on first execution.
For example, the following test case:
CREATE TABLE t1 (a INT);
PREPARE stmt FROM "INSERT INTO t1 VALUES (1)";
EXECUTE stmt;
DROP TABLE t1;
CREATE VIEW t1 S SELECT 1;
--error ER_NON_INSERTABLE_TABLE
EXECUTE stmt; # (*)
DROP VIEW t1;
will hit assert on running the statement 'EXECUTE stmt' marked with (*)
when allocation of a memory be performed on parsing the view.
Memory allocation is requested inside the function mysql_make_view
when a view definition being parsed. In order to avoid an assertion
failure, call of the function mysql_make_view() must be moved after
invocation of the function check_and_update_table_version().
It will result in re-preparing the whole PS statement or current
SP instruction that will free currently allocated items and reset
read_only flag for the memory root.
Moved call of the function check_and_update_table_version() just
before the place where the function extend_table_list() is invoked
in order to avoid allocation of memory on a PS/SP memory root
marked as read only. It happens by the reason that the function
extend_table_list() invokes sp_add_used_routine() to add a trigger
created for the table in time frame between execution the statement
EXECUTE `stmt_id` .
For example, the following test case
create table t1 (a int);
prepare stmt from "insert into t1 (a) value (1)";
execute stmt;
create trigger t1_bi before insert on t1 for each row
set @message= new.a;
execute stmt; # (*)
adds the trigger t1_bi to a list of used routines that involves
allocation of a memory on PS memory root that has been already marked
as read only on first run of the statement 'execute stmt'.
In result, when the statement marked with (*) is executed it results in
assert hit.
To fix the issue call the function check_and_update_table_version()
before invocation of extend_table_list() to force re-compilation of
PS/SP that resets read-only flag of its memory root.
This patch adds support for controlling of memory allocation
done by SP/PS that could happen on second and following executions.
As soon as SP or PS has been executed the first time its memory root
is marked as read only since no further memory allocation should
be performed on it. In case such allocation takes place it leads to
the assert hit for invariant that force no new memory allocations
takes place as soon as the SP/PS has been marked as read only.
The feature for control of memory allocation made on behalf SP/PS
is turned on when both debug build is on and the cmake option
-DWITH_PROTECT_STATEMENT_MEMROOT is set.
The reason for introduction of the new cmake option
-DWITH_PROTECT_STATEMENT_MEMROOT
to control memory allocation of second and following executions of
SP/PS is that for the current server implementation there are too many
places where such memory allocation takes place. As soon as all such
incorrect allocations be fixed the cmake option
-DWITH_PROTECT_STATEMENT_MEMROOT
can be removed and control of memory allocation made on second and
following executions can be turned on only for debug build. Before
every incorrect memory allocation be fixed it makes sense to guard
the checking of memory allocation on read only memory by extra cmake
option else we would get a lot of failing test on buildbot.
Moreover, fixing of all incorrect memory allocations could take pretty
long period of time, so for introducing the feature without necessary
to wait until all places throughout the source code be fixed it makes
sense to add the new cmake option.
Summary:
This patch enables possible index optimization when
the WHERE clause has an IN condition of the form:
signed_or_unsigned_column IN (signed_or_unsigned_constant,
signed_or_unsigned_constant
[,signed_or_unsigned_constant]*)
when the IN list constants are of different signess, e.g.:
WHERE signed_column IN (signed_constant, unsigned_constant ...)
WHERE unsigned_column IN (signed_constant, unsigned_constant ...)
Details:
In a condition like:
WHERE unsigned_predicant IN (1, LONGLONG_MAX + 1)
comparison handlers for individual (predicant,value) pairs are
calculated as follows:
* unsigned_predicant and 1 produce &type_handler_newdecimal
* unsigned_predicant and (LONGLONG_MAX + 1) produce &type_handler_slonglong
The old code decided that it could not use bisection because
the two pairs had different comparison handlers.
As a result, bisection was not allowed, and, in case of
an indexed integer column predicant the index on the column was not used.
The new code catches special cases like:
signed_predicant IN (signed_constant, unsigned_constant)
unsigned_predicant IN (signed_constant, unsigned_constant)
It enables bisection using in_longlong, which supports a mixture
of predicant and values of different signess.
In case when the predicant is an indexed column this change
automatically enables index range optimization.
Thanks to Vicențiu Ciorbaru for proposing the idea and for preparing MTR tests.
For clang compiler the compiler's flag -Wno-unused-but-set-variable
was set based on compiler version. This approach could result in
false positive detection for presence of compiler option since
only first three groups of digits in compiler version taken into account
and it could lead to inaccuracy in determining of supported compiler's
features.
Correct way to detect options supported by a compiler is to use
the macros MY_CHECK_CXX_COMPILER_FLAG and to check the result of
variable with prefix have_CXX__
So, to check whether compiler does support the option
-Wno-unused-but-set-variable
the macros
MY_CHECK_CXX_COMPILER_FLAG(-Wno-unused-but-set-variable)
should be called and the result variable
have_CXX__Wno_unused_but_set_variable
be tested for assigned value.
When the SQL driver thread goes to wait for room in the parallel slave
worker queue, there was a race where a kill at the right moment could
be ignored and the wait proceed uninterrupted by the kill.
Fix by moving the THD::check_killed() to occur _after_ doing ENTER_COND().
This bug was seen as sporadic failure of the testcase rpl.rpl_parallel
(rpl.rpl_parallel_gco_wait_kill since 10.5), with "Slave stopped with
wrong error code".
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Restore code to make InnoDB choose the second transaction as a deadlock
victim if two transactions deadlock that need to commit in-order for
parallel replication. This code was erroneously removed when VATS was
implemented in InnoDB.
Also add a test case for InnoDB choosing the right deadlock victim.
Also fixes this bug, with testcase that reliably reproduces:
MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master
Note: This should be null-merged to 10.6, as a different fix is needed
there due to InnoDB locking code changes.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Remove the exception that InnoDB does not report auto-increment locks waits
to the parallel replication.
There was an assumption that these waits could not cause conflicts with
in-order parallel replication and thus need not be reported. However, this
assumption is wrong and it is possible to get conflicts that lead to hangs
for the duration of --innodb-lock-wait-timeout. This can be seen with three
transactions:
1. T1 is waiting for T3 on an autoinc lock
2. T2 is waiting for T1 to commit
3. T3 is waiting on a normal row lock held by T2
Here, T3 needs to be deadlock killed on the wait by T1.
Note: This should be null-merged to 10.6, as a different fix is needed
there due to InnoDB lock code changes.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Field_varstring::get_copy_func() did not take into account
that functions do_varstring1[_mb], do_varstring2[_mb] do not support
compressed data.
Changing the return value of Field_varstring::get_copy_func()
to `do_field_string` if there is a compresion and truncation
at the same time. This fixes the problem, so now it works as follows:
- val_str() uncompresses the data
- The prefix is then calculated on the uncompressed data
Additionally, introducing two new copying functions
- do_varstring1_no_truncation()
- do_varstring2_no_truncation()
Using new copying functions in cases when:
- a Field_varstring with length_bytes==1 is changing to a longer
Field_varstring with length_bytes==1
- a Field_varstring with length_bytes==2 is changing to a longer
Field_varstring with length_bytes==2
In these cases we don't care neither of compression nor
of multi-byte prefixes: the entire data gets fully copied
from the source column to the target column as is.
This is a kind of new optimization, but this also was needed
to preserve existing MTR test results.
This is also related to
MDEV-31348 Assertion `last_key_entry >= end_pos' failed in virtual bool
JOIN_CACHE_HASHED::put_record()
Valgrind exposed a problem with the join_cache for hash joins:
=25636== Conditional jump or move depends on uninitialised value(s)
==25636== at 0xA8FF4E: JOIN_CACHE_HASHED::init_hash_table()
(sql_join_cache.cc:2901)
The reason for this was that avg_record_length contained a random value
if one had used SET optimizer_switch='optimize_join_buffer_size=off'.
This causes either 'random size' memory to be allocated (up to
join_buffer_size) which can increase memory usage or, if avg_record_length
is less than the row size, memory overwrites in thd->mem_root, which is
bad.
Fixed by setting avg_record_length in JOIN_CACHE_HASHED::init()
before it's used.
There is no test case for MDEV-31893 as valgrind of join_cache_notasan
checks that.
I added a test case for MDEV-31348.
Revert the old work-around for buggy fdatasync() on Linux ext3. This bug was
fixed in Linux > 10 years ago back to kernel version at least 3.0.
Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This is also related to
MDEV-31348 Assertion `last_key_entry >= end_pos' failed in virtual bool
JOIN_CACHE_HASHED::put_record()
Valgrind exposed a problem with the join_cache for hash joins:
=25636== Conditional jump or move depends on uninitialised value(s)
==25636== at 0xA8FF4E: JOIN_CACHE_HASHED::init_hash_table()
(sql_join_cache.cc:2901)
The reason for this was that avg_record_length contained a random value
if one had used SET optimizer_switch='optimize_join_buffer_size=off'.
This causes either 'random size' memory to be allocated (up to
join_buffer_size) which can increase memory usage or, if avg_record_length
is less than the row size, memory overwrites in thd->mem_root, which is
bad.
Fixed by setting avg_record_length in JOIN_CACHE_HASHED::init()
before it's used.
There is no test case for MDEV-31893 as valgrind of join_cache_notasan
checks that.
I added a test case for MDEV-31348.
There was two related problems:
(1) Galera node that is defined as a slave to async MariaDB
master at restart might do SST (state stransfer) and
part of that it will copy mysql.gtid_slave_pos table.
Problem is that updates on that table are not replicated
on a cluster. Therefore, table from donor that is not
slave is copied and joiner looses gtid position it was
and start executing events from wrong position of the binlog.
This incorrect position could break replication and
causes node to be dropped and requiring user action.
(2) Slave sql thread might start executing events before
galera is ready (wsrep_ready=ON) and that could also
cause node to be dropped from the cluster.
In this fix we enable replication of mysql.gtid_slave_pos
table on a cluster. In this way all nodes in a cluster
will know gtid slave position and even after SST joiner
knows correct gtid position to start.
Furthermore, we wait galera to be ready before slave
sql thread executes any events to prevent too early
execution.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
when validating vcol's (default, check, etc) in ALTER TABLE
vcol_info->flags are modified in place. This means that if ALTER TABLE
fails for any reason we need to restore them to their original values.
(mroonga was freeing the memory on ::reset() but not on ::close())
After MDEV-21580 the truncation of SORT_FIELD::length
set_if_smaller(sortorder->length, thd->variables.max_sort_length)
became conditional:
if (is_variable_sized())
set_if_smaller(length, thd->variables.max_sort_length)
To provide correct functioning of is_variable_sized() SORT_FIELD::type
must be set properly. This commit adds the necessary initialization
of SORT_FIELD::type to JOIN_TAB::remove_duplicates() as it is done
in filesort's sortlength() function.
DBUG_ASSERT is added to sortlength() just in case to prevent
a possible uint32 overflow
make TRANSACTIONAL table option behave similar to other engine-defined
table options. If the engine doesn't suport it:
* if specified expicitly in CREATE or ALTER - it's ER_UNKNOWN_OPTION
* an error or a warning depending on sql_mode IGNORE_BAD_TABLE_OPTIONS
* in ALTER TABLE from the engine that suppors it to the engine that
doesn't - silently preserved (no warning)
* it is commented out in SHOW CREATE unless IGNORE_BAD_TABLE_OPTIONS
* invoke check_expression() for all vcol_info's in
mysql_prepare_create_table() to check for FK CASCADE
* also check for SET NULL and SET DEFAULT
* to check against existing FKs when a vcol is added in ALTER TABLE,
old FKs must be added to the new_key_list just like other indexes are
* check columns recursively, if vcol1 references vcol2,
flags of vcol2 must be taken into account
* remove check_table_name_processor(), put that logic under
check_vcol_func_processor() to avoid walking the tree twice
mark old keys in the ALTER TABLE with the `old` flag, not with
the `key_create_info.check_for_duplicate_indexes`.
This allows to mark old foreign keys too.
don't construct open ranges from prefix blob keys for < (less than)
just as it's already done for > (greater than)
because prefix KEY_PART doesn't create prefix Field for blobs
(see open_table_from_share() near "Create a new field for the key part"),
so stored_field_cmp_to_item() will compare the original field to the
value not taking the prefix length into account.
A simple "SET SESSION gtid_seq_no= DEFAULT" did not work, it would straight
up crash the server! Also, explicitly setting gtid_seq_no to 0 gave an error
in --gtid-strict-mode=1.
Setting to DEFAULT or 0 should disable any prior setting of
gtid_seq_no, so that the next transaction is allocated the next GTID
in sequence, as normal.
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
differently react to SQL_MODE => unusable SHOW CREATE
Use abort_on_warning dependent on strict mode over create new table
like it is done for copy data and inplace alter.
MDEV-31749 sporadic assert in MDEV-30619 new test
If the workers of a parallel replica are busy (potentially with long
queues), but the SQL thread has no events left to distribute (so it
goes idle), then the next event that comes from the primary will
update mi->last_master_timestamp with its timestamp, even if the
workers have not yet finished.
This patch changes the parallel replica logic which updates
last_master_timestamp after idling from using solely sql_thread_caught_up
(added in MDEV-29639) to using the latter with rli queued/dequeued
event counters.
That is, if the queued count is equal to the dequeued count, it
means all events have been processed and the replica is considered
idle when the driver thread has also distributed all events.
Low level details of the commit include
- to make a more generalized test for Seconds_Behind_Master on
the parallel replica, rpl_delayed_parallel_slave_sbm.test
is renamed to rpl_parallel_sbm.test for this purpose.
- pause_sql_thread_on_next_event usage was removed
with the MDEV-30619 fixes. Rather than remove it, we adapt it
to the needs of this test case
- added test case to cover SBM spike of relay log read and LMT
update that was fixed by MDEV-29639
- rpl_seconds_behind_master_spike.test is made to use
the negate_clock_diff_with_master debug eval.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
We introduce simple plugin dependency. A plugin init function may
return HA_ERR_RETRY_INIT. If this happens during server startup when
the server is trying to initialise all plugins, the failed plugins
will be retried, until no more plugins succeed in initialisation or
want to be retried.
This will fix spider init bugs which is caused in part by its
dependency on Aria for initialisation.
The reason we need a new return code, instead of treating every
failure as a request for retry, is that it may be impossible to clean
up after a failed plugin initialisation. Take InnoDB for example, it
has a global variable `buf_page_cleaner_is_active`, which may not
satisfy an assertion during a second initialisation try, probably
because InnoDB does not expect the initialisation to be called
twice.
Since TLS server certificate verification is a client
only option, this flag is removed in both client (C/C)
and MariaDB server capability flags.
This patch reverts commit 89d759b93e
(MySQL Bug #21543) and stores the server certificate validation
option in mysql->options.extensions.
Restrict vcol_cleanup_expr() in close_thread_tables() to only simple
locked tables mode. Prelocked is cleaned up like normal statement: in
close_thread_table().
First UPDATE under START TRANSACTION does nothing (nstate= nstate),
but anyway generates history. Since update vector is empty we get into
(!uvect->n_fields) branch which only adds history row, but does not do
update. After that we get current row with wrong (old) row_start value
and because of that second UPDATE tries to insert history row again
because it sees trx->id != row_start which is the guard to avoid
inserting multiple trx_id-based history rows under same transaction
(because we have same trx_id and we get duplicate error and this bug
demostrates that). But this try anyway fails because PK is based on
row_end which is constant under same transaction, so PK didn't change.
The fix moves vers_make_update() to an earlier stage of
calc_row_difference(). Therefore it prepares update vector before
(!uvect->n_fields) check and never gets into that branch, hence no
need to handle versioning inside that condition anymore.
Now trx->id and row_start are equal after first UPDATE and we don't
try to insert second history row.
== Cleanups and improvements ==
ha_innobase::update_row():
vers_set_fields and vers_ins_row are cleaned up into direct condition
check. SQLCOM_ALTER_TABLE check now is not used as this is dead code,
assertion is done instead.
upd_node->is_delete is set in calc_row_difference() just to keep
versioning code as much in one place as possible. vers_make_delete()
is still located in row_update_for_mysql() as this is required for
ha_innodbase::delete_row() as well.
row_ins_duplicate_error_in_clust():
Restrict DB_FOREIGN_DUPLICATE_KEY to the better conditions.
VERSIONED_DELETE is used specifically to help lower stack to
understand what caused current insert. Related to MDEV-29813.
On create table tmp as select ... we exited Item_func::fix_fields()
with error. fix_fields_if_needed('foo' or 'bar') failed and we
returned true, but already changed const_item_cache. So the item is in
inconsistent state: fixed == false and const_item_cache == false.
Now we cleanup the item before the return if Item_func::fix_fields()
fails to process.
Constraints processing row_ins_check_foreign_constraint() was not
called because row_upd_check_references_constraints() didn't see
update as delete: node->is_delete was false.
Since MDEV-30378 we check for TRG_EVENT_DELETE to detect versioned
delete in ha_innobase::update_row().
Now we can use TRG_EVENT_DELETE to set upd_node->is_delete, so
constraints processing is triggered correctly.