Commit graph

73273 commits

Author SHA1 Message Date
Sachin Kumar
9f4ba624e2 MDEV-24667 LOAD DATA INFILE on temporary table not written to slave binlog
Problem: In regular replication, when master binlogged using statement format
slave might not have written an event to its binary log when the Query
event aimed at a temporary table.
Specifically this was observed with LOAD DATA INFILE.

This effect was possible because unlike master slave holds temporary
tables in its pool and the master side check of existence of a
temporary table at the format bin-logging decision did not apply.

Solution: replace THD::has_thd_temporary_tables() with
THD::has_temporary_tables which allows to identify temporary table
presence on either side.

--
Reviewed by Andrei Elkin.
2022-03-25 10:49:48 +02:00
sjaakola
9b2fa2ae8e MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes
This commit adds a mtr test for reproducing a test scenario where despite of
innodb_disallow_writes blocking, writes to file system can still happen.

The test launches a garbd node, which triggers one of the cluster node to switch to
SST donor state. In this state, all disk activity should be halted, and e.g.
innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb
data directory when the node enters the donor state, and records another md5sum
when the node leaves the donor state. If there is no IO activity in data directory, these
hashes should be equal.

For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be
stopped when entering the donor state. The test uses this new dbug sync point,
to control when to record the md5sums.

New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor
node to call this script, and to enter in donor state.

The backup script could be later extended as general purpose backup method for the cluster.

This commit fixes also one race condition happening in wsrep_sst_rsync, like this:
* wsrep_rsync_sst script requests for flush tables,
  and then waits in a loop until mariadbd has created file tables_flushed,
  as confirmation that FLUSH TABLES has completed
* mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL,
  and after this it creates the tables_flushed file
* note that SST script will now continue to startup rsync sending
* mariadbd's SST donor thread now calls for sst_disallow_writes(),
  so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point

This race condition is fixed in this commit, by performing all disk IO blocking before
creating the tables_flushed file.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-25 10:04:15 +02:00
Alexander Barkov
6437b30404 MDEV-28166 sql_mode=ORACLE: fully qualified package function calls do not work: db.pkg.func()
Also fixes MDEV-19328 sql_mode=ORACLE: Package function in VIEW
2022-03-25 10:46:59 +04:00
Daniel Black
88ce8a3d8b Merge 10.7 into 10.8 2022-03-25 15:06:56 +11:00
Daniel Black
8b92e346b1 Merge 10.6 into 10.7 2022-03-25 14:31:59 +11:00
Daniel Black
ec62f46a61 Merge 10.5 to 10.6 2022-03-25 11:31:49 +11:00
Brandon Nesterenko
cd88b0831f DBAAS-7828: Primary/replica: configuration change of autocommit=0 can not be applied
Problem:
========
When the mysql.gtid_slave_pos table uses the InnoDB engine, and
mysqld starts, it reads the table and begins a transaction. After
reading the value, it should end the transaction and release all
associated locks. The bug reported in DBAAS-7828 shows that when
autocommit is off, the locks are not released, resulting in
indefinite hangs on future attempts to change gtid_slave_pos. In
particular, the transaction was not properly finalized because
thd->server_status was not updated to reflect the end of the
transaction.

Solution:
========
This patch updates the code to properly commit the transaction after
reading gtid_slave_pos during mysqld start-up.

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2022-03-24 12:00:40 -06:00
Brandon Nesterenko
32ab6219be MDEV-25580: rpl.rpl_semi_sync_slave_compressed_protocol crashes because of wrong packet
Problem:
========
When both semi-sync and slave compression are enabled, the numbering
on packet headers can become out of sync between the primary and
replica servers. More specifically, after the master flushes its
write, it should increment the counters that track packets. The
bug is such that the master only updates the normal packet counter
and leaves the compressed packet counter alone.

Solution:
========
After the master flushes, additionally increment the compressed
packet counter.

Reviewed By:
============
Andrei Elkin: <andrei.elkin@mariadb.com>
2022-03-24 07:25:22 -06:00
Daniel Black
e86986a157 Merge 10.6 into 10.7 2022-03-24 18:57:07 +11:00
Igor Babaev
bbf02c85ba MDEV-24281 Reading from freed memory when running main.view with --ps-protocol
This bug could affect prepared statements for the command CREATE VIEW with
specification that contained unnamed basic constant in select list. If
generation of a valid name for the corresponding view column required
resolution of conflicts with names of other columns that were explicitly
defined then execution of such prepared statement and following deallocation
of this statement led to reading from freed memory.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2022-03-23 12:50:50 -07:00
Marko Mäkelä
44231dc6d5 Cleanup: have_sanitizer='ASAN,UBSAN'
This was suggested by Sergei Golubhick.
Fixes up commit b91a123d8c
2022-03-23 16:41:58 +02:00
Ian Gilfillan
8153c974e6 Update contributors 2022-03-23 10:47:27 +11:00
Andrei
5ccd845d51 MDEV-27760 event may non stop replicate in circular semisync setup
MDEV-21117 had to relax own events acceptance condition for a case
when a former semisync master server recovers after crash as the
semisync slave. That however admitted a possibility for endless event
"orbiting" in the non-strict slave gtid mode of semisync circular
setup.

The same server-id event termination is restored now for
the non-strict gtid mode to follow regular rules (that is it's ignored
unless @@global.replicate_same_server_id allows it in).

To address MDEV-21117 recovery agenda,
in the strict gtid mode and the transaction's gtid ordered strictly
greater than the current slave gtid state, the same server-id
transaction is accepted.

The gtid strict mode is safe to accept transactions even if
the slave state were not set correct by the user, e.g
at the former master.
An added test shows a typical out-of-order error at execution so
no data corruption is guaranteed in such a case.
2022-03-22 19:20:19 +02:00
Alexander Barkov
0c4c064f98 MDEV-27743 Remove Lex::charset
This patch also fixes:

MDEV-27690 Crash on `CHARACTER SET csname COLLATE DEFAULT` in column definition
MDEV-27853 Wrong data type on column `COLLATE DEFAULT` and table `COLLATE some_non_default_collation`
MDEV-28067 Multiple conflicting column COLLATE clauses are not rejected
MDEV-28118 Wrong collation of `CAST(.. AS CHAR COLLATE DEFAULT)`
MDEV-28119 Wrong column collation on MODIFY + CONVERT
2022-03-22 17:12:15 +04:00
Alexander Barkov
d25b10fede MDEV-27712 Reduce the size of Lex_length_and_dec_st from 16 to 8
User visible change:
Removing the length specified by user from error messages:
ER_TOO_BIG_SCALE and ER_TOO_BIG_PRECISION
as discussed with Sergei.
2022-03-22 14:42:54 +04:00
Alexander Barkov
0812d0de8d MDEV-28131 Unexpected warning while selecting from information_schema.processlist
Problem:

DECIMAL columns in I_S must be explicitly set of some value.

I_S columns do not have `DEFAULT 0` (after MDEV-18918), so during
restore_record() their record fragments pointed by Field::ptr are
initialized to zero bytes 0x00.
But an array of 0x00's is not a valid binary DECIMAL value.
So val_decimal() called for such Field_new_decimal generated a warning
when seeing a wrong binary encoded DECIMAL value in the record.

Fix:

Explicitly setting INFORMATION_SCHEMA.PROCESSLIST.PROGRESS
to the decimal value of 0 if no progress information is available.
2022-03-21 16:42:58 +04:00
Oleksandr Byelkin
fbc1cc974e MDEV-26009 Server crash when calling twice procedure using FOR-loop
The problem was that instructions sp_instr_cursor_copy_struct and
sp_instr_copen uses the same lex, adding and removing "tail" of
prelocked tables and forgetting that tail of all tables is kept in
LEX::query_tables_last. If the LEX used only by one instruction
or the query do not have prelocked tables it is not important.
But to work correctly in all cases LEX::query_tables_last should
be reset to make new tables added in the correct list (after last
table in the LEX instead after last table of the prelocking "tail"
which was cut).
2022-03-21 07:55:57 +01:00
Jan Lindström
12ce9b4f02 Fix compile error. 2022-03-18 20:50:10 +01:00
Alexey Yurchenko
eceb9e2478 MDEV-26971: JSON file interface to wsrep node state.
Fix status reporting - move it from dead code to actually executing.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 20:50:10 +01:00
Alexey Yurchenko
73d80c8672 MDEV-26971: Implement progress reporting by mariabackup SST script
Currently covers network transfer stage on donor and joiner.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 16:38:41 +01:00
Alexey Yurchenko
98355a0789 MDEV-26971: Support for progress reporting from SST scripts.
New feedback events:
 - "total N": signals new SST stage and reports estimated total work
 - "complete N": reports completed work in the current stage

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 16:38:41 +01:00
Alexey Yurchenko
9d7e596ba6 MDEV-26971: JSON file interface to wsrep node state.
Integration with status reporter in wsrep-lib.

Status reporter reports changes in wsrep state and logged errors/
warnings to a json file which then can be read and interpreted by
an external monitoring tool.

Rationale: until the server is fully initialized it is unaccessible
by client and the only source of information is an error log which
is not machine-friendly. Since wsrep node can spend a very long time
in initialization phase (state transfer), it may be a very long time
that automatic tools can't easily monitor its liveness and progression.

New variable: wsrep_status_file specifies the output file name.
If not set, no file is created and no reporting is done.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 16:38:41 +01:00
mkaruza
507030c492 MDEV-27713 Crash after a conflict of applier thread with stored procedure call by event scheduler
When thread is BF aborted by high priority service, ULL (user level
locks need to be removed and released). Calling directly release of lock for
MDL_EXPLICIT type doesn't clear also `thd->ull_hash`. Method
`mysql_ull_cleanup` will properly clear all information about ULL locks
for thread.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:26 +02:00
mkaruza
304f75c973 MDEV-27568 Parallel async replication hangs on a Galera node
Using parallel slave applying can cause deadlock between between DDL and
other events. GTID with lower seqno can be blocked in galera when node
entered TOI mode, but DDL GTID which has higher node can be blocked
before previous GTIDs are applied locally.

Fix is to check prior commits before entering TOI.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:26 +02:00
Daniele Sciascia
c63eab2c68 MDEV-28055: Galera ps-protocol fixes
* Fix test galera.MW-44 to make it work with --ps-protocol
* Skip test galera.MW-328C under --ps-protocol This test
  relies on wsrep_retry_autocommit, which has no effect
  under ps-protocol.
* Return WSREP related errors on COM_STMT_PREPARE commands
  Change wsrep_command_no_result() to allow sending back errors
  when a statement is prepared. For example, to handle deadlock
  error due to BF aborted transaction during prepare.
* Add sync waiting before statement prepare
  When a statement is prepared, tables used in the statement may be
  opened and checked for existence. Because of that, some tests (for
  example galera_create_table_as_select) that CREATE a table in one node
  and then SELECT from the same table in another node may result in errors
  due to non existing table.
  To make tests behave similarly under normal and PS protocol, we add a
  call to sync wait before preparing statements that would sync wait
  during normal execution.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:26 +02:00
sjaakola
97582f1c06 MDEV-27649 PS conflict handling causing node crash
Handling BF abort for prepared statement execution so that EXECUTE processing will continue
until parameter setup is complete, before BF abort bails out the statement execution.

THD class has new boolean member: wsrep_delayed_BF_abort, which is set if BF abort is observed
in do_command() right after reading client's packet, and if the client has sent PS execute command.
In such case, the deadlock error is not returned immediately back to client, but the PS execution
will be started. However, the PS execution loop, will now check if wsrep_delayed_BF_abort is set, and
stop the PS execution after the type information has been assigned for the PS.
With this, the PS protocol type information, which is present in the first PS EXECUTE command, is not lost
even if the first PS EXECUTE command was marked to abort.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:25 +02:00
Daniel Black
065f995e6d Merge branch 10.5 into 10.6 2022-03-18 12:17:11 +11:00
Sergei Golubchik
ecb6f9c894 MDEV-28095 crash in multi-update and implicit grouping
disallow implicit grouping in multi-update.
explicit GROUP BY is not allowed by the grammar.
2022-03-17 16:58:48 +01:00
Alexander Barkov
22fd31c588 MDEV-28078 Garbage on multiple equal ENUMs with tricky character sets
TYPELIBs for ENUM/SET columns could erroneously undergo redundant
hex-unescaping at the table open time.

Fix:
- Prevent multiple unescaping of the same TYPELIB
- Prevent sharing TYPELIBs between columns with different mbminlen
2022-03-17 13:05:03 +04:00
Marko Mäkelä
118826d173 Fix gcc-12 -O2 -Warray-bounds 2022-03-17 10:20:07 +02:00
Daniel Black
b73d852779 Merge 10.4 to 10.5 2022-03-17 17:03:24 +11:00
Daniel Black
069139a549 Merge 10.3 to 10.4
extra2_read_len resolved by keeping the implementation
in sql/table.cc by exposed it for use by ha_partition.cc

Remove identical implementation in unireg.h
(ref: bfed2c7d57)
2022-03-16 16:39:10 +11:00
Daniel Black
6a2d88c132 Merge 10.2 to 10.3 2022-03-16 12:51:22 +11:00
Alexander Barkov
0e63023cb8 Merge branch 10.2 into 10.3 2022-03-16 12:49:13 +11:00
Daniel Black
57dbe8785d MDEV-23915 ER_KILL_DENIED_ERROR not passed a thread id (part 2)
Per Marko's comment in JIRA, sql_kill is passing the thread id
as long long. We change the format of the error messages to match,
and cast the thread id to long long in sql_kill_user.
2022-03-16 09:37:45 +11:00
Daniel Black
99837c61a6 MDEV-23915 ER_KILL_DENIED_ERROR not passed a thread id
The 10.5 test error main.grant_kill showed up a incorrect
thread id on a big endian architecture.

The cause of this is the sql_kill_user function assumed the
error was ER_OUT_OF_RESOURCES, when the the actual error was
ER_KILL_DENIED_ERROR. ER_KILL_DENIED_ERROR as an error message
requires a thread id to be passed as unsigned long, however a
user/host was passed.

ER_OUT_OF_RESOURCES doesn't even take a user/host, despite
the optimistic comment. We remove this being passed as an
argument to the function so that when MDEV-21978 is implemented
one less compiler format warning is generated (which would
have caught this error sooner).

Thanks Otto for reporting and Marko for analysis.
2022-03-16 09:37:45 +11:00
Marko Mäkelä
9f5a3e5689 Merge 10.7 into 10.8 2022-03-15 18:18:07 +02:00
Marko Mäkelä
dc4b7f382b Merge 10.6 into 10.7 2022-03-15 15:25:31 +02:00
Marko Mäkelä
4ef44cc2f9 Merge 10.5 into 10.6 2022-03-15 14:49:24 +02:00
Marko Mäkelä
e1246775a9 Merge 10.4 into 10.5 2022-03-15 08:32:28 +02:00
Marko Mäkelä
9c6135e81f Merge 10.3 into 10.4 2022-03-15 08:10:35 +02:00
Daniel Black
a950086036 Merge 10.2 (part) into 10.3
commit '6de482a6fefac0c21daf33ed465644151cdf879f'

10.3 no longer errors in truncate_notembedded.test
but per comments, a non-crash is all that we are after.
2022-03-15 16:44:52 +11:00
Hugo Wen
dafc5fb9c1 MDEV-27342: Fix issue of recovery failure using new server id
Commit 6c39eaeb1 made the crash recovery dependent on server_id.
The crash recovery could fail when restoring a new instance from
original crashed data directory USING A NEW SERVER ID.

The issue doesn't exist in previous major versions before 10.6.

Root cause is when generating the input XID to be searched in the hash,
server id is populated with the current server id.
So if the server id changed when recovering, the XID couldn't be found
in the hash due to server id doesn't match.

This fix is to use original server id when creating the input XID
object in function `xarecover_do_commit_or_rollback`.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
2022-03-14 19:57:10 -07:00
Alexander Barkov
03c3dc6365 MDEV-23210 Assertion `(length % 4) == 0' failed in my_lengthsp_utf32 on ALTER TABLE, SELECT and INSERT
Problem:
Parse-time conversion from binary to tricky character sets like utf32
produced ill-formed strings. So, later a chash happened in debug builds,
or a wrong SHOW CREATE TABLE was returned in release builds.

Fix:

1. Backporting a few methods from 10.3:
  - THD::check_string_for_wellformedness()
  - THD::convert_string() overloads
  - THD::make_text_string_connection()

2. Adding a new method THD::reinterpret_string_from_binary(),
   which makes sure to either returns a well-formed string
   (optionally prepending with zero bytes), or returns an error.
2022-03-14 14:42:59 +04:00
Marko Mäkelä
18bb95b608 Merge 10.7 into 10.8 2022-03-14 11:52:11 +02:00
Marko Mäkelä
e67d46e4a1 Merge 10.6 into 10.7 2022-03-14 11:30:32 +02:00
Marko Mäkelä
572e34304e Merge 10.5 into 10.6 2022-03-14 10:59:46 +02:00
Sergei Golubchik
bfed2c7d57 MDEV-27753 Incorrect ENGINE type of table after crash for CONNECT table
whenever possible, partitioning should use the full
partition plugin name, not the one byte legacy code.

Normally, ha_partition can get the engine plugin from
table_share->default_part_plugin.

But in some cases, e.g. in DROP TABLE, the table isn't
opened, table_share is NULL, and ha_partition has to parse
the frm, much like dd_frm_type() does.

temporary_tables.cc, sql_table.cc:

When dropping a table, it must be deleted in the engine
first, then frm file. Because frm can be the only true
source of metadata that the engine might need for DROP.

table.cc:

when opening a partitioned table, if the engine for
partitions is not found, do not fallback to MyISAM.
2022-03-14 08:55:59 +01:00
Marko Mäkelä
59359fb44a MDEV-24841 Build error with MSAN use-of-uninitialized-value in comp_err
The MemorySanitizer implementation in clang includes some built-in
instrumentation (interceptors) for GNU libc. In GNU libc 2.33, the
interface to the stat() family of functions was changed. Until the
MemorySanitizer interceptors are adjusted, any MSAN code builds
will act as if that the stat() family of functions failed to initialize
the struct stat.

A fix was applied in
https://reviews.llvm.org/rG4e1a6c07052b466a2a1cd0c3ff150e4e89a6d87a
but it fails to cover the 64-bit variants of the calls.

For now, let us work around the MemorySanitizer bug by defining
and using the macro MSAN_STAT_WORKAROUND().
2022-03-14 09:28:55 +02:00
Sergei Golubchik
6789f2cfab MDEV-18304 sql_safe_updates does not work with OR clauses
not every index-using plan sets bits in table->quick_keys.
QUICK_ROR_INTERSECT_SELECT, for example, doesn't.

Use the fact that select->quick is set instead.

Also allow EXPLAIN to work.
2022-03-12 19:13:17 +01:00