The issue is that two MARIA_HA instances shares the same MARIA_STATUS_INFO
object during UNION execution, so the second MARIA_HA instance state pointer
MARIA_HA::state points to the MARIA_HA::state_save of the first MARIA instance.
This happens in
thr_multi_lock(...) {
...
for (first_lock=data, pos= data+1 ; pos < end ; pos++)
{
...
if (pos[0]->lock == pos[-1]->lock && pos[0]->lock->copy_status)
(pos[0]->lock->copy_status)((*pos)->status_param,
(*first_lock)->status_param);
...
}
...
}
Usually the state is restored from ha_maria::external_lock(...):
\#0 _ma_update_status (param=0x6290000e6270) at ./storage/maria/ma_state.c:309
\#1 0x00005555577ccb15 in _ma_update_status_with_lock (info=0x6290000e6270) at ./storage/maria/ma_state.c:361
\#2 0x00005555577c7dcc in maria_lock_database (info=0x6290000e6270, lock_type=2) at ./storage/maria/ma_locking.c:66
\#3 0x0000555557802ccd in ha_maria::external_lock (this=0x61d0001b1308, thd=0x62a000048270, lock_type=2) at ./storage/maria/ha_maria.cc:2727
But _ma_update_status() does not take into account the case when
MARIA_HA::status points to the MARIA_HA::state_save of the other MARIA_HA
instance.
The fix is to restore MARIA_HA::state in ha_maria::external_lock() after
maria_lock_database() call for transactional tables.
A read-only storage engine that stores it's data in (aws) S3
To store data in S3 one could use ALTER TABLE:
ALTER TABLE table_name ENGINE=S3
libmarias3 integration done by Sergei Golubchik
libmarias3 created by Andrew Hutchings
This commit is based on the work of Michal Schorm, rebased on the
earliest MariaDB version.
Th command line used to generate this diff was:
find ./ -type f \
-exec sed -i -e 's/Foundation, Inc., 59 Temple Place, Suite 330, Boston, /Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, /g' {} \; \
-exec sed -i -e 's/Foundation, Inc. 59 Temple Place.* Suite 330, Boston, /Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, /g' {} \; \
-exec sed -i -e 's/MA.*.....-1307.*USA/MA 02110-1335 USA/g' {} \; \
-exec sed -i -e 's/Foundation, Inc., 59 Temple/Foundation, Inc., 51 Franklin/g' {} \; \
-exec sed -i -e 's/Place, Suite 330, Boston, MA.*02111-1307.*USA/Street, Fifth Floor, Boston, MA 02110-1335 USA/g' {} \; \
-exec sed -i -e 's/MA.*.....-1307/MA 02110-1335/g' {} \;
make live checksum to be returned in handler::info(),
and slow table-scan checksum to be calculated in handler::checksum().
part of
MDEV-16249 CHECKSUM TABLE for a spider table is not parallel and saves all data in memory in the spider head by default
The MDEV-17262 commit 26432e49d3
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.
In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
There were two newly enabled warnings:
1. cast for a function pointers. Affected sql_analyse.h, mi_write.c
and ma_write.cc, mf_iocache-t.cc, mysqlbinlog.cc, encryption.cc, etc
2. memcpy/memset of nontrivial structures. Fixed as:
* the warning disabled for InnoDB
* TABLE, TABLE_SHARE, and TABLE_LIST got a new method reset() which
does the bzero(), which is safe for these classes, but any other
bzero() will still cause a warning
* Table_scope_and_contents_source_st uses `TABLE_LIST *` (trivial)
instead of `SQL_I_List<TABLE_LIST>` (not trivial) so it's safe to
bzero now.
* added casts in debug_sync.cc and sql_select.cc (for JOIN)
* move assignment method for MDL_request instead of memcpy()
* PARTIAL_INDEX_INTERSECT_INFO::init() instead of bzero()
* remove constructor from READ_RECORD() to make it trivial
* replace some memcpy() with c++ copy assignments
The symptom of the bug was that one got the following in
the aria recovery log:
"Table 'xxx', id 57, has create_rename_lsn (1,0x12dee) more recent than LOGREC_FILE_ID's LSN (1,0x12dc4), ignoring open request"
After this all future redo entries was marked with
"For table of short id 57, table skipped, so skipping record"
Analyze:
When ending batch insert, create_rename_lsn for the table
is updated to signal that earlier redo entries for the
table can't be applied. The problem was that future redo
entries was also ignored as redo code assumed they where
for the old table.
Fixed by calling translog_dessign_id, which causes
future redo entries to be seen as belonging to the
updated table.
The warning was removed as this is a common case that happens if the table
was dropped and later created during the same checkpoint or if there was
a bulk insert done on an empty table.
This was caused by a combination of factors:
* MyISAM/Aria temporary tables historically never saved the state
to disk (MYI/MAI), because the state never needed to persist
* certain ALTER TABLE operations modify the original TABLE structure
and if they fail, the original table has to be reopened to
revert all changes (m_needs_reopen=1)
as a result, when ALTER fails and MyISAM/Aria temp table gets reopened,
it reads the stale state from the disk.
As a fix, MyISAM/Aria tables now *always* write the state to disk
on close, *unless* HA_EXTRA_PREPARE_FOR_DROP was done first. And
the server now always does HA_EXTRA_PREPARE_FOR_DROP before dropping
a temporary table.
Part of MDEV-5336 Implement LOCK FOR BACKUP
- Added new locks to MDL_BACKUP for all stages of backup locks and
a new MDL lock needed for backup stages.
- Renamed MDL_BACKUP_STMT to MDL_BACKUP_DDL
- flush_tables() takes a new parameter that decides what should be flushed.
- InnoDB, Aria (transactional tables with checksums), Blackhole, Federated
and Federatedx tables are marked to be safe for online backup. We are
using MDL_BACKUP_TRANS_DML instead of MDL_BACKUP_DML locks for these
which allows any DML's to proceed for these tables during the whole
backup process until BACKUP STAGE COMMIT which will block the final
commit.
Part of MDEV-5336 Implement LOCK FOR BACKUP
The idea is that instead of waiting in close_cached_tables() for all
tables to be closed, we instead call flush_tables() that does:
- Flush not used objects in table cache to free memory
- Collect all tables that are open
- Call HA_EXTRA_FLUSH on the objects, to get them into "closed state"
- Added HA_EXTRA_FLUSH support to archive and CSV
- Added multi-user protection to HA_EXTRA_FLUSH in MyISAM and Aria
The benefit compared to old code is:
- FTWRL doesn't have to wait for long running read operations or
open HANDLER's
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
Two bugs in Aria, related to 2-level fulltext indexes:
* REPAIR calculated the key number incorrectly
* CHECK copied the key into last_key too early and
checking the second-level btree was overwriting it
extra/mariabackup/fil_cur.cc:361:42: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
extra/mariabackup/fil_cur.cc:376:9: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
sql/handler.cc:6196:45: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
sql/log.cc:1681:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/log.cc:1687:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/wsrep_sst.cc:1388:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
sql/wsrep_sst.cc:232:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
storage/connect/filamdbf.cpp:450:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/filamdbf.cpp:970:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/inihandl.cpp:197:16: warning: address of array 'key->name' will always evaluate to 'true' [-Wpointer-bool-conversion]
storage/innobase/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/buf/buf0buf.cc:5085:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/innobase/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/handler/ha_innodb.cc:18685:7: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
storage/innobase/row/row0mysql.cc:3319:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/row/row0mysql.cc:3327:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/maria/ma_norec.c:35:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_norec.c:42:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_test2.c:1009:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/maria/ma_test2.c:1010:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/mroonga/ha_mroonga.cpp:9189:44: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
storage/mroonga/vendor/groonga/lib/expr.c:4987:22: warning: comparison of constant -1 with expression of type 'grn_operator' is always false [-Wtautological-constant-out-of-range-compare]
storage/xtradb/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/buf/buf0buf.cc:5047:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/xtradb/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3324:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3332:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:120:35: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:96:35: note: expanded from macro 'INFO_TAIL'
- Made output to be aligned in aria_chk -d
- Aria engine error texts are now written instead of "Undefined error"
- When running with --check --force, tables with wrong TRN's but otherwise
correct are now zerofilled
- Fixed several bugs in check and recovery related to fulltext
- When doing recovery, store highest found TRID in aria_control_file
Before this, the
- Changed ERROR to WARNING for MyISAM/Aria message
that are warnings in the check utilities.
This affects for example "client is using or
hasn't closed the table properly".
- Print "Table is fixed" if check succeded in
fixing the table.
Problem was that a parallel open of a table, overwrote info->state that
was in used by repair.
Fixed by changing _ma_tmp_disable_logging_for_table() to use
a new state buffer state.no_logging to store the temporary state.
Other things:
- Use original number of rows when retrying repair to get rid of a
potential warning "Number of rows changed from X to Y"
- Changed maria_commit() to make it easier to merge with 10.4
- If table is not locked (like with show commands), use the global
number of rows as the local number may not be up to date.
(Minor not critical fix)
- Added some missing DBUG_RETURN
This was introduced by two pointers I added to TRN
as part of MDEV-16421 Make system tables crash safe
- Added code to ensure that trn_prev is not pointing
to wrong object
- A lot of new asserts and more code comments
- Simplified code in _ma_trnman_end_trans_hook()
- New back link allowed me to remove a loop
Make all system tables in mysql directory of type
engine=Aria
Privilege tables are using transactional=1
Statistical tables are using transactional=0, to allow them
to be quickly updated with low overhead.
Help tables are also using transactional=0 as these are only
updated at init time.
Other changes:
- Aria store engine is now a required engine
- Update comment for Aria tables to reflect their new usage
- Fixed that _ma_reset_trn_for_table() removes unlocked table
from transaction table list. This was needed to allow one
to lock and unlock system tables separately from other
tables, for example when reading a procedure from mysql.proc
- Don't give a warning when using transactional=1 for engines
that is using transactions. This is both logical and also
to avoid warnings/errors when doing an alter of a privilege
table to InnoDB.
- Don't abort on warnings from ALTER TABLE for changes that
would be accepted by CREATE TABLE.
- New created Aria transactional tables are marked as not movable
(as they include create_rename_lsn).
- bootstrap.test was changed to kill orignal server, as one
can't anymore have two servers started at same time on same
data directory and data files.
- Disable maria.small_blocksize as one can't anymore change
aria block size after system tables are created.
- Speed up creation of help tables by using lock tables.
- wsrep_sst_resync now also copies Aria redo logs.
MDEV-10130 Assertion `share->in_trans == 0' failed in storage/maria/ma_close.c
MDEV-10378 Assertion `trn' failed in virtual int ha_maria::start_stmt
The problem was that maria_handler->trn was not properly reset
at commit/rollback and ha_maria::exernal_lock() could get confused
because.
There was some old code in ha_maria::implicit_commit() that tried
to take care of this, but it was not bullet proof.
Fixed by adding list of all tables that is part of the maria transaction to
TRN.
A nice side effect was of the fix is that loops in
ha_maria::implict_commit() got to be much simpler.
Other things:
- Fixed a bug in mysql_admin_table() where argument open_for_modify
was wrongly reset for the next table in the chain
- rollback admin command also in case of fatal error.
- Split _ma_set_trn_for_table() to three version to simplify code
and debugging.
- Several new asserts to detect the original problem (that file was
not properly removed from trn before calling ma_close())
I was able to repeat the problem with old version of randgen
Reason for crash:
- It's not safe to change share->now_transactional if there are changed
bitmaps in the pagecache as flushing these can cause redo-entries and
the bitmap flush code checks that share->now_transactional is set.
Fixed by flushing bitmaps in _ma_tmp_disable_logging_for_table() before
we set share->now_transactional to 0
Problem was that if copy_data_between_tables() didn't do proper
clean up in case of failures:
- copy object was not properly freed
- end_bulk_insert() was not called
- mysql_trans_prepare_alter_copy_data() set THD->transaction.on to
false which was not properly restored
The last part caused a crash in Aria as Aria depends on that THD
is correct.
Other things:
- Reset info->switched_transactional after usage (safety)
- Reset bulk_insert_single_undo (safety)