The cause of this was several different bugs:
- When using binary logging with binlog_row_image=FULL
the all bits in read_set was set, which caused a
different (wrong) pattern for marking vcol_set.
- TABLE::mark_virtual_columns_for_write() didn't in all
cases mark vcol_set with the vcol_field.
- TABLE::update_virtual_fields() has to update all
vcol fields on REPLACE if binary logging with FULL
is used.
- VCOL_UPDATE_INDEXED should update all vcol fields part
of an index that was not updated by VCOL_UPDATE_FOR_READ
- max_row_length() calculated length of NULL and not
used fields. This didn't cause any crash, but used
more memory than needed.
thd_destructor_proxy(): Ensure that purge actually exits,
like the logic should have been ever since MDEV-14080.
srv_purge_shutdown(): A new function to wait for the
purge coordinator to exit. Before exiting, the
purge coordinator will ensure that all purge workers have exited.
This is the MariaDB 10.2 version of the patch.
field_store_string(): Simplify the code.
field_store_index_name(): Remove, and use field_store_string()
instead. Starting with MariaDB 10.2.2, there is the predicate
dict_index_t::is_committed(), and dict_index_t::name never
contains the magic byte 0xff.
Correct some comments to refer to TEMP_INDEX_PREFIX_STR.
i_s_cmp_per_index_fill_low(): Use the appropriate value NULL to
identify that an index was not found. Check that storing each
column value succeeded.
i_s_innodb_buffer_page_fill(), i_s_innodb_buf_page_lru_fill():
Only invoke Field::set_notnull() if the index was found.
(This fixes the bug.)
i_s_dict_fill_sys_indexes(): Adjust the index->name that was
directly loaded from SYS_INDEXES.NAME (which can start with
the 0xff byte). This was the only function that depended
on the translation in field_store_index_name().
MDEV-16123 ASAN heap-use-after-free handler::ha_index_or_rnd_end
MDEV-13828 Segmentation fault on RENAME TABLE
Problem was that destructor called methods for closed table.
Fixed by removing code in destructor.
Crash happened when deleting all columns that was part of a check constraint
The bug was that read map for from table was used when
checking CHECK constraint and was not properly reset
in copy_data_between_tables()
Problem was that detection of temporary tables was all wrong for
RENAME TABLE.
(Temporary tables where opened by top level call to
open_temporary_tables(), which can't detect if a temporary table
was renamed to something and then reused).
Fixed by adding proper parsing of rename list to check against
the current name of a table at each rename stage.
Also change do_rename_temporary() to check against the current
state of temporary tables, not according to the state of start
of RENAME TABLE.
MDEV-10130 Assertion `share->in_trans == 0' failed in storage/maria/ma_close.c
MDEV-10378 Assertion `trn' failed in virtual int ha_maria::start_stmt
The problem was that maria_handler->trn was not properly reset
at commit/rollback and ha_maria::exernal_lock() could get confused
because.
There was some old code in ha_maria::implicit_commit() that tried
to take care of this, but it was not bullet proof.
Fixed by adding list of all tables that is part of the maria transaction to
TRN.
A nice side effect was of the fix is that loops in
ha_maria::implict_commit() got to be much simpler.
Other things:
- Fixed a bug in mysql_admin_table() where argument open_for_modify
was wrongly reset for the next table in the chain
- rollback admin command also in case of fatal error.
- Split _ma_set_trn_for_table() to three version to simplify code
and debugging.
- Several new asserts to detect the original problem (that file was
not properly removed from trn before calling ma_close())
Step#1: RocksDB files require a special #define when they are compiled
with valgrind. Without that, valgrind fails with an 'unimplemented syscall'
error for fcntl call.
order with Galera and encrypt-tmp-files=1
Problem:- If trans_cache (IO_CACHE) uses encrypted tmp file
then on next DML server will crash.
Case:-
Lets take a case , we have a table t1 , We try to do 2 inserts in t1
1. A really long insert so that trans_cache has to use temp_file
2. Just a small insert
Analysis:- Actually server crashes from inside of galera
library.
/lib64/libc.so.6(abort+0x175)[0x7fb5ba779dc5]
/usr/lib64/galera/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5State...
mysys/stacktrace.c:247(my_print_stacktrace)[0x7fb5a714940e]
sql/signal_handler.cc:160(handle_fatal_signal)[0x7fb5a715c1bd]
sql/wsrep_hton.cc:257(wsrep_rollback)[0x7fb5bcce923a]
sql/wsrep_hton.cc:268(wsrep_rollback)[0x7fb5bcce9368]
sql/handler.cc:1658(ha_rollback_trans(THD*, bool))[0x7fb5bcd4f41a]
sql/handler.cc:1483(ha_commit_trans(THD*, bool))[0x7fb5bcd4f804]
but actual issue is not in galera but in mariadb, because for 2nd
insert we should never call rollback. We are calling rollback because
log_and_order fails it fails because write_cache fails , It fails
because after reinit_io_cache(trans_cache) , my_b_bytes_in_cache says 0
so we look into tmp_file for data , which is obviously wrong since temp
was used for previous insert and it no longer exist.
wsrep_write_cache_inc() reads the IO_CACHE in a loop, filling it with
my_b_fill() until it returns "0 bytes read". Later
MYSQL_BIN_LOG::write_cache() does the same. wsrep_write_cache_inc()
assumes that reading a zero bytes past EOF leaves the old data in the
cache
Solution:- There is two issue in my_b_encr_read
1st we should never equal read_end to info->buffer. I mean this
does not make sense read_end should always point to end of buffer.
2nd For most of the case(apart from async IO_CACHE) info->pos_in_file
should be equal to info->buffer position wrt to temp file , since
in this case we are not changing info->buffer it should remain
unchanged.
The failures with valgrind occur as a result of Spider sometimes using the
wrong transaction for operations in background threads that send requests to
the data nodes. The use of the wrong transaction caused the networking to the
data nodes to use the wrong thread in some cases. Valgrind eventually
detects this when such a thread is destroyed before it is used to disconnect
from the data node by that wrong transaction when it is freed.
I have fixed the problem by correcting the transaction used in each of these
cases.
Author:
Jacob Mathew.
Reviewer:
Kentoku Shiba.
Cherry-Picked:
Commit afe5a51 on branch 10.2
The failures with valgrind occur as a result of Spider sometimes using the
wrong transaction for operations in background threads that send requests to
the data nodes. The use of the wrong transaction caused the networking to the
data nodes to use the wrong thread in some cases. Valgrind eventually
detects this when such a thread is destroyed before it is used to disconnect
from the data node by that wrong transaction when it is freed.
I have fixed the problem by correcting the transaction used in each of these
cases.
Author:
Jacob Mathew.
Reviewer:
Kentoku Shiba.
Merged:
Commit 4d576d9 on branch bb-10.3-MDEV-12900
Explain_query must be created in the execution arena.
But JOIN::optimize_inner temporarily switches to the statement arena
under `if (sel->first_cond_optimization)`. This might cause
Explain_query to be allocated in the statement arena. Usually it is
harmless (although technically incorrect and a waste of memory), but
in case of EXECUTE IMMEDIATE, Prepared_statement object and its
statement arena are destroyed before log_slow_statement() call,
which uses Explain_query.
Fix:
1. Create Explain_query before switching arenas.
2. Before filling earlier-created Explain_query with data, set
thd->mem_root from the Explain_query::mem_root
Itcan happen that the connection is already gone during the window
between quering I_S.PROCESSLIST and KILL QUERY.
Fix is to tolerate ER_NO_SUCH_THREAD returned from KILL QUERY.
Add small improvement in message "Killing MDL query " to actually output
the query.
Also do not try to kill queries that are already in Killed state.
It should work ok on all Unixes, but on Windows ,only worked by accident
in the past, with client not being Unicode safe.
It stopped working with Visual Studio 2017 15.7 update now.
The crash occurs when a thread that is closing its connection attempts to
access Spider transaction information when another thread has freed that memory
while processing Spider plugin deinit. This occurs because Spider does not
adjust the plugin's reference count when it sets a transaction information
pointer for the plugin.
The fix I implemented changes the way Spider sets the transaction information
pointer to use thd_set_ha_data() so that Spider's plugin reference counter is
adjusted as well.
Author:
Jacob Mathew.
Reviewer:
Kentoku Shiba.
Merged From:
Commit ab9d420 on branch 10.2
Fix two issues:
1. Rdb_ddl_manager::rename() loses the value of m_hidden_pk_val. new
object used to get 0, which means "not loaded from the db yet".
2. ha_rocksdb::load_hidden_pk_value() uses current transaction (and its
snapshot) when loading hidden PK value from disk. This may cause it to
load an out-of-date value.
The current code does not support recursive CTEs whose specifications
contain a mix of ALL UNION and DISTINCT UNION operations.
This patch catches such specifications and reports errors for them.
with recursive subquery
There were two problems:
1. The code did not report that usage of global ORDER BY / LIMIT clauses
was not supported yet.
2. The code just reset fake_select_lex of the the unit specifying
a recursive CTE to NULL and that caused memory leaks in some cases.
The crash occurs when a thread that is closing its connection attempts to
access Spider transaction information when another thread has freed that memory
while processing Spider plugin deinit. This occurs because Spider does not
adjust the plugin's reference count when it sets a transaction information
pointer for the plugin.
The fix I implemented changes the way Spider sets the transaction information
pointer to use thd_set_ha_data() so that Spider's plugin reference counter is
adjusted as well.
Author:
Jacob Mathew.
Reviewer:
Kentoku Shiba.
Merged From:
Commit eabfadc on branch bb-10.3-MDEV-7914
jemalloc > 5.0.0 doesn't like to be linked with
a dlopen-ed module.
Don't link tokudb with jemalloc on Fedora 28,
LD_PRELOAD it instead with mysqld_safe
and with systemd.
Fixed by extending unique_table() with a flag to not allow usage of
the replaced table.
I also cleaned up find_dup_table() to not use goto next.
I also added more comments to the code in find_dup_table()
Analyze core independently of max-save-datadir and max-save-core setting.
Increment $num_saved_cores only if core was actually saved.
"Move any core files from e.g. mysqltest" independently of
max-save-datadir setting. Note: it may overwrite core from mysqld, which
might not be desired (it did work this way even before).
srv_purge_coordinator_thread(): Wait for all purge worker threads
to actually exit. An analysis of a core dump of a hung 10.3 server
revealed that one srv_worker_thread did not exit, even though the
purge coordinator had exited. This caused kill_server_thread and
mysqld_main to wait indefinitely. The main InnoDB shutdown was
never called, because unireg_end() was never called.
Imported the following test case from mysql to MariaDB
1) innodb.alter_kill
2) innodb.alter_foreign_crash
3) innodb.alter_rename_files
4) innodb.analyze_table
5) Appended the case in innodb-online-alter-gis
Problem:
We keep pinning pages in dict_stats_analyze_index_below_cur(),
but doesn't release these pages. When we have a relative small
buffer pool size, and big innodb_stats_persistent_sample_pages,
there will be no free pages for use.
Solution:
Use a separate mtr in dict_stats_analyze_index_below_cur(),
and commit mtr before return.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 11362