Unit prepare prematurely fixed field which must be fixed via
setup_conds() to correctly update table->covering_keys.
Call vers_setup_conds() directly instead, because actually everything
else is not needed.
Wrong assertion condition. SYSTEM_TIME_ALL indicates that
vers_setup_conds() is done. In case FOR SYSTEM_TIME ALL is specified
in command the assertion passes but not checks anything.
Don't do skip_setup_conds() unless all errors are checked.
Fixes following errors:
ER_PERIOD_NOT_FOUND
ER_VERS_QUERY_IN_PARTITION
ER_VERS_ENGINE_UNSUPPORTED
ER_VERS_NOT_VERSIONED
LIMIT history partitions cannot be checked by existing algorithm of
check_misplaced_rows() because working history partition is
incremented each time another one is filled. The existing algorithm
gets record and tries to decide partition id for it by
get_partition_id(). For LIMIT history it will just get first
non-filled partition.
To fix such partitions it is required to do REBUILD instead of REPAIR.
When view is merged by DT_MERGE_FOR_INSERT it is then skipped from
processing and doesn't update WHERE clause with
vers_setup_conds(). Note that view itself cannot work in
vers_setup_conds() because it doesn't have row_start, row_end
fields. Thus it is required to descend down to material TABLE_LIST
through calls of mysql_derived_prepare() and run vers_setup_conds()
from there. Luckily, all views (views of views, views of views of
views, etc.) are linked in one list through next_global pointer, so we
can skip all views of views and get straight to non-view TABLE_LIST by
checking its merge_underlying_list property for zero value (it is
assigned by DT_MERGE_FOR_INSERT for merged derived tables).
We have to do that only for UPDATE and DELETE. Other DML commands
don't use WHERE clause.
MDEV-21146 Assertion `m_lock_type == 2' in handler::ha_drop_table upon LOAD DATA
LOAD DATA does not use WHERE and the above call of vers_setup_conds()
is not needed. unit->prepare() led to wrongly locked temporary table.
"write set" for replication finally got its correct place
(mark_columns_per_binlog_row_image()). When done generally in
mark_columns_needed_for_update() it affects optimization
algorithm. used_key_is_modified, query_plan.using_io_buffer are
wrongly set and that leads to wrong prepare_for_keyread() which limits
read_set.
Turn read cache off for update and multi-update for versioned
table. no_cache is reinited on each TABLE open because it is
applicable for specific algorithms.
As a side fix vers_insert_history_row() honors vers_write setting.
Aria with row_format=fixed uses IO_CACHE of type READ_CACHE for
sequential read in update loop. When history row is inserted inside
this loop the cache misses it and fails with error.
TODO:
Currently maria_extra() does not support SEQ_READ_APPEND. Probably it
might be possible to use this type of cache.
Prior to this fix, when matching addresses using mask,
extra bits could be used for comparison, e.g to
match with "a.b.c.d/24" , 27 bits were compared rather than 24.
The patch fixes the calculation.
MDEV-18957 UPDATE with LIMIT clause is wrong for versioned partitioned tables
UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().
Additional DML cases in view.test
Fix partitioning and DS-MRR to work together
- In ha_partition::index_end(): take into account that ha_innobase (and
other engines using DS-MRR) will have inited=RND when initialized for
DS-MRR scan.
- In ha_partition::multi_range_read_next(): if the MRR scan is using
HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's
handler will store anything into *range_info.
- In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions
about how much memory their MRR implementation needs by passing
*buffer_size=0. DS-MRR code didn't know about this (actually it used
uint for buffer size calculation and would have an under-flow).
Returning *buffer_size=0 made ha_partition assume that partitions do
not need MRR memory and pass the same buffer to each of them.
Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return
the amount of buffer space needed, but not more than about
@@mrr_buffer_size.
* Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its
partitions, and partition use DS-MRR, the code will call handler->clone
with TABLE (*NOT partition*) name as an argument.
DS-MRR has no way of knowing the partition name, so the solution was
to have the ::clone() function for the affected storage engine to ignore
the name argument and get it elsewhere.
Fix incorrect change introduced in the fix for MDEV-20109.
The patch tried to compute a more precise estimate for the record_count
value in SJ-Materialization-Scan strategy (in
Sj_materialization_picker::check_qep). However the new formula is worse
as it produces extremely optimistic results in common cases where
SJ-Materialization-Scan should be used)
The old formula produces pessimistic results in cases when Sj-Materialization-
Scan is unlikely to be a good choice anyway. So, the old behavior is better.
The assert indicates that the current transaction got caught uncleaned from
the semisync master's cache when it is signaled to proceed upon its
ack receive.
The reason of missed cleanup turns out to be a flaw in the gtid
connect mode.
A submitted by connecting slave value of its last received event's
binlog file *name* was adopted into
{{Repl_semi_sync_master::m_reply_file_name}} as a part of semisync
initialization.
Notice that the initialization still refines the position part of the
submitted last received event's binlog coordinates.
The master side binlog filename:pos refinement is
specific to the gtid connect mode for purpose of computing the latest
binlog file to resume slave feeding from.
Effectively in the gtid connect mode the computed resumption filename:pos
may appear smaller in which case a new post-connect time committing
transaction may be logged with its filename:pos also less than the
submitted coordinates and that triggers the assert.
Fixed with making the semisync initialization to use the refined filename:pos.
It is guaranteed to be less than any new generated transaction's binlog:pos.
The issue here is the wrong estimate of the cardinality of a partial join,
the cardinality is too high because the function table_cond_selectivity()
returns an absurd number 100 while selectivity cannot be greater than 1.
When accessing table t by outer reference t1.a via index we do not perform any
range analysis for t. Yet we see TABLE::quick_key_parts[key] and
TABLE->quick_rows[key] contain a non-zero value though these should have been
remained untouched and equal to 0.
Thus real cause of the problem is that TABLE::init does not clean the arrays
TABLE::quick_key_parts[] and TABLE::>quick_rows[].
It should have done it because the TABLE structure created for any
instance of a table can be reused for many queries.
In the function prev_record_reads where one finds the different row combinations for a
subset of partial join, it did not take into account the selectivity of tables
involved in the subset of partial join.
Unfortunate DROP TEMPORARY..IF EXISTS on a regular table may allow
subsequent CREATE TABLE statements to steal away the PFS_table_share
instance from the dropped table.
Partition table with the AUTO_INCREMENT column we ahve to check if the
max value is properly loaded. So we need to open all tables in INSERT
PARTITION statement if necessary. Also we need to check if some
tables are pruned away and not count the max autoincrement in this case.
* use compile_time_assert instead of DBUG_ASSERT
* don't use thd->clear_error(), because
* the error was already consumed by the error handler, so there is
nothing to clear
* it's dangerous to clear errors indiscriminately, if the error came
from outside of read_statistics_for_tables() it must not be cleared
mysql_insert() first opens all affected tables (which implicitly
starts a transaction in InnoDB), then stat tables.
A failure to open a stat table caused open_tables() to abort
the current stmt transaction (trans_rollback_stmt()). So, from the
server point of view the following ha_write_row()-s happened outside
of a transactions, and the server didn't bother to commit them.
The server has a mechanism to prevent a transaction being
unexpectedly committed or rolled back in the middle of a statement -
if an operation takes place _in a sub-statement_ it cannot change
the transaction state. Operations on stat tables are exactly that -
they are not allowed to change a transaction state. Put them in
a sub-statement to make sure they don't.
Basicaly it's an uninitialized read. 165 is 0xa5 which comes from TRASH_ALLOC()
Fix by calling a class ctor which initializes problematic
TMP_TABLE_PARAM::force_copy_fields field
Lock wait can happen on secondary index when doing FK checks for wsrep.
We should just return error to upper layer and applier will retry
operation when needed.
Relates to MDEV-17863 DROP TEMPORARY TABLE creates a transaction in
binary log on read only server
Other things:
- Fixed that insert into normal_table select from tmp_table is
replicated as row events if tmp_table doesn't exists on slave.
- Use local variables table and share to simplify code
- Use sql_command_flags to detect what kind of command was used
- Added CF_DELETES_DATA to simplify detecton of delete commands
- Removed duplicate error in create_table_from_items().
- Any temporary tables created under read-only mode will never be logged
to binary log. Any usage of these tables to update normal tables, even
after read-only has been disabled, will use row base logging (as the
temporary table will not be on the slave).
- Analyze, check and repair table will not be logged in read-only mode.
Other things:
- Removed not used varaibles in
MYSQL_BIN_LOG::flush_and_set_pending_rows_event.
- Set table_share->table_creation_was_logged for all normal tables.
- THD::binlog_query() now returns -1 if statement was not logged., This
is used to update table_share->table_creation_was_logged.
- Don't log admin statements in opt_readonly is set.
- Table's that doesn't have table_creation_was_logged will set binlog format to row
logging.
- Removed not needed/wrong setting of table->s->table_creation_was_logged
in create_table_from_items()
The code in convert_charset_partition_constant() did not
take into account that the call for item->safe_charset_converter()
can return NULL when conversion is not safe.
Note, 10.2 was not affected. The test for NULL presents in 10.2,
but it disappeared in 10.3 in a mistake. Restoring the test.
Problem was in a combination of LOCK TABLES on several Aria
tables followed by an ALTER TABLE. After the ALTER TABLE there
was a force close + reopen of the alter table. The close failed
because the table was still part of a transaction.
Fixed by calling extra(HA_EXTRA_PREPARE_FOR_FORCED_CLOSE) as
part of closing the table, which ensures that the table is not
anymore part of the current transaction.
Proper C-style type erasure is done via void*, not via char* or something else.
free_key_cache()
free_rpl_filter(): types were fixed to avoid function pointer type cast which
is still undefined behavior.
Note, that casting from void* to any other pointer type is safe and correct.