TABLE::mark_columns_needed_for_update(): use_all_columns() assigns
pointer of all_set into read_set and write_set, but this is not good
since all_set is changed later by
TABLE::mark_columns_used_by_index_no_reset().
Do column_bitmaps_signal() whenever we change read_set/write_set.
Analysis
Mysqlbinlog output for encrypted binary log
#Q> insert into tab1 values (3,'row 003')
#190912 17:36:35 server id 10221 end_log_pos 980 CRC32 0x53bcb3d3 Table_map: `test`.`tab1` mapped to number 19
# at 940
#190912 17:36:35 server id 10221 end_log_pos 1026 CRC32 0xf2ae5136 Write_rows: table id 19 flags: STMT_END_F
Here we can see Table_map_log_event ends at 980 but Next event starts at 940.
And the reason for that is we do not send START_ENCRYPTION_EVENT to the slave
Solution:-
Send Start_encryption_log_event as Ignorable_log_event to slave(mysqlbinlog),
So that mysqlbinlog can update its log_pos.
Since Slave can request multiple FORMAT_DESCRIPTION_EVENT while master does not
have so We only update slave master pos when master actually have the
FORMAT_DESCRIPTION_EVENT. Similar logic should be applied for START_ENCRYPTION_EVENT.
Also added the test case when new server reads the data from old server which
does not send START_ENCRYPTION_EVENT to slave.
Master Slave Upgrade Scenario.
When Slave is updated first, Slave will have extra logic of handling
START_ENCRYPTION_EVENT But master willnot be sending START_ENCRYPTION_EVENT.
So there will be no issue.
When Master is updated first, It will send START_ENCRYPTION_EVENT to
slave , But slave will ignore this event in queue_event.
Fix rpl_skip_error test.
We cant reset Slave_skipped_errors(even with FLUSH STATUS), So instead
of absolute slave_skipped_errors we look for delta of slave_skipped_errors
Fix rpl.rpl_binlog_errors and binlog_encryption.rpl_binlog_errors
We create the $load_file and $load_file2 but we never remove them.
Fix rpl_000011.test
Instead of real value use delta value , Since flush status wont flush
LONGLONG variable.
Fix rpl_row_find_row_debug
Instead of searching whole log_error_ file we will use search_pattern_in_file
which runs pattern search only on latest test run , instead of full file.
Fix rpl_ip_mix rpl_ip_mix2
We should call reset slave all because we also want to reset master_host
otherwise show slave status wont be empty and making repeat N a failure.
Fix rpl_rotate_logs
First we have to remove master.info file (cleanup) and second we have to
call reset slave all because if we do not call reset slave all then we wont
read master.info file beacuse we already have master config in memory.
And this makes start slave to pass , which shoud fail becuase its permision
is 000
Fix circular_serverid0 test
The reason is that ++dbug_rows_event_count == 2 in queue_event does
not take --repeat into account. So I have reseted the dbug_rows_event_count
in if body.
calc_field_event_length should accurately calculate the size of BLOB type
fields, Instead of returning just the bytes taken by length it should return
length bytes + actual length.
read_statistics_for_tables_if_needed
Regression after 279a907, read_statistics_for_tables_if_needed() was
called after open_normal_and_derived_tables() failure.
Fixed by moving read_statistics_for_tables() call to a branch of
get_schema_stat_record() where result of open_normal_and_derived_tables()
is checked.
Removed THD::force_read_stats, added read_statistics_for_tables() instead.
Simplified away statistics_for_command_is_needed().
This applies to large allocations.
This maps to the way Linux does it in MDEV-10814 except FreeBSD uses
different constants.
Adjust error string to match to implementation.
Tested on FreeBSD-12.0
For release builds, do not declare unused variables.
unpack_row(): Omit a debug-only variable from WSREP diagnostic message.
create_wsrep_THD(): Fix -Wmaybe-uninitialized for the PSI_thread_key.
Analysis:
========
In general if there are three groups.
1 - Inserts 32 which fails due to local entry '32' on slave.
2 - Inserts 33
3 - Inserts 34
Each group considers itself as a waiter and it waits for prior group 'waitee'.
This is done in 'register_wait_for_prior_event_group_commit'. If there is no
other parallel group being scheduled then no waitee will be there.
Let us assume 3 groups are being scheduled in parallel.
3-> waits for 2-> waits for->1
'1' upon completion it checks is there any registered subsequent waiter. If
so it wakes up the subsequent waiter with its execution status. This execution
status is stored in wakeup_error.
If '1' failed then it sends corresponding wakeup_error to 2. Then '2' aborts
and it propagates error to '3'. So all further commits are aborted. This
mechanism works only when all transactions reach a stage where they are
waiting for their prior commit to complete.
In case of optimistic following scenario occurs.
1,2,3 are scheduled in parallel.
3 - Reaches group_commit_code waits for 2 to complete.
1 - errors out sets stop_on_error_sub_id=1.
When a group execution results in error its corresponding sub_id is set to
'stop_on_error_sub_id'. Any new groups queued for execution will check if
their sub_id is > stop_on_error_sub_id. If it is true their execution will be
skipped as prior group execution failed. 'skip_event_group=1' will be set.
Since the execution of SQL thread is about to stop we just skip execution of
all the following event groups. We still do all the normal waiting and wakeup
processing between the event groups as a simple way to ensure that everything
is stopped and cleaned up correctly.
Upon error '1' transaction checks for registered waiters. Since no one is
there it simply goes away.
2 - Starts the execution. It checks do I have a waitee.
Since wait_commit_sub_id == entry->last_committed_sub_id no waitee is set.
Secondly: 'entry->stop_on_error_sub_id' is set by '1'st execution. Now
'handle_parallel_thread' code checks if the current group 'sub_id' is greater
than the 'sub_id' set within 'stop_on_error_sub_id'.
Since the above is true 'skip_event_group=true' is set. Simply call
'wait_for_prior_commit' to wakeup all waiters. Group '2' didn't had any
waitee and its execution is skipped. Hence its wakeup_error=0.It sends a
positive wakeup signal to '3'. Which commits. This results in a missed
transaction. i.e 33 is missed and 34 is committed.
Fix:
===
When a worker learns that an earlier transaction execution has failed, and it
should not proceed for further execution, it should mark its own execution
status as failed so that it alerts its followers to abort as well.
For CMAKE_BUILD_TYPE=Debug, the default MYSQL_MAINTAINER_MODE=AUTO
implies -Werror along with other flags in cmake/maintainer.cmake,
which would break the debug builds when CMAKE_CXX_FLAGS include -O2.
This fix includes a backport of 6dd3f24090
from MariaDB 10.3.
Always initialize ScopedStatementReplication::saved_binlog_format,
so that GCC cannot emit a bogus warning about
ScopedStatementReplication::~ScopedStatementReplication() using the
variable.
The code was originally introduced in
commit d998da0306.
Also fixes:
MDEV-20560 Assertion `precision > 0' failed in decimal_bin_size upon SELECT with MOD short unsigned decimal
Changing the way how Item_func_mod calculates its max_length.
It now uses decimal_precision(), decimal_scale() and unsigned_flag
of its arguments, like all other Item_num_op descendants do.
MDEV-20589: Server still crashes in Field::set_warning_truncated_wrong_value
- Use dbug_tmp_use_all_columns() to mark that all fields can be used
- Remove field->is_stat_field (not needed)
- Remove extra arguments to Field::clone() that should not be there
- Safety fix for Field::set_warning_truncated_wrong_value() to not crash
if table is zero in production builds (We have got crashes several times
here so better to be safe than sorry).
- Threat wrong character string warnings identical to other field
conversion warnings. This removes some warnings we before got from
internal conversion errors. There is no good reason why a user would
get an error in case of 'key_field='wrong-utf8-string' but not for
'field=wrong-utf8-string'. The old code could also easily give
thousands of no-sence warnings for one single statement.
In the function test_if_cheaper_ordering we make a decision if using an index is better than
using filesort for ordering. If we chose to do range access then in test_quick_select we
should make sure that cost for table scan is set to DBL_MAX so that it is not picked.
A CTE can be defined as a table values constructor. In this case the CTE is
always materialized in a temporary table.
If the definition of the CTE contains a list of the names of the CTE
columns then the query expression that uses this CTE can refer to the CTE
columns by these names. Otherwise the names of the columns are taken from
the names of the columns in the result set of the query that specifies the
CTE.
Thus if the column names of a CTE are provided in the definition the
columns of result set should be renamed. In a general case renaming of
the columns is done in the select lists of the query specifying the CTE.
If a CTE is specified by a table value constructor then there are no such
select lists and renaming is actually done for the columns of the result
of materialization.
Now if a view is specified by a query expression that uses a CTE specified
by a table value constructor saving the column names of the CTE in the
stored view definition becomes critical: without these names the query
expression is not able to refer to the columns of the CTE.
This patch saves the given column names of CTEs in stored view definitions
that use them.
The flag is_stat_field is not set for the min_value and max_value of field items
inside table share. This is a must requirement as we don't want to throw
warnings of truncation when we read values from the statistics table to the column
statistics of table share fields.
Fix:
===
Implemented upstream fix.
commit 7d3d0fc303
Author: He Zhenxing <zhenxing.he@sun.com>
Backport Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed
to run 'after_queue_event' hook
Errors when send reply to master should never cause the IO thread
to stop, because master can fall back to async replication if it
does not get reply from slave.
The problem is fixed by deliberately ignoring the return value of
slave_reply.
Command COM_SHUTDOWN was rejected in non-Primary because
server_command_flags[COM_SHUTDOWN] had value CF_NO_COM_MULTI
instead of CF_SKIP_WSREP_CHECK.
As a fix removed assignment
server_command_flags[CF_NO_COM_MULTI]= CF_NO_COM_MULTI
which overwrote server_command_flags[COM_SHUTDOWN].
selectivity values fails
After having set the assertion that checks validity of selectivity values
returned by the function table_cond_selectivity() a test case from
order_by.tesst failed. The failure occurred because range optimizer could
return as an estimate of the cardinality of the ranges built for an index
a number exceeding the total number of records in the table.
The second bug is more subtle. It may happen when there are several
indexes with same prefix defined on the first joined table t accessed by
a constant ref access. In this case the range optimizer estimates the
number of accessed records of t for each usable index and these
estimates can be different. Only the first of these estimates is taken
into account when the selectivity of the ref access is calculated.
However the optimizer later can choose a different index that provides
a different estimate. The function table_condition_selectivity() could use
this estimate to discount the selectivity of the ref access. This could
lead to an selectivity value returned by this function that was greater
that 1.
best_access_path() is called from two optimization phases:
1. Plan choice phase, in choose_plan(). Here, the join prefix being
considered is in join->positions[]
2. Plan refinement stage, in fix_semijoin_strategies_for_picked_join_order
Here, the join prefix is in join->best_positions[]
It used to access join->positions[] from stage #2. This didnt cause any
valgrind or asan failures (as join->positions[] has been written-to before)
but the effect was similar to that of reading the random data:
The join prefix we've picked (in join->best_positions) could have
nothing in common with the join prefix that was last to be considered
(in join->positions).