Analysis:
========
When "Profiling" is enabled, server collects the resource usage of each
statement that gets executed in current session. Profiling doesn't support
nested statements. In order to ensure this behavior when profiling is enabled
for a statement, there should not be any other active query which is being
profiled. This active query information is stored in 'current' variable. When
a nested query arrives it finds 'current' being not NULL and server aborts.
When 'init_connect' and 'init_slave' system variables are set they contain a
set of statements to be executed. "execute_init_command" is the function call
which invokes "dispatch_command" for each statement provided in
'init_connect', 'init_slave' system variables. "execute_init_command" invokes
"start_new_query" and it passes the statement list to "dispatch_command". This
"dispatch_command" intern invokes "start_new_query" which leads to nesting of
queries. Hence '!current' assert is triggered.
Fix:
===
Remove profiling from "execute_init_command" as it will be done within
"dispatch_command" execution.
The code in fill_schema_schemata() did not take into account that
make_db_list() can leave empty db_names if the requested database
name was too long, so the call for db_names.at(0) crashed on assert.
- Moving the code testing if the database directory exists
into a separate function verify_database_directory_exists()
- Modifying the test to check if db_names is not empty
Problem:
=======
The "Start binlog_dump" message hasn't been updated to include the slave's
requested GTID position:
20:05:05 139836760311552 [Note] Start binlog_dump to slave_server(2), pos(, 4)
For diagnostic purposes, it would be helpful if the GTID position were
included.
Fix:
===
Imporve "Start binlog_dump" print message to include "using_gtid" and
"GTID position" requested by slave.
Ex:
[Note] Start binlog_dump to slave_server(2), pos(, 4), using_gtid(1),
gtid('1-1-201,2-2-100')
[Note] Start binlog_dump to slave_server(3), pos('mariadb-bin.004142',
507988273), using_gtid(0), gtid('')
The code erroneously used buff[100] in a fiew places to make
a GRANTEE value in the form:
'user'@'host'
Fix:
- Fixing the code to use (USER_HOST_BUFF_SIZE + 6) instead of 100.
- Adding a DBUG_ASSERT to make sure the buffer is enough
- Wrapping the code into a class Grantee_str, to reuse it easier in 4 places.
The real problem was that attempt to roll back cahnes after end of memory in QC was made incorrectly and lead to using uninitialized memory.
(bug has nothing to do with resize operation, it is just lack of resources erro processed incorrectly)
In case of SELECT without tables which returns either 0 or 1 rows,
JOIN::exec_inner() did not check if the flag representing SQL_CALC_FOUND_ROWS
is set or not and send_records was direclty assigned 0. So SELECT FOUND_ROWS()
was giving 0 in the output. Now it checks if the flag is set, if it is set
send_record=1 else 0. 1 is the number of rows that could have been sent
to the client if the SELECT query had SQL_CALC_FOUND_ROWS.
It is 0 when no rows were sent because the SELECT query did not have
SQL_CALC_FOUND_ROWS.
For low sort_buffer_size, in the cost calculation of using the Unique object the elements in the tree were evaluated to 0, make sure to have atleast 1 element in the Unique tree.
Also for the function Unique::get allocate memory for atleast MERGEBUFF2+1 keys.
For DECIMAL[(M[,D])] datatype max_sort_length was not being honoured which was leading to buffer
overflow while making the sort key. The fix to this problem would be to create sort keys for decimals
with atmost max_sort_key bytes
Important:
The minimum value of max_sort_length has been raised to 8 (previously was 4),
so fixed size datatypes like DOUBLE and BIGINIT are not truncated for
lower values of max_sort_length.
Backported the support for aborting and replaying stored procedure and fix for trigger
key assigments from 10.4 version.
Backported also two mtr tests: wsrep_sp_bf_abort and MDEV-20225
Previously multiple threads were allowed to load histograms concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.
To avoid scalability bottleneck, histograms loading is protected by
per-TABLE_SHARE atomic variable.
Whenever histograms were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.
Whenever histograms have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If histograms are being loaded
concurrently, statement waits until load is completed.
- Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only
meaningful within TABLE_SHARE (not used for collected stats).
- TABLE_STATISTICS_CB::histograms_can_be_read and
TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state
atomic variable.
- Simplified away alloc_histograms_for_table_share().
Note: there's still likely a data race if a thread attempts accessing
histograms data after it failed to load it (because of concurrent load).
It was there previously and goes out of the scope of this effort. One way
of fixing it could be reviving TABLE::histograms_are_read and adding
appropriate checks whenever it is needed.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
Previously multiple threads were allowed to load statistics concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.
To avoid scalability bottleneck, statistics loading is protected by
per-TABLE_SHARE atomic variable.
Whenever statistics were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.
Whenever statistics have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If statistics are being loaded
concurrently, statement waits until load is completed.
TABLE_STATISTICS_CB::stats_can_be_read and
TABLE_STATISTICS_CB::stats_is_read are replaced with a tri state atomic
variable.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
Removed redundant loops, integrated logics into the caller instead.
Unified condition in read_statistics_for_tables(), less
"table_share != NULL" checks, no more potential "table_share == NULL"
dereferencing.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
In Item_nodeset_func_predicate::val_nodeset, args[1] is not necessarily
an Item_func descendant. It can be Item_bool.
Removing a wrong cast. It was not really needed anyway.
- `SET DEFAULT ROLE xxx [FOR yyy]` should say:
"User yyy has not been granted a role xxx" if:
- The current user (not the user `yyy` in the FOR clause) can see the
role xxx. It can see the role if:
* role exists in `mysql.roles_mappings` (traverse the graph),
* If the current user has read access on `mysql.user` table - in
that case, it can see all roles, granted or not.
- Otherwise it should be "Invalid role specification".
In other words, it should not be possible to use `SET DEFAULT ROLE` to discover whether a specific role exist or not.
The immediate bug was caused by a failure to recognize a correct
position to stop the slave applier run in optimistic parallel mode.
There were the following set of issues that the analysis unveil.
1 incorrect estimate for the event binlog position passed to
is_until_satisfied
2 wait for workers to complete by the driver thread did not account non-group events
that could be left unprocessed and thus to mix up the last executed
binlog group's file and position:
the file remained old and the position related to the new rotated file
3 incorrect 'slave reached file:pos' by the parallel slave report in the error log
4 relay log UNTIL missed out the parallel slave branch in
is_until_satisfied.
The patch addresses all of them to simplify logics of log change
notification in either the master and relay-log until case.
P.1 is addressed with passing the event into is_until_satisfied()
for proper analisis by the function.
P.2 is fixed by changes in handle_queued_pos_update().
P.4 required removing relay-log change notification by workers.
Instead the driver thread updates the notion of the current relay-log
fully itself with aid of introduced
bool Relay_log_info::until_relay_log_names_defer.
An extra print out of the requested until file:pos is arranged
with --log-warning=3.
The code incorrectly assumed in multiple places that TYPELIB
values cannot have 0x00 bytes inside. In fact they can:
CREATE TABLE t1 (a ENUM(0x61, 0x0062) CHARACTER SET BINARY);
Note, the TYPELIB value encoding used in FRM is ambiguous about 0x00.
So this fix is partial.
It fixes 0x00 bytes in many (but not all) places:
- In the middle or in the end of a value:
CREATE TABLE t1 (a ENUM(0x6100) ...);
CREATE TABLE t1 (a ENUM(0x610062) ...);
- In the beginning of the first value:
CREATE TABLE t1 (a ENUM(0x0061));
CREATE TABLE t1 (a ENUM(0x0061), b ENUM('b'));
- In the beginning of the second (and following) value of the *last* ENUM/SET
in the table:
CREATE TABLE t1 (a ENUM('a',0x0061));
CREATE TABLE t1 (a ENUM('a'), b ENUM('b',0x0061));
However, it does not fix 0x00 when:
- 0x00 byte is in the beginning of a value of a non-last ENUM/SET
causes an error:
CREATE TABLE t1 (a ENUM('a',0x0061), b ENUM('b'));
ERROR 1033 (HY000): Incorrect information in file: './test/t1.frm'
This is an ambuguous case and will be fixed separately.
We need a new TYPELIB encoding to fix this.
Details:
- unireg.cc
The function pack_header() incorrectly used strlen() to detect
a TYPELIB value length. Adding a new function typelib_values_packed_length()
which uses TYPELIB::type_lengths[n] to detect the n-th value length,
and reusing the new function in pack_header() and packed_fields_length()
- table.cc
fix_type_pointers() assumed in multiple places that values cannot have
0x00 inside and used strlen(TYPELIB::type_names[n]) to set
the corresponding TYPELIB::type_lengths[n].
Also, fix_type_pointers() did not check the encoded data for consistency.
Rewriting fix_type_pointers() code to populate TYPELIB::type_names[n] and
TYPELIB::type_lengths[n] at the same time, so no additional loop
with strlen() is needed any more.
Adding many data consistency tests.
Fixing the main loop in fix_type_pointers() to use memchr() instead of
strchr() to handle 0x00 properly.
Fixing create_key_infos() to return the result in a LEX_STRING rather
that in a char*.
Analysis:
========
RESET MASTER TO # command deletes all binary log files listed in the index
file, resets the binary log index file to be empty, and creates a new binary
log with number #. When the user provided binary log number is greater than
the max allowed value '2147483647' server fails to generate a new binary log.
The RESET MASTER statement marks the binlog closure status as
'LOG_CLOSE_TO_BE_OPENED' and exits. Statements which follow RESET MASTER
try to write to binary log they find the log_state != LOG_CLOSED and
proceed to write to binary log cache and it results in crash.
Fix:
===
During MYSQL_BIN_LOG open, if generation of new binary log name fails then the
"log_state" needs to be marked as "LOG_CLOSED". With this further statements
will find binary log as closed and they will skip writing to the binary log.
The assert was caused by early cleanup of a user variable participant
in BINLOG @var,@var where it plays twice. That was unexpected by the base
code to clear its value prematurely.
Fixed with relocating the user var destruction after operations with
its value is over.
The code erroneously allowed both:
INSERT INTO t1 (vcol) VALUES (DEFAULT);
INSERT INTO t1 (vcol) VALUES (DEFAULT(non_virtual_column));
The former is OK, but the latter is not.
Adding a new virtual method in Item:
virtual bool vcol_assignment_allowed_value() const { return false; }
Item_null, Item_param and Item_default_value override it.
Item_default_value overrides it in the way to:
- allow DEFAULT
- disallow DEFAULT(col)
For the case when the optimizer does the IN-EXISTS transformation,
the equality condition is injected in the WHERE OR HAVING clause of
the subquery. If the select list of the subquery has a reference to
the parent select make sure to use the reference and not the original
item.
The DECIMAL data type branch in Item_func_int_val::fix_length_and_dec()
incorrectly used DOUBLE-style length calculation, which resulted in
a smaller data type than the actual result of FLOOR()/CEIL() needs.
only MDL-prelock but do not open FK child tables for read-only (RESTRICT)
FK actions.
Tables still needs to be opened for CASCADE actions, see 9180e8666b
The event scheduler has a THD which is used for e.g. keeping track
of the timing of the events. Thus, each scheduling of an event will
make use of this THD, which in turn allocates memory in the THD's
mem root. However, the mem root was never cleared, and hence, the
memory occupied would monotonically increase throughout the life
time of the server.
The root cause was found by Jon Olav Hauglid, and this fix clears the
THD's mem root for each event being scheduled.
Change-Id: I462d2b9fd9658c9f33ab5080f7cd0e0ea28382df
in fact, in MariaDB it cannot, but it can show spurious slaves
in SHOW SLAVE HOSTS.
slave was registered in COM_REGISTER_SLAVE and un-registered after
COM_BINLOG_DUMP. If there was no COM_BINLOG_DUMP, it would never
unregister.
Cause:
In case of version based condtional comments, if the condition evaluates
to false, it is converted to a regular comment for replication by
replacing "!" by " ".
Nested comment in a conditional comment is replicated as is. Nested
comments are supported only in case of conditional comments and when a
the comment on slave is no more a conditional comment, the statement
execution fails on the slave.
Fix:
Convert the nested comment, start from "/*" to "(*" and comment end from
"*/" to "*)" for replication.
Change-Id: I1a8e385a267b2370529eade094f0258fa96886c0
This is a backport of the applicable part of
commit 93475aff8d and
commit 2c39f69d34
from 10.4.
Before 10.4 and Galera 4, WSREP_ON is a macro that points to
a global Boolean variable, so it is not that expensive to
evaluate, but we will add an unlikely() hint around it.
WSREP_ON_NEW: Remove. This macro was introduced in
commit c863159c32
when reverting WSREP_ON to its previous definition.
We replace some use of WSREP_ON with WSREP(thd), like it was done
in 93475aff8d. Note: the macro
WSREP() in 10.1 is equivalent to WSREP_NNULL() in 10.4.
Item_func_rand::seed_random(): Avoid invoking current_thd
when WSREP is not enabled.
_ma_fetch_keypage(): Correct an assertion that used to always hold.
Thanks to clang -Wint-in-bool-context for flagging this.
double_to_datetime_with_warn(): Suppress -Wimplicit-int-float-conversion
by adding a cast. LONGLONG_MAX converted to double will actually be
LONGLONG_MAX+1.
Since commit 7198c6ab2d
the ./mtr --embedded tests would fail to start innodb_plugin
because of an undefined reference to the symbol wsrep_log().
Let us define a stub for that function. The embedded server
is never built WITH_WSREP, but there are no separate storage
engine builds for the embedded server. Hence, by default,
the dynamic InnoDB storage engine plugin would be built WITH_WSREP
and it would fail to load into the embedded server library due to
a reference to the undefined symbol.
The first patch for the bug was erroneous: it did not take into account
the fact that the modified function get_key_scans_params() was called in
different contexts. As a result the patch caused a regression bug MDEV-22191.
The patch for this bug introduced an extra parameter. Actually we can
do without this parameter and use the fourth parameter for the same
puropose - to differentiate between the calls of the function for range
access and for index merge access.
Also removed the call of get_key_scans_params() in the code of the function
merge_same_index_scans() as not needed.
Several tests that involve stored procedures fail on 10.4 kvm-asan
(clang 10) due to stack overrun. The main contributor to this stack
overrun is mysql_execute_command(), which is invoked recursively
during stored procedure execution.
Rebuilding with cmake -DWITH_WSREP=OFF shrunk the stack frame size
of mysql_execute_command() by more than 10 kilobytes in a
WITH_ASAN=ON, CMAKE_BUILD_TYPE=Debug build. The culprit
turned out to be the macro WSREP_LOG, which is allocating a
separate 1KiB buffer for every occurrence.
We replace the macro with a function, so that the stack will be
allocated only when the function is actually invoked. In this way,
no stack space will be wasted by default (when WSREP and Galera
are disabled).
This backports commit b6c5657ef2
from MariaDB 10.3.1.
Without ASAN, compilers can be smarter and optimize the stack usage.
The original commit message mentions that 1KiB was saved on GCC 5.4,
and 4KiB on Mac OS X Lion, which presumably uses a clang-based compiler.
When index_merge_sort_union is turned off only ror scans were considered for range
scans, which is wrong.
To fix the problem ensure both ror scans and non ror scans are considered for range
access
The problem happened in these line:
uval0= (ulonglong) (val0_negative ? -val0 : val0);
uval1= (ulonglong) (val1_negative ? -val1 : val1);
return check_integer_overflow(val0_negative ? -(longlong) res : res,
!val0_negative);
when unary minus was performed on -9223372036854775808.
This behavior is undefined in C/C++.
This bug could manifest itself in a very rare cases when the optimizer
chose an execution plan by which a joined table was accessed by a table
scan and the optimizer was checking whether ranges checked for each record
could improve this plan. In such cases the optimizer evaluates range
conditions over a table that depend on other tables. For such conditions
the constructed SEL_ARG trees are marked as MAYBE_KEY. If a SEL_ARG object
constructed for a sargable condition marked as RANGE_KEY had the same
first key part as a MAYBE_KEY SEL_ARG object and the key_and() function
was called for this pair of SEL_ARG objects then an invalid SEL_ARG
object could be constructed that ultimately could lead to a crash before
the execution phase.