Let simplify the test.
The update_time is stored in the table metadata (dict_table_t);
it has nothing to do with buffer pool page eviction or replacement.
(addressed review input)
The issue was introduced by @@optimizer_max_sel_arg_weight code.
key_or() calls SEL_ARG::update_weight_locally(). That function
takes O(tree->elements) time.
Without that call, key_or(big_tree, one_element_tree) would take
O(log(big_tree)) when one_element_tree doesn't overlap with elements
of big_tree.
This means, update_weight_locally() can cause a big slowdown.
The fix:
1. key_or() actually doesn't need to call update_weight_locally().
It calls SEL_ARG::tree_delete() and SEL_ARG::insert(). These functions
update SEL_ARG::weight.
It also manipulates the SEL_ARG objects directly, but these
modifications do not change the weight of the tree.
I've just removed the update_weight_locally() call.
2. and_all_keys() also calls update_weight_locally(). It manipulates the
SEL_ARG graph directly.
Removed that call and added the code to update the SEL_ARG graph weight.
Tests main.range and main.range_not_embedded already contain the queries
that have test coverage for the affected code.
look for an installed plugin with the same name _and the same type_
(in case there are many plugins with the same name and different type,
which is, technically, possible for built-in plugins).
it's not "non deterministic", it's completely defined
by @@rand_seed1 and @@rand_seed2. And as a session func it needs
to be re-fixed at the beginning of every statement.
Test fixes:
Since fix for CONC-603 (wrong error handling in TLS read/write) in case
of a read/write error client doesn't return always error 2013 (server
has gone away), so in addition we need to check for error 2026
(TLS/SSL error) and 5014 (write error).
Starting with commit da094188f6 (MDEV-24393),
MariaDB will no longer acquire advisory file locks on InnoDB data
files by default, because it would create a large number of
entries in Linux /proc/locks.
The motivation for acquiring the file locks is to prevent accidental
concurrent startup of multiple server processes on the same data files.
Such mistake still turns out to be relatively common, based on
corruption bug reports from the community.
To prevent corruption due to concurrent startup attempts, the
Aria storage engine would unconditionally acquire an advisory lock
on one of its log files.
Solution: InnoDB will always lock its system tablespace files.
(Ever since commit 685d958e38
the InnoDB log file will not necessarily be open while the
server is running, because it can be accessed via memory-mapped I/O.)
If more protection is desired, then the option --external-locking
can be used.
The mandatory advisory lock also fixes intermittent failures of
some crash recovery tests. It turns out that when the mtr test harness
kills and restarts the server, it will not actually ensure that the
old process has terminated before starting the new one.
This bug could cause a crash of the server when executing queries containing
ANY/ALL predicands with redundant subqueries in GROUP BY clauses.
These subqueries are eliminated by remove_redundant_subquery_clause()
together with elimination of GROUP BY list containing these subqueries.
However the references to the elements of the GROUP BY remained in the
JOIN::all_fields list of the right operand of of the ALL/ANY predicand.
Later these references confused make_aggr_tables_info() when forming
proper execution structures after ALL/ANY predicands had been replaced
with expressions containing MIN/MAX set functions.
The patch just removes these references from JOIN::all_fields list used
by the subquery of the ALL/ANY predicand when its GROUP BY clause is
eliminated.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
prepare_inplace_add_virtual(): Over-estimate the size of the arrays
by not subtracting table->s->virtual_fields (which may refer to
stored, not virtual generated columns). InnoDB only distinguishes
virtual columns.
- Import tablespace re-evicts and reload the table definition. During that
time, innodb has to load the table even though the secondary fts index
marked as corrupted
- InnoDB should ignore the single word followed by apostrophe while
tokenising the document. Example is that if the input string is O'brien
then right now, InnoDB seperates into two tokens as O, brien. But
after this patch, InnoDB can ignore the token 'O' and consider
only 'brien'.
Unlike GCC, clang could optimize away alloca() and thus the
ALLOCATE_MEM_ON_STACK() instrumentation. To make it harder, let us
invoke a non-inline function on the entire allocated buffer.
The hang may be caused by a 1pc branch that was fixed by MDEV-26031 in
10.6 and up. That commit did not look relevant in 10.5 and below
so was not pushed to the low branches.
To possibly tackle the reported issue
the MDEV-26031 is backported now with a test that
unlike 10.6 does not expose the former bug in 10.5.
It is only needed for checking a refined logics
inside MYSQL_BIN_LOG::write_transaction_to_binlog.
The latter is made to do away with xid-unlogging (which is suspected
to have been at fault) for xid-less transaction.
This commit replaces sprintf(buf, ...) with
snprintf(buf, sizeof(buf), ...),
specifically in the "easy" cases where buf is allocated with a size
known at compile time.
The changes make sure we are not write outside array/string bounds which
will lead to undefined behaviour. In case the code is trying to write
outside bounds - safe version of functions simply cut the string
messages so we process this gracefully.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services,
Inc.
bsonudf.cpp warnings cleanup by Daniel Black
Reviewer: Daniel Black
Problem:
=======
This patch addresses two issues:
1. An incident event can be incorrectly reported for transactions
which are rolled back successfully. That is, an incident event
should only be generated for failed “non-transactional transactions”
(i.e., those which modify non-transactional tables) because they
cannot be rolled back.
2. When the mariadb slave (error) stops at receiving the incident
event there's no description of what led to it. Neither in the event
nor in the master's error log.
Solution:
========
Before reporting an incident event for a transaction, first validate
that it is “non-transactional” (i.e. cannot be safely rolled back).
To determine if a transaction is non-transactional,
lex->stmt_accessed_table(LEX::STMT_WRITES_NON_TRANS_TABLE)
is used because it is set previously in
THD::decide_logging_format().
Additionally, when an incident event is written, write an error
message to the server’s error log to indicate the underlying issue.
Reviewed by:
===========
Andrei Elkin <andrei.elkin@mariadb.com>
dict_load_foreigns(): Use a correctly sized buffer for the maximum-length
SYS_FOREIGN.ID. In case of overflow, do not crash the server but instead
return DB_CORRUPTION.
This commit is a fixup for MDEV-28762
Analysis: Some recursive json functions dont check for stack control
Fix: Add check_stack_overrun(). The last argument is NULL because it is not
used
Passing $opt_parallel as $childs is wrong: child can be killed before
it connects and you will never decrement $childs for this.
Another problem is (and that is the cause of this bug): child can be
killed and never close server socket. This can happen f.ex. after
unmaskable KILL signal. In such case the socket is closed by reaping
the child but that never happens inside reading the socket loop in
run_test_server().
The proper design is the waitless reap of children inside the socket
loop and if there is no more children we finish the socket loop. Since
there is Windows variation where we don't control the children via
waitpid(), all the clients must normally close the socket and only
this can finish the socket loop. For Unix variation we reckon that
case as all children closed the socket but not all yet died and for
that we do final waiting waitpid() (was done before the patch as
well).
To be more complete, we now handle 3 end-of-game scenarios in Unix:
1. all children closed socket, all children died: everything is
handled by the socket loop;
2. all children closed socket, not all yet died: we wait for alive
children to die after exiting the socket loop;
3. not all children closed socket, all children died: everything is
handled by the socket loop.
For Windows end-of-game scenario is only one:
All children close the socket.
66832e3a introduced change that prints core dumps in very detailed
format. That's completely out of user-friendliness but serves as a
measure for debugging hard-reproducible bugs.
The proper way to implement this:
1. it must be controlled by command-line and environment variable;
2. detailed traces must be default for buildbots only, for user
invocations normal stack traces should be printed.
Options for control are: MTR_PRINT_CORE and --print-core that accept
the following values:
no Don't print core
short Print stack trace of failed thread
medium Print stack traces of all threads
detailed Print all stack traces with debug context
custom:<code> Use debugger commands <code> to print stack trace
Default setting is: short (see env_or_default() call in pre_setup())
For environment variable wrong values are silently ignored (falls back
to default setting, see env_or_default()).
Command-line option --print-core (or -C) overrides environment
variable. Its default value is 'short' if not specified explicitly
(same env_or_default() call in pre_setup()). Explicit values are
checked for validity.
--print-method option can specify by which debugger we print
cores. For Windows there is only one choice: cdb. For Unix the values
are: gdb, dbx, lldb, auto. Default value is: auto
In 'auto' we try to use all possible debuggers until success.
setup_boot_args(), setup_client_args(), setup_args() traversing
datastructures on each invocation. Even if performance is not
important to perl script (though it definitely saves some CO2), this
nonetheless provokes some code-reading questions. Reading and
debugging such code is not convenient.
The better way is to prepare all the data in advance in an easily
readable form as well as do the validation step before any further
processing.
Use mtr_report() instead of die() like the other code does.
TODO: do_args() does even more data processing magic. Prepare that
data according the above strategy in advance in pre_setup() if possible.