Problem:-
We are able to insert duplicate value in table because cmp_binary_offset
is not able to differentiate between NULL and empty string. So
check_duplicate_long_entry_key is never called and we don't check for
duplicate.
Solution
Added a if condition with is_null() on field which can differentiate
between NULL and empty string.
when assigning the cached item to the Item_cache for the first time
make sure to use Item_cache::setup(), not Item_cache::store().
Because the former copies the metadata (and allocates memory, in case
of Item_cache_row), and Item_cache::decimal must be set for
comparisons to work correctly.
Deallocation of TABLE_LIST::dt_handler and TABLE_LIST::pushdown_derived
was performed in multiple places if code. This not only made the code
more difficult to maintain but also led to memory leaks and
ASAN heap-use-after-free errors.
This commit puts deallocation of TABLE_LIST::dt_handler and
TABLE_LIST::pushdown_derived to the single point - JOIN::cleanup()
Per the code my_set_max_open_files 3 lines earlier, we attempt
to set the nofile (number of open files), rlimit to max_open_files.
We should use this in the warning because wanted_files may not
be the number.
When a range rowid filter was used with an index ref access the cost of
accessing the index entries for the records rejected by the filter was not
taken into account. For a ref access by an index with big average number
of records per key this led to poor execution plans if selectivity of the
used filter was high.
The patch resolves this problem. It also introduces a minor optimization
that skips look-ups into a filter that turns out to be empty.
With this patch the output of ANALYZE stmt reports the number of look-ups
into used rowid filters.
The patch also back-ports from 10.5 the code that properly sets the field
TABLE::file::table for opened temporary tables.
The test cases that were supposed to use rowid filters have been adjusted
in order to use similar execution plans after this fix.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The ALTER related code cannot do at the same time both:
- modify partitions
- change column data types
Explicit changing of a column data type together with a partition change is
prohibited by the parter, so this is not allowed and returns a syntax error:
ALTER TABLE t MODIFY ts BIGINT, DROP PARTITION p1;
This fix additionally disables implicit data type upgrade
(e.g. from "MariaDB 5.3 TIME" to "MySQL 5.6 TIME", or the other way
around according to the current mysql56_temporal_format) in case of
an ALTER modifying partitions, e.g.:
ALTER TABLE t DROP PARTITION p1;
In such commands now only the partition change happens, while
the data types stay unchanged.
One can additionally run:
ALTER TABLE t FORCE;
either before or after the ALTER modifying partitions to
upgrade data types according to mysql56_temporal_format.
Abort startup, if SSL setup fails.
Also, for the server always check that certificate matches private key
(even if ssl_cert is not set, OpenSSL will try to use default one)
the only query of the XA transaction is on a non-transactional table
errors out:
XA BEGIN 'x';
--error ER_DUP_ENTRY
INSERT INTO t1 VALUES (1),(1);
XA END 'x';
XA PREPARE 'x';
The binlogging pattern is correctly started as expected with
the errored-out Query or its ROW format events, but there is
no empty XA_prepare_log_event group.
The following
XA COMMIT 'x';
therefore should not be logged either, but it does.
The bug is fixed with proper maintaining of a read-write binlog hton
property and use it to enforce correct binlogging decisions.
Specifically in the bug description case XA COMMIT won't be binlogged
in both when given in the same connection and externally after disconnect.
The same continue to apply to an empty XA that do not change any data in all
transactional engines involved.
Read the version of the view share when we read definition to prevent
simultaniouse access to a view table SHARE (and so its MEM_ROOT)
from different threads.
OpenSSL handles memory management using **OPENSSL_xxx** API[^1]. For
allocation, there is `OPENSSL_malloc`. To free it, `OPENSSL_free` should
be called.
We've been lucky that OPENSSL (and wolfSSL)'s implementation allowed the
usage of `free` for memory cleanup. However, other OpenSSL forks, such
as AWS-LC[^2], is not this forgiving. It will cause a server crash.
Test case `openssl_1` provides good coverage for this issue. If a user
is created using:
`grant select on test.* to user1@localhost require SUBJECT "...";`
user1 will crash the instance during connection under AWS-LC.
There have been numerous OpenSSL forks[^3]. Due to FIPS[^4] and other
related regulatory requirements, MariaDB will be built using them. This
fix will increase MariaDB's adaptability by using more compliant and
generally accepted API.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
[^1]: https://www.openssl.org/docs/man1.1.1/man3/OPENSSL_malloc.html
[^2]: https://github.com/awslabs/aws-lc
[^3]: https://en.wikipedia.org/wiki/OpenSSL#Forks
[^4]: https://en.wikipedia.org/wiki/FIPS_140-2
st_select_lex::init_query is called in the exectuion of EXECUTE
IMMEDIATE 'alter table ...'. so reset the initialization at the
same point we set join= 0.
and also MDEV-25564, MDEV-18157.
Attempt to produce EXPLAIN output caused a crash in
Explain_node::print_explain_for_children. The cause of this was that an
Explain_node (actually a derived) had a link to child select#N, but
there was no query plan present for select#N.
The query plan wasn't present because the subquery was eliminated.
- Either it was a degenerate subquery like "(SELECT 1)" in MDEV-25564.
- Or it was a subquery in a UNION subquery's ORDER BY clause:
col IN (SELECT ... UNION
SELECT ... ORDER BY (SELECT FROM t1))
In such cases, legacy code structure in subquery/union processing code(*)
makes it hard to detect that the subquery was eliminated, so we end up
with EXPLAIN data structures (Explain_node::children) having dangling
links to child subqueries.
Do make the checks and don't follow the dangling links.
(In ideal world, we should not have these dangling links. But fixing
the code (*) would have high risk for the stable versions).
The bug is that we don't have a a lock on the trigger name, so it is
possible for two threads to try to create the same trigger at the same
time and both thinks that they have succeed.
Same thing can happen with drop trigger or a combinations of create and
drop trigger.
Fixed by adding a mdl lock for the trigger name for the duration of the
create/drop.
This was caused by the short_option_1-master.opt file that had the
option -T12, which means (among other things) to use blocking for
sockets. This was supported up to MariaDB 10.4, but not in 10.5 where
we removed the code that changes blocking sockets to non blocking in
case of errors.
Fixed by ignoring the TEST_BLOCKING flag and also by not using the -T12
argument in short_option_1.
Other things:
- Added back support for valgrind (the original issue had nothing to
do with valgrind).
- While debugging I noticed that the retry loop in
handle_connections_sockets() was doing a lot of work during shutdown.
Fixed by not doing retrys during shutdown.
The population of default values in INSERT SELECT was being
performed twice. With sequences, this resulted in every
second sequence value being used.
With SELECT INSERT we remove the second invokation of
table->update_default_fields(). This was already performed
in store_values() invoking fill_record_n_invoke_before_triggers()
which invoked update_default_fields() previously.
We do need to return an error on duplicate values, so the
::store_values is extended to take the ignore option.
=========== Problem =============
- `show columns` is not working for temporary tables, even though there
is enough privilege `create temporary tables`.
=========== Solution =============
- Append `TMP_TABLE_ACLS` privilege when running `show columns` for temp
tables.
- Additionally `check_access()` for database only once, not for each
field
=========== Additionally =============
- Update comments for function `check_table_access` arguments
Reviewed by: <vicentiu@mariadb.org>
For some queries that involve tables with different but convertible
character sets for columns taking part in the query, repeatable
execution of such queries in PS mode or as part of a stored routine
would result in server abnormal termination.
For example,
CREATE TABLE t1 (a2 varchar(10));
CREATE TABLE t2 (u1 varchar(10) CHARACTER SET utf8);
CREATE TABLE t3 (u2 varchar(10) CHARACTER SET utf8);
PREPARE stmt FROM
"SELECT t1.* FROM (t1 JOIN t2 ON (t2.u1 = t1.a2))
WHERE (EXISTS (SELECT 1 FROM t3 WHERE t3.u2 = t1.a2))";
EXECUTE stmt;
EXECUTE stmt; <== Running this prepared statement the second time
results in server crash.
The reason of server crash is that an instance of the class
Item_func_conv_charset, that created for conversion of a column
from one character set to another, is allocated on execution
memory root but pointer to this instance is stored in an item
placed on prepared statement memory root. Below is calls trace to
the place where an instance of the class Item_func_conv_charset
is created.
setup_conds
Item_func::fix_fields
Item_bool_rowready_func2::fix_length_and_dec
Item_func::setup_args_and_comparator
Item_func_or_sum::agg_arg_charsets_for_comparison
Item_func_or_sum::agg_arg_charsets
Item_func_or_sum::agg_item_set_converter
Item::safe_charset_converter
And the following trace shows the place where a pointer to
the instance of the class Item_func_conv_charset is passed
to the class Item_func_eq, that is created on a memory root of
the prepared statement.
Prepared_statement::execute
mysql_execute_command
execute_sqlcom_select
handle_select
mysql_select
JOIN::optimize
JOIN::optimize_inner
convert_join_subqueries_to_semijoins
convert_subq_to_sj
To fix the issue, switch to the Prepared Statement memory root
before calling the method Item_func::setup_args_and_comparator
in order to place any created Items on permanent memory root.
It may seem that such approach would result in a memory
leakage in case the parameter marker '?' is used in the query
as in the following example
PREPARE stmt FROM
"SELECT t1.* FROM (t1 JOIN t2 ON (t2.u1 = t1.a2))
WHERE (EXISTS (SELECT 1 FROM t3 WHERE t3.u2 = ?))";
EXECUTE stmt USING convert('A' using latin1);
but it wouldn't since for such case any of the parameter markers
is treated as a constant and no subquery to semijoin optimization
is performed.
See also commits aa8a31da and 64678c for a Bug #22990029 fix.
In this scenario INSERT chose to check if delete unmarking is available for
a just deleted record. To build an update vector, it needed to calculate
the vcols as well. Since this INSERT was not IGNORE-flagged, recalculation
failed.
Solutiuon: temporarily set abort_on_warning=true, while calculating the
column for delete-unmarked insert.
As of now innodb does not store trx_id for each record in secondary index.
The idea behind is following: let us store only per-page max_trx_id, and
delete-mark the records when they are deleted/updated.
If the read starts, it rememders the lowest id of currently active
transaction. Innodb refers to it as trx->read_view->m_up_limit_id.
See also ReadView::open.
When the page is fetched, its max_trx_id is compared to m_up_limit_id.
If the value is lower, and the secondary index record is not delete-marked,
then this page is just safe to read as is. Else, a clustered index could be
needed ato access. See page_get_max_trx_id call in row_search_mvcc, and the
corresponding switch (row_search_idx_cond_check(...)) below.
Virtual columns are required to be updated in case if the record was
delete-marked. The motivation behind it is documented in
Row_sel_get_clust_rec_for_mysql::operator() near
row_sel_sec_rec_is_for_clust_rec call.
This was basically a description why virtual column computation can
normally happen during SELECT, and, generally, a vcol index access.
Sometimes stats tables are updated by innodb. This starts a new
transaction, and it can happen that it didn't finish to the moment of
SELECT execution, forcing virtual columns recomputation. If the result was
a something that normally outputs a warning, like division by zero, then
it could be outputted in a racy manner.
The solution is to suppress the warnings when a column is computed
for the described purpose.
ignore_wrnings argument is added innobase_get_computed_value.
Currently, it is only true for a call from
row_sel_sec_rec_is_for_clust_rec.
MDEV-19243 introduced a regression on Windows.
In (supposedly rare) case, where environment variable TZ was set,
@@system_time_zone no longer derives from TZ. Instead, it incorrecty
refers to system default time zone, eventhough UTC time conversion
takes TZ into account.
The fix is to restore TZ-aware handling (timezone name derives from
tzname), if TZ is set.