This patch introduces support for the system variable eq_range_index_dive_limit
that existed in MySQL starting from 5.6. The variable sets a limit for
index dives into equality ranges. Index dives are performed by optimizer
to estimate the number of rows in range scans. Index dives usually provide
good estimate but they are pretty expensive. To estimate the number of rows
in equality ranges statistical data on indexes can be employed. Its usage gives
not so good estimates but it's cheap. So if the number of equality dives
required by an index scan exceeds the set limit no dives for equality
ranges are performed by the optimizer for this index.
As the new system variable is introduced in a stable version the default
value for it is set to a special value meaning there is no limit for the number
of index dives performed by the optimizer.
The patch partially uses the MySQL code for WL 5957
'Statistics-based Range optimization for many ranges'.
Item_subselect::is_expensive() used to return FALSE (Inexpensive) whenever
it saw that one of SELECTs in the Subquery's UNION is degenerate. It
ignored the fact that other parts of the UNION might not be inexpensive,
including the case where pther parts of the UNION have no query plan yet.
For a subquery in form col >= ANY (SELECT 'foo' UNION SELECT 'bar')
this would cause the query to be considered inexpensive when there is
no query plan for the second part of the UNION, which in turn would
cause the SELECT 'foo' to compute and free itself while still inside
JOIN::optimize for that SELECT (See MDEV comment for full description).
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.
The causes and fixes:
1. We need to improve processing of changing the auto-increment values
after changing the cluster size.
2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.
3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).
truncate incorrect values in convert_period_to_month() so that
PERIOD_DIFF never returns a value outside of 2^23 range.
And, for safety, increase buffer sizes for int10_to_str
to be sufficienly big for any int10_to_str result.
After the commit b76b69cd5f
loose index scan for queries with DISTINCT stopped working.
That is why that commit has to be reverted.
Additionally this patch fixes the problem of MDEV-10880.
with spatial index
So the issue is since it is spatial index , at the time of searching index
for key (Rows_log_event::find_row) we use wrong field image we use
Field::itRAW while we should be using Field::itMBR
Adding a FK constraint to an existing table (ALTER TABLE ADD FOREIGN
KEY) causes the applier to fail, if a concurrent DML statement that
violate the new constraint (i.e. a DELETE or UPDATE of record in the
parent table).
For exmaple, the following scenario causes a crash in the applier:
1. ALTER successfully adds FK constraint in node_1
2. On node_2 is UPDATE is in pre_commit() and has certified successfully
3. ALTER is delivered in node_2 and BF aborts DML
4. Applying UPDATE event causes FK violation in node_1
To avoid this situation it is necessary for UPDATE to fail during
certification. And for the UPDATE to fail certfication it is necessary
that ALTER appends certification keys for both the child and the parent
table. Before this patch, ALTER TABLE ADD FK only appended keys for
child table which is ALTERed.
We do not accept:
1. We did not have this problem (fixed earlier and better)
d982e717ab Bug#27510150: MYSQLDUMP FAILS FOR SPECIFIC --WHERE CLAUSES
2. We do not have such options (an DBUG_ASSERT put just in case)
bbc2e37fe4 Bug#27759871: BACKRONYM ISSUE IS STILL IN MYSQL 5.7
3. Serg fixed it in other way in this release:
e48d775c6f Bug#27980823: HEAP OVERFLOW VULNERABILITIES IN MYSQL CLIENT LIBRARY
The problem happened in the derived condition pushdown code:
- When Item_func_regex::build_clone() was called, it created a copy of
the original Item_func_regex, and this copy got registered in free_list.
Class specific additional dynamic members (such as "re") made
a shallow copy, rather than a deep copy, in the cloned Item_func_regex.
As a result, the Regexp_processor_pcre::m_pcre of the cloned Item_func_regex
and of the original Item_func_regex pointed to the same compiled regular
expression.
- On cleanup_items(), both original and cloned copies of Item_func_regex
called re.cleanup(), which called pcre_free(m_pcre). So the same compiled
regular expression was freed two times, which was noticed by ASAN.
The same problem was repeatable for Item_func_regexp_instr.
A similar problem happened for Item_func_sp, for the sp_result_field member.
Both original and cloned copies of Item_func_sp pointed the same Field instance
and both deleted it on cleanup().
A possible solution would be to fix build_clone() to create deep
(instead of shallow) copies for the dynamic members of the affected classes
(Item_func_regex, Item_func_regexp_instr, Item_func sp).
However, this would be too complex.
As agreed with Galina and Igor, this patch disallows using using these
affected classes in derived condition pushdown by overriding get_clone()
to return NULL.
Assertion `used_tables_cache == 0' failed
This bug manifested itself when executing queries
over materialized derived tables /vies and with
conjunctive always true predicates containing
inexpensive single-row subqueries.
This bug disappeared after the patch mdev-15035
had been applied.
This patch fixes another problem introduced by the patch for mdev-4817.
The latter changed Item_cond::fix_fields() in such a way that it could
call the virtual method is_expensive(). With the first its call
the method saves the result in Item::is_expensive_cache. For all next
calls the method returns the result from this cache. So if the item
once was determined as expensive the method always returns true.
For subqueries it's not good, because non-optimized subqueries always
is considered as expensive.
It means that the cache should be invalidated after the call of
optimize_constant_subqueries().
In this case we are setting the field Item_func_eq::in_eqaulity_no for the semi-join equalities.
This helps us to remove these equalites as the inner tables are not available during parent select execution
while the outer tables are not available during materialization phase.
We only have it set for the equalites for the fields involved with the IN subquery
and reset it for the equalities which do not belong to the IN subquery.
For example in case of nested IN subqueries:
SELECT t1.a FROM t1 WHERE t1.a IN
(SELECT t2.a FROM t2 where t2.b IN
(select t3.b from t3 where t3.c=27 ))
there are two equalites involving the fields of the IN subquery
1) t2.b = t3.b : the field Item_func_eq::in_eqaulity_no is set when we merge the grandchild select into the child select
2) t1.a = t2.a : the field Item_func_eq::in_eqaulity_no is set when we merge the child select into the parent select
But when we perform case 2) we should ensure that we reset the equalities in the child's WHERE clause.
with join_cache_level>2
During muliple equality propagation for a query in which we have an IN subquery, the items in the select list of the
subquery may not be part of the multiple equality because there might be another occurence of the same field in the
where clause of the subquery.
So we keyuse_is_valid_for_access_in_chosen_plan function which expects the items in the select list of the subquery to
be same to the ones in the multiple equality (through these multiple equalities we create keyuse array).
The solution would be that we expect the same field not the same Item because when we have SEMI JOIN MATERIALIZATION SCAN,
we use copy back technique to copies back the materialised table fields to the original fields of the base tables.
This patch fixes another problem introduced by the patch for mdev-4817.
The latter changed Item_cond::fix_fields() in such a way that it could
call the virtual method is_expensive(). With the first its call
the method saves the result in Item::is_expensive_cache. For all next
calls the method returns the result from this cache. So if the item
once was determined as expensive the method always returns true.
For subqueries it's not good, because non-optimized subqueries always
is considered as expensive.
It means that the cache should be invalidated after the call of
optimize_constant_subqueries().
This patch fixes another problem introduced by the patch for mdev-4817.
The latter changed Item_cond::fix_fields() in such a way that it could
call the virtual method is_expensive(). With the first its call
the method saves the result in Item::is_expensive_cache. For all next
calls the method returns the result from this cache. So if the item
once was determined as expensive the method always returns true.
For subqueries it's not good, because non-optimized subqueries always
is considered as expensive.
It means that the cache should be invalidated after the call of
optimize_constant_subqueries().
This patch fixes another problem introduced by the patch for mdev-4817.
The latter changed Item_cond::fix_fields() in such a way that it could
call the virtual method is_expensive(). With the first its call
the method saves the result in Item::is_expensive_cache. For all next
calls the method returns the result from this cache. So if the item
once was determined as expensive the method always returns true.
For subqueries it's not good, because non-optimized subqueries always
is considered as expensive.
It means that the cache should be invalidated after the call of
optimize_constant_subqueries().
Due to a legacy bug in the code of make_join_statistics() detecting
so-called constant tables could miss some of them in rare queries
that used RIGHT JOIN. As a result these queries had execution plans
different from the execution plans of the equivalent queries with
LEFT JOIN.
Besides starting from 10.2 this could trigger an assertion failure.
Restricted output for CREATE USER, GRANT, REVOKE and SET PASSWORD
so that it shows only above keywords but not rest of query i.e.
not user or password.
Restricted output for CREATE USER, GRANT, REVOKE and SET PASSWORD
so that it shows only above keywords but not rest of query i.e.
not user or password.
don't try to set default time zone in --bootstrap,
this generally cannot be done, as timezone tables aren't loaded.
and bootstrap scripts don't need it anyway.
Don't install server files if WITHOUT_SERVER is specified.
"Server files" are defined as files going into the MariaDB-Server RPM,
that is files in the components Server, ManPagesServer, Server_Scripts,
IniFiles, SuportFiles, and Readme.
MEMORY table could be renamed into a non-extistent database.
rename() is documented to return ENOENT when the source file does not
exist OR when the target directory not exist. Nonexistent source .frm
file is ok (table can still exist in the engine), nonexistent target
directory is not.
Make my_rename to use ENOTDIR for the latter case. Make RENAME TABLE
issue an appropriate error ("unknown database" instead of "unknown table")
snprintf returns the number of bytes it wrote (or would have written) NOT
counting the \0 terminal character.
The buffer size it accepts as argument DOES COUNT the \0 character.
Pass the right parameter value.
After the MDEV-13118 fix there's no code in the server that
wants caseup/casedn to change the argument in place for simple
charsets. Let's remove this logic and always return the result in a
new string for all charsets, both simple and complex.
1. Removing the optimization that *some* character sets used in casedn()
and caseup(), which allowed (and required) to change the case in-place,
overwriting the string passed as the "src" argument.
Now all CHARSET_INFO's work in the same way:
non of them change the source string in-place, all of them now convert
case from the source string to the destination string, leaving
the source string untouched.
2. Adding "const" qualifier to the "char *src" parameter
to caseup() and casedn().
3. Removing duplicate implementations in ctype-mb.c.
Now both caseup() and casedn() implementations for all CJK character sets
use internally the same function my_casefold_mb()
(the former my_casefold_mb_varlen()).
4. Removing the "unused" attribute from parameters of some my_case{up|dn}_xxx()
implementations, as the affected parameters are now *used* in the code.
Previously these parameters were used only in DBUG_ASSERT().
This problem is similar to MDEV-10306.
1. Fixing Item_str_conv::val_str(String *str) to return the result in "str",
and to use tmp_value only as a temporary buffer for args[0]->val_str().
The new code version now guarantees that the result is always returned in
"str". The trick with copy_if_not_alloced() is not used any more.
2. The change #1 revealed the same problem in SUBSTRING_INDEX(),
so some tests with combinations of UPPER()/LOWER() and SUBSTRING_INDEX()
started to fail. Fixing Item_func_substr_index::val_str() the same way,
to return the result in "str" and use tmp_value as a temporary buffer
for args[0]->val_str().
as a separate source for data
Actually MDEV-15867 and MDEV-16192 are same, Slave adds "or replace" to create
table stmt. So create table t1 is create or replace on slave. So this bug
is not because of replication, We can get this bug on general server if we
manually add or replace to create query.
Problem:- So if we try to create table t1 (same name as of temp table t1 ) via
CREATE or replace TABLE t AS SELECT * FROM t;
Since in this query we are creating table from select * from t1 , we call
unique_table function to see whether if source and destination table are same.
But there is one issue unique_table does not account if source table is tmp table
in this case source and destination table can be same.
Solution:- We will change find_dup_table to not to look for temp table if
CHECK_DUP_SKIP_TEMP_TABLE flag is on.
Problem:- Create/drop index was logged into binlog.
Goal:- Operation on temporary table should not be binlog when binlog format
is row.
Solution:-
We should add CF_FORCE_ORIGINAL_BINLOG_FORMAT when there is ddl on temp
table.
For optimize, analyze, repair we wont change anything ,Then will
be logged in binlog , But they also dont throw any error if operation fails
Since slave wont be having any temp table , but these operation on tmp
table will be processed without breaking replication.
For rename we need a different logic MDEV-16728 will solve it.
for the small InnoDB table
This bug was introduced by the patch 6c414fcf89.
The patch has not taken into account that some objects of the Field_* types
are created only for TABLE_SHARE and the field 'table' is set to NULL
for them. In particular such are objects created to store statistical
min/max values for columns.
for blob column
ANALYZE TABLE <table> does not collect statistical data on min/max values
for BLOB columns of <table>. However these values can be added into
mysql.column_stats manually by executing proper statements.
Unfortunately this led to a memory leak because the memory allocated
for these values was never freed.
This patch provides the server with a function to free memory allocated
for min/max statistical values of BLOB types.
Temporarily changed the test case until MDEV-16711 is fixed as without
this fix the test case for MDEV-16757 did not fail only for 10.0.
for blob column
ANALYZE TABLE <table> does not collect statistical data on min/max values
for BLOB columns of <table>. However these values can be added into
mysql.column_stats manually by executing proper statements.
Unfortunately this led to a memory leak because the memory allocated
for these values was never freed.
This patch provides the server with a function to free memory allocated
for min/max statistical values of BLOB types.
Backport the fix f214d36512 to 10.0
Author: Sergei Golubchik <serg@mariadb.org>
Date: Tue Apr 17 00:44:34 2018 +0200
ASAN error in is_stat_table()
don't memcmp beyond the first argument's end
Also: use my_strcasecmp(table_alias_charset), like elsewhere, not memcmp
In this issue we are using derived_with_keys optimization and we are using these keys to do a hash join which is incorrect.
We cannot create keys for dervied tables whose keyparts have types are of BLOB or TEXT type. TEXT or BLOB columns can only be
indexed over a specified length.
Problem:
========
Truncate operation holds MDL on the table (t1) and tries to
acquire InnoDB dict_operation_lock. Purge holds dict_operation_lock
and tries to acquire MDL on the table (t1) to evaluate virtual
column expressions for indexed virtual columns.
It leads to deadlock of purge and truncate table (DDL).
Solution:
=========
If purge tries to acquire MDL on the table then it should do the following:
i) Purge should release all innodb latches (including dict_operation_lock)
before acquiring metadata lock on the table.
ii) After acquiring metadata lock on the table, it should check whether the
table was dropped or renamed. If the table is dropped then purge should
ignore the undo log record. If the table is renamed then it should
release the old MDL and acquire MDL on the new name.
iii) Once purge acquires MDL, it should use the SQL table handle for all
the remaining virtual index for the purge record.
purge_node_t: Introduce new virtual column information to know whether
the MDL was acquired successfully.
This is joint work with Marko Mäkelä.
When processing a query containing with clauses a call of the function
check_dependencies_in_with_clauses() before opening tables used in the
query is necessary if with clauses include specifications of recursive
CTEs.
This call was missing if such a query belonged to a stored function.
This caused misbehavior of the server: it could report a fake error
as in the test case for MDEV-16629 or the executed query could hang
as in the test cases for MDEV-16661 and MDEV-15151.
"my_snprintf(NULL, 0, ...)" does not follow what snprintf does. Change
the call in wsrep_dump_rbr_buf_with_header to use snprintf (note that
wsrep_dump_rbr_buf was using snprintf all the way).
Fix an off-by-one error in comparison so it actually prints the output
One can create table with the same name for `field` and `table` `check` constraint.
For example:
`create table t(a int check(a>0), constraint a check(a>10));`
But when inserting new rows same error is always raised.
For example with
```insert into t values (-1);```
and
```insert into t values (10);```
same error `ER_CONSTRAINT_FAILED` is obtained and it is not clear which constraint is violated.
This patch solve this error so that in case if field constraint is violated the first parameter
in the error message is `table.field_name` and if table constraint is violated the first parameter
in error message is `constraint_name`.
When the definition of the index used for hash join was created
in create_hj_key_for_table() it could cause memory overwrite
due to a bug that led to an underestimation of the number of
the index component.
MDEV-7257 made a dump thread to read from binlog concurrently with
writers as long as the read bytes are below a water-mark
(MYSQL_BIN_LOG::binlog_end_pos). However it appeared to be possible a
dump thread reader reach out for bytes past the water mark through a
feature of IO_CACHE that fills in the internal buffer and while doing
so it could read what the reader is not supposed to see (the bytes
above MYSQL_BIN_LOG::binlog_end_pos).
The issue is fixed with constraining the IO_CACHE buffer fill to respect
the watermark.
An added unit test proves reading from file is bound to an external
parameter
passed to {IO_CACHE::end_of_file} cache member.
MDEV-7257 made a dump thread to read from binlog concurrently with
writers as long as the read bytes are below a water-mark
(MYSQL_BIN_LOG::binlog_end_pos). However it appeared to be possible a
dump thread reader reach out for bytes past the water mark through a
feature of IO_CACHE that fills in the internal buffer and while doing
so it could read what the reader is not supposed to see (the bytes
above MYSQL_BIN_LOG::binlog_end_pos).
The issue is fixed with constraining the IO_CACHE buffer fill to respect
the watermark.
An added unit test proves reading from file is bound to an external
parameter
passed to {IO_CACHE::end_of_file} cache member.
The previous correction of the patch for mdev-16473 did not work
correctly for the databases whose names started with '*'.
Added a test case with a database named "*".
For the purpose of reporting an error to error log, shutdown thread was
attempting to access current_thd->variables.lc_messages->errmsgs->errmsgs.
Whereas current_thd was NULL.
We should log errors according to global lc_messages setting instead of
session setting.
Only close stdin if it was open initinally. Otherwise we may close file
descriptor which is reused for different puprose (specifically for binlog
index file in case of this bug).
MDEV-16512 Server crashes in find_field_in_table_ref on 2nd
execution of SP referring to non-existing field
Problem was in the natural join code that it changed TABLE_LIST and
Item_fields but didn't restore changed things if things goes wrong
and was not able to re-execute after failure.
Some of the problems could have been avoided if we would have run
fix_fields before doing natural join transformations.
Fixed by marking functions complete AFTER they had executed, instead at
start.
I had also to change some tests that checked if Item_fields are usable.
This doesn't fix all known problems, but at least avoids some crashes.
What should be done in the near future is to mark the statement in the SP
as 'not re-executable' and force a reparse of it on next execution.
Reviewer: Sergei Petrunia <psergey@askmonty.org>
This is a typical systemd response where it tries to shutdown the
joiner (due to "timeout") before the joiner manages to complete SST.
wsrep_sst_wait
wsrep_SE_init_wait
While waiting the operation to finish use mysql_cond_timedwait
instead of mysql_cond_wait and if operation is not finished
extend systemd timeout (if needed).
Before this patch if no default database was set the server threw
an error for any table name reference that was not fully qualified by
database name. In particular it happened for table names referenced
CTE tables. This was incorrect.
The error message was thrown at the parser stage when the names referencing
different tables were not resolved yet.
Now if no default database is set and a with clause is used in the
processed statement any table reference is just supplied with a dummy
database name "*none*" at the parser stage. Later after a call
of check_dependencies_in_with_clauses() when the names for CTE tables
can be resolved error messages are thrown only for those names that
refer to non-CTE tables. This is done in open_and_process_table().
Observed and described
partitioned engine execution time difference
between master and slave was caused by excessive invocation
of base_engine::rnd_init which was done also for partitions
uninvolved into Rows-event operation.
The bug's slave slowdown therefore scales with the number of partitions.
Fixed with applying an upstream patch.
References:
----------
https://bugs.mysql.com/bug.php?id=73648
Bug#25687813 REPLICATION REGRESSION WITH RBR AND PARTITIONED TABLES