This is a performance bug, affecting in particular the bison generated code
for the parser.
Prior to this fix, the grammar used a long chain of reduces to parse an
expression, like:
bit_expr -> bit_term
bit_term -> bit_factor
bit_factor -> value_expr
value_expr -> term
term -> factor
etc
This chain of reduces cause the internal state automaton in the generated
parser to execute more state transitions and more reduces, so that the
generated MySQLParse() function would spend a lot of time looping to execute
all the grammar reductions.
With this patch, the grammar has been reorganized so that rules are more
"flat", limiting the depth of reduces needed to parse <expr>.
Tests have been written to enforce that relative priorities and properties
of operators have not changed while changing the grammar.
See the bug report for performance data.
Bug#30418 "datadict" tests (all engines) fail: Dependency on the host name
for ordering
Bug#30420 "datadict" tests (all engines) fail: Release build has help tables loaded
Bug#30438 "{memory,myisam,ndb}__datadict" tests fail: Use "InnoDB" without checking
Bug#30440 "datadict" tests (all engines) fail: Character sets depend on configuration
Attention: Only the build team can check if Bug#30440 is really fixed.
Currently the Last_query_cost session status variable shows
only the cost of a single flat subselect. For complex queries
(with subselects or unions etc) Last_query_cost is not valid
as it was showing the cost for the last optimized subselect.
Fixed by reseting to zero Last_query_cost when the complete
cost of the query cannot be determined.
Last_query_cost will be non-zero only for single flat queries.
- Merge sslaccept and sslconnect.
- Atomically "reset" vio to VIO_TYPE_SSL when the SSL connection has
succeeded, this avoids having to revert anything and thus protects
against "close_active_vio" in the middle.
- Add some variance to the testcase
The optimization that uses a unique index to remove GROUP BY, did not
ensure that the index was actually used, thus violating the ORDER BY
that is impled by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index. In case GROUP BY ... ORDER BY null is used, GROUP BY is
simply removed.
If, after the tables are locked, one of the conditions to read from a
HANDLER table is not met, the handler code wrongly jumps to a error path
that won't unlock the tables.
The user-visible effect is that after a error in a handler read command,
all subsequent handler operations on the same table will hang.
The fix is simply to correct the code to jump to the (same) error path that
unlocks the tables.
The problem from a user's perspective: user creates table A, and then tries
to CREATE TABLE a SELECT from A - and this causes a deadlock error, a hang,
or fails with a debug assert, but only if the storage engine is InnoDB.
The origin of the problem: InnoDB uses case-insensitive collation
(system_charset_info) when looking up the internal table share, thus returning
the same share for 'a' and 'A'.
Cause of the user-visible behavior: since the same share is returned to SQL
locking subsystem, it assumes that the same table is first locked (within the
same session) for WRITE, and then for READ, and returns a deadlock error.
However, the code is wrong in not properly cleaning up upon an error, leaving
external locks in place, which leads to assertion failures and hangs.
Fix that has been implemented: the SQL layer should properly propagate the
deadlock error, cleaning up and freeing all resources.
Further work towards a more complete solution: InnoDB should not use case
insensitive collation for table share hash if table names on disk honor the case.
This is a performance bug, related to the parsing or 'OR' and 'AND' boolean
expressions.
Let N be the number of expressions involved in a OR (respectively AND).
When N=1
For example, "select 1" involve only 1 term: there is no OR operator.
In 4.0 and 4.1, parsing expressions not involving OR had no overhead.
In 5.0, parsing adds some overhead, with Select->expr_list.
With this patch, the overhead introduced in 5.0 has been removed,
so that performances for N=1 should be identical to the 4.0 performances,
which are optimal (there is no code executed at all)
The overhead in 5.0 was in fact affecting significantly some operations.
For example, loading 1 Million rows into a table with INSERTs,
for a table that has 100 columns, leads to parsing 100 Millions of
expressions, which means that the overhead related to Select->expr_list
is executed 100 Million times ...
Considering that N=1 is by far the most probable expression,
this case should be optimal.
When N=2
For example, "select a OR b" involves 2 terms in the OR operator.
In 4.0 and 4.1, parsing expressions involving 2 terms created 1 Item_cond_or
node, which is the expected result.
In 5.0, parsing these expression also produced 1 node, but with some extra
overhead related to Select->expr_list : creating 1 list in Select->expr_list
and another in Item_cond::list is inefficient.
With this patch, the overhead introduced in 5.0 has been removed
so that performances for N=2 should be identical to the 4.0 performances.
Note that the memory allocation uses the new (thd->mem_root) syntax
directly.
The cost of "is_cond_or" is estimated to be neglectable: the real problem
of the performance degradation comes from unneeded memory allocations.
When N>=3
For example, "select a OR b OR c ...", which involves 3 or more terms.
In 4.0 and 4.1, the parser had no significant cost overhead, but produced
an Item tree which is difficult to evaluate / optimize during runtime.
In 5.0, the parser produces a better Item tree, using the Item_cond
constructor that accepts a list of children directly, but at an extra cost
related to Select->expr_list.
With this patch, the code is implemented to take the best of the two
implementations:
- there is no overhead with Select->expr_list
- the Item tree generated is optimized and flattened.
This is achieved by adding children nodes into the Item tree directly,
with Item_cond::add(), which avoids the need for temporary lists and memory
allocation
Note that this patch also provide an extra optimization, that the previous
code in 5.0 did not provide: expressions are flattened in the Item tree,
based on what the expression already parsed is, and not based on the order
in which rules are reduced.
For example : "(a OR b) OR c", "a OR (b OR c)" would both be represented
with 2 Item_cond_or nodes before this patch, and with 1 node only with this
patch. The logic used is based on the mathematical properties of the OR
operator (it's associative), and produces a simpler tree.
Binlogging of the statement with a side effect like a modified non-trans table did not happen.
The artifact involved all binloggable dml queries.
Fixed with changing the binlogging conditions all over the code to exploit thd->transaction.stmt.modified_non_trans_table
introduced by the patch for bug@27417.
Multi-delete case has own specific addressed by another bug@29136. Multi-update case has been addressed by bug#27716 and
patch and will need merging.
Although the query cache doesn't support retrieval of statements containing
column level access control, it was still possible to cache such statements
thus wasting memory.
This patch extends the access control check on the target tables to avoid
caching a statement with column level restrictions.
Views are excepted and can be cached but only retrieved by super user account.
Although the query cache doesn't support retrieval of statements containing
column level access control, it was still possible to cache such statements
thus wasting memory.
This patch extends the access control check on the target tables to avoid
caching a statement with column level restrictions.
HEAP tables can't index BIT fields. Due to this when grouping by such fields is
needed they are converted to a fields of the LONG type when temporary table
is being created. But a side effect of this is that a wrong type of BIT
fields is returned to a client.
Now the JOIN::prepare and the create_distinct_group functions are create
additional hidden copy of BIT fields to preserve original fields untouched.
New hidden fields are used for grouping instead.
The bug caused memory corruption for some queries with top OR level
in the WHERE condition if they contained equality predicates and
other sargable predicates in disjunctive parts of the condition.
The corruption happened because the upper bound of the memory
allocated for KEY_FIELD and SARGABLE_PARAM internal structures
containing info about potential lookup keys was calculated incorrectly
in some cases. In particular it was calculated incorrectly when the
WHERE condition was an OR formula with disjuncts being AND formulas
including equalities and other sargable predicates.
mysql_ha_open calls mysql_ha_close on the error path (unsupported) to close the (opened) table before inserting it into the tables hash list handler_tables_hash) but mysql_ha_close only closes tables which are on the hash list, causing the table to be left open and locked.
This change moves the table close logic into a separate function that is always called on the error path of mysql_ha_open or on a normal handler close (mysql_ha_close).
ORDER BY is used
The range analysis module did not correctly signal to the
handler that a range represents a ref (EQ_RANGE flag). This causes
non-range queries like
SELECT ... FROM ... WHERE keypart_1=const, ..., keypart_n=const
ORDER BY ... FOR UPDATE
to wait for a lock unneccesarily if another running transaction uses
SELECT ... FOR UPDATE on the same table.
Fixed by setting EQ_RANGE for all range accesses that represent
an equality predicate.
ChangeSet@1.2575, 2007-08-07 19:16:06+02:00, msvensson@pilot.(none) +2 -0
Bug#26793 mysqld crashes when doing specific query on information_schema
- Drop the newly created user user1@localhost
- Cleanup testcase
under terms of bug#28875 for better performance.
The change appeared to require more changes in item_cmpfunc.cc,
which is dangerous in 5.0.
Conversion between a latin1 column and an ascii string constant
stopped to work.
Two character mappings were way off (backtick and tilde were "E"
and "Y"!), and three others were slightly rotated. The first
would cause collisions, and the latter was probably benign.
Now, assign the character mappings exactly to their normal values.
MySQL replicates the time zone only when operations that involve
it are performed. This is controlled by a flag. But this flag
is set only on successful operation.
The flag must be set also when there is an error that involves
a timezone (so the master would replicate the error to the slaves).
Fixed by moving the setting of the flag before the operation
(so it apples to errors as well).
This bug manifested itself for queries with grouping by columns of
the BIT type. It led to wrong comparisons of bit-field values and
wrong result sets.
Bit-field values never cannot be compared as binary values. Yet
the class Field_bit had an implementation of the cmp method that
compared bit-fields values as binary values.
Also the get_image and set_image methods of the base class Field
cannot be used for objects of the Field_bit class.
Now these methods are declared as virtual and specific implementations
of the methods are provided for the class Field_bit.
A test case was waiting for a fixed number of seconds for a specific
state of the slave IO thread to take place.
Fixed by waiting in a loop for that specific thread state instead
(or timeout).
- Move the code to generate test report to the test tool(in this
case mysqltest) where the best control of what failed is
- Simplify the code in mysql-test-run.pl
- mysqltest will now find what diff to use in a best effort attempt
using "diff -u", "diff -c" and finally dumping the two files verbatim
in the report.