on table creates
The problem was in incompatible syntax for key definition in CREATE
TABLE.
5.0 supports only the following syntax for key definition (see "CREATE
TABLE syntax" in the manual):
{INDEX|KEY} [index_name] [index_type] (index_col_name,...)
While 5.1 parser supports the above syntax, the "preferred" syntax was
changed to:
{INDEX|KEY} [index_name] (index_col_name,...) [index_type]
The above syntax is used in 5.1 for the SHOW CREATE TABLE output, which
led to dumps generated by 5.1 being incompatible with 5.0.
Fixed by changing the parser in 5.0 to support both 5.0 and 5.1 syntax
for key definition.
In a union without braces, the order by at the end is applied to the
overall union. It therefore should not interfere with the individual
select parts of the union.
Fixed by changing our parser rules appropriately.
subselects into account
It is forbidden to use the SELECT INTO construction inside UNION statements
unless on the last SELECT of the union. The parser records whether it
has seen INTO or not when parsing a UNION statement. But if the INTO was
legally used in an outer query, an error is thrown if UNION is seen in a
subquery. Fixed in 5.0 by remembering the nesting level of INTO tokens and
mitigate the error unless it collides with the UNION.
Anti-patch. This patch undoes the previously pushed patch. It is
null-merged in versions 5.1 and above since there the original
patch is still desired.
crashes MySQL 5.122
There was a difference in how UNIONs are handled
on top level and when in sub-query.
Because the rules for sub-queries were syntactically
allowing cases that are not currently supported by
the server we had crashes (this bug) or wrong results
(bug 32051).
Fixed by making the syntax rules for UNIONs match the
ones at top level.
These rules however do not support nesting UNIONs, e.g.
(SELECT a FROM t1 UNION ALL SELECT b FROM t2)
UNION
(SELECT c FROM t3 UNION ALL SELECT d FROM t4)
Supports for statements with nested UNIONs will be
added in a future version.
Problem: we have CHECK TABLE options allowed (by accident?) for
ANALYZE/OPTIMIZE TABLE.
Fix: disable them.
Note: it might require additional fixes in 5.1/6.0
When the server was out of memory it crashed because of invalid memory access.
This patch adds detection for failed memory allocations and make the server
output a proper error message.
all space column names.
The parser has been modified to check VIEW column names
with the check_column_name function and to report an error
on empty and all space column names (same as for TABLE
column names).
The root cause of the issue was that the CREATE FUNCTION grammar,
for User Defined Functions, was using the sp_name rule.
The sp_name rule is intended for fully qualified stored procedure names,
like either ident.ident, or just ident but with a default database
implicitly selected.
A UDF does not have a fully qualified name, only a name (ident), and should
not use the sp_name grammar fragment during parsing.
The fix is to re-organize the CREATE FUNCTION grammar, to better separate:
- creating UDF (no definer, can have AGGREGATE, simple ident)
- creating Stored Functions (definer, no AGGREGATE, fully qualified name)
With the test case provided, another issue was exposed which is also fixed:
the DROP FUNCTION statement was using sp_name and also failing when no database
is implicitly selected, when droping UDF functions.
The fix is also to change the grammar so that DROP FUNCTION works with
both the ident.ident syntax (to drop a stored function), or just the ident
syntax (to drop either a UDF or a Stored Function, in the current database)
Bug#30982 CHAR(..USING..) can return a not-well-formed string
Bug#30986 Character set introducer followed by a HEX string can return bad result
check_well_formed_result moved to Item from Item_str_func
fixed Item_func_char::val_str for proper ucs symbols converting
added check for well formed strings for correct conversion of constants with underscore
charset
Fixed the usage of spatial data (and Point in specific) with
non-spatial indexes.
Several problems :
- The length of the Point class was not updated to include the
spatial reference system identifier. Fixed by increasing with 4
bytes.
- The storage length of the spatial columns was not accounting for
the length that is prepended to it. Fixed by treating the
spatial data columns as blobs (and thus increasing the storage
length)
- When creating the key image for comparison in index read wrong
key image was created (the one needed for and r-tree search,
not the one for b-tree/other search). Fixed by treating the
spatial data columns as blobs (and creating the correct kind of
image based on the index type).
Bug#29816 Syntactically wrong query fails with misleading error message
The core problem is that an SQL-invoked function name can be a <schema
qualified routine name> that contains no <schema name>, but the mysql
parser insists that all stored procedures (function, procedures and
triggers) must have a <schema name>, which is not true for functions.
This problem is especially visible when trying to create a function
or when a query contains a syntax error after a function call (in the
same query), both will fail with a "No database selected" message if
the session is not attached to a particular schema, but the first
one should succeed and the second fail with a "syntax error" message.
Part of the fix is to revamp the sp name handling so that a schema
name may be omitted for functions -- this means that the internal
function name representation may not have a dot, which represents
that the function doesn't have a schema name. The other part is
to place schema checks after the type (function, trigger or procedure)
of the routine is known.
DELETE FROM ... USING ... statements with the following type of
ambiguous aliasing gave unexpected results:
DELETE FROM t1 AS alias USING t1, t2 AS alias WHERE t1.a = alias.a;
This query would leave table t1 intact but delete rows from t2.
Fixed by changing DELETE FROM ... USING syntax so that only alias
references (as opposed to alias declarations) may be used in FROM.
This is a performance bug, affecting in particular the bison generated code
for the parser.
Prior to this fix, the grammar used a long chain of reduces to parse an
expression, like:
bit_expr -> bit_term
bit_term -> bit_factor
bit_factor -> value_expr
value_expr -> term
term -> factor
etc
This chain of reduces cause the internal state automaton in the generated
parser to execute more state transitions and more reduces, so that the
generated MySQLParse() function would spend a lot of time looping to execute
all the grammar reductions.
With this patch, the grammar has been reorganized so that rules are more
"flat", limiting the depth of reduces needed to parse <expr>.
Tests have been written to enforce that relative priorities and properties
of operators have not changed while changing the grammar.
See the bug report for performance data.
This is a performance bug, related to the parsing or 'OR' and 'AND' boolean
expressions.
Let N be the number of expressions involved in a OR (respectively AND).
When N=1
For example, "select 1" involve only 1 term: there is no OR operator.
In 4.0 and 4.1, parsing expressions not involving OR had no overhead.
In 5.0, parsing adds some overhead, with Select->expr_list.
With this patch, the overhead introduced in 5.0 has been removed,
so that performances for N=1 should be identical to the 4.0 performances,
which are optimal (there is no code executed at all)
The overhead in 5.0 was in fact affecting significantly some operations.
For example, loading 1 Million rows into a table with INSERTs,
for a table that has 100 columns, leads to parsing 100 Millions of
expressions, which means that the overhead related to Select->expr_list
is executed 100 Million times ...
Considering that N=1 is by far the most probable expression,
this case should be optimal.
When N=2
For example, "select a OR b" involves 2 terms in the OR operator.
In 4.0 and 4.1, parsing expressions involving 2 terms created 1 Item_cond_or
node, which is the expected result.
In 5.0, parsing these expression also produced 1 node, but with some extra
overhead related to Select->expr_list : creating 1 list in Select->expr_list
and another in Item_cond::list is inefficient.
With this patch, the overhead introduced in 5.0 has been removed
so that performances for N=2 should be identical to the 4.0 performances.
Note that the memory allocation uses the new (thd->mem_root) syntax
directly.
The cost of "is_cond_or" is estimated to be neglectable: the real problem
of the performance degradation comes from unneeded memory allocations.
When N>=3
For example, "select a OR b OR c ...", which involves 3 or more terms.
In 4.0 and 4.1, the parser had no significant cost overhead, but produced
an Item tree which is difficult to evaluate / optimize during runtime.
In 5.0, the parser produces a better Item tree, using the Item_cond
constructor that accepts a list of children directly, but at an extra cost
related to Select->expr_list.
With this patch, the code is implemented to take the best of the two
implementations:
- there is no overhead with Select->expr_list
- the Item tree generated is optimized and flattened.
This is achieved by adding children nodes into the Item tree directly,
with Item_cond::add(), which avoids the need for temporary lists and memory
allocation
Note that this patch also provide an extra optimization, that the previous
code in 5.0 did not provide: expressions are flattened in the Item tree,
based on what the expression already parsed is, and not based on the order
in which rules are reduced.
For example : "(a OR b) OR c", "a OR (b OR c)" would both be represented
with 2 Item_cond_or nodes before this patch, and with 1 node only with this
patch. The logic used is based on the mathematical properties of the OR
operator (it's associative), and produces a simpler tree.
(Regression, caused by a patch for the bug 22646).
Problem: when result type of date_format() was changed from
binary string to character string, mixing date_format()
with a ascii column in CONCAT() stopped to work.
Fix:
- adding "repertoire" flag into DTCollation class,
to mark items which can return only pure ASCII strings.
- allow character set conversion from pure ASCII to other character sets.
SP with local variables with non-ASCII names crashed the server.
The server replaces SP local variable names with NAME_CONST calls
when putting statements into the binary log. It used UTF8-encoded
item names as variable names for the replacement inside NAME_CONST
calls. However, statement string may be encoded by any
known character set by the SET NAMES statement.
The server used byte length of UTF8-encoded names to increment
the position in the query string that led to array index overrun.
causes full table lock on innodb table.
Also fixes Bug#28502 Triggers that update another innodb table
will block on X lock unnecessarily (duplciate).
Code review fixes.
Both bugs' synopses are misleading: InnoDB table is
not X locked. The statements, however, cannot proceed concurrently,
but this happens due to lock conflicts for tables used in triggers,
not for the InnoDB table.
If a user had an InnoDB table, and two triggers, AFTER UPDATE and
AFTER INSERT, competing for different resources (e.g. two distinct
MyISAM tables), then these two triggers would not be able to execute
concurrently. Moreover, INSERTS/UPDATES of the InnoDB table would
not be able to run concurrently.
The problem had other side-effects (see respective bug reports).
This behavior was a consequence of a shortcoming of the pre-locking
algorithm, which would not distinguish between different DML operations
(e.g. INSERT and DELETE) and pre-lock all the tables
that are used by any trigger defined on the subject table.
The idea of the fix is to extend the pre-locking algorithm to keep track,
for each table, what DML operation it is used for and not
load triggers that are known to never be fired.
fails if a database is not selected prior.
The problem manifested itself when a user tried to
create a routine that had non-fully-qualified identifiers in its bodies
and there was no current database selected.
This is a regression introduced by the fix for Bug 19022:
The patch for Bug 19022 changes the code to always produce a warning
if we can't resolve the current database in the parser.
In this case this was not necessary, since even though the produced
parsed tree was incorrect, we never re-use sphead
that was obtained at first parsing of CREATE PROCEDURE.
The sphead that is anyhow used is always obtained through db_load_routine,
and there we change the current database to sphead->m_db before
calling yyparse.
The idea of the fix is to resolve the current database directly using
lex->sphead->m_db member when parsing a stored routine body, when
such is present.
This patch removes the need to reset the current database
when loading a trigger or routine definition into SP cache.
The redundant code will be removed in 5.1.
Problem: separator was not converted to the result character set,
so the result was a mixture of two different character sets,
which was especially bad for UCS2.
Fix: convert separator to the result character set.
ALTER VIEW is currently not supported as a prepared statement
and should be disabled as such as they otherwise could cause server crashes.
ALTER VIEW is currently not supported when called from stored
procedures or functions for related reasons and should also be disabled.
This patch disables these DDL statements and adjusts the appropriate test
cases accordingly.
Additional tests has been added to reflect on the fact that we do support
CREATE/ALTER/DROP TABLE for Prepared Statements (PS), Stored Procedures (SP)
and PS within SP.
Changed code to enforce that SQL_CACHE only in the first SELECT is used to turn on caching(as documented), but any SQL_NO_CACHE will turn off caching (not documented, but a useful behaviour, especially for machine generated queries). Added test cases to explicitly test the documented caching behaviour and test cases for the reported bug.