causes full table lock on innodb table.
Also fixes Bug#28502 Triggers that update another innodb table
will block on X lock unnecessarily (duplciate).
Code review fixes.
Both bugs' synopses are misleading: InnoDB table is
not X locked. The statements, however, cannot proceed concurrently,
but this happens due to lock conflicts for tables used in triggers,
not for the InnoDB table.
If a user had an InnoDB table, and two triggers, AFTER UPDATE and
AFTER INSERT, competing for different resources (e.g. two distinct
MyISAM tables), then these two triggers would not be able to execute
concurrently. Moreover, INSERTS/UPDATES of the InnoDB table would
not be able to run concurrently.
The problem had other side-effects (see respective bug reports).
This behavior was a consequence of a shortcoming of the pre-locking
algorithm, which would not distinguish between different DML operations
(e.g. INSERT and DELETE) and pre-lock all the tables
that are used by any trigger defined on the subject table.
The idea of the fix is to extend the pre-locking algorithm to keep track,
for each table, what DML operation it is used for and not
load triggers that are known to never be fired.
The need arose when working on Bug 26141, where it became
necessary to replace TABLE_LIST with its forward declaration in a few
headers, and this involved a lot of s/TABLE_LIST/st_table_list/.
Although other workarounds exist, this patch is in line
with our general strategy of moving away from typedef-ed names.
Sometime in future we might also rename TABLE_LIST to follow the
coding style, but this is a huge change.
fails if a database is not selected prior.
The problem manifested itself when a user tried to
create a routine that had non-fully-qualified identifiers in its bodies
and there was no current database selected.
This is a regression introduced by the fix for Bug 19022:
The patch for Bug 19022 changes the code to always produce a warning
if we can't resolve the current database in the parser.
In this case this was not necessary, since even though the produced
parsed tree was incorrect, we never re-use sphead
that was obtained at first parsing of CREATE PROCEDURE.
The sphead that is anyhow used is always obtained through db_load_routine,
and there we change the current database to sphead->m_db before
calling yyparse.
The idea of the fix is to resolve the current database directly using
lex->sphead->m_db member when parsing a stored routine body, when
such is present.
This patch removes the need to reset the current database
when loading a trigger or routine definition into SP cache.
The redundant code will be removed in 5.1.
1. Fix ddl_i18n_koi8r, ddl_i18n_utf8: explicitly specify character-sets
directory for mysqldump;
2. Fix crash in mysqldump if collation is not found;
3. Use proper way to compare character set names.
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
Bug 28127 (Some valid identifiers names are not parsed correctly)
Bug 26302 (MySQL server cuts off trailing "*/" from comments in SP/func)
This patch is the second part of a major cleanup, required to fix
Bug 25411 (trigger code truncated).
The root cause of the issue stems from the function skip_rear_comments,
which was a work around to remove "extra" "*/" characters from the query
text, when parsing a query and reusing the text fragments to represent a
view, trigger, function or stored procedure.
The reason for this work around is that "special comments",
like /*!50002 XXX */, were not parsed properly, so that a query like:
AAA /*!50002 BBB */ CCC
would be seen by the parser as "AAA BBB */ CCC" when the current version
is greater or equal to 5.0.2
The root cause of this stems from how special comments are parsed.
Special comments are really out-of-bound text that appear inside a query,
that affects how the parser behave.
In nature, /*!50002 XXX */ in MySQL is similar to the C concept
of preprocessing :
#if VERSION >= 50002
XXX
#endif
Depending on the current VERSION of the server, either the special comment
should be expanded or it should be ignored, but in all cases the "text" of
the query should be re-written to strip the "/*!50002" and "*/" markers,
which does not belong to the SQL language itself.
Prior to this fix, these markers would leak into :
- the storage format for VIEW,
- the storage format for FUNCTION,
- the storage format for FUNCTION parameters, in mysql.proc (param_list),
- the storage format for PROCEDURE,
- the storage format for PROCEDURE parameters, in mysql.proc (param_list),
- the storage format for TRIGGER,
- the binary log used for replication.
In all cases, not only this cause format corruption, but also provide a vector
for dormant security issues, by allowing to tunnel code that will be activated
after an upgrade.
The proper solution is to deal with special comments strictly during parsing,
when accepting a query from the outside world.
Once a query is parsed and an object is created with a persistant
representation, this object should not arbitrarily mutate after an upgrade.
In short, special comments are a useful but limited feature for MYSQLdump,
when used at an *interface* level to facilitate import/export,
but bloating the server *internal* storage format is *not* the proper way
to deal with configuration management of the user logic.
With this fix:
- the Lex_input_stream class now acts as a comment pre-processor,
and either expands or ignore special comments on the fly.
- MYSQLlex and sql_yacc.yy have been cleaned up to strictly use the
public interface of Lex_input_stream. In particular, how the input stream
accepts or rejects a character is private to Lex_input_stream, and the
internal buffer pointers of that class are strictly private, and should not
be tempered with during parsing.
This caused many changes mostly in sql_lex.cc.
During the code cleanup in case MY_LEX_NUMBER_IDENT,
Bug 28127 (Some valid identifiers names are not parsed correctly)
was found and fixed.
By parsing special comments properly, and removing the function
'skip_rear_comments' [sic],
Bug 26302 (MySQL server cuts off trailing "*/" from comments in SP/func)
has been fixed as well.
Coding style: classes start with a capital letter.
Rename some classes related to parsing:
create_field -> Create_field
foreign_key -> Foreign_key
key_part_spec -> Key_part_spec
Bug#4968 ""Stored procedure crash if cursor opened on altered table"
Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing"
Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from
stored procedure."
Bug#19733 "Repeated alter, or repeated create/drop, fails"
Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server"
Bug#24879 "Prepared Statements: CREATE TABLE (UTF8 KEY) produces a
growing key length" (this bug is not fixed in 5.0)
Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE
statements in stored routines or as prepared statements caused
incorrect results (and crashes in versions prior to 5.0.25).
In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE
SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options).
The problem of bugs 4968, 19733, 19282 and 6895 was that functions
mysql_prepare_table, mysql_create_table and mysql_alter_table are not
re-execution friendly: during their operation they modify contents
of LEX (members create_info, alter_info, key_list, create_list),
thus making the LEX unusable for the next execution.
In particular, these functions removed processed columns and keys from
create_list, key_list and drop_list. Search the code in sql_table.cc
for drop_it.remove() and similar patterns to find evidence.
The fix is to supply to these functions a usable copy of each of the
above structures at every re-execution of an SQL statement.
To simplify memory management, LEX::key_list and LEX::create_list
were added to LEX::alter_info, a fresh copy of which is created for
every execution.
The problem of crashing bug 22060 stemmed from the fact that the above
metnioned functions were not only modifying HA_CREATE_INFO structure
in LEX, but also were changing it to point to areas in volatile memory
of the execution memory root.
The patch solves this problem by creating and using an on-stack
copy of HA_CREATE_INFO in mysql_execute_command.
Additionally, this patch splits the part of mysql_alter_table
that analizes and rewrites information from the parser into
a separate function - mysql_prepare_alter_table, in analogy with
mysql_prepare_table, which is renamed to mysql_prepare_create_table.
The root cause of this bug is related to the function skip_rear_comments,
in sql_lex.cc
Recent code changes in skip_rear_comments changed the prototype from
"const uchar*" to "const char*", which had an unforseen impact on this test:
(endp[-1] < ' ')
With unsigned characters, this code filters bytes of value [0x00 - 0x20]
With *signed* characters, this also filters bytes of value [0x80 - 0xFF].
This caused the regression reported, considering cyrillic characters in the
parameter name to be whitespace, and truncated.
Note that the regression is present both in 5.0 and 5.1.
With this fix:
- [0x80 - 0xFF] bytes are no longer considered whitespace.
This alone fixes the regression.
In addition, filtering [0x00 - 0x20] was found bogus and abusive,
so that the code now filters uses my_isspace when looking for whitespace.
Note that this fix is only addressing the regression affecting UTF-8
in general, but does not address a more fundamental problem with
skip_rear_comments: parsing a string *backwards*, starting at end[-1],
is not safe with multi-bytes characters, so that end[-1] can confuse the
last byte of a multi-byte characters with a characters to filter out.
The only known impact of this remaining issue affects objects that have to
meet all the conditions below:
- the object is a FUNCTION / PROCEDURE / TRIGGER / EVENT / VIEW
- the body consist of only *1* instruction, and does *not* contain a
BEGIN-END block
- the instruction ends, lexically, with <ident> <whitespace>* ';'?
For example, "select <ident>;" or "return <ident>;"
- The last character of <ident> is a multi-byte character
- the last byte of this character is ';' '*', '/' or whitespace
In this case, the body of the object will be truncated after parsing,
and stored in an invalid format.
This last issue has not been fixed in this patch, since the real fix
will be implemented by Bug 25411 (trigger code truncated), which is caused
by the very same code.
The real problem is that the function skip_rear_comments is only a
work-around, and should be removed entirely: see the proposed patch for
bug 25411 for details.
Replacing binlog_row_based_if_mixed with variable binlog_stmt_flags
holding several flags and adding member functions to manipulate the
flags.
Added code to generate a warning when an attempt to log an unsafe
statement to the binary log was made. The warning is both pushed to the
SHOW WARNINGS table and written to the error log. The prevent flooding
the error log, the warning is just written to the error log once per
open session.
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
Before this fix, the parser would sometime change where a token starts by
altering Lex_input_string::tok_start, which later confused the code in
sql_yacc.yy that needs to capture the source code of a SQL statement,
like to represent the body of a stored procedure.
This line of code in sql_lex.cc :
case MY_LEX_USER_VARIABLE_DELIMITER:
lip->tok_start= lip->ptr; // Skip first `
would <skip the first back quote> ... and cause the bug reported.
In general, the responsibility of sql_lex.cc is to *find* where token are
in the SQL text, but is *not* to make up fake or incomplete tokens.
With a quoted label like `my_label`, the token starts on the first quote.
Extracting the token value should not change that (it did).
With this fix, the lexical analysis has been cleaned up to not change
lip->tok_start (in the case found for this bug).
The functions get_token() and get_quoted_token() now have an extra
parameters, used when some characters from the beginning of the token need
to be skipped when extracting a token value, like when extracting 'AB' from
'0xAB', for example, for a HEX_NUM token.
This exposed a bad assumption in Item_hex_string and Item_bin_string,
which has been fixed:
The assumption was that the string given, 'AB', was in fact preceded in
memory by '0x', which might be false (it can be preceded by "x'" and
followed by "'" -- or not be preceded by valid memory at all)
If a name is needed for Item_hex_string or Item_bin_string, the name is
taken from the original and true source code ('0xAB'), and assigned in
the select_item rule, instead of relying on assumptions related to how
memory is used.
The issue found with bug 25411 is due to the function skip_rear_comments()
which damages the source code while implementing a work around.
The root cause of the problem is in the lexical analyser, which does not
process special comments properly.
For special comments like :
[1] aaa /*!50000 bbb */ ccc
since 5.0 is a version older that the current code, the parser is in lining
the content of the special comment, so that the query to process is
[2] aaa bbb ccc
However, the text of the query captured when processing a stored procedure,
stored function or trigger (or event in 5.1), can be after rebuilding it:
[3] aaa bbb */ ccc
which is wrong.
To fix bug 25411 properly, the lexical analyser needs to return [2] when
in lining special comments.
In order to implement this, some preliminary cleanup is required in the code,
which is implemented by this patch.
Before this change, the structure named LEX (or st_lex) contains attributes
that belong to lexical analysis, as well as attributes that represents the
abstract syntax tree (AST) of a statement.
Creating a new LEX structure for each statements (which makes sense for the
AST part) also re-initialized the lexical analysis phase each time, which
is conceptually wrong.
With this patch, the previous st_lex structure has been split in two:
- st_lex represents the Abstract Syntax Tree for a statement. The name "lex"
has not been changed to avoid a bigger impact in the code base.
- class lex_input_stream represents the internal state of the lexical
analyser, which by definition should *not* be reinitialized when parsing
multiple statements from the same input stream.
This change is a pre-requisite for bug 25411, since the implementation of
lex_input_stream will later improve to deal properly with special comments,
and this processing can not be done with the current implementation of
sp_head::reset_lex and sp_head::restore_lex, which interfere with the lexer.
This change set alone does not fix bug 25411.
Support of views wasn't implemented for the TRUNCATE statement.
Now TRUNCATE on views has the same semantics as DELETE FROM view:
mysql_truncate() checks whether the table is a view and falls back
to delete if so.
In order to initialize properly the LEX::updatable for a view
st_lex::can_use_merged() now allows usage of merged views for the
TRUNCATE statement.
the lexer API which internally uses unsigned char variables to
address its state map. The implementation of the lexer should be
internal to the lexer, and not influence the rest of the code.