LOAD_FILE
LOAD_FILE is not safe to replicate in STATEMENT mode, because it
depends on a file (which is loaded on master and may not exist in
slave(s)). This leads to scenarios on which the slave replicates the
statement with 'load_file' and it will try to load the file from local
file system. Given that the file may not exist in the slave filesystem
the operation will not succeed (probably returning NULL), causing
master and slave(s) to diverge. However, when using MIXED mode
replication, this can be made to work, if the statement including
LOAD_FILE is marked as unsafe, triggering a switch to ROW mode,
meaning that the contents of the file are written to binlog as row
events. Consequently, the contents from the file in the master will
reach the slave via the binlog.
This patch addresses this bug by marking the load_file function as
unsafe. When in mixed mode and when LOAD_FILE is issued, there will be
a switch to row mode. Furthermore, when in statement mode, the
LOAD_FILE will raise a warning that the statement is unsafe in that
mode.
Based on contributed patch from Martin Friebe, CLA from 2007-02-24.
The parser lacked support for field sizes after signed long,
when it should extend to 2**32-1.
Now, we correct that limitation, and also make the error handling
consistent for casts.
---
Fix minor complaints of Marc Alff, for patch against B-g#15776.
---
Merge zippy.cornsilk.net:/home/cmiller/work/mysql/bug15776/my50-bug15776
into zippy.cornsilk.net:/home/cmiller/work/mysql/bug15776/my51-bug15776
---
Merge zippy.cornsilk.net:/home/cmiller/work/mysql/bug15776/my51-bug15776
into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.1-build
---
testing
Problem: when character_set_connection=utf8,
mixing SPACE() with a non-Unicode column (e.g. for concat)
produced "illegal mix of collations" error.
Fix: Item_string() corresponding to space character
is now created using "ASCII" repertoire. Previously
it was incorrectly created using "UNICODE" repertoure, which
didn't allow to convert results of SPACE() to a non-Unicode
character set.
The functions ROW_COUNT/FOUND_ROWS are indeed not safe to be used in
statement based replication.
Added code to declare them as such and switch the statement they're in
to row based logging for mixed mode.
Problem: thd->thread_specific_used flag is not set executing a statement
containig connection_id() function using PS protocol, that leads to
improper binlog event creation.
Fix: set the flag in the Item_func_connection_id::fix_fields().
restores from mysqlbinlog out
Problem: using "mysqlbinlog | mysql" for recoveries the connection_id()
result may differ from what was used when issuing the statement.
Fix: if there is a connection_id() in a statement, write to binlog
SET pseudo_thread_id= XXX; before it and use the value later on.
The cast operation ignored the cases when the precision and/or the scale exceeded
the limits, 65 and 30 respectively. No errors were reported in these cases.
For some queries this may lead to an assertion abort.
Fixed by throwing errors for such cases.
Replacing binlog_row_based_if_mixed with variable binlog_stmt_flags
holding several flags and adding member functions to manipulate the
flags.
Added code to generate a warning when an attempt to log an unsafe
statement to the binary log was made. The warning is both pushed to the
SHOW WARNINGS table and written to the error log. The prevent flooding
the error log, the warning is just written to the error log once per
open session.
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
This pads the value of CHAR columns with spaces up to full column length (according to ANSI)
It's not makde part of oracle or ansi mode yet, as this would cause a notable behaviour change.
Added uuid_short(), a generator for increasing 'unique' longlong integers (8 bytes)
The MBR-prefixed function aliases (e.g., MBROVERLAPS()) were missing in 5.1, after
some code refactoring left them out. This simply adds them into the func_array list.
The problem was that some facilities (like CONVERT_TZ() function or
server HELP statement) may require implicit access to some tables in
'mysql' database. This access was done by ordinary means of adding
such tables to the list of tables the query is going to open.
However, if we issued LOCK TABLES before that, we would get "table
was not locked" error trying to open such implicit tables.
The solution is to treat certain tables as MySQL system tables, like
we already do for mysql.proc. Such tables may be opened for reading
at any moment regardless of any locks in effect. The cost of this is
that system table may be locked for writing only together with other
system tables, it is disallowed to lock system tables for writing and
have any other lock on any other table.
After this patch the following tables are treated as MySQL system
tables:
mysql.help_category
mysql.help_keyword
mysql.help_relation
mysql.help_topic
mysql.proc (it already was)
mysql.time_zone
mysql.time_zone_leap_second
mysql.time_zone_name
mysql.time_zone_transition
mysql.time_zone_transition_type
These tables are now opened with open_system_tables_for_read() and
closed with close_system_tables(), or one table may be opened with
open_system_table_for_update() and closed with close_thread_tables()
(the latter is used for mysql.proc table, which is updated as part of
normal MySQL server operation). These functions may be used when
some tables were opened and locked already.
NOTE: online update of time zone tables is not possible during
replication, because there's no time zone cache flush neither on LOCK
TABLES, nor on FLUSH TABLES, so the master may serve stale time zone
data from cache, while on slave updated data will be loaded from the
time zone tables.
Before this fix, a call to a User Defined Function (UDF) could,
under some circumstances, be interpreted as a call to a Stored function
instead. This occurred if a native function was invoked in the parameters
for the UDF, as in "select my_udf(abs(x))".
The root cause of this defect is the introduction, by the fix for Bug 21809,
of st_select_lex::udf_list, and it's usage in the parser in sql_yacc.yy
in the rule function_call_generic (in 5.1).
While the fix itself for Bug 21809 is correct in 5.0, the code change
merged into the 5.1 release created the issue, because the calls in 5.1 to :
- lex->current_select->udf_list.push_front(udf)
- lex->current_select->udf_list.pop()
are not balanced in case of native functions, causing the udf_list,
which is really a stack, to be out of sync with the internal stack
maintained by the bison parser.
Instead of moving the call to udf_list.pop(), which would have fixed the
symptom, this patch goes further and removes the need for udf_list.
This is motivated by two reasons:
a) Maintaining a stack in the MySQL code in sync with the stack maintained
internally in sql_yacc.cc (not .yy) is extremely dependent of the
implementation of yacc/bison, and extremely difficult to maintain.
It's also totally dependent of the structure of the grammar, and has a risk
to break with regression defects each time the grammar itself is changed.
b) The previous code did report construct like "foo(expr AS name)" as
syntax errors (ER_PARSER_ERROR), which is incorrect, and misleading.
The syntax is perfectly valid, as this expression is valid when "foo" is
a UDF. Whether this syntax is legal or not depends of the semantic of "foo".
With this change:
a) There is only one stack (in bison), and no List<udf_func> to maintain.
b) "foo(expr AS name)", when used incorrectly, is reported as semantic error:
- ER_WRONG_PARAMETERS_TO_NATIVE_FCT (for native functions)
- ER_WRONG_PARAMETERS_TO_STORED_FCT (for stored functions)
This is achieved by the changes implemented in item_create.cc
Before this change, the functions BENCHMARK, ENCODE, DECODE and FORMAT could
only accept a constant for some parameters.
After this change, this restriction has been removed. An implication is that
these functions can also be used in prepared statements.
The change consist of changing the following classes:
- Item_func_benchmark
- Item_func_encode
- Item_func_decode
- Item_func_format
to:
- only accept Item* in the constructor,
- and evaluate arguments during calls to val_xxx()
which fits the general design of all the other functions.
The 'TODO' items identified in item_create.cc during the work done for
Bug 21114 are addressed by this fix, as a natural consequence of aligning
the design.
In the 'func_str' test, a single very long test line involving an explain
extended select with many functions has been rewritten into multiple
separate tests, to improve maintainability.
The result of explain extended select decode(encode(...)) has changed,
since the encode and decode functions now print all their parameters.
Due to the complexity of this change, everything is documented in WL#3565
This patch is the third iteration, it takes into account the comments
received to date.