DATABASE WHEN USING TABLE ALIASES
Issue:
-----
When using table aliases for deleting, MySQL checks
privileges against the current database and not the
privileges on the actual table or database the table
resides.
SOLUTION:
---------
While checking privileges for multi-deletes,
correspondent_table should be used since it points to the
correct table and database.
Problem :
---------
The specific issue reported in this bug is with range/list column
value that is allocated and initialized by evaluating partition
expression(item tree) during execution. After evaluation the range
list value is marked fixed [part_column_list_val]. During next
execution, we don't re-evaluate the expression and use the old value
since it is marked fixed.
Solution :
----------
One way to solve the issue is to mark all column values as not fixed
during clone so that the expression is always re-evaluated once we
attempt partition_info::fix_column_value_functions() after cloning
the part_info object during execution of DDL on partitioned table.
Reviewed-by: Jimmy Yang <Jimmy.Yang@oracle.com>
Reviewed-by: Mattias Jonsson <mattias.jonsson@oracle.com>
RB: 9424
Valid min value for query_cache_min_res_unit is 512. But attempt
to set value greater than or equal to the ULONG_MAX(max value) is
resulting query_cache_min_res_unit value to 0. This result in
crash while searching for memory block lesser than the valid
min value to store query results.
Free memory blocks in query cache are stored in bins according
to their size. The bins are stored in size descending order.
For the memory block request the appropriate bin is searched using
binary search algorithm. The minimum free memory block request
expected is 512 bytes. And the appropriate bin is searched for block
greater than or equals to 512 bytes.
Because of the bug the query_cache_min_res_unit is set to 0. Due
to which there is a chance of request for memory blocks lesser
than the minimum size in free memory block bins. Search for bin
for this invalid input size fails and returns garbage index.
Accessing bins array element with this index is causing the issue
reported.
The valid value range for the query_cache_min_res_unit is
512 to ULONG_MAX(when value is greater than the max allowed value,
max allowed value is used i.e ULONG_MAX). While setting result unit
block size (query_cache_min_res_unit), size is memory aligned by
using a macro ALIGN_SIZE. The ALIGN_SIZE logic is as below,
(input_size + sizeof(double) - 1) & ~(sizeof(double) - 1)
For unsigned long type variable when input_size is greater than
equal to ULONG_MAX-(sizeof(double)-1), above expression is
resulting in value 0.
Fix:
-----
Comparing value set for query_cache_min_res_unit with max
aligned value which can be stored in ulong type variable.
If it is greater then setting it to the max aligned value for
ulong type variable.
- Removing use of calls to current_thd
- More DBUG_PRINT
- Code style changes
- Made some local functions static
Ensure that calls to print_keyuse are locked with mutex to get all lines in same debug packet
SELECT ... WHERE XX IN (SELECT YY)
this was transformed to something like:
SELECT ... WHERE IF_EXISTS(SELECT ... HAVING XX=YY)
The bug was that for normal execution XX was fixed in the original outer SELECT context while in PS it was fixed in the sub query context and this confused the optimizer.
Fixed by ensuring that XX is always fixed in the outer context.
This is MDEV-7601, including it's sub tasks MDEV-7594, MDEV-7555, MDEV-7590, MDEV-7581, MDEV-7589
The problem was that select_lex->non_agg_fields was not properly reset for re-execution and this caused an overwrite of a random memory position.
The fix was move non_agg_fields from select_lext to JOIN, which is properly reset.
The --gtid-ignore-duplicates option was not working correctly with row-based
replication. When a row event was completed, but before committing, there
was a small window where another multi-source SQL thread could wrongly try
to re-execute the same transaction, without properly ignoring the duplicate
GTID. This would lead to duplicate key error or out-of-order GTID error or
similar.
Thanks to Matt Neth for reporting this and giving an easy way to reproduce
the issue.
Problem :
---------
Issue-1: The root cause for the issues is that (col1 > 1) is not a
valid partition function and we should have thrown error at much early
stage [partition_info::check_partition_info]. We are not checking
sub-partition expression when partition expression is NULL.
Issue-2: Potential issue for future if any partition function needs to
change item tree during open/fix_fields. We should release changed
items, if any, before doing closefrm when we open the partitioned table
during creation in create_table_impl.
Solution :
----------
1.check_partition_info() - Check for sub-partition expression even if no
partition expression.
[partition by ... columns(...) subpartition by hash(<expr>)]
2.create_table_impl() - Assert that the change list is empty before doing
closefrm for partitioned table. Currently no supported partition function
seems to be changing item tree during open.
Reviewed-by: Mattias Jonsson <mattias.jonsson@oracle.com>
RB: 9345
in ha_delete_table()
* only convert ENOENT and HA_ERR_NO_SUCH_TABLE to warnings
* only return real error codes (that is, not ENOENT and
not HA_ERR_NO_SUCH_TABLE)
* intercept HA_ERR_ROW_IS_REFERENCED to generate backward
compatible ER_ROW_IS_REFERENCED
in mysql_rm_table_no_locks()
* no special code to handle HA_ERR_ROW_IS_REFERENCED
* no special code to handle ENOENT and HA_ERR_NO_SUCH_TABLE
* return multi-table error ER_BAD_TABLE_ERROR <table list> only
when there were many errors, not when there were many
tables to drop (but only one table generated an error)
When RENAME TABLE is executed, it apparently does not check whether the engine
is available (unlike ALTER TABLE .. RENAME, which does). It means that if the
engine in question was not loaded on some reason, the table might become
unusable, since the engine won't know about the change.
With this patch RENAME TABLE fails if storage engine is not available.
when --bind-address is not specificed explicitly (or set to '*')
MariaDB tries all wildcard addresses. Print a warning (not an error)
if a socket cannot be created for some of them.
Still print an error if a socket cannot be created for an address
that a user has specified expicitly with --bind-address.
* take into account that example may be NULL
* use example->safe_charset_converter(), copy-paste from
Item::safe_charset_converter() (example might have its own
implementation)
* handle the case when the charset doesn't need conversion
(and return this).
semisync plugin and setting rpl_semi_sync_master_enabled
There was race condition between INSTALL PLUGIN and SET. It was caused by a
gap in INSTALL PLUGIN when plugin variables were registered but not fully
initialized. Accessing such variables concurrently may reference uninitialized
memory, specifically sys_var_pluginvar::plugin.
Fixed by initializing sys_var_pluginvar::plugin early, before variable is
registered.
semisync plugin and setting rpl_semi_sync_master_enabled
Cleanup:
Removed my_intern_plugin_lock() and my_intern_plugin_lock_ci() wrappers. They
were obsoleted by revision f56dd32bf.
"Range Checked for Each Record" should be only employed when the other
option would be cross-product join (i.e. the other option is so bad that
we hardly risk anything).
Previous logic was: use RCfER if there are no possible quick selects, or
quick select would read > 100 rows. Also, it didn't always work as
expected due to range optimizer changing table->quick_keys and us
looking at sel->quick_keys.
Another angle is that recent versions have enabled use of Join Buffering
in e.g. outer joins. This further reduces the range of cases where RCfER
should be used.
We are still unable to estimate the cost of RCfER with any precision, so
now changing the condition of "no quick select or quick->records> 100"
to a hopefully better condition "no quick select or quick would cost more
than full table scan".
Changing the error message to:
"...from type 'decimal(0,?)/*old*/' to type ' 'decimal(10,7)'..."
So it's now clear that the master data type is OLD decimal.
Removing Item_cache::used_table_map, Item_cache::used_tables() and
Item_cache::set_used_tables(). Using the same inherited from
Item_basic_constant implementations instead.
Fhe GEOMETRY field should be handled just as the BLOB field. So that was fiexed in field_conv.
One additional bug was found and fixed meanwhile - thet the geometry field subtypes
should also be merged for UNION command.
Server may crash if sanity checks of COLUMN_GET() fail.
COLUMN_GET() description generator expects parent CAST item, which may not have
been created due to failure of sanity checks. Then further attempt to report
an error may crash the server.
Fixed COLUMN_GET() description generator to handle such case.
Factory timezone is supposed "For companies who don't want to put time zone
specification in their installation procedures. When users run date, they'll get
the message. Also useful for the "comp.sources" version."
This "message" is exposed as timezone abbreviation, which is supposed to be
short and thus may cause generated INSERT statements to fail.
Do not attempt to load Factory timezone.
remove the code that checks for correct options for
for CHECK/REPAIR VIEW. Rewrite the grammar for the parser
to check that. This changes error messages as
-ERROR 42000: You have an error ... near '' at line 1
+ERROR 42000: You have an error ... near 'quick' at line 1
When the slave processes the master restart format_description event,
parallel replication needs to complete any prior events before processing
the restart event (which closes temporary tables and such stuff).
This happens in wait_for_workers_idle(), however it was not waiting long
enough. The wait was using wait_for_prior_commit(), but at that points table
can still be open. This lead to assertion in this case.
So change wait_for_workers_idle() to wait until all worker threads have
reached finish_event_group(), at which point all tables should have been
closed.
Do not use format function attribute for sql_print_xxx() family of
functions as they use a MariaDB-specific extension of printf instead
of one provided by the system.
AVOID DEADLOCK AFTER RESTORE
Analysis
--------
Accessing the restored NDB table in an active multi-statement
transaction was resulting in deadlock found error.
MySQL Server needs to discover metadata of NDB table from
data nodes after table is restored from backup. Metadata
discovery happens on the first access to restored table.
Current code mandates this statement to be the first one
in the transaction. This is because discover needs exclusive
metadata lock on the table. Lock upgrade at this point can
lead to MDL deadlock and the code was written at the time
when MDL deadlock detector was not present. In case when
discovery attempted in the statement other than the first
one in transaction ER_LOCK_DEADLOCK error is reported
pessimistically.
Fix:
---
Removed the constraint as any potential deadlock will be
handled by deadlock detector. Also changed code in discover
to keep metadata locks of active transaction.
Same issue was present in table auto repair scenario. Same
fix is added in repair path also.
on REPAIR don't do table-specific stuff for views
(because even if the view has a temp table opened for it,
it's not opened all the way down the engine. In particular,
Aria crashes in maria_status() because MARIA_HA* info - that is
table->table->file->file - is NULL)
Gave priority to password field when using a native authentication
plugin.
Also, prevented a user from setting an invalid auth_string, when using
native authentication.
including the big commit
commit 305130361bf72726de220f3d2b2787395e10be61
Author: Marc Alff <marc.alff@oracle.com>
Date: Tue Feb 10 11:31:32 2015 +0100
WL#8354 BACKPORT DIGEST IMPROVEMENTS TO MYSQL 5.6
(with the following commits) and related changes in sql/
On EOF vio_read returns 0, it's not an error so the errno
is not reset. If the previous error was EINTR the client
will loop forever. See also man recv.
1. After a period of wait (where last_master_timestamp=0)
do NOT restore the last_master_timestamp to the timestamp
of the last executed event (which would mean we've just
executed it, and we're that much behind the master).
2. Update last_master_timestamp before executing the event,
not after.
Take the approach from the this commit (but with a different test
case that actually makes sense):
commit 0c75ab453fb8c5439576af8fe5add7a1b89f1569
Author: Luis Soares <luis.soares@sun.com>
Date: Thu Apr 15 17:39:31 2010 +0100
BUG#52166: Seconds_Behind_Master spikes after long idle period
As part of the fix find_files() prototype has been modified and
mysql-cluster uses find_files() function. Hence modified find_files() call
in ha_ndbcluster_binlog.cc file to make mysql-cluster build successful.
Fixed overflow error that caused fewer bites to be allocated than
necessary on Windows 64 bit. This is due to ulong being 32 bit on
64 bit Windows and 64 bit on 64 bit Linux.
The slave SQL thread was clearing serial_rgi->thd before deleting
serial_rgi, which could cause access to NULL THD.
The clearing was introduced in commit
2e100cc5a4 and is just plain wrong. So revert
that part (single line) of that commit.
Thanks to Daniel Black for bug analysis and test case.
HOST WHEN IT CONTAINS WILDCARD
Description :- Incorrect access privileges are provided to a
user due to wrong sorting of users when wildcard characters
is present in the hostname.
Analysis :- Function "get_sorts()" is used to sort the
strings of user name, hostname, database name. It is used
to arrange the users in the access privilege matching order.
When a user connects, it checks in the sorted user access
privilege list and finds a corresponding matching entry for
the user. Algorithm used in "get_sort()" sorts the strings
inappropriately. As a result, when a user connects to the
server, it is mapped to incorrect user access privileges.
Algorithm used in "get_sort()" counts the number of
characters before the first occurence of any one of the
wildcard characters (single-wildcard character '_' or
multi-wildcard character '%') and sorts in that order.
As a result of inconnect sorting it treats hostname "%" and
"%.mysql.com" as equally-specific values and therefore
the order is indeterminate.
Fix:- The "get_sort()" algorithm has been modified to treat
"%" seperately. Now "get_sort()" returns a number which, if
sorted in descending order, puts strings in the following
order:-
* strings with no wildcards
* strings containg wildcards and non-wildcard characters
* single muilt-wildcard character('%')
* empty string.
Description: On an example MySQL instance with 28k empty
InnoDB tables, a specific query to information_schema.tables
and information_schema.columns leads to memory consumption
over 38GB RSS.
Analysis: In get_all_tables() call, we fill the I_S tables
from frm files and storage engine. As part of that process
we call make_table_name_list() and allocate memory for all
the 28k frm file names in the THD mem_root through
make_lex_string_root(). Since it has been called around
28k * 28k times there is a huge memory getting hogged in
THD mem_root. This causes the RSS to grow to 38GB.
Fix: As part of fix we are creating a temporary mem_root
in get_all_tables and passing it to fill_fiels(). There we
replace the THD mem_root with the temporary mem_root and
allocates the file names in temporary mem_root and frees
it once we fill the I_S tables in get_all_tables and
re-assign the original mem_root back to THD mem_root.
Note: Checked the massif out put with the fix now the memory growth is just around 580MB at peak.
on disconnect THD must clean user_var_events array before
dropping temporary tables. Otherwise when binlogging a DROP,
it'll access user_var_events, but they were allocated
in the already freed memroot.
if we clear the error status (in THD::clear_error())
make sure to clear the thd->killed == KILL_BAD_DATA too,
because it was caused by the error that we're clearing.
Remove the too restrictive bugfix for bug#67088.
FIFO can be used for general/slow logs, but lseek() and fsync() on
FIFO fail. And open() needs to be non-blocking, in case the other
end isn't reading.
Three-way deadlock:
T1: SHOW GLOBAL STATUS
-> acquire LOCK_status
T2: STOP SLAVE
-> acquire LOCK_active_mi
-> terminate_slave_thread()
-> -> cond_timedwait for handle_slave_sql to stop
T3: sql slave thread (same applies to io thread)
-> handle_slave_sql(), when exiting
-> -> THD::add_status_to_global()
-> -> -> wait for LOCK_status...
T1: SHOW GLOBAL STATUS
-> for "Slave_heartbeat_period" status variable
-> -> show_heartbeat_period()
-> -> -> wait for LOCK_active_mi
cherry-pick from 5.6:
commit fc8b395898f40387b3468122bd0dae31e29a6fde
Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com>
Date: Wed Jun 12 21:41:05 2013 +0530
BUG#16904035-SHOW STATUS - EXCESSIVE LOCKING ON LOCK_ACTIVE_MI AND
ACTIVE_MI->RLI->DATA_LOCK
Problem: Excessive locking on lock_active_mi and rli->data_lock
while executing any `show status like 'X'` command.
Analysis: SHOW_FUNCs for Slave_running, Slave_retried_transactions,
Slave_heartbeat_period, Slave_received_heartbeats,
Slave_last_heartbeat are acquiring lock_active_mi and rli->data_lock
to show their variable value. It is ok to show stale data while showing
the status variables i.e., even if they miss one update, it will
not cause any great trouble.
Fix: Remove the locks from the above mentioned SHOW_FUNC functions.
Add a test case
Taking into account implicit dependence of constant view field from nullable table of left join added.
Fixed finding real table to check if it turned to NULL (materialized view & derived taken into account)
Removed incorrect uninitialization.
There was a rare race, where a deadlock error might not be correctly
handled, causing the slave to stop with something like this in the error
log:
150423 14:04:10 [ERROR] Slave SQL: Connection was killed, Gtid 0-1-2, Internal MariaDB error code: 1927
150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
150423 14:04:10 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
150423 14:04:10 [Warning] Slave: Connection was killed Error_code: 1927
150423 14:04:10 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'master-bin.000001 position 1234
The problem was incorrect error handling. When a deadlock is detected, it
causes a KILL CONNECTION on the offending thread. This error is then later
converted to a deadlock error, and the transaction is retried.
However, the deadlock error was not cleared at the start of the retry, nor
was the lingering kill signal. So it was possible to get another deadlock
kill early during retry. If this happened with particular thread
scheduling/timing, it was possible that the new KILL CONNECTION error was
masked by the earlier deadlock error, so that the second kill was not
properly converted into a deadlock error and retry.
This patch adds code that clears the old error and killed flag before
starting the retry. It also adds code to handle a deadlock kill caught in a
couple of places where it was not handled before.
convert_subq_to_sj() must check the results of in_equality->fix_fields()
call. It can fail in a meaningful way when e.g. we're trying to compare
columns with incompatible collations.
This was a regression from the patch for MDEV-7668.
A test was incorrect, so the slave would not properly handle re-using
temporary tables, which lead to replication failure in this case.
It is possible for Item_field to have a NULL field_name. This is true if
the Item_field is created based on a field in a temporary table that has
no name. It is thus necessary to do a null check before attempting a
strcmp.
Add some suppressions that were missing. They are for if a STOP SLAVE is
executed early during IO thread startup, when it is negotiating with the
master. The master connection may be killed in the middle of a
mysql_real_query(), which is not a test failure if it is a network error.
This also caught one real code error, fixed with this commit: The I/O thread
would fail to automatically reconnect if a network error happened while
fetching the value of @@GLOBAL.gtid_domain_id.
Make sure that in parallel replication, we execute wait_for_prior_commit()
before setting table->in_use for a temporary table. Otherwise we can end up
with two parallel replication worker threads competing with each other for
use of a temporary table.
Re-factor the use of find_temporary_table() to be able to handle errors
in the caller (as wait_for_prior_commit() can return error in case of
deadlock kill).
[This commit cherry-picked to be able to merge MDEV-7936, of which it
is a pre-requisite, into both 10.0 and 10.1.]
Parallel replication depends on locking (table locks, row locks, etc.) to
prevent two conflicting transactions from running and committing in parallel.
But temporary tables are designed to be visible only to one thread, and have
no such locking.
In the concrete issue, an intermediate master could commit a CREATE TEMPORARY
TABLE in the same group commit as in INSERT into that table. Thus, a
lower-level master could attempt to run them in parallel and get an error.
More generally, we need protection from parallel replication trying to run
transactions in parallel that access a common temporary table.
This patch simply causes use of a temporary table from parallel replication
to wait for all previous transactions to commit, serialising the replication
at that point.
(A more fine-grained locking could be added later, possibly. However,
using temporary tables in statement-based replication is in any case
normally undesirable; for example a restart of the server will lose
temporary tables and can break replication).
Note that row-based replication is not affected, as it does not do any
temporary tables on the slave-side.
This patch also cleans up the locking around protecting the list of
temporary tables in Relay_log_info. This used to take the
rli->data_lock at the end of every statement, which is very bad for
concurrency. With this patch, the lock is not taken unless temporary
tables (with statement-based binlogging) are in use on the slave.
> + if (lex->no_write_to_binlog && lex->only_view)
> + {
> + my_parse_error(ER(ER_SYNTAX_ERROR));
> + MYSQL_YYABORT;
Why? REPAIR NO_WRITE_TO_BINLOG VIEW makes perfect sense to me, why did
you want to disallow it?
The hangs occur when the group_commit_orderer object is freed before the last
mark_start_commit() call on it - this loses the wakeup to other waiting worker
threads, causing them to hang until killed manually.
The object was freed because wakeup_subsequent_commits() was called two early
in two places. For MDEV-7888, during ANALYZE TABLE, and for MDEV-7929 during
record_gtid() after processing a DDL event. The group_commit_orderer object
can be freed when its last transaction has called wait_for_prior_commit().
Fix by implementing a suspend/resume mechanism for wakeup_subsequent_commits()
that can be used in places where a transaction is committed without this being
the commit of the actual replication event group.
Also add a protection mechanism (that asserts in debug builds) which can
prevent the too-early free and hang if other similar bugs should remain in
other parts of the code.
This patch fixes a bug in the error handling in parallel replication, when one
worker thread gets a failure and other worker threads processing later
transactions have to rollback and abort.
The problem was with the lifetime of group_commit_orderer objects (GCOs).
A GCO is freed when we register that its last event group has committed. This
relies on register_wait_for_prior_commit() and wait_for_prior_commit() to
ensure that the fact that T2 has committed implies that any earlier T1 has
also committed, and can thus no longer execute mark_start_commit().
However, in the error case, the code was skipping the
register_wait_for_prior_commit() and wait_for_prior_commit() calls. Thus
commit ordering was not guaranteed, and a GCO could be freed too early. Then a
later mark_start_commit() would reference deallocated GCO, which could lead to
lost wakeup (causing slave threads to hang) or other corruption.
This patch makes also the error case respect commit order. This way, also the
error case gets the GCO lifetime correct, and the hang no longer occurs.
When a transaction in parallel replication needs to retry (eg. because of
deadlock kill), first wait for all prior transactions to commit before doing
the retry. This way, we avoid the retry once again conflicting with a prior
transaction, requiring yet another retry.
Without this patch, we saw "in the wild" that transactions had to be retried
more than 10 times to succeed, which exceeds the default
--slave_transaction_retries value and is in any case undesirable.
(We already do this in 10.1 in "optimistic" parallel replication mode; this
patch just makes the code use the same logic for "conservative" mode (only
mode in 10.0)).
Backport from mysql-5.5 to mysql-5.1
Bug# 19699237: UNINITIALIZED VARIABLE IN
ITEM_FIELD::STR_RESULT LEADS TO INCORRECT
BEHAVIOR
ISSUE:
------
When the following conditions are satisfied in a query, a
server crash occurs:
a) Two rows are compared using a NULL-safe equal-to operator.
b) Each of these rows belong to different charsets.
SOLUTION:
---------
When one charset is converted to another for comparision,
the constructor of "Item_func_conv_charset" is called.
This will attempt to use the Item_cache if the string is a
constant. This check succeeds because the "used_table_map"
of the Item_cache class is never set to the correct value.
Since it is mistakenly assumed to be a constant, it tries
to fetch the relevant null value related fields which are
yet to be initialized. This results in valgrind issues
and wrong results.
The fix is to update the "used_table_map" of "Item_cache".
This will allow "Item_func_conv_charset" to realise that
this is not a constant.
Problem: UDF doesn't handle the arguments properly when they
are of string type due to a misplaced break.
The length of arguments is also not set properly
when the argument is NULL.
Solution: Fixed the code by putting the break at right place
and setting the argument length to zero when the
argument is NULL.
Backport from mysql-5.5 to mysql-5.1
Bug#19880368 : GROUP_CONCAT CRASHES AFTER DUMP_LEAF_KEY
Problem:
find_order_by_list does not update the address of order_item
correctly after resolving.
Solution:
Change the ref_by address for a order_by field if its
SUM_FUNC_ITEM to the address of the field present in
all_fields.
when using function.
Merged upstream fix to Bug#16221433 MYSQL REJECTS QUERY DUE TO BAD
RESOLUTION OF NAMES IN HAVING; VIEW UNREADABLE
authored by Guilhem Bichot <guilhem.bichot@oracle.com>.
Backport from mysql-5.5 to mysql-5.1
Bug #19612819 : FILESORT: ASSERTION FAILED: POS->FIELD != 0 || POS->ITEM != 0
Problem:
While getting the temp table field for a REF_ITEM
make_sortorder is using the real_item. As a result
server fails later with an assert.
Solution:
Do not use real_item to get the temp table field.
Instead use the REF_ITEM itself as temp table fields
are created for REF_ITEM not the real_item.
If the spatial key is used within an equality comparison, the comparison
does not produce relevant results generally as identical geometry can be
stored differently. Still, we want to support the operation. In order
to allow a hash join plan, we must define a key_length for Field_geom.
Backport from mysql-5.5 to mysql-5.1 of:
Bug19770858: MYSQLD CAN BE DRIVEN TO OOM WITH TWO SIMPLE SESSION VARS
The problem was that the maximum value of the transaction_prealloc_size
session system variable was ULONG_MAX which meant that it was possible
to cause the server to allocate excessive amounts of memory.
This patch fixes the problem by reducing the maxmimum value of
transaction_prealloc_size and transaction_alloc_block_size down
to 128K.
Note that transactions will still be able to allocate more than
128K if needed, this patch just reduces the amount that can be
preallocated - as well as the maximum size of the incremental
allocation blocks.
(cherry picked from commit 540c9f7ebb428bbf9ec028feabe1f7f919fdefd9)
Conflicts:
mysql-test/suite/sys_vars/r/transaction_alloc_block_size_basic.result
mysql-test/suite/sys_vars/r/transaction_alloc_block_size_basic_64.result
mysql-test/suite/sys_vars/t/disabled.def
mysql-test/suite/sys_vars/t/transaction_alloc_block_size_basic.test
sql/sys_vars.cc
JOIN::cur_dups_producing_tables was not maintained correctly in
the cases of greedy optimization (search_depth < n_tables).
Moved it to POSITION structure where it will be maintained automatically.
Removed POSITION::prefix_dups_producing_tables since its value can now
be calculated.
Parallel replication (in 10.0 / "conservative" mode) relies on binlog group
commits to group transactions that can be safely run in parallel on the
slave. The --binlog-commit-wait-count and --binlog-commit-wait-usec options
exist to increase the number of commits per group. But in case of conflicts
between transactions, this can cause unnecessary delay and reduced througput,
especially on a slave where commit order is fixed.
This patch adds a heuristics to reduce this problem. When transaction T1 goes
to commit, it will first wait for N transactions to queue up for a group
commit. However, if we detect that another transaction T2 is waiting for a row
lock held by T1, then we will skip the wait and let T1 commit immediately,
releasing locks and let T2 continue.
On a slave, this avoids the unfortunate situation where T1 is waiting for T2
to join the group commit, but T2 is waiting for T1 to release locks, causing
no work to be done for the duration of the --binlog-commit-wait-usec timeout.
(The heuristic seems reasonable on the master as well, so it is enabled for
all transactions, not just replication transactions).
BINLOGGED INCORRECTLY - BREAKS A SLAVE
Submitted a incomplete patch with my previous push,
re submitting the extra changes the required to make
the patch complete.
Analysis:
In row based replication, Master does not send temp table information
to Slave. If there are any DDLs that involves in regular table that needs
to be sent to Slave and a temp tables (which will not be available at Slave),
the Master rewrites the query replacing temp table with it's defintion.
Eg: create table regular_table like temptable.
In rewrite logic, server is ignoring the database of regular table
which can cause problems mentioned in this bug.
Fix: dont ignore database information (if available) while
rewriting the query
Delay spawning parallel replication worker threads until a slave SQL
thread is running, and de-spawn them when the last SQL thread stops.
This is especially useful to avoid needless threads on a master in a
setup where same my.cnf is used on masters and slaves.
Parallel replication depends on locking (table locks, row locks, etc.) to
prevent two conflicting transactions from running and committing in parallel.
But temporary tables are designed to be visible only to one thread, and have
no such locking.
In the concrete issue, an intermediate master could commit a CREATE TEMPORARY
TABLE in the same group commit as in INSERT into that table. Thus, a
lower-level master could attempt to run them in parallel and get an error.
More generally, we need protection from parallel replication trying to run
transactions in parallel that access a common temporary table.
This patch simply causes use of a temporary table from parallel replication
to wait for all previous transactions to commit, serialising the replication
at that point.
(A more fine-grained locking could be added later, possibly. However,
using temporary tables in statement-based replication is in any case
normally undesirable; for example a restart of the server will lose
temporary tables and can break replication).
Note that row-based replication is not affected, as it does not do any
temporary tables on the slave-side.
This patch also cleans up the locking around protecting the list of
temporary tables in Relay_log_info. This used to take the
rli->data_lock at the end of every statement, which is very bad for
concurrency. With this patch, the lock is not taken unless temporary
tables (with statement-based binlogging) are in use on the slave.
The binlog contains specially marked format description events to mark
when a master restart happened (which could have caused temporary
tables to be silently dropped). Such events also cause slave to close
temporary tables.
However, there was a bug that if after this, slave re-connects to the
master in GTID mode, the master can send an old format description
event again. If temporary tables are closed when such event is seen
for the second time, it might drop temporary tables created after that
event, and cause replication failure.
With this patch, the restart flag of the format description event is
cleared by the master when it is sent to the slave in a subsequent
connection, to avoid the errorneous temp table close.
The problem occurs in parallel replication in GTID mode, when we are using
multiple replication domains. In this case, if the SQL thread stops, the
slave GTID position may refer to a different point in the relay log for each
domain.
The bug was that when the SQL thread was stopped and restarted (but the IO
thread was kept running), the SQL thread would resume applying the relay log
from the point of the most advanced replication domain, silently skipping all
earlier events within other domains. This caused replication corruption.
This patch solves the problem by storing, when the SQL thread stops with
multiple parallel replication domains active, the current GTID
position. Additionally, the current position in the relay logs is moved back
to a point known to be earlier than the current position of any replication
domain. Then when the SQL thread restarts from the earlier position, GTIDs
encountered are compared against the stored GTID position. Any GTID that was
already applied before the stop is skipped to avoid duplicate apply.
This patch should have no effect if multi-domain GTID parallel replication is
not used. Similarly, if both SQL and IO thread are stopped and restarted, the
patch has no effect, as in this case the existing relay logs are removed and
re-fetched from the master at the current global @@gtid_slave_pos.
This bug manifests due to wrong computation and evaluation of
keyinfo->key_length. The issues were:
* Using table->file->max_key_length() as an absolute value that must not be
reached for a key, while it represents the maximum number of bytes
possible for a table key.
* Incorrectly computing the keyinfo->key_length size during
KEY_PART_INFO creation. The metadata information regarding the key
such the field length (for strings) was added twice.
When the server starts up, check if the master-bin.state file was lost.
If it was, recover its contents by scanning the last binlog file, thus
avoiding running with a corrupt binlog state.
Temporary table count fix. The number of temporary tables was increased
when the table is not actually created. (when do_not_open was passed
as TRUE to create_tmp_table).
3RD EXECUTION OF PS
Problem:
When order by is by a column number for a group concat function
which has an outer reference, server fails in case of prepared
statements on the third execution
Analysis:
When a group concat function has order by, the fields in order by
are not resolved until execution if the input is a column number.
During execution they get resolved after the temp table gets created.
As a result they will be pointing to temp table fields which are
runtime created objects. This results in dangling pointers leading
to server failure.
Solution:
Reset the pointers for the order by fields to point to the original
arguments after execution as they are invalid.
Done in Item_func_group_concat::cleanup.
ISSUE:
------
There can be up to MERGEBUFF2 number of sorted merge chunks,
We need enough buffer space for at least one record from
each merge chunks. If estimates are wrong(very low) and we
allocate buffer space for less than MERGEBUFF2, then we will
have issue in merge_buffers, if actual number of rows to be
sorted is bigger than estimate and external filesort is
chosen.
SOLUTION:
---------
Set number of rows to sort to be at least MERGEBUFF2.
partially cherry-pick from mysql/5.6.
No test case (mysql/5.6 test case is useless, the correct
test case uses too much memory)
commit e061985813db54948f99892d89f7e076242473a5
Author: <Dao-Gang.Qu@sun.com>
Date: Tue Jun 1 15:02:22 2010 +0800
Bug #49931 Incorrect type in read_log_event error
Bug #49932 mysqlbinlog max_allowed_packet hard coded to 1GB
If somehow the COMMIT or XID event in an event group was missing, the code in
parallel replication to handle this was not sufficient, leading to server
deadlock.
In parallel replication, don't rollback inside ha_commit_trans() in case of
error.
The rollback will be done later, but the parallel replication code needs to
run unmark_start_commit() before the rollback to properly control the
sequencing of transactions.
I did not manage to come up with a reliable automatic test case for this, but
I tested it manually.
When the binlog was rotated due to @@max_binlog_size, the values of the
binlog_shapshot_file and binlog_snapshot_position were inconsistent in case of
non-transactional DML. The position was refering to the old file, while the
filename was of the new file after rotation. This patch makes them consistent
by making sure the position is also refering to the new file.
cherry-pick the upstream fix
commit d4ba10184cd7bde9c31c610e664ecd0c93605c46
Author: Sujatha Sivakumar <sujatha.sivakumar@oracle.com>
Date: Wed Jul 2 11:34:11 2014 +0530
Bug#17453826:ASSERTION ERROR WHEN SETTING FUTURE BINLOG
FILE/POS WITH SEMISYNC
Problem:
========
When DMLs are in progress on the master stopping a slave and
setting ahead binlog name/pos will cause an assert on the
master.
...
Item_func::print() prints itself as name + "(" + arguments + ")".
Normally that works, but Item_func_interval internally implements its
arguments as one single Item_row. Item_row prints itself as
"(" + values + ")". As a result, "INTERVAL(1,2)" was being printed
as "INTERVAL((1,2))". Fixed with a custom Item_func_interval::print().
Problem:
find_order_by_list does not update the address of order_item
correctly after resolving.
Solution:
Change the ref_by address for a order_by field if its
SUM_FUNC_ITEM to the address of the field present in
all_fields.
Redefine FT_KEYPART in a way that it does not conflict with Hash Join.
Hash join stores field->field_index in KEYUSE::keypart, so we must
use a value of FT_KEYPART that's greater than MAX_FIELDS.
The order of initialisation during server startup was incorrect. The slave
threads were started before the parallel replication worker thread pool was
initialised, allowing a race where uninitialised data could be accessed.
Problem:
While getting the temp table field for a REF_ITEM
make_sortorder is using the real_item. As a result
server fails later with an assert.
Solution:
Do not use real_item to get the temp table field.
Instead use the REF_ITEM itself as temp table fields
are created for REF_ITEM not the real_item.
Do not use merge_for_insert for commands which use SELECT because optimizer can't work with such tables.
Fixes which makes multi-delete working with normally merged views.
When the distance in ST_BUFFER is too far negative the coordinates can run out of the operational
area. We should just return an empty geometry in this case.
In versions 5.5 and 5.6 the MySQL version is not logged until
server is started and ready to accept connections. Exiting
server before this point will not have server version information
in the log. But in 5.7 code, we log a server version information
just after we prepare server_version string and logging is initialized.
For 5.5 and 5.6 code also adding this code to print server version
information.
Test results:
================
5.5
-----
Server version will be logged as below on server startup:
141218 8:45:48 [Note] /home/praveen/WorkDir/mysql_local/bug20052694/mysql/sql/mysqld (mysqld 5.5.42-debug-log) starting as process 19697 ...
5.6
----
Server version will be logged as below on server startup:
2014-12-18 09:08:43 0 [Note] /home/praveen/WorkDir/mysql_local/bug20052694/mysql-5.6/sql/mysqld (mysqld 5.6.23-debug-log) starting as process 18474 ...
LEADS TO INCORRECT BEHAVIOR
ISSUE:
------
When the following conditions are satisfied in a query, a
server crash occurs:
a) Two rows are compared using a NULL-safe equal-to operator.
b) Each of these rows belong to different charsets.
SOLUTION:
---------
When one charset is converted to another for comparision,
the constructor of "Item_func_conv_charset" is called.
This will attempt to use the Item_cache if the string is a
constant. This check succeeds because the "used_table_map"
of the Item_cache class is never set to the correct value.
Since it is mistakenly assumed to be a constant, it tries
to fetch the relevant null value related fields which are
yet to be initialized. This results in valgrind issues
and wrong results.
The fix is to update the "used_table_map" of "Item_cache".
This will allow "Item_func_conv_charset" to realise that
this is not a constant.
ISSUE:
------
We pre-allocate the ref_pointer_array before we resolve outer
references. This means that in some cases the
ref_pointer_array may not be large enough to hold all
references created. One such case is aggregate functions in
having clause of a subquery which may add items to select list
of outer query. So it is necessary to consider
select_n_having_items for subqueries while allocating
ref_pointer_array else we will get buffer overflow.
SOLUTION:
---------
Allocate a larger ref_pointer_array by aggregating
select_n_having_items for subqueries.
The fix in sql_yacc.yy is a backport from bug fix 18782905.
CRASHES WITH AUTO_INCREMENT COLUMN
Description:- Creating a federated table with AUTO_INCREMENT
column using LIKE clause results in a server crash.
Analysis:- Creating a federated table with AUTO_INCREMENT
column using LIKE clause results in a federated server
crash due to the uninitialized connection structure(mysql).
Also due to unassigned connection string for the remote
server, at the time of preparation of "create_info"
structure, the creation of any federated table using LIKE
clause fails with an error, "ERROR 1 (HY000): server name:
'' doesn't exist!". This bug is not only with
AUTO_INCREMENT but in all creations of federated tables with
LIKE clause.
Fix :- In ha_federated::info(), "mysql->insert_id" assigned
to "stats.auto_increment_value" only when there is an
active connection. This fixes the crash issue. For creating
the federated table with LIKE clause, connection string is
assigned at the time of preparation of "create_info"
structure.
Call mysql_derived_reinit() if we are reusing view.
This is needed as during a previous error condition the view may not have been reset
sql/sql_derived.cc:
More DBUG_PRINT
Always reset merged_for_insert (no reason to not do that)
sql/sql_derived.h:
Added prototype
sql/sql_insert.cc:
More DBUG_PRINT
Added DBUG_ASSERT
sql/sql_view.cc:
Call mysql_derived_reinit() if we are reusing view.
This is needed as during a previous error condition the view may not have been reset
sql/table.cc:
More DBUG_PRINT
The problem was that the maximum value of the transaction_prealloc_size
session system variable was ULONG_MAX which meant that it was possible
to cause the server to allocate excessive amounts of memory.
This patch fixes the problem by reducing the maxmimum value of
transaction_prealloc_size and transaction_alloc_block_size down
to 128K.
Note that transactions will still be able to allocate more than
128K if needed, this patch just reduces the amount that can be
preallocated - as well as the maximum size of the incremental
allocation blocks.
Using a boolean flag for 'there is a RESET MASTER in progress' doesn't
work very well for multiple concurrent RESET MASTER statements.
Changed to a counter.
Fix MDL to report an error when a wait was killed, but preserve
the old documented behavior of GET_LOCK() where killing it is not an error.
Also remove race conditions in main.create_or_replace test
Problem Description And Fix:
Inserting a fudged record in mysql.proc with the dbname
column value as test and the name column as empty, will
cause a crash in mysqld when we run the command DROP
DATABASE test.
During DROP DATABASE test, mysql_rm_db subsequently
calls lock_db_routines. In the routine we fetch the
field 'name' from mysql.proc by calling the underlying
storage engine API in lock_db_routines. This cause NULL
value as the field column of mysql.proc and subsequent
dereference MDL_request::init leads to crash.
Modifying mysql.proc using SQL command by user is not
supported, but in principle, there is a possibility
of mysql.proc getting corrupted which can also lead
to empty fields and arbitary values. The patch fixes
the crash by checking NULL and propagating the appopriate
error code to the user.
Stage "Filling schema table" is now properly shown in 'show processlist'
mysys/mf_keycache.c:
Simple cleanup with more comments
sql/lock.cc:
Return to original stage after mysql_lock_tables
Made 'Table lock' as a true stage
sql/sql_show.cc:
Restore original stage after get_schema_tables_result
- The code that tested if
WHERE expr=value AND expr=const
can be rewritten to:
WHERE const=value AND expr=const
was incomplete in case of STRING_RESULT.
- Moving the test into a new function, to reduce duplicate code.