Conflicts:
- mysql-test/r/mysqld--help-win.result
- sql/sys_vars.cc
Original revsion (in next-mr-bugfixing):
------------------------------------------------------------
revno: 2971 [merge]
revision-id: alfranio.correia@sun.com-20100121210527-rbuheu5rnsmcakh1
committer: Alfranio Correia <alfranio.correia@sun.com>
branch nick: mysql-next-mr-bugfixing
timestamp: Thu 2010-01-21 21:05:27 +0000
message:
BUG#46364 MyISAM transbuffer problems (NTM problem)
It is well-known that due to concurrency issues, a slave can become
inconsistent when a transaction contains updates to both transaction and
non-transactional tables.
In a nutshell, the current code-base tries to preserve causality among the
statements by writing non-transactional statements to the txn-cache which
is flushed upon commit. However, modifications done to non-transactional
tables on behalf of a transaction become immediately visible to other
connections but may not immediately get into the binary log and therefore
consistency may be broken.
In general, it is impossible to automatically detect causality/dependency
among statements by just analyzing the statements sent to the server. This
happen because dependency may be hidden in the application code and it is
necessary to know a priori all the statements processed in the context of
a transaction such as in a procedure. Moreover, even for the few cases that
we could automatically address in the server, the computation effort
required could make the approach infeasible.
So, in this patch we introduce the option
- "--binlog-direct-non-transactional-updates" that can be used to bypass
the current behavior in order to write directly to binary log statements
that change non-transactional tables.
Besides, it is used to enable the WL#2687 which is disabled by default.
------------------------------------------------------------
revno: 2970.1.1
revision-id: alfranio.correia@sun.com-20100121131034-183r4qdyld7an5a0
parent: alik@sun.com-20100121083914-r9rz2myto3tkdya0
committer: Alfranio Correia <alfranio.correia@sun.com>
branch nick: mysql-next-mr-bugfixing
timestamp: Thu 2010-01-21 13:10:34 +0000
message:
BUG#46364 MyISAM transbuffer problems (NTM problem)
It is well-known that due to concurrency issues, a slave can become
inconsistent when a transaction contains updates to both transaction and
non-transactional tables.
In a nutshell, the current code-base tries to preserve causality among the
statements by writing non-transactional statements to the txn-cache which
is flushed upon commit. However, modifications done to non-transactional
tables on behalf of a transaction become immediately visible to other
connections but may not immediately get into the binary log and therefore
consistency may be broken.
In general, it is impossible to automatically detect causality/dependency
among statements by just analyzing the statements sent to the server. This
happen because dependency may be hidden in the application code and it is
necessary to know a priori all the statements processed in the context of
a transaction such as in a procedure. Moreover, even for the few cases that
we could automatically address in the server, the computation effort
required could make the approach infeasible.
So, in this patch we introduce the option
- "--binlog-direct-non-transactional-updates" that can be used to bypass
the current behavior in order to write directly to binary log statements
that change non-transactional tables.
Besides, it is used to enable the WL#2687 which is disabled by default.
There was two problems:
The first was the symptom, caused by bad error handling in
ha_partition. It did not handle print_error etc. when
having no partitions (when used by dummy handler).
The second was the real problem that when dropping tables
it reused the table type (storage engine) from when the lock
was asked for, not the table type that it had when gaining
the exclusive name lock. So that it tried to delete tables
from wrong storage engines.
Solutions for the first problem was to accept some handler
calls to the partitioning handler even if it was not setup
with any partitions, and also if possible fallback
to use the base handler's default functions.
Solution for the second problem was to remove the optimization
to reuse the definition from the cache, instead always check
the frm-file when holding the LOCK_open mutex
(updated with a fix for a debug print crash and better
comments as required by reviewer, and removed optimization
to avoid reading the frm-file).
caused by patch which implemented new type-of-operation-aware
metadata locks and added a wait-for graph based deadlock
detector to the MDL subsystem (this patch fixed bug #46272
"MySQL 5.4.4, new MDL: unnecessary deadlock" and bug #37346
"innodb does not detect deadlock between update and alter
table").
Crashes were caused by a race in MDL_context::try_acquire_lock().
This method added MDL_ticket to the list of granted tickets and
released lock protecting list before setting MDL_ticket::m_lock.
Thus some other thread was able to see ticket without properly
set m_lock member for some short period of time. If this thread
called method involving this member during this period crash
happened.
This fix ensures that MDL_ticket::m_lock is set in all cases
when ticket is added to granted/pending lists in MDL_lock.
Add a wait-for graph based deadlock detector to the
MDL subsystem.
Fixes bug #46272 "MySQL 5.4.4, new MDL: unnecessary deadlock" and
bug #37346 "innodb does not detect deadlock between update and
alter table".
The first bug manifested itself as an unwarranted abort of a
transaction with ER_LOCK_DEADLOCK error by a concurrent ALTER
statement, when this transaction tried to repeat use of a
table, which it has already used in a similar fashion before
ALTER started.
The second bug showed up as a deadlock between table-level
locks and InnoDB row locks, which was "detected" only after
innodb_lock_wait_timeout timeout.
A transaction would start using the table and modify a few
rows.
Then ALTER TABLE would come in, and start copying rows
into a temporary table. Eventually it would stumble on
the modified records and get blocked on a row lock.
The first transaction would try to do more updates, and get
blocked on thr_lock.c lock.
This situation of circular wait would only get resolved
by a timeout.
Both these bugs stemmed from inadequate solutions to the
problem of deadlocks occurring between different
locking subsystems.
In the first case we tried to avoid deadlocks between metadata
locking and table-level locking subsystems, when upgrading shared
metadata lock to exclusive one.
Transactions holding the shared lock on the table and waiting for
some table-level lock used to be aborted too aggressively.
We also allowed ALTER TABLE to start in presence of transactions
that modify the subject table. ALTER TABLE acquires
TL_WRITE_ALLOW_READ lock at start, and that block all writes
against the table (naturally, we don't want any writes to be lost
when switching the old and the new table). TL_WRITE_ALLOW_READ
lock, in turn, would block the started transaction on thr_lock.c
lock, should they do more updates. This, again, lead to the need
to abort such transactions.
The second bug occurred simply because we didn't have any
mechanism to detect deadlocks between the table-level locks
in thr_lock.c and row-level locks in InnoDB, other than
innodb_lock_wait_timeout.
This patch solves both these problems by moving lock conflicts
which are causing these deadlocks into the metadata locking
subsystem, thus making it possible to avoid or detect such
deadlocks inside MDL.
To do this we introduce new type-of-operation-aware metadata
locks, which allow MDL subsystem to know not only the fact that
transaction has used or is going to use some object but also what
kind of operation it has carried out or going to carry out on the
object.
This, along with the addition of a special kind of upgradable
metadata lock, allows ALTER TABLE to wait until all
transactions which has updated the table to go away.
This solves the second issue.
Another special type of upgradable metadata lock is acquired
by LOCK TABLE WRITE. This second lock type allows to solve the
first issue, since abortion of table-level locks in event of
DDL under LOCK TABLES becomes also unnecessary.
Below follows the list of incompatible changes introduced by
this patch:
- From now on, ALTER TABLE and CREATE/DROP TRIGGER SQL (i.e. those
statements that acquire TL_WRITE_ALLOW_READ lock)
wait for all transactions which has *updated* the table to
complete.
- From now on, LOCK TABLES ... WRITE, REPAIR/OPTIMIZE TABLE
(i.e. all statements which acquire TL_WRITE table-level lock) wait
for all transaction which *updated or read* from the table
to complete.
As a consequence, innodb_table_locks=0 option no longer applies
to LOCK TABLES ... WRITE.
- DROP DATABASE, DROP TABLE, RENAME TABLE no longer abort
statements or transactions which use tables being dropped or
renamed, and instead wait for these transactions to complete.
- Since LOCK TABLES WRITE now takes a special metadata lock,
not compatible with with reads or writes against the subject table
and transaction-wide, thr_lock.c deadlock avoidance algorithm
that used to ensure absence of deadlocks between LOCK TABLES
WRITE and other statements is no longer sufficient, even for
MyISAM. The wait-for graph based deadlock detector of MDL
subsystem may sometimes be necessary and is involved. This may
lead to ER_LOCK_DEADLOCK error produced for multi-statement
transactions even if these only use MyISAM:
session 1: session 2:
begin;
update t1 ... lock table t2 write, t1 write;
-- gets a lock on t2, blocks on t1
update t2 ...
(ER_LOCK_DEADLOCK)
- Finally, support of LOW_PRIORITY option for LOCK TABLES ... WRITE
was abandoned.
LOCK TABLE ... LOW_PRIORITY WRITE from now on has the same
priority as the usual LOCK TABLE ... WRITE.
SELECT HIGH PRIORITY no longer trumps LOCK TABLE ... WRITE in
the wait queue.
- We do not take upgradable metadata locks on implicitly
locked tables. So if one has, say, a view v1 that uses
table t1, and issues:
LOCK TABLE v1 WRITE;
FLUSH TABLE t1; -- (or just 'FLUSH TABLES'),
an error is produced.
In order to be able to perform DDL on a table under LOCK TABLES,
the table must be locked explicitly in the LOCK TABLES list.
The root cause of the crash is that a TranxNode is freed before it is used.
A TranxNode is allocated and inserted into the active list each time
a log event is written and flushed into the binlog file.
The memory for TranxNode is allocated with thd_alloc and will be freed
at the end of the statement. The after_commit/after_rollback callback
was supposed to be called before the end of each statement and remove the node from
the active list. However this assumption is not correct in all cases(e.g. call
'CREATE TEMPORARY TABLE myisam_t SELECT * FROM innodb_t' in a transaction
and delete all temporary tables automatically when a session closed),
and can cause the memory allocated for TranxNode be freed
before it was removed from the active list. So The TranxNode pointer in the active
list would become a wild pointer and cause the crash.
After this patch, We have a class called a TranxNodeAllocate which manages the memory
for allocating and freeing TranxNode. It uses my_malloc to allocate memory.
REVOKE/GRANT; ALTER EVENT.
The following statements support the CURRENT_USER() where a user is needed.
DROP USER
RENAME USER CURRENT_USER() ...
GRANT ... TO CURRENT_USER()
REVOKE ... FROM CURRENT_USER()
ALTER DEFINER = CURRENT_USER() EVENT
but, When these statements are binlogged, CURRENT_USER() just is binlogged
as 'CURRENT_USER()', it is not expanded to the real user name. When slave
executes the log event, 'CURRENT_USER()' is expand to the user of slave
SQL thread, but SQL thread's user name always NULL. This breaks the replication.
After this patch, All above statements are rewritten when they are binlogged.
The CURRENT_USER() is expanded to the real user's name and host.
Fixed 2 problems :
1. test_if_order_by_key() was continuing on the primary key
as if it has a primary key suffix (as the secondary keys do).
This leads to crashes in ORDER BY <pk>,<pk>.
Fixed by not treating the primary key as the secondary one
and not depending on it being clustered with a primary key.
2. The cost calculation was trying to read the records
per key when operating on ORDER BYs that order on all of the
secondary key + some of the primary key.
This leads to crashes because of out-of-bounds array access.
Fixed by assuming we'll find 1 record per key in such cases.
The problem was that the dbug facility was being used after the
per-thread dbug state had already been finalized. The was present
in a few functions which invoked decrement_handler_count, which
in turn invokes my_thread_end on Windows. In my_thread_end, the
per-thread dbug state is finalized. Any use after the state is
finalized ends up creating a new state.
The solution is to process the exit of a function before the
decrement_handler_count function is called.
The auto-inc unsafe warning makes sense even though it's just
one auto-inc table could be involved via a trigger or a stored
function.
However its content was not updated by bug@45677 fixes continuing to mention
two tables whereas the fixes refined semantics of replication of auto_increment
in stored routine.
Fixed with updating the error message, renaming the error and an internal unsafe-condition
constants.
A documentation notice
======================
Inserting into an autoincrement column in a stored function or a trigger
is unsafe for replication.
Even with just one autoincrement column, if the routine is invoked more than
once slave is not guaranteed to execute the statement graph same way as
the master.
And since it's impossible to estimate how many times a routine can be invoked at
the query pre-execution phase (see lock_tables), the statement is marked
pessimistically unsafe.
column is used for ORDER BY
Problem: filesort isn't meant for null length sort data
(e.g. char(0)), that leads to a server crash.
Fix: disregard sort order if sort data record length is 0 (nothing
to sort).
The problem was that a DROP TRIGGER statement inside a stored
procedure could cause a crash in subsequent invocations. This
was due to the addition, on the first execution, of a temporary
table reference to the stored procedure query table list. In
a subsequent invocation, there would be a attempt to reinitialize
the temporary table reference, which by then was already gone.
The solution is to backup and reset the query table list each
time a trigger needs to be dropped. This ensures that any temp
changes to the query table list are discarded. It is safe to
do so at this time as drop trigger is restricted from more
complicated scenarios (ie, not allowed within stored functions,
etc).
In case of 'CREATE VIEW' subselect transformation does not happen(see JOIN::prepare).
During fix_fields Item_row may call is_null() method for its arugmens which
leads to item calculation(wrong subselect in our case as
transformation did not happen before). This is_null() call
does not make sence for 'CREATE VIEW'.
Note:
Only Item_row is affected because other items don't call is_null()
during fix_fields() for arguments.
SHOW CREATE TABLE on a view (v1) that contains a function whose
statement uses another view (v2), could trigger a infinite loop
if the view referenced within the function causes a warning to
be raised while opening the said view (v2).
The problem was a infinite loop over the stack of internal error
handlers. The problem would be triggered if the stack contained
two or more handlers and the first two handlers didn't handle the
raised condition. In this case, the loop variable would always
point to the second handler in the stack.
The solution is to correct the loop variable assignment so that
the loop is able to iterate over all handlers in the stack.
in multitable delete/subquery
SQL_BUFFER_RESULT should not have an effect on non-SELECT
statements according to our documentation.
Fixed by not passing it through to multi-table DELETE (similarly
to how it's done for multi-table UPDATE).
Rename method as to not hide a base.
Reorder attributes initialization.
Remove unused variable.
Rework code to silence a warning due to assignment used as truth value.
The problem was that a failure to open a view wasn't being
properly handled. When opening a view with unknown definer,
the open procedure would be treated as successful and would
later crash when attempting to lock the view (which wasn't
opened to begin with).
The solution is to skip further processing when opening a
table if it fails with a fatal error.