The problem is that a SELECT on one thread is blocked by INSERT ... ON
DUPLICATE KEY UPDATE on another thread even when low_priority_updates is
activated.
The solution is to possibly downgrade the lock type to the setting of
low_priority_updates if the INSERT cannot be concurrent.
Binlogging of the statement with a side effect like a modified non-trans table did not happen.
The artifact involved all binloggable dml queries.
Fixed with changing the binlogging conditions all over the code to exploit thd->transaction.stmt.modified_non_trans_table
introduced by the patch for bug@27417.
Multi-delete case has own specific addressed by another bug@29136. Multi-update case has been addressed by bug#27716 and
patch and will need merging.
INSERT DELAYED on a replication slave was converted to regular INSERT,
whereas it should try concurrent INSERT first.
With this patch we try to convert delayed insert to concurrent insert on
a replication slave. If it is impossible for some reason, we fall back to
regular insert.
No test case for this fix. I do not see anything indicating this is
regression - we behave this way since Nov 2000.
Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack
Once had been set the flag might later got reset inside of a stored routine
execution stack.
The reason was in that there was no check if a new statement started at time
of resetting.
The artifact affects most of binlogable DML queries. Notice, that multi-update
is wrapped up within
bug@27716 fix, multi-delete bug@29136.
Fixed with saving parent's statement flag of whether the statement modified
non-transactional table, and unioning (merging) the value with that was gained
in mysql_execute_command.
Resettling thd->no_trans_update members into thd->transaction.`member`;
Asserting code;
Effectively the following properties are held.
1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table
reflects the fact if such a table got modified by the substatement.
That also respects THD::really_abort_on_warnin() requirements.
2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as
the union of the values of all invoked sub-statements.
That fixes this bug#27417;
Computing of thd->transaction.all.modified_non_trans_table is refined to base to
the stmt's value for all the case including insert .. select statement which
before the patch had an extra issue bug@28960.
Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in
to temp_table select.
The supplied test verifies limitely, mostly asserts. The ultimate testing is defered
for bug@13270, bug@23333.
the master but on the slave
MySQL can decide to "downgrade" a INSERT DELAYED statement
to normal insert in certain situations.
One such situation is when the slave is replaying a
replication feed.
However INSERT DELAYED is logged even if there're no updates
whereas the NORMAL INSERT is not logged in such cases.
Fixed by always logging a "downgraded" INSERT DELAYED: even
if there were no updates.
No test case, since the bug requires a stress case with 30 INSERT DELAYED
threads and 1 killer thread to repeat. The patch is verified
manually.
Review fixes.
The server that is running DELAYED inserts would deadlock itself
or crash under high load if some of the delayed threads were KILLed
in the meanwhile.
The fix is to change internal lock acquisition order of delayed inserts
subsystem and to ensure that
Delayed_insert::table_list::db does not point to volatile memory in some
cases.
For details, please see a comment for sql_insert.cc.
"Federated INSERT failures"
Federated does not correctly handle "INSERT...ON DUPLICATE KEY UPDATE"
However, implementing such support is not reasonably possible without
increasing complexity of the storage engine: checking that constraints
on remote server match local server and parsing error messages.
This patch causes 'ON DUPLICATE KEY' to fail with ER_DUP_KEY message
if a conflict occurs and not to fail silently.
The method select_insert::send_error does two things, it rolls back a statement
being executed and outputs an error message. But when a
nonexistent column is referenced, an error message has been published already and
there is no need to publish another.
Fixed by moving all functionality beyond publishing an error message into
select_insert::abort() and calling only that function.
When the INSERT .. ON DUPLICATE KEY UPDATE has to update a matched row but
the new data is the same as in the record then it returns as if
no rows were inserted or updated. Nevertheless the row is silently
updated. This leads to a situation when zero updated rows are reported
in the case when data has actually been changed.
Now the write_record function updates a row only if new data differs from
that in the record.
flag is set.
When the CLIENT_FOUND_ROWS flag is set then the server should return
found number of rows independently whether they were updated or not.
But this wasn't the case for the INSERT statement which always returned
number of rows that were actually changed thus providing wrong info to
the user.
Now the select_insert::send_eof method and the mysql_insert function
are sending the number of touched rows if the CLIENT_FOUND_ROWS flag is set.
Refining the tests since pb revealed the older version's fragality - the error from SF() due to killed
may be different on different env:s.
DBUG_ASSERT instead of assert.
The reason for the bug was that replaying of a query on slave could not be possible since its event
was recorded with the killed error. Due to the specific of handling INSERT, which per-row-while-loop is
unbreakable to killing, the query on transactional table should have not appeared in binlog unless
there was a call to a stored routine that got interrupted with killing (and then there must be an error
returned out of the loop).
The offered solution added the following rule for binlogging of INSERT that accounts the above
specifics:
For INSERT on transactional-table if the error was not set the only raised flag
is harmless and is ignored via masking out on time of creation of binlog event.
For both table types the combination of raised error and KILLED flag indicates that there
was potentially partial execution on master and consistency is under the question.
In that case the code continues to binlog an event with an appropriate killed error.
The fix relies on the specified behaviour of stored routine that must propagate the error
to the top level query handling if the thd->killed flag was raised in the routine execution.
The patch adds an arg with the default killed-status-unset value to Query_log_event::Query_log_event.
Bug#21483 "Server abort or deadlock on INSERT DELAYED with another
implicit insert"
Also fixes and adds test cases for bugs:
20497 "Trigger with INSERT DELAYED causes Error 1165"
21714 "Wrong NEW.value and server abort on INSERT DELAYED to a
table with a trigger".
Post-review fixes.
Problem:
In MySQL INSERT DELAYED is a way to pipe all inserts into a
given table through a dedicated thread. This is necessary for
simplistic storage engines like MyISAM, which do not have internal
concurrency control or threading and thus can not
achieve efficient INSERT throughput without support from SQL layer.
DELAYED INSERT works as follows:
For every distinct table, which can accept DELAYED inserts and has
pending data to insert, a dedicated thread is created to write data
to disk. All user connection threads that attempt to
delayed-insert into this table interact with the dedicated thread in
producer/consumer fashion: all records to-be inserted are pushed
into a queue of the dedicated thread, which fetches the records and
writes them.
In this design, client connection threads never open or lock
the delayed insert table.
This functionality was introduced in version 3.23 and does not take
into account existence of triggers, views, or pre-locking.
E.g. if INSERT DELAYED is called from a stored function, which,
in turn, is called from another stored function that uses the delayed
table, a deadlock can occur, because delayed locking by-passes
pre-locking. Besides:
* the delayed thread works directly with the subject table through
the storage engine API and does not invoke triggers
* even if it was patched to invoke triggers, if triggers,
in turn, used other tables, the delayed thread would
have to open and lock involved tables (use pre-locking).
* even if it was patched to use pre-locking, without deadlock
detection the delayed thread could easily lock out user
connection threads in case when the same table is used both
in a trigger and on the right side of the insert query:
the delayed thread would not release locks until all inserts
are complete, and user connection can not complete inserts
without having locks on the tables used on the right side of the
query.
Solution:
These considerations suggest two general alternatives for the
future of INSERT DELAYED:
* it is considered a full-fledged alternative to normal INSERT
* it is regarded as an optimisation that is only relevant
for simplistic engines.
Since we missed our chance to provide complete support of new
features when 5.0 was in development, the first alternative
currently renders infeasible.
However, even the second alternative, which is to detect
new features and convert DELAYED insert into a normal insert,
is not easy to implement.
The catch-22 is that we don't know if the subject table has triggers
or is a view before we open it, and we only open it in the
delayed thread. We don't know if the query involves pre-locking
until we have opened all tables, and we always first create
the delayed thread, and only then open the remaining tables.
This patch detects the problematic scenarios and converts
DELAYED INSERT to a normal INSERT using the following approach:
* if the statement is executed under pre-locking (e.g. from
within a stored function or trigger) or the right
side may require pre-locking, we detect the situation
before creating a delayed insert thread and convert the statement
to a conventional INSERT.
* if the subject table is a view or has triggers, we shutdown
the delayed thread and convert the statement to a conventional
INSERT.
Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT
with locked tables"
Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers"
Bug #24738 "CREATE TABLE ... SELECT is not isolated properly"
Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when
temporary table exists"
Deadlock occured when one tried to execute CREATE TABLE IF NOT
EXISTS ... SELECT statement under LOCK TABLES which held
read lock on target table.
Attempt to execute the same statement for already existing
target table with triggers caused server crashes.
Also concurrent execution of CREATE TABLE ... SELECT statement
and other statements involving target table suffered from
various races (some of which might've led to deadlocks).
Finally, attempt to execute CREATE TABLE ... SELECT in case
when a temporary table with same name was already present
led to the insertion of data into this temporary table and
creation of empty non-temporary table.
All above problems stemmed from the old implementation of CREATE
TABLE ... SELECT in which we created, opened and locked target
table without any special protection in a separate step and not
with the rest of tables used by this statement.
This underminded deadlock-avoidance approach used in server
and created window for races. It also excluded target table
from prelocking causing problems with trigger execution.
The patch solves these problems by implementing new approach to
handling of CREATE TABLE ... SELECT for base tables.
We try to open and lock table to be created at the same time as
the rest of tables used by this statement. If such table does not
exist at this moment we create and place in the table cache special
placeholder for it which prevents its creation or any other usage
by other threads.
We still use old approach for creation of temporary tables.
Also note that we decided to postpone introduction of some tests
for concurrent behaviour of CREATE TABLE ... SELECT till 5.1.
The main reason for this is absence in 5.0 ability to set @@debug
variable at runtime, which can be circumvented only by using several
test files with individual .opt files. Since the latter is likely
to slowdown test-suite unnecessary we chose not to push this tests
into 5.0, but run them manually for this version and later push
their optimized version into 5.1
Bug occurs in INSERT IGNORE ... SELECT ... ON DUPLICATE KEY UPDATE
statements, when SELECT returns duplicated values and UPDATE clause
tries to assign NULL values to NOT NULL fields.
NOTE: By current design MySQL server treats INSERT IGNORE ... ON
DUPLICATE statements as INSERT ... ON DUPLICATE with update of
duplicated records, but MySQL manual lacks this information.
After this fix such behaviour becomes legalized.
The write_record() function was returning error values even within
INSERT IGNORE, because ignore_errors parameter of
the fill_record_n_invoke_before_triggers() function call was
always set to FALSE. FALSE is replaced by info->ignore.
In certain cases AFTER UPDATE/DELETE triggers on NDB tables that referenced
subject table didn't see the results of operation which caused invocation
of those triggers. In other words AFTER trigger invoked as result of update
(or deletion) of particular row saw version of this row before update (or
deletion).
The problem occured because NDB handler in those cases postponed actual
update/delete operations to be able to perform them later as one batch.
This fix solves the problem by disabling this optimization for particular
operation if subject table has AFTER trigger for this operation defined.
To achieve this we introduce two new flags for handler::extra() method:
HA_EXTRA_DELETE_CANNOT_BATCH and HA_EXTRA_UPDATE_CANNOT_BATCH.
These are called if there exists AFTER DELETE/UPDATE triggers during a
statement that potentially can generate calls to delete_row()/update_row().
This includes multi_delete/multi_update statements as well as insert statements
that do delete/update as part of an ON DUPLICATE statement.
NO_AUTO_VALUE_ON_ZERO mode.
In the NO_AUTO_VALUE_ON_ZERO mode the table->auto_increment_field_not_null
variable is used to indicate that a non-NULL value was specified by the user
for an auto_increment column. When an INSERT .. ON DUPLICATE updates the
auto_increment field this variable is set to true and stays unchanged for the
next insert operation. This makes the next inserted row sometimes wrongly have
0 as the value of the auto_increment field.
Now the fill_record() function resets the table->auto_increment_field_not_null
variable before filling the record.
The table->auto_increment_field_not_null variable is also reset by the
open_table() function for a case if we missed some auto_increment_field_not_null
handling bug.
Now the table->auto_increment_field_not_null is reset at the end of the
mysql_load() function.
Reset the table->auto_increment_field_not_null variable after each
write_row() call in the copy_data_between_tables() function.
thd->options' OPTION_STATUS_NO_TRANS_UPDATE bit was not restored at the end of SF() invocation, where
SF() modified non-ta table.
As the result of this artifact it was not possible to detect whether there were any side-effects when
top-level query ends.
If the top level query table was not modified and the bit is lost there would be no binlogging.
Fixed with preserving the bit inside of thd->no_trans_update struct. The struct agregates two bool flags
telling whether the current query and the current transaction modified any non-ta table.
The flags stmt, all are dropped at the end of the query and the transaction.
context was used as an argument of GROUP_CONCAT.
Ensured correct setting of the depended_from field in references
generated for set functions aggregated in outer selects.
A wrong value of this field resulted in wrong maps returned by
used_tables() for these references.
Made sure that a temporary table field is added for any set function
aggregated in outer context when creation of a temporary table is
needed to execute the inner subquery.
To correctly decide which predicates can be evaluated with a given table
the optimizer must know the exact set of tables that a predicate depends
on. If that mask is too wide (refer to non-existing tables) the optimizer
can erroneously skip a predicate.
One such case of wrong table usage mask were the aggregate functions.
The have a all-1 mask (meaning depend on all tables, including non-existent
ones).
Fixed by making a real used_tables mask for the aggregates. The mask is
constructed in the following way :
1. OR the table dependency masks of all the arguments of the aggregate.
2. If all the arguments of the function are from the local name resolution
context and it is evaluated in the same name resolution
context where it is referenced all the tables from that name resolution
context are OR-ed to the dependency mask. This is to denote that an
aggregate function depends on the number of rows it processes.
3. Handle correctly the case of an aggregate function optimization (such that
the aggregate function can be pre-calculated and made a constant).
Made sure that an aggregate function is never a constant (unless subject of a
specific optimization and pre-calculation).
One other flaw was revealed and fixed in the process : references were
not calling the recalculation method for used_tables of their targets.
Removed wrong fix for the bug#27006.
The bug was added by the fix for the bug#19978 and fixed by Monty on 2007/02/21.
trigger.test, trigger.result:
Corrected test case for the bug#27006.
UPDATE if the row wasn't actually changed.
This bug was caused by fix for bug#19978. It causes AFTER UPDATE triggers
not firing if a row wasn't actually changed by the update part of the
INSERT .. ON DUPLICATE KEY UPDATE.
Now triggers are always fired if a row is touched by the INSERT ... ON
DUPLICATE KEY UPDATE.
INSERT uses query_id to verify what fields are
mentioned in the fields list of the INSERT command.
However the check for that is made after the
ON DUPLICATE KEY is processed. This causes all
the fields mentioned in ON DUPLICATE KEY to be
considered as mentioned in the fields list of
INSERT.
Moved the check up, right after processing the
fields list.
touched but not actually changed.
The LAST_INSERT_ID() is reset to 0 if no rows were inserted or changed.
This is the case when an INSERT ... ON DUPLICATE KEY UPDATE updates a row
with the same values as the row contains.
Now the LAST_INSERT_ID() values is reset to 0 only if there were no rows
successfully inserted or touched.
The new 'touched' field is added to the COPY_INFO structure. It holds the
number of rows that were touched no matter whether they were actually
changed or not.
When INSERT is done over a view the table being inserted into is
checked to be unique among all views tables. But if the view contains
self-joined table an error will be thrown even if all tables are used under
different aliases.
The unique_table() function now also checks tables' aliases when needed.