mariadb/sql/sql_insert.cc

4102 lines
135 KiB
C++
Raw Normal View History

Fix for BUG#11755168 '46895: test "outfile_loaddata" fails (reproducible)'. In sql_class.cc, 'row_count', of type 'ha_rows', was used as last argument for ER_TRUNCATED_WRONG_VALUE_FOR_FIELD which is "Incorrect %-.32s value: '%-.128s' for column '%.192s' at row %ld". So 'ha_rows' was used as 'long'. On SPARC32 Solaris builds, 'long' is 4 bytes and 'ha_rows' is 'longlong' i.e. 8 bytes. So the printf-like code was reading only the first 4 bytes. Because the CPU is big-endian, 1LL is 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 so the first four bytes yield 0. So the warning message had "row 0" instead of "row 1" in test outfile_loaddata.test: -Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 1 +Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 0 All error-messaging functions which internally invoke some printf-life function are potential candidate for such mistakes. One apparently easy way to catch such mistakes is to use ATTRIBUTE_FORMAT (from my_attribute.h). But this works only when call site has both: a) the format as a string literal b) the types of arguments. So: func(ER(ER_BLAH), 10); will silently not be checked, because ER(ER_BLAH) is not known at compile time (it is known at run-time, and depends on the chosen language). And func("%s", a va_list argument); has the same problem, as the *real* type of arguments is not known at this site at compile time (it's known in some caller). Moreover, func(ER(ER_BLAH)); though possibly correct (if ER(ER_BLAH) has no '%' markers), will not compile (gcc says "error: format not a string literal and no format arguments"). Consequences: 1) ATTRIBUTE_FORMAT is here added only to functions which in practice take "string literal" formats: "my_error_reporter" and "print_admin_msg". 2) it cannot be added to the other functions: my_error(), push_warning_printf(), Table_check_intact::report_error(), general_log_print(). To do a one-time check of functions listed in (2), the following "static code analysis" has been done: 1) replace my_error(ER_xxx, arguments for substitution in format) with the equivalent my_printf_error(ER_xxx,ER(ER_xxx), arguments for substitution in format), so that we have ER(ER_xxx) and the arguments *in the same call site* 2) add ATTRIBUTE_FORMAT to push_warning_printf(), Table_check_intact::report_error(), general_log_print() 3) replace ER(xxx) with the hard-coded English text found in errmsg.txt (like: ER(ER_UNKNOWN_ERROR) is replaced with "Unknown error"), so that a call site has the format as string literal 4) this way, ATTRIBUTE_FORMAT can effectively do its job 5) compile, fix errors detected by ATTRIBUTE_FORMAT 6) revert steps 1-2-3. The present patch has no compiler error when submitted again to the static code analysis above. It cannot catch all problems though: see Field::set_warning(), in which a call to push_warning_printf() has a variable error (thus, not replacable by a string literal); I checked set_warning() calls by hand though. See also WL 5883 for one proposal to avoid such bugs from appearing again in the future. The issues fixed in the patch are: a) mismatch in types (like 'int' passed to '%ld') b) more arguments passed than specified in the format. This patch resolves mismatches by changing the type/number of arguments, not by changing error messages of sql/share/errmsg.txt. The latter would be wrong, per the following old rule: errmsg.txt must be as stable as possible; no insertions or deletions of messages, no changes of type or number of printf-like format specifiers, are allowed, as long as the change impacts a message already released in a GA version. If this rule is not followed: - Connectors, which use error message numbers, will be confused (by insertions/deletions of messages) - using errmsg.sys of MySQL 5.1.n with mysqld of MySQL 5.1.(n+1) could produce wrong messages or crash; such usage can easily happen if installing 5.1.(n+1) while /etc/my.cnf still has --language=/path/to/5.1.n/xxx; or if copying mysqld from 5.1.(n+1) into a 5.1.n installation. When fixing b), I have verified that the superfluous arguments were not used in the format in the first 5.1 GA (5.1.30 'bteam@astra04-20081114162938-z8mctjp6st27uobm'). Had they been used, then passing them today, even if the message doesn't use them anymore, would have been necessary, as explained above.
2011-05-16 22:04:01 +02:00
/* Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.
2000-07-31 21:29:14 +02:00
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
2000-07-31 21:29:14 +02:00
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
2000-07-31 21:29:14 +02:00
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Fix for BUG#11755168 '46895: test "outfile_loaddata" fails (reproducible)'. In sql_class.cc, 'row_count', of type 'ha_rows', was used as last argument for ER_TRUNCATED_WRONG_VALUE_FOR_FIELD which is "Incorrect %-.32s value: '%-.128s' for column '%.192s' at row %ld". So 'ha_rows' was used as 'long'. On SPARC32 Solaris builds, 'long' is 4 bytes and 'ha_rows' is 'longlong' i.e. 8 bytes. So the printf-like code was reading only the first 4 bytes. Because the CPU is big-endian, 1LL is 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 so the first four bytes yield 0. So the warning message had "row 0" instead of "row 1" in test outfile_loaddata.test: -Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 1 +Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 0 All error-messaging functions which internally invoke some printf-life function are potential candidate for such mistakes. One apparently easy way to catch such mistakes is to use ATTRIBUTE_FORMAT (from my_attribute.h). But this works only when call site has both: a) the format as a string literal b) the types of arguments. So: func(ER(ER_BLAH), 10); will silently not be checked, because ER(ER_BLAH) is not known at compile time (it is known at run-time, and depends on the chosen language). And func("%s", a va_list argument); has the same problem, as the *real* type of arguments is not known at this site at compile time (it's known in some caller). Moreover, func(ER(ER_BLAH)); though possibly correct (if ER(ER_BLAH) has no '%' markers), will not compile (gcc says "error: format not a string literal and no format arguments"). Consequences: 1) ATTRIBUTE_FORMAT is here added only to functions which in practice take "string literal" formats: "my_error_reporter" and "print_admin_msg". 2) it cannot be added to the other functions: my_error(), push_warning_printf(), Table_check_intact::report_error(), general_log_print(). To do a one-time check of functions listed in (2), the following "static code analysis" has been done: 1) replace my_error(ER_xxx, arguments for substitution in format) with the equivalent my_printf_error(ER_xxx,ER(ER_xxx), arguments for substitution in format), so that we have ER(ER_xxx) and the arguments *in the same call site* 2) add ATTRIBUTE_FORMAT to push_warning_printf(), Table_check_intact::report_error(), general_log_print() 3) replace ER(xxx) with the hard-coded English text found in errmsg.txt (like: ER(ER_UNKNOWN_ERROR) is replaced with "Unknown error"), so that a call site has the format as string literal 4) this way, ATTRIBUTE_FORMAT can effectively do its job 5) compile, fix errors detected by ATTRIBUTE_FORMAT 6) revert steps 1-2-3. The present patch has no compiler error when submitted again to the static code analysis above. It cannot catch all problems though: see Field::set_warning(), in which a call to push_warning_printf() has a variable error (thus, not replacable by a string literal); I checked set_warning() calls by hand though. See also WL 5883 for one proposal to avoid such bugs from appearing again in the future. The issues fixed in the patch are: a) mismatch in types (like 'int' passed to '%ld') b) more arguments passed than specified in the format. This patch resolves mismatches by changing the type/number of arguments, not by changing error messages of sql/share/errmsg.txt. The latter would be wrong, per the following old rule: errmsg.txt must be as stable as possible; no insertions or deletions of messages, no changes of type or number of printf-like format specifiers, are allowed, as long as the change impacts a message already released in a GA version. If this rule is not followed: - Connectors, which use error message numbers, will be confused (by insertions/deletions of messages) - using errmsg.sys of MySQL 5.1.n with mysqld of MySQL 5.1.(n+1) could produce wrong messages or crash; such usage can easily happen if installing 5.1.(n+1) while /etc/my.cnf still has --language=/path/to/5.1.n/xxx; or if copying mysqld from 5.1.(n+1) into a 5.1.n installation. When fixing b), I have verified that the superfluous arguments were not used in the format in the first 5.1 GA (5.1.30 'bteam@astra04-20081114162938-z8mctjp6st27uobm'). Had they been used, then passing them today, even if the message doesn't use them anymore, would have been necessary, as explained above.
2011-05-16 22:04:01 +02:00
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
2000-07-31 21:29:14 +02:00
/* Insert of records */
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
/*
INSERT DELAYED
Insert delayed is distinguished from a normal insert by lock_type ==
TL_WRITE_DELAYED instead of TL_WRITE. It first tries to open a
"delayed" table (delayed_get_table()), but falls back to
open_and_lock_tables() on error and proceeds as normal insert then.
Opening a "delayed" table means to find a delayed insert thread that
has the table open already. If this fails, a new thread is created and
waited for to open and lock the table.
If accessing the thread succeeded, in
Delayed_insert::get_local_table() the table of the thread is copied
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
for local use. A copy is required because the normal insert logic
works on a target table, but the other threads table object must not
be used. The insert logic uses the record buffer to create a record.
And the delayed insert thread uses the record buffer to pass the
record to the table handler. So there must be different objects. Also
the copied table is not included in the lock, so that the statement
can proceed even if the real table cannot be accessed at this moment.
Copying a table object is not a trivial operation. Besides the TABLE
object there are the field pointer array, the field objects and the
record buffer. After copying the field objects, their pointers into
the record must be "moved" to point to the new record buffer.
After this setup the normal insert logic is used. Only that for
delayed inserts write_delayed() is called instead of write_record().
It inserts the rows into a queue and signals the delayed insert thread
instead of writing directly to the table.
The delayed insert thread awakes from the signal. It locks the table,
inserts the rows from the queue, unlocks the table, and waits for the
next signal. It does normally live until a FLUSH TABLES or SHUTDOWN.
*/
#include "my_global.h" /* NO_EMBEDDED_ACCESS_CHECKS */
#include "sql_priv.h"
#include "unireg.h" // REQUIRED: for other includes
#include "sql_insert.h"
#include "sql_update.h" // compare_record
#include "sql_base.h" // close_thread_tables
#include "sql_cache.h" // query_cache_*
#include "key.h" // key_copy
#include "lock.h" // mysql_unlock_tables
#include "sp_head.h"
#include "sql_view.h" // check_key_in_view, insert_view_fields
#include "sql_table.h" // mysql_create_table_no_lock
#include "sql_acl.h" // *_ACL, check_grant_all_columns
#include "sql_trigger.h"
#include "sql_select.h"
#include "sql_show.h"
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
#include "slave.h"
#include "sql_parse.h" // end_active_trans
#include "rpl_mi.h"
#include "transaction.h"
#include "sql_audit.h"
2000-07-31 21:29:14 +02:00
#ifndef EMBEDDED_LIBRARY
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
static bool delayed_get_table(THD *thd, MDL_request *grl_protection_request,
TABLE_LIST *table_list);
2007-05-16 08:32:47 +02:00
static int write_delayed(THD *thd, TABLE *table, enum_duplicates duplic,
LEX_STRING query, bool ignore, bool log_on);
2000-07-31 21:29:14 +02:00
static void end_delayed_insert(THD *thd);
pthread_handler_t handle_delayed_insert(void *arg);
2000-07-31 21:29:14 +02:00
static void unlink_blobs(register TABLE *table);
#endif
static bool check_view_insertability(THD *thd, TABLE_LIST *view);
2000-07-31 21:29:14 +02:00
/* Define to force use of my_malloc() if the allocated memory block is big */
#ifndef HAVE_ALLOCA
#define my_safe_alloca(size, min_length) my_alloca(size)
#define my_safe_afree(ptr, size, min_length) my_afree(ptr)
#else
#define my_safe_alloca(size, min_length) ((size <= min_length) ? my_alloca(size) : my_malloc(size,MYF(0)))
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
2010-07-08 23:20:08 +02:00
#define my_safe_afree(ptr, size, min_length) if (size > min_length) my_free(ptr)
2000-07-31 21:29:14 +02:00
#endif
/*
Check that insert/update fields are from the same single table of a view.
SYNOPSIS
check_view_single_update()
fields The insert/update fields to be checked.
view The view for insert.
map [in/out] The insert table map.
DESCRIPTION
This function is called in 2 cases:
1. to check insert fields. In this case *map will be set to 0.
Insert fields are checked to be all from the same single underlying
table of the given view. Otherwise the error is thrown. Found table
map is returned in the map parameter.
2. to check update fields of the ON DUPLICATE KEY UPDATE clause.
In this case *map contains table_map found on the previous call of
the function to check insert fields. Update fields are checked to be
from the same table as the insert fields.
RETURN
0 OK
1 Error
*/
bool check_view_single_update(List<Item> &fields, List<Item> *values,
TABLE_LIST *view, table_map *map)
{
/* it is join view => we need to find the table for update */
List_iterator_fast<Item> it(fields);
Item *item;
TABLE_LIST *tbl= 0; // reset for call to check_single_table()
table_map tables= 0;
while ((item= it++))
tables|= item->used_tables();
if (values)
{
it.init(*values);
while ((item= it++))
tables|= item->used_tables();
}
/* Convert to real table bits */
tables&= ~PSEUDO_TABLE_BITS;
/* Check found map against provided map */
if (*map)
{
if (tables != *map)
goto error;
return FALSE;
}
if (view->check_single_table(&tbl, tables, view) || tbl == 0)
goto error;
view->table= tbl->table;
*map= tables;
return FALSE;
error:
my_error(ER_VIEW_MULTIUPDATE, MYF(0),
view->view_db.str, view->view_name.str);
return TRUE;
}
2000-07-31 21:29:14 +02:00
/*
Check if insert fields are correct.
SYNOPSIS
check_insert_fields()
thd The current thread.
table The table for insert.
fields The insert fields.
values The insert values.
2005-04-21 10:29:05 +02:00
check_unique If duplicate values should be rejected.
NOTE
Clears TIMESTAMP_AUTO_SET_ON_INSERT from table->timestamp_field_type
or leaves it as is, depending on if timestamp should be updated or
not.
RETURN
0 OK
-1 Error
2000-07-31 21:29:14 +02:00
*/
2005-04-21 10:29:05 +02:00
static int check_insert_fields(THD *thd, TABLE_LIST *table_list,
List<Item> &fields, List<Item> &values,
bool check_unique,
bool fields_and_values_from_different_maps,
table_map *map)
2000-07-31 21:29:14 +02:00
{
2004-07-16 00:15:55 +02:00
TABLE *table= table_list->table;
if (!table_list->updatable)
{
my_error(ER_NON_INSERTABLE_TABLE, MYF(0), table_list->alias, "INSERT");
return -1;
}
2000-07-31 21:29:14 +02:00
if (fields.elements == 0 && values.elements != 0)
{
if (!table)
{
my_error(ER_VIEW_NO_INSERT_FIELD_LIST, MYF(0),
table_list->view_db.str, table_list->view_name.str);
return -1;
}
if (values.elements != table->s->fields)
2000-07-31 21:29:14 +02:00
{
2005-04-30 08:46:08 +02:00
my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), 1L);
2000-07-31 21:29:14 +02:00
return -1;
}
#ifndef NO_EMBEDDED_ACCESS_CHECKS
Field_iterator_table_ref field_it;
field_it.set(table_list);
if (check_grant_all_columns(thd, INSERT_ACL, &field_it))
return -1;
#endif
clear_timestamp_auto_bits(table->timestamp_field_type,
TIMESTAMP_AUTO_SET_ON_INSERT);
/*
No fields are provided so all fields must be provided in the values.
Thus we set all bits in the write set.
*/
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
bitmap_set_all(table->write_set);
2000-07-31 21:29:14 +02:00
}
else
{ // Part field list
SELECT_LEX *select_lex= &thd->lex->select_lex;
Name_resolution_context *context= &select_lex->context;
Name_resolution_context_state ctx_state;
2004-07-23 08:20:58 +02:00
int res;
2000-07-31 21:29:14 +02:00
if (fields.elements != values.elements)
{
2005-04-30 08:46:08 +02:00
my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), 1L);
2000-07-31 21:29:14 +02:00
return -1;
}
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
thd->dup_field= 0;
select_lex->no_wrap_view_item= TRUE;
/* Save the state of the current name resolution context. */
ctx_state.save_state(context, table_list);
/*
Perform name resolution only in the first table - 'table_list',
which is the table that is inserted into.
*/
table_list->next_local= 0;
context->resolve_in_table_list_only(table_list);
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
res= setup_fields(thd, 0, fields, MARK_COLUMNS_WRITE, 0, 0);
/* Restore the current context. */
ctx_state.restore_state(context, table_list);
thd->lex->select_lex.no_wrap_view_item= FALSE;
2004-07-23 08:20:58 +02:00
if (res)
2000-07-31 21:29:14 +02:00
return -1;
if (table_list->effective_algorithm == VIEW_ALGORITHM_MERGE)
{
if (check_view_single_update(fields,
fields_and_values_from_different_maps ?
(List<Item>*) 0 : &values,
table_list, map))
return -1;
table= table_list->table;
}
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (check_unique && thd->dup_field)
2000-07-31 21:29:14 +02:00
{
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
my_error(ER_FIELD_SPECIFIED_TWICE, MYF(0), thd->dup_field->field_name);
2000-07-31 21:29:14 +02:00
return -1;
}
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (table->timestamp_field) // Don't automaticly set timestamp if used
{
if (bitmap_is_set(table->write_set,
table->timestamp_field->field_index))
clear_timestamp_auto_bits(table->timestamp_field_type,
TIMESTAMP_AUTO_SET_ON_INSERT);
else
{
bitmap_set_bit(table->write_set,
table->timestamp_field->field_index);
}
}
2000-07-31 21:29:14 +02:00
}
// For the values we need select_priv
#ifndef NO_EMBEDDED_ACCESS_CHECKS
table->grant.want_privilege= (SELECT_ACL & ~table->grant.privilege);
#endif
if (check_key_in_view(thd, table_list) ||
(table_list->view &&
check_view_insertability(thd, table_list)))
{
my_error(ER_NON_INSERTABLE_TABLE, MYF(0), table_list->alias, "INSERT");
return -1;
}
2000-07-31 21:29:14 +02:00
return 0;
}
/*
Check update fields for the timestamp field.
SYNOPSIS
check_update_fields()
thd The current thread.
insert_table_list The insert table list.
table The table for update.
update_fields The update fields.
NOTE
If the update fields include the timestamp field,
remove TIMESTAMP_AUTO_SET_ON_UPDATE from table->timestamp_field_type.
RETURN
0 OK
-1 Error
*/
2005-04-21 10:29:05 +02:00
static int check_update_fields(THD *thd, TABLE_LIST *insert_table_list,
List<Item> &update_fields,
List<Item> &update_values, table_map *map)
{
2005-04-21 10:29:05 +02:00
TABLE *table= insert_table_list->table;
2009-06-17 16:56:44 +02:00
my_bool timestamp_mark= 0;
if (table->timestamp_field)
{
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
/*
Unmark the timestamp field so that we can check if this is modified
by update_fields
*/
timestamp_mark= bitmap_test_and_clear(table->write_set,
table->timestamp_field->field_index);
}
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
/* Check the fields we are going to modify */
if (setup_fields(thd, 0, update_fields, MARK_COLUMNS_WRITE, 0, 0))
return -1;
if (insert_table_list->effective_algorithm == VIEW_ALGORITHM_MERGE &&
check_view_single_update(update_fields, &update_values,
insert_table_list, map))
return -1;
if (table->timestamp_field)
{
/* Don't set timestamp column if this is modified. */
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (bitmap_is_set(table->write_set,
table->timestamp_field->field_index))
clear_timestamp_auto_bits(table->timestamp_field_type,
TIMESTAMP_AUTO_SET_ON_UPDATE);
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (timestamp_mark)
bitmap_set_bit(table->write_set,
table->timestamp_field->field_index);
}
return 0;
}
2007-04-04 16:58:25 +02:00
/*
Prepare triggers for INSERT-like statement.
SYNOPSIS
prepare_triggers_for_insert_stmt()
table Table to which insert will happen
NOTE
Prepare triggers for INSERT-like statement by marking fields
used by triggers and inform handlers that batching of UPDATE/DELETE
cannot be done if there are BEFORE UPDATE/DELETE triggers.
*/
2007-04-04 16:58:25 +02:00
void prepare_triggers_for_insert_stmt(TABLE *table)
{
if (table->triggers)
{
if (table->triggers->has_triggers(TRG_EVENT_DELETE,
TRG_ACTION_AFTER))
{
/*
The table has AFTER DELETE triggers that might access to
subject table and therefore might need delete to be done
immediately. So we turn-off the batching.
*/
(void) table->file->extra(HA_EXTRA_DELETE_CANNOT_BATCH);
}
if (table->triggers->has_triggers(TRG_EVENT_UPDATE,
TRG_ACTION_AFTER))
{
/*
The table has AFTER UPDATE triggers that might access to subject
table and therefore might need update to be done immediately.
So we turn-off the batching.
*/
(void) table->file->extra(HA_EXTRA_UPDATE_CANNOT_BATCH);
}
}
2007-04-04 16:58:25 +02:00
table->mark_columns_needed_for_insert();
}
/**
Upgrade table-level lock of INSERT statement to TL_WRITE if
a more concurrent lock is infeasible for some reason. This is
necessary for engines without internal locking support (MyISAM).
An engine with internal locking implementation might later
downgrade the lock in handler::store_lock() method.
*/
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
static
void upgrade_lock_type(THD *thd, thr_lock_type *lock_type,
enum_duplicates duplic)
{
if (duplic == DUP_UPDATE ||
(duplic == DUP_REPLACE && *lock_type == TL_WRITE_CONCURRENT_INSERT))
{
*lock_type= TL_WRITE_DEFAULT;
return;
}
if (*lock_type == TL_WRITE_DELAYED)
{
/*
We do not use delayed threads if:
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
- we're running in the safe mode or skip-new mode -- the
feature is disabled in these modes
- we're executing this statement on a replication slave --
we need to ensure serial execution of queries on the
slave
- it is INSERT .. ON DUPLICATE KEY UPDATE - in this case the
insert cannot be concurrent
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
- this statement is directly or indirectly invoked from
a stored function or trigger (under pre-locking) - to
avoid deadlocks, since INSERT DELAYED involves a lock
upgrade (TL_WRITE_DELAYED -> TL_WRITE) which we should not
attempt while keeping other table level locks.
- this statement itself may require pre-locking.
We should upgrade the lock even though in most cases
delayed functionality may work. Unfortunately, we can't
easily identify whether the subject table is not used in
the statement indirectly via a stored function or trigger:
if it is used, that will lead to a deadlock between the
client connection and the delayed thread.
*/
if (specialflag & (SPECIAL_NO_NEW_FUNC | SPECIAL_SAFE_MODE) ||
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
thd->variables.max_insert_delayed_threads == 0 ||
thd->locked_tables_mode > LTM_LOCK_TABLES ||
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
thd->lex->uses_stored_routines())
{
*lock_type= TL_WRITE;
return;
}
if (thd->slave_thread)
{
/* Try concurrent insert */
*lock_type= (duplic == DUP_UPDATE || duplic == DUP_REPLACE) ?
TL_WRITE : TL_WRITE_CONCURRENT_INSERT;
return;
}
bool log_on= (thd->variables.option_bits & OPTION_BIN_LOG);
if (global_system_variables.binlog_format == BINLOG_FORMAT_STMT &&
log_on && mysql_bin_log.is_open())
{
/*
Statement-based binary logging does not work in this case, because:
a) two concurrent statements may have their rows intermixed in the
queue, leading to autoincrement replication problems on slave (because
the values generated used for one statement don't depend only on the
value generated for the first row of this statement, so are not
replicable)
b) if first row of the statement has an error the full statement is
not binlogged, while next rows of the statement may be inserted.
c) if first row succeeds, statement is binlogged immediately with a
zero error code (i.e. "no error"), if then second row fails, query
will fail on slave too and slave will stop (wrongly believing that the
master got no error).
So we fallback to non-delayed INSERT.
Note that to be fully correct, we should test the "binlog format which
the delayed thread is going to use for this row". But in the common case
where the global binlog format is not changed and the session binlog
format may be changed, that is equal to the global binlog format.
We test it without mutex for speed reasons (condition rarely true), and
in the common case (global not changed) it is as good as without mutex;
if global value is changed, anyway there is uncertainty as the delayed
thread may be old and use the before-the-change value.
*/
*lock_type= TL_WRITE;
}
}
}
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
/**
Find or create a delayed insert thread for the first table in
the table list, then open and lock the remaining tables.
If a table can not be used with insert delayed, upgrade the lock
and open and lock all tables using the standard mechanism.
@param thd thread context
@param table_list list of "descriptors" for tables referenced
directly in statement SQL text.
The first element in the list corresponds to
the destination table for inserts, remaining
tables, if any, are usually tables referenced
by sub-queries in the right part of the
INSERT.
@return Status of the operation. In case of success 'table'
member of every table_list element points to an instance of
class TABLE.
@sa open_and_lock_tables for more information about MySQL table
level locking
*/
static
bool open_and_lock_for_insert_delayed(THD *thd, TABLE_LIST *table_list)
{
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
MDL_request protection_request;
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
DBUG_ENTER("open_and_lock_for_insert_delayed");
#ifndef EMBEDDED_LIBRARY
/*
In order for the deadlock detector to be able to find any deadlocks
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
caused by the handler thread waiting for GRL or this table, we acquire
protection against GRL (global IX metadata lock) and metadata lock on
table to being inserted into inside the connection thread.
If this goes ok, the tickets are cloned and added to the list of granted
locks held by the handler thread.
*/
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
if (thd->global_read_lock.can_acquire_protection())
DBUG_RETURN(TRUE);
protection_request.init(MDL_key::GLOBAL, "", "", MDL_INTENTION_EXCLUSIVE,
MDL_STATEMENT);
if (thd->mdl_context.acquire_lock(&protection_request,
thd->variables.lock_wait_timeout))
DBUG_RETURN(TRUE);
if (thd->mdl_context.acquire_lock(&table_list->mdl_request,
thd->variables.lock_wait_timeout))
/*
If a lock can't be acquired, it makes no sense to try normal insert.
Therefore we just abort the statement.
*/
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
DBUG_RETURN(TRUE);
bool error= FALSE;
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
if (delayed_get_table(thd, &protection_request, table_list))
error= TRUE;
else if (table_list->table)
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
{
/*
Open tables used for sub-selects or in stored functions, will also
cache these functions.
*/
if (open_and_lock_tables(thd, table_list->next_global, TRUE, 0))
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
{
end_delayed_insert(thd);
error= TRUE;
}
else
{
/*
First table was not processed by open_and_lock_tables(),
we need to set updatability flag "by hand".
*/
if (!table_list->derived && !table_list->view)
table_list->updatable= 1; // usual table
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
}
}
/*
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
We can't release protection against GRL and metadata lock on the table
being inserted into here. These locks might be required, for example,
because this INSERT DELAYED calls functions which may try to update
this or another tables (updating the same table is of course illegal,
but such an attempt can be discovered only later during statement
execution).
*/
/*
Reset the ticket in case we end up having to use normal insert and
therefore will reopen the table and reacquire the metadata lock.
*/
table_list->mdl_request.ticket= NULL;
if (error || table_list->table)
DBUG_RETURN(error);
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
#endif
/*
* This is embedded library and we don't have auxiliary
threads OR
* a lock upgrade was requested inside delayed_get_table
because
- there are too many delayed insert threads OR
- the table has triggers.
Use a normal insert.
*/
table_list->lock_type= TL_WRITE;
DBUG_RETURN(open_and_lock_tables(thd, table_list, TRUE, 0));
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
}
/**
Create a new query string for removing DELAYED keyword for
multi INSERT DEALAYED statement.
@param[in] thd Thread handler
@param[in] buf Query string
@return
0 ok
1 error
*/
static int
create_insert_stmt_from_insert_delayed(THD *thd, String *buf)
{
/* Make a copy of thd->query() and then remove the "DELAYED" keyword */
if (buf->append(thd->query()) ||
buf->replace(thd->lex->keyword_delayed_begin_offset,
thd->lex->keyword_delayed_end_offset -
thd->lex->keyword_delayed_begin_offset, 0))
return 1;
return 0;
}
/**
INSERT statement implementation
@note Like implementations of other DDL/DML in MySQL, this function
relies on the caller to close the thread tables. This is done in the
end of dispatch_command().
*/
bool mysql_insert(THD *thd,TABLE_LIST *table_list,
List<Item> &fields,
List<List_item> &values_list,
List<Item> &update_fields,
List<Item> &update_values,
enum_duplicates duplic,
bool ignore)
2000-07-31 21:29:14 +02:00
{
int error, res;
bool transactional_table, joins_freed= FALSE;
bool changed;
bool was_insert_delayed= (table_list->lock_type == TL_WRITE_DELAYED);
2000-07-31 21:29:14 +02:00
uint value_count;
ulong counter = 1;
ulonglong id;
COPY_INFO info;
TABLE *table= 0;
List_iterator_fast<List_item> its(values_list);
2000-07-31 21:29:14 +02:00
List_item *values;
Name_resolution_context *context;
Name_resolution_context_state ctx_state;
#ifndef EMBEDDED_LIBRARY
char *query= thd->query();
/*
log_on is about delayed inserts only.
By default, both logs are enabled (this won't cause problems if the server
runs without --log-bin).
*/
bool log_on= (thd->variables.option_bits & OPTION_BIN_LOG);
2007-05-14 08:56:23 +02:00
#endif
thr_lock_type lock_type;
2004-12-30 23:44:00 +01:00
Item *unused_conds= 0;
2000-07-31 21:29:14 +02:00
DBUG_ENTER("mysql_insert");
/*
Upgrade lock type if the requested lock is incompatible with
the current connection mode or table operation.
*/
upgrade_lock_type(thd, &table_list->lock_type, duplic);
/*
We can't write-delayed into a table locked with LOCK TABLES:
this will lead to a deadlock, since the delayed thread will
never be able to get a lock on the table. QQQ: why not
upgrade the lock here instead?
*/
if (table_list->lock_type == TL_WRITE_DELAYED &&
thd->locked_tables_mode &&
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
2009-11-30 16:55:03 +01:00
find_locked_table(thd->open_tables, table_list->db,
table_list->table_name))
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
{
my_error(ER_DELAYED_INSERT_TABLE_LOCKED, MYF(0),
table_list->table_name);
DBUG_RETURN(TRUE);
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
}
2000-07-31 21:29:14 +02:00
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
if (table_list->lock_type == TL_WRITE_DELAYED)
2000-07-31 21:29:14 +02:00
{
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
if (open_and_lock_for_insert_delayed(thd, table_list))
DBUG_RETURN(TRUE);
2000-07-31 21:29:14 +02:00
}
else
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
{
if (open_and_lock_tables(thd, table_list, TRUE, 0))
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
DBUG_RETURN(TRUE);
}
lock_type= table_list->lock_type;
2002-11-26 00:00:05 +01:00
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "init");
thd->used_tables=0;
2000-07-31 21:29:14 +02:00
values= its++;
value_count= values->elements;
2003-05-03 01:16:56 +02:00
2004-07-16 00:15:55 +02:00
if (mysql_prepare_insert(thd, table_list, table, fields, values,
2004-12-30 23:44:00 +01:00
update_fields, update_values, duplic, &unused_conds,
FALSE,
(fields.elements || !value_count ||
table_list->view != 0),
!ignore && (thd->variables.sql_mode &
(MODE_STRICT_TRANS_TABLES |
MODE_STRICT_ALL_TABLES))))
2000-07-31 21:29:14 +02:00
goto abort;
/* mysql_prepare_insert set table_list->table if it was not set */
table= table_list->table;
context= &thd->lex->select_lex.context;
/*
These three asserts test the hypothesis that the resetting of the name
resolution context below is not necessary at all since the list of local
tables for INSERT always consists of one table.
*/
DBUG_ASSERT(!table_list->next_local);
DBUG_ASSERT(!context->table_list->next_local);
DBUG_ASSERT(!context->first_name_resolution_table->next_name_resolution_table);
/* Save the state of the current name resolution context. */
ctx_state.save_state(context, table_list);
/*
Perform name resolution only in the first table - 'table_list',
which is the table that is inserted into.
*/
table_list->next_local= 0;
context->resolve_in_table_list_only(table_list);
2002-11-26 00:00:05 +01:00
while ((values= its++))
2000-07-31 21:29:14 +02:00
{
counter++;
if (values->elements != value_count)
{
my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), counter);
2000-07-31 21:29:14 +02:00
goto abort;
}
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (setup_fields(thd, 0, *values, MARK_COLUMNS_READ, 0, 0))
2000-07-31 21:29:14 +02:00
goto abort;
}
its.rewind ();
/* Restore the current context. */
ctx_state.restore_state(context, table_list);
2000-07-31 21:29:14 +02:00
/*
2002-07-25 00:00:56 +02:00
Fill in the given fields and dump it to the table file
2000-07-31 21:29:14 +02:00
*/
bzero((char*) &info,sizeof(info));
info.ignore= ignore;
2000-07-31 21:29:14 +02:00
info.handle_duplicates=duplic;
info.update_fields= &update_fields;
info.update_values= &update_values;
info.view= (table_list->view ? table_list : 0);
/*
Count warnings for all inserts.
For single line insert, generate an error if try to set a NOT NULL field
to NULL.
*/
thd->count_cuted_fields= ((values_list.elements == 1 &&
!ignore) ?
CHECK_FIELD_ERROR_FOR_NULL :
CHECK_FIELD_WARN);
2000-07-31 21:29:14 +02:00
thd->cuted_fields = 0L;
table->next_number_field=table->found_next_number_field;
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
#ifdef HAVE_REPLICATION
if (thd->slave_thread &&
(info.handle_duplicates == DUP_UPDATE) &&
(table->next_number_field != NULL) &&
rpl_master_has_bug(&active_mi->rli, 24432, TRUE, NULL, NULL))
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
goto abort;
#endif
2000-07-31 21:29:14 +02:00
error=0;
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "update");
if (duplic == DUP_REPLACE &&
(!table->triggers || !table->triggers->has_delete_triggers()))
table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
if (duplic == DUP_UPDATE)
table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
/*
let's *try* to start bulk inserts. It won't necessary
start them as values_list.elements should be greater than
some - handler dependent - threshold.
We should not start bulk inserts if this statement uses
functions or invokes triggers since they may access
to the same table and therefore should not see its
inconsistent state created by this optimization.
So we call start_bulk_insert to perform nesessary checks on
values_list.elements, and - if nothing else - to initialize
the code to make the call of end_bulk_insert() below safe.
*/
#ifndef EMBEDDED_LIBRARY
if (lock_type != TL_WRITE_DELAYED)
#endif /* EMBEDDED_LIBRARY */
{
if (duplic != DUP_ERROR || ignore)
table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
/**
This is a simple check for the case when the table has a trigger
that reads from it, or when the statement invokes a stored function
that reads from the table being inserted to.
Engines can't handle a bulk insert in parallel with a read form the
same table in the same connection.
*/
if (thd->locked_tables_mode <= LTM_LOCK_TABLES)
table->file->ha_start_bulk_insert(values_list.elements);
}
thd->abort_on_warning= (!ignore && (thd->variables.sql_mode &
(MODE_STRICT_TRANS_TABLES |
MODE_STRICT_ALL_TABLES)));
2007-04-04 16:58:25 +02:00
prepare_triggers_for_insert_stmt(table);
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
if (table_list->prepare_where(thd, 0, TRUE) ||
table_list->prepare_check_option(thd))
error= 1;
2002-07-25 00:00:56 +02:00
while ((values= its++))
2000-07-31 21:29:14 +02:00
{
if (fields.elements || !value_count)
{
restore_record(table,s->default_values); // Get empty record
if (fill_record_n_invoke_before_triggers(thd, fields, *values, 0,
table->triggers,
TRG_EVENT_INSERT))
2000-07-31 21:29:14 +02:00
{
if (values_list.elements != 1 && ! thd->is_error())
2000-07-31 21:29:14 +02:00
{
info.records++;
continue;
}
/*
TODO: set thd->abort_on_warning if values_list.elements == 1
and check that all items return warning in case of problem with
storing field.
*/
2000-07-31 21:29:14 +02:00
error=1;
break;
}
}
else
{
if (thd->used_tables) // Column used in values()
restore_record(table,s->default_values); // Get empty record
else
{
TABLE_SHARE *share= table->s;
/*
Fix delete marker. No need to restore rest of record since it will
be overwritten by fill_record() anyway (and fill_record() does not
use default values in this case).
*/
table->record[0][0]= share->default_values[0];
/* Fix undefined null_bits. */
if (share->null_bytes > 1 && share->last_null_bit_pos)
{
table->record[0][share->null_bytes - 1]=
share->default_values[share->null_bytes - 1];
}
}
if (fill_record_n_invoke_before_triggers(thd, table->field, *values, 0,
table->triggers,
TRG_EVENT_INSERT))
2000-07-31 21:29:14 +02:00
{
if (values_list.elements != 1 && ! thd->is_error())
2000-07-31 21:29:14 +02:00
{
info.records++;
continue;
}
error=1;
break;
}
}
if ((res= table_list->view_check_option(thd,
(values_list.elements == 1 ?
0 :
ignore))) ==
VIEW_CHECK_SKIP)
continue;
else if (res == VIEW_CHECK_ERROR)
2004-09-03 14:18:40 +02:00
{
error= 1;
break;
2004-09-03 14:18:40 +02:00
}
#ifndef EMBEDDED_LIBRARY
if (lock_type == TL_WRITE_DELAYED)
2000-07-31 21:29:14 +02:00
{
LEX_STRING const st_query = { query, thd->query_length() };
error=write_delayed(thd, table, duplic, st_query, ignore, log_on);
2000-07-31 21:29:14 +02:00
query=0;
}
else
#endif
error=write_record(thd, table ,&info);
if (error)
break;
thd->warning_info->inc_current_row_for_warning();
2000-07-31 21:29:14 +02:00
}
free_underlaid_joins(thd, &thd->lex->select_lex);
joins_freed= TRUE;
/*
Now all rows are inserted. Time to update logs and sends response to
user
*/
#ifndef EMBEDDED_LIBRARY
2000-07-31 21:29:14 +02:00
if (lock_type == TL_WRITE_DELAYED)
{
if (!error)
{
info.copied=values_list.elements;
end_delayed_insert(thd);
}
2000-07-31 21:29:14 +02:00
}
else
#endif
2000-07-31 21:29:14 +02:00
{
/*
Do not do this release if this is a delayed insert, it would steal
auto_inc values from the delayed_insert thread as they share TABLE.
*/
table->file->ha_release_auto_increment();
if (thd->locked_tables_mode <= LTM_LOCK_TABLES &&
table->file->ha_end_bulk_insert() && !error)
{
table->file->print_error(my_errno,MYF(0));
error=1;
}
if (duplic != DUP_ERROR || ignore)
table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
transactional_table= table->file->has_transactions();
if ((changed= (info.copied || info.deleted || info.updated)))
2000-07-31 21:29:14 +02:00
{
/*
Invalidate the table in the query cache if something changed.
For the transactional algorithm to work the invalidation must be
before binlog writing and ha_autocommit_or_rollback
*/
query_cache_invalidate3(thd, table_list, 1);
}
if (thd->transaction.stmt.modified_non_trans_table)
thd->transaction.all.modified_non_trans_table= TRUE;
if (error <= 0 ||
2009-06-17 16:56:44 +02:00
thd->transaction.stmt.modified_non_trans_table ||
was_insert_delayed)
{
if (mysql_bin_log.is_open())
{
int errcode= 0;
if (error <= 0)
{
/*
[Guilhem wrote] Temporary errors may have filled
thd->net.last_error/errno. For example if there has
been a disk full error when writing the row, and it was
MyISAM, then thd->net.last_error/errno will be set to
"disk full"... and the mysql_file_pwrite() will wait until free
space appears, and so when it finishes then the
write_row() was entirely successful
*/
/* todo: consider removing */
thd->clear_error();
}
else
errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
/* bug#22725:
A query which per-row-loop can not be interrupted with
KILLED, like INSERT, and that does not invoke stored
routines can be binlogged with neglecting the KILLED error.
If there was no error (error == zero) until after the end of
inserting loop the KILLED flag that appeared later can be
disregarded since previously possible invocation of stored
routines did not result in any error due to the KILLED. In
such case the flag is ignored for constructing binlog event.
*/
DBUG_ASSERT(thd->killed != THD::KILL_BAD_DATA || error > 0);
if (was_insert_delayed && table_list->lock_type == TL_WRITE)
{
/* Binlog INSERT DELAYED as INSERT without DELAYED. */
String log_query;
if (create_insert_stmt_from_insert_delayed(thd, &log_query))
{
sql_print_error("Event Error: An error occurred while creating query string"
"for INSERT DELAYED stmt, before writing it into binary log.");
error= 1;
}
else if (thd->binlog_query(THD::ROW_QUERY_TYPE,
log_query.c_ptr(), log_query.length(),
transactional_table, FALSE, FALSE,
errcode))
error= 1;
}
else if (thd->binlog_query(THD::ROW_QUERY_TYPE,
thd->query(), thd->query_length(),
transactional_table, FALSE, FALSE,
errcode))
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
2010-01-07 16:39:11 +01:00
error= 1;
}
2000-07-31 21:29:14 +02:00
}
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
2007-07-30 17:27:36 +02:00
DBUG_ASSERT(transactional_table || !changed ||
thd->transaction.stmt.modified_non_trans_table);
2000-07-31 21:29:14 +02:00
}
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "end");
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
/*
We'll report to the client this id:
- if the table contains an autoincrement column and we successfully
inserted an autogenerated value, the autogenerated value.
- if the table contains no autoincrement column and LAST_INSERT_ID(X) was
called, X.
- if the table contains an autoincrement column, and some rows were
inserted, the id of the last "inserted" row (if IGNORE, that value may not
have been really inserted but ignored).
*/
id= (thd->first_successful_insert_id_in_cur_stmt > 0) ?
thd->first_successful_insert_id_in_cur_stmt :
(thd->arg_of_last_insert_id_function ?
thd->first_successful_insert_id_in_prev_stmt :
((table->next_number_field && info.copied) ?
table->next_number_field->val_int() : 0));
2000-07-31 21:29:14 +02:00
table->next_number_field=0;
thd->count_cuted_fields= CHECK_FIELD_IGNORE;
table->auto_increment_field_not_null= FALSE;
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
if (duplic == DUP_REPLACE &&
(!table->triggers || !table->triggers->has_delete_triggers()))
table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
2000-07-31 21:29:14 +02:00
if (error)
goto abort;
WL#4738 streamline/simplify @@variable creation process Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables Bug#20415 Output of mysqld --help --verbose is incomplete Bug#25430 variable not found in SELECT @@global.ft_max_word_len; Bug#32902 plugin variables don't know their names Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting! Bug#34829 No default value for variable and setting default does not raise error Bug#34834 ? Is accepted as a valid sql mode Bug#34878 Few variables have default value according to documentation but error occurs Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var. Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status Bug#40988 log_output_basic.test succeeded though syntactically false. Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails) Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled Bug#44797 plugins w/o command-line options have no disabling option in --help Bug#46314 string system variables don't support expressions Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken Bug#46586 When using the plugin interface the type "set" for options caused a crash. Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds Bug#49417 some complaints about mysqld --help --verbose output Bug#49540 DEFAULT value of binlog_format isn't the default value Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix) Bug#49644 init_connect and \0 Bug#49645 init_slave and multi-byte characters Bug#49646 mysql --show-warnings crashes when server dies
2009-12-22 10:35:56 +01:00
if (values_list.elements == 1 && (!(thd->variables.option_bits & OPTION_WARNINGS) ||
2000-07-31 21:29:14 +02:00
!thd->cuted_fields))
{
my_ok(thd, info.copied + info.deleted +
((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
info.touched : info.updated),
id);
}
else
{
2000-07-31 21:29:14 +02:00
char buff[160];
ha_rows updated=((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
info.touched : info.updated);
if (ignore)
sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
(lock_type == TL_WRITE_DELAYED) ? (ulong) 0 :
(ulong) (info.records - info.copied),
(ulong) thd->warning_info->statement_warn_count());
2000-07-31 21:29:14 +02:00
else
sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
(ulong) (info.deleted + updated),
(ulong) thd->warning_info->statement_warn_count());
::my_ok(thd, info.copied + info.deleted + updated, id, buff);
2000-07-31 21:29:14 +02:00
}
thd->abort_on_warning= 0;
DBUG_RETURN(FALSE);
2000-07-31 21:29:14 +02:00
abort:
#ifndef EMBEDDED_LIBRARY
2000-07-31 21:29:14 +02:00
if (lock_type == TL_WRITE_DELAYED)
end_delayed_insert(thd);
#endif
if (table != NULL)
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
table->file->ha_release_auto_increment();
if (!joins_freed)
free_underlaid_joins(thd, &thd->lex->select_lex);
thd->abort_on_warning= 0;
DBUG_RETURN(TRUE);
2000-07-31 21:29:14 +02:00
}
2004-07-16 00:15:55 +02:00
/*
Additional check for insertability for VIEW
SYNOPSIS
check_view_insertability()
thd - thread handler
2004-07-16 00:15:55 +02:00
view - reference on VIEW
IMPLEMENTATION
A view is insertable if the folloings are true:
- All columns in the view are columns from a table
- All not used columns in table have a default values
- All field in view are unique (not referring to the same column)
2004-07-16 00:15:55 +02:00
RETURN
FALSE - OK
view->contain_auto_increment is 1 if and only if the view contains an
auto_increment field
2004-07-16 00:15:55 +02:00
TRUE - can't be used for insert
*/
static bool check_view_insertability(THD * thd, TABLE_LIST *view)
2004-07-16 00:15:55 +02:00
{
uint num= view->view->select_lex.item_list.elements;
2004-07-16 00:15:55 +02:00
TABLE *table= view->table;
Field_translator *trans_start= view->field_translation,
*trans_end= trans_start + num;
Field_translator *trans;
uint used_fields_buff_size= bitmap_buffer_size(table->s->fields);
2005-07-13 01:25:36 +02:00
uint32 *used_fields_buff= (uint32*)thd->alloc(used_fields_buff_size);
MY_BITMAP used_fields;
2006-08-01 06:49:43 +02:00
enum_mark_columns save_mark_used_columns= thd->mark_used_columns;
DBUG_ENTER("check_key_in_view");
if (!used_fields_buff)
DBUG_RETURN(TRUE); // EOM
2004-07-16 00:15:55 +02:00
DBUG_ASSERT(view->table != 0 && view->field_translation != 0);
(void) bitmap_init(&used_fields, used_fields_buff, table->s->fields, 0);
bitmap_clear_all(&used_fields);
2004-07-16 00:15:55 +02:00
view->contain_auto_increment= 0;
/*
we must not set query_id for fields as they're not
really used in this context
*/
2006-08-01 06:49:43 +02:00
thd->mark_used_columns= MARK_COLUMNS_NONE;
2004-07-16 00:15:55 +02:00
/* check simplicity and prepare unique test of view */
for (trans= trans_start; trans != trans_end; trans++)
2004-07-16 00:15:55 +02:00
{
if (!trans->item->fixed && trans->item->fix_fields(thd, &trans->item))
{
2006-08-01 06:49:43 +02:00
thd->mark_used_columns= save_mark_used_columns;
DBUG_RETURN(TRUE);
}
Item_field *field;
2004-07-16 00:15:55 +02:00
/* simple SELECT list entry (field without expression) */
2004-10-05 12:41:51 +02:00
if (!(field= trans->item->filed_for_view_update()))
{
2006-08-01 06:49:43 +02:00
thd->mark_used_columns= save_mark_used_columns;
2004-07-16 00:15:55 +02:00
DBUG_RETURN(TRUE);
}
if (field->field->unireg_check == Field::NEXT_NUMBER)
2004-07-16 00:15:55 +02:00
view->contain_auto_increment= 1;
/* prepare unique test */
2004-10-05 14:07:31 +02:00
/*
remove collation (or other transparent for update function) if we have
it
*/
trans->item= field;
2004-07-16 00:15:55 +02:00
}
2006-08-01 06:49:43 +02:00
thd->mark_used_columns= save_mark_used_columns;
2004-07-16 00:15:55 +02:00
/* unique test */
for (trans= trans_start; trans != trans_end; trans++)
2004-07-16 00:15:55 +02:00
{
/* Thanks to test above, we know that all columns are of type Item_field */
Item_field *field= (Item_field *)trans->item;
/* check fields belong to table in which we are inserting */
if (field->field->table == table &&
bitmap_fast_test_and_set(&used_fields, field->field->field_index))
2004-07-16 00:15:55 +02:00
DBUG_RETURN(TRUE);
}
DBUG_RETURN(FALSE);
}
2004-04-10 00:14:32 +02:00
/*
Check if table can be updated
2004-04-10 00:14:32 +02:00
SYNOPSIS
mysql_prepare_insert_check_table()
thd Thread handle
2004-11-25 01:23:13 +01:00
table_list Table list
fields List of fields to be updated
where Pointer to where clause
2004-11-25 01:23:13 +01:00
select_insert Check is making for SELECT ... INSERT
RETURN
2004-11-25 01:23:13 +01:00
FALSE ok
TRUE ERROR
2004-04-10 00:14:32 +02:00
*/
2004-11-25 01:23:13 +01:00
static bool mysql_prepare_insert_check_table(THD *thd, TABLE_LIST *table_list,
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
List<Item> &fields,
2004-11-25 01:23:13 +01:00
bool select_insert)
2004-04-10 00:14:32 +02:00
{
2004-07-16 00:15:55 +02:00
bool insert_into_view= (table_list->view != 0);
DBUG_ENTER("mysql_prepare_insert_check_table");
2004-07-16 00:15:55 +02:00
/*
first table in list is the one we'll INSERT into, requires INSERT_ACL.
all others require SELECT_ACL only. the ACL requirement below is for
new leaves only anyway (view-constituents), so check for SELECT rather
than INSERT.
*/
if (setup_tables_and_check_access(thd, &thd->lex->select_lex.context,
&thd->lex->select_lex.top_join_list,
table_list,
&thd->lex->select_lex.leaf_tables,
select_insert, INSERT_ACL, SELECT_ACL))
2004-11-25 01:23:13 +01:00
DBUG_RETURN(TRUE);
2004-07-16 00:15:55 +02:00
if (insert_into_view && !fields.elements)
{
thd->lex->empty_field_list_on_rset= 1;
if (!table_list->table)
{
my_error(ER_VIEW_NO_INSERT_FIELD_LIST, MYF(0),
table_list->view_db.str, table_list->view_name.str);
2004-11-25 01:23:13 +01:00
DBUG_RETURN(TRUE);
}
DBUG_RETURN(insert_view_fields(thd, &fields, table_list));
2004-07-16 00:15:55 +02:00
}
2004-11-25 01:23:13 +01:00
DBUG_RETURN(FALSE);
}
/*
Get extra info for tables we insert into
@param table table(TABLE object) we insert into,
might be NULL in case of view
@param table(TABLE_LIST object) or view we insert into
*/
static void prepare_for_positional_update(TABLE *table, TABLE_LIST *tables)
{
if (table)
{
if(table->reginfo.lock_type != TL_WRITE_DELAYED)
2009-07-03 10:39:22 +02:00
table->prepare_for_position();
return;
}
DBUG_ASSERT(tables->view);
List_iterator<TABLE_LIST> it(*tables->view_tables);
TABLE_LIST *tbl;
while ((tbl= it++))
prepare_for_positional_update(tbl->table, tbl);
return;
}
/*
Prepare items in INSERT statement
SYNOPSIS
mysql_prepare_insert()
thd Thread handler
table_list Global/local table list
2005-07-03 13:17:52 +02:00
table Table to insert into (can be NULL if table should
be taken from table_list->table)
2004-12-30 23:44:00 +01:00
where Where clause (for insert ... select)
select_insert TRUE if INSERT ... SELECT statement
check_fields TRUE if need to check that all INSERT fields are
given values.
abort_on_warning whether to report if some INSERT field is not
assigned as an error (TRUE) or as a warning (FALSE).
TODO (in far future)
In cases of:
INSERT INTO t1 SELECT a, sum(a) as sum1 from t2 GROUP BY a
ON DUPLICATE KEY ...
we should be able to refer to sum1 in the ON DUPLICATE KEY part
2004-04-10 00:14:32 +02:00
WARNING
You MUST set table->insert_values to 0 after calling this function
before releasing the table object.
2005-07-03 13:17:52 +02:00
RETURN VALUE
FALSE OK
TRUE error
*/
2005-07-03 13:17:52 +02:00
bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list,
TABLE *table, List<Item> &fields, List_item *values,
2004-12-22 12:54:39 +01:00
List<Item> &update_fields, List<Item> &update_values,
2004-12-30 23:44:00 +01:00
enum_duplicates duplic,
COND **where, bool select_insert,
bool check_fields, bool abort_on_warning)
{
SELECT_LEX *select_lex= &thd->lex->select_lex;
Name_resolution_context *context= &select_lex->context;
Name_resolution_context_state ctx_state;
bool insert_into_view= (table_list->view != 0);
bool res= 0;
table_map map= 0;
DBUG_ENTER("mysql_prepare_insert");
2004-11-25 01:23:13 +01:00
DBUG_PRINT("enter", ("table_list 0x%lx, table 0x%lx, view %d",
(ulong)table_list, (ulong)table,
(int)insert_into_view));
/* INSERT should have a SELECT or VALUES clause */
DBUG_ASSERT (!select_insert || !values);
2004-12-22 12:54:39 +01:00
/*
For subqueries in VALUES() we should not see the table in which we are
inserting (for INSERT ... SELECT this is done by changing table_list,
because INSERT ... SELECT share SELECT_LEX it with SELECT.
*/
if (!select_insert)
{
2005-07-03 13:17:52 +02:00
for (SELECT_LEX_UNIT *un= select_lex->first_inner_unit();
un;
un= un->next_unit())
{
for (SELECT_LEX *sl= un->first_select();
sl;
sl= sl->next_select())
{
sl->context.outer_context= 0;
}
}
}
2004-12-22 12:54:39 +01:00
if (duplic == DUP_UPDATE)
{
/* it should be allocated before Item::fix_fields() */
2004-12-22 12:54:39 +01:00
if (table_list->set_insert_values(thd->mem_root))
2004-12-30 23:44:00 +01:00
DBUG_RETURN(TRUE);
}
2004-12-22 12:54:39 +01:00
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (mysql_prepare_insert_check_table(thd, table_list, fields, select_insert))
2004-11-25 01:23:13 +01:00
DBUG_RETURN(TRUE);
2004-07-16 00:15:55 +02:00
/* Prepare the fields in the statement. */
if (values)
2004-04-10 00:14:32 +02:00
{
/* if we have INSERT ... VALUES () we cannot have a GROUP BY clause */
DBUG_ASSERT (!select_lex->group_list.elements);
/* Save the state of the current name resolution context. */
ctx_state.save_state(context, table_list);
/*
Perform name resolution only in the first table - 'table_list',
which is the table that is inserted into.
*/
table_list->next_local= 0;
context->resolve_in_table_list_only(table_list);
res= (setup_fields(thd, 0, *values, MARK_COLUMNS_READ, 0, 0) ||
check_insert_fields(thd, context->table_list, fields, *values,
!insert_into_view, 0, &map));
if (!res && check_fields)
{
bool saved_abort_on_warning= thd->abort_on_warning;
thd->abort_on_warning= abort_on_warning;
res= check_that_all_fields_are_given_values(thd,
table ? table :
context->table_list->table,
context->table_list);
thd->abort_on_warning= saved_abort_on_warning;
}
if (!res)
res= setup_fields(thd, 0, update_values, MARK_COLUMNS_READ, 0, 0);
if (!res && duplic == DUP_UPDATE)
{
select_lex->no_wrap_view_item= TRUE;
res= check_update_fields(thd, context->table_list, update_fields,
update_values, &map);
select_lex->no_wrap_view_item= FALSE;
}
/* Restore the current context. */
ctx_state.restore_state(context, table_list);
2005-07-03 13:17:52 +02:00
}
2005-07-03 13:17:52 +02:00
if (res)
DBUG_RETURN(res);
2004-07-16 00:15:55 +02:00
2004-11-25 01:23:13 +01:00
if (!table)
table= table_list->table;
if (!select_insert)
2004-04-10 00:14:32 +02:00
{
Item *fake_conds= 0;
TABLE_LIST *duplicate;
if ((duplicate= unique_table(thd, table_list, table_list->next_global, 1)))
{
update_non_unique_table_error(table_list, "INSERT", duplicate);
DBUG_RETURN(TRUE);
}
select_lex->fix_prepare_information(thd, &fake_conds, &fake_conds);
2005-07-03 13:17:52 +02:00
select_lex->first_execution= 0;
2004-04-10 00:14:32 +02:00
}
/*
Only call prepare_for_posistion() if we are not performing a DELAYED
operation. It will instead be executed by delayed insert thread.
*/
if (duplic == DUP_UPDATE || duplic == DUP_REPLACE)
prepare_for_positional_update(table, table_list);
DBUG_RETURN(FALSE);
2004-04-10 00:14:32 +02:00
}
2000-07-31 21:29:14 +02:00
/* Check if there is more uniq keys after field */
static int last_uniq_key(TABLE *table,uint keynr)
{
/*
When an underlying storage engine informs that the unique key
conflicts are not reported in the ascending order by setting
the HA_DUPLICATE_KEY_NOT_IN_ORDER flag, we cannot rely on this
information to determine the last key conflict.
The information about the last key conflict will be used to
do a replace of the new row on the conflicting row, rather
than doing a delete (of old row) + insert (of new row).
Hence check for this flag and disable replacing the last row
by returning 0 always. Returning 0 will result in doing
a delete + insert always.
*/
if (table->file->ha_table_flags() & HA_DUPLICATE_KEY_NOT_IN_ORDER)
return 0;
while (++keynr < table->s->keys)
2000-07-31 21:29:14 +02:00
if (table->key_info[keynr].flags & HA_NOSAME)
return 0;
return 1;
}
/*
Write a record to table with optional deleting of conflicting records,
invoke proper triggers if needed.
SYNOPSIS
write_record()
thd - thread context
table - table to which record should be written
info - COPY_INFO structure describing handling of duplicates
and which is used for counting number of records inserted
and deleted.
NOTE
Once this record will be written to table after insert trigger will
be invoked. If instead of inserting new record we will update old one
then both on update triggers will work instead. Similarly both on
delete triggers will be invoked if we will delete conflicting records.
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
2007-07-30 17:27:36 +02:00
Sets thd->transaction.stmt.modified_non_trans_table to TRUE if table which is updated didn't have
transactions.
RETURN VALUE
0 - success
non-0 - error
2000-07-31 21:29:14 +02:00
*/
int write_record(THD *thd, TABLE *table,COPY_INFO *info)
2000-07-31 21:29:14 +02:00
{
int error, trg_error= 0;
2000-07-31 21:29:14 +02:00
char *key=0;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
MY_BITMAP *save_read_set, *save_write_set;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
ulonglong prev_insert_id= table->file->next_insert_id;
ulonglong insert_id_for_cur_row= 0;
DBUG_ENTER("write_record");
2000-07-31 21:29:14 +02:00
info->records++;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
save_read_set= table->read_set;
save_write_set= table->write_set;
2002-12-02 20:38:00 +01:00
if (info->handle_duplicates == DUP_REPLACE ||
info->handle_duplicates == DUP_UPDATE)
2000-07-31 21:29:14 +02:00
{
while ((error=table->file->ha_write_row(table->record[0])))
2000-07-31 21:29:14 +02:00
{
uint key_nr;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
/*
If we do more than one iteration of this loop, from the second one the
row will have an explicit value in the autoinc field, which was set at
the first call of handler::update_auto_increment(). So we must save
the autogenerated value to avoid thd->insert_id_for_cur_row to become
0.
*/
if (table->file->insert_id_for_cur_row > 0)
insert_id_for_cur_row= table->file->insert_id_for_cur_row;
else
table->file->insert_id_for_cur_row= insert_id_for_cur_row;
bool is_duplicate_key_error;
if (table->file->is_fatal_error(error, HA_CHECK_DUP))
2000-07-31 21:29:14 +02:00
goto err;
is_duplicate_key_error= table->file->is_fatal_error(error, 0);
if (!is_duplicate_key_error)
{
/*
We come here when we had an ignorable error which is not a duplicate
key error. In this we ignore error if ignore flag is set, otherwise
report error as usual. We will not do any duplicate key processing.
*/
if (info->ignore)
goto ok_or_after_trg_err; /* Ignoring a not fatal error, return 0 */
goto err;
}
2000-07-31 21:29:14 +02:00
if ((int) (key_nr = table->file->get_dup_key(error)) < 0)
{
error= HA_ERR_FOUND_DUPP_KEY; /* Database can't find key */
2000-07-31 21:29:14 +02:00
goto err;
}
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
/* Read all columns for the row we are going to replace */
table->use_all_columns();
/*
Don't allow REPLACE to replace a row when a auto_increment column
was used. This ensures that we don't get a problem when the
whole range of the key has been used.
*/
2002-12-02 20:38:00 +01:00
if (info->handle_duplicates == DUP_REPLACE &&
table->next_number_field &&
key_nr == table->s->next_number_index &&
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
(insert_id_for_cur_row > 0))
goto err;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (table->file->ha_table_flags() & HA_DUPLICATE_POS)
2000-07-31 21:29:14 +02:00
{
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (table->file->rnd_pos(table->record[1],table->file->dup_ref))
2000-07-31 21:29:14 +02:00
goto err;
}
else
{
if (table->file->extra(HA_EXTRA_FLUSH_CACHE)) /* Not needed with NISAM */
2000-07-31 21:29:14 +02:00
{
error=my_errno;
goto err;
}
2000-07-31 21:29:14 +02:00
if (!key)
{
if (!(key=(char*) my_safe_alloca(table->s->max_unique_length,
2000-07-31 21:29:14 +02:00
MAX_KEY_LENGTH)))
{
error=ENOMEM;
goto err;
}
}
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
key_copy((uchar*) key,table->record[0],table->key_info+key_nr,0);
if ((error=(table->file->index_read_idx_map(table->record[1],key_nr,
(uchar*) key, HA_WHOLE_KEY,
HA_READ_KEY_EXACT))))
2000-07-31 21:29:14 +02:00
goto err;
}
2002-12-02 20:38:00 +01:00
if (info->handle_duplicates == DUP_UPDATE)
2000-07-31 21:29:14 +02:00
{
int res= 0;
/*
We don't check for other UNIQUE keys - the first row
that matches, is updated. If update causes a conflict again,
an error is returned
2002-12-02 20:38:00 +01:00
*/
DBUG_ASSERT(table->insert_values != NULL);
2003-05-03 01:16:56 +02:00
store_record(table,insert_values);
restore_record(table,record[1]);
2004-12-30 23:44:00 +01:00
DBUG_ASSERT(info->update_fields->elements ==
info->update_values->elements);
if (fill_record_n_invoke_before_triggers(thd, *info->update_fields,
*info->update_values,
info->ignore,
table->triggers,
TRG_EVENT_UPDATE))
goto before_trg_err;
/* CHECK OPTION for VIEW ... ON DUPLICATE KEY UPDATE ... */
if (info->view &&
(res= info->view->view_check_option(current_thd, info->ignore)) ==
VIEW_CHECK_SKIP)
goto ok_or_after_trg_err;
2004-12-30 23:44:00 +01:00
if (res == VIEW_CHECK_ERROR)
goto before_trg_err;
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
table->file->restore_auto_increment(prev_insert_id);
if (table->next_number_field)
table->file->adjust_next_insert_id_after_explicit_value(
table->next_number_field->val_int());
info->touched++;
if (!records_are_comparable(table) || compare_records(table))
{
if ((error=table->file->ha_update_row(table->record[1],
table->record[0])) &&
error != HA_ERR_RECORD_IS_THE_SAME)
{
if (info->ignore &&
!table->file->is_fatal_error(error, HA_CHECK_DUP_KEY))
{
goto ok_or_after_trg_err;
}
goto err;
}
if (error != HA_ERR_RECORD_IS_THE_SAME)
info->updated++;
else
error= 0;
2007-03-19 23:37:33 +01:00
/*
If ON DUP KEY UPDATE updates a row instead of inserting one, it's
like a regular UPDATE statement: it should not affect the value of a
next SELECT LAST_INSERT_ID() or mysql_insert_id().
Except if LAST_INSERT_ID(#) was in the INSERT query, which is
handled separately by THD::arg_of_last_insert_id_function.
*/
insert_id_for_cur_row= table->file->insert_id_for_cur_row= 0;
trg_error= (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_UPDATE,
2007-03-19 23:37:33 +01:00
TRG_ACTION_AFTER, TRUE));
info->copied++;
}
if (table->next_number_field)
table->file->adjust_next_insert_id_after_explicit_value(
table->next_number_field->val_int());
info->touched++;
goto ok_or_after_trg_err;
2002-12-02 20:38:00 +01:00
}
else /* DUP_REPLACE */
{
2004-02-11 00:06:46 +01:00
/*
The manual defines the REPLACE semantics that it is either
an INSERT or DELETE(s) + INSERT; FOREIGN KEY checks in
InnoDB do not function in the defined way if we allow MySQL
to convert the latter operation internally to an UPDATE.
We also should not perform this conversion if we have
timestamp field with ON UPDATE which is different from DEFAULT.
Another case when conversion should not be performed is when
we have ON DELETE trigger on table so user may notice that
we cheat here. Note that it is ok to do such conversion for
tables which have ON UPDATE but have no ON DELETE triggers,
we just should not expose this fact to users by invoking
ON UPDATE triggers.
2004-02-11 00:06:46 +01:00
*/
if (last_uniq_key(table,key_nr) &&
!table->file->referenced_by_foreign_key() &&
(table->timestamp_field_type == TIMESTAMP_NO_AUTO_SET ||
table->timestamp_field_type == TIMESTAMP_AUTO_SET_ON_BOTH) &&
(!table->triggers || !table->triggers->has_delete_triggers()))
2002-12-02 20:38:00 +01:00
{
if ((error=table->file->ha_update_row(table->record[1],
table->record[0])) &&
error != HA_ERR_RECORD_IS_THE_SAME)
2002-12-02 20:38:00 +01:00
goto err;
if (error != HA_ERR_RECORD_IS_THE_SAME)
info->deleted++;
else
error= 0;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
thd->record_first_successful_insert_id_in_cur_stmt(table->file->insert_id_for_cur_row);
/*
Since we pretend that we have done insert we should call
its after triggers.
*/
goto after_trg_n_copied_inc;
}
else
{
if (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_BEFORE, TRUE))
goto before_trg_err;
if ((error=table->file->ha_delete_row(table->record[1])))
goto err;
info->deleted++;
if (!table->file->has_transactions())
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
2007-07-30 17:27:36 +02:00
thd->transaction.stmt.modified_non_trans_table= TRUE;
if (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
TRG_ACTION_AFTER, TRUE))
{
trg_error= 1;
goto ok_or_after_trg_err;
}
/* Let us attempt do write_row() once more */
2002-12-02 20:38:00 +01:00
}
2000-07-31 21:29:14 +02:00
}
}
/*
If more than one iteration of the above while loop is done, from the second
one the row being inserted will have an explicit value in the autoinc field,
which was set at the first call of handler::update_auto_increment(). This
value is saved to avoid thd->insert_id_for_cur_row becoming 0. Use this saved
autoinc value.
*/
if (table->file->insert_id_for_cur_row == 0)
table->file->insert_id_for_cur_row= insert_id_for_cur_row;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
thd->record_first_successful_insert_id_in_cur_stmt(table->file->insert_id_for_cur_row);
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
/*
Restore column maps if they where replaced during an duplicate key
problem.
*/
if (table->read_set != save_read_set ||
table->write_set != save_write_set)
table->column_bitmaps_set(save_read_set, save_write_set);
2000-07-31 21:29:14 +02:00
}
else if ((error=table->file->ha_write_row(table->record[0])))
2000-07-31 21:29:14 +02:00
{
if (!info->ignore ||
table->file->is_fatal_error(error, HA_CHECK_DUP))
2000-07-31 21:29:14 +02:00
goto err;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
table->file->restore_auto_increment(prev_insert_id);
goto ok_or_after_trg_err;
2000-07-31 21:29:14 +02:00
}
after_trg_n_copied_inc:
info->copied++;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
thd->record_first_successful_insert_id_in_cur_stmt(table->file->insert_id_for_cur_row);
trg_error= (table->triggers &&
table->triggers->process_triggers(thd, TRG_EVENT_INSERT,
TRG_ACTION_AFTER, TRUE));
ok_or_after_trg_err:
2000-07-31 21:29:14 +02:00
if (key)
my_safe_afree(key,table->s->max_unique_length,MAX_KEY_LENGTH);
if (!table->file->has_transactions())
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
2007-07-30 17:27:36 +02:00
thd->transaction.stmt.modified_non_trans_table= TRUE;
DBUG_RETURN(trg_error);
2000-07-31 21:29:14 +02:00
err:
info->last_errno= error;
/* current_select is NULL if this is a delayed insert */
if (thd->lex->current_select)
thd->lex->current_select->no_error= 0; // Give error
2000-07-31 21:29:14 +02:00
table->file->print_error(error,MYF(0));
before_trg_err:
table->file->restore_auto_increment(prev_insert_id);
if (key)
my_safe_afree(key, table->s->max_unique_length, MAX_KEY_LENGTH);
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
table->column_bitmaps_set(save_read_set, save_write_set);
DBUG_RETURN(1);
2000-07-31 21:29:14 +02:00
}
/******************************************************************************
2002-07-25 00:00:56 +02:00
Check that all fields with arn't null_fields are used
2000-07-31 21:29:14 +02:00
******************************************************************************/
int check_that_all_fields_are_given_values(THD *thd, TABLE *entry,
TABLE_LIST *table_list)
2000-07-31 21:29:14 +02:00
{
int err= 0;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
MY_BITMAP *write_set= entry->write_set;
2000-07-31 21:29:14 +02:00
for (Field **field=entry->field ; *field ; field++)
{
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (!bitmap_is_set(write_set, (*field)->field_index) &&
((*field)->flags & NO_DEFAULT_VALUE_FLAG) &&
((*field)->real_type() != MYSQL_TYPE_ENUM))
2000-07-31 21:29:14 +02:00
{
bool view= FALSE;
if (table_list)
{
2005-09-02 08:50:17 +02:00
table_list= table_list->top_table();
view= test(table_list->view);
}
if (view)
{
push_warning_printf(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
ER_NO_DEFAULT_FOR_VIEW_FIELD,
ER(ER_NO_DEFAULT_FOR_VIEW_FIELD),
table_list->view_db.str,
table_list->view_name.str);
}
else
{
push_warning_printf(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
ER_NO_DEFAULT_FOR_FIELD,
ER(ER_NO_DEFAULT_FOR_FIELD),
(*field)->field_name);
}
err= 1;
2000-07-31 21:29:14 +02:00
}
}
return thd->abort_on_warning ? err : 0;
2000-07-31 21:29:14 +02:00
}
/*****************************************************************************
2002-07-25 00:00:56 +02:00
Handling of delayed inserts
A thread is created for each table that one uses with the DELAYED attribute.
2000-07-31 21:29:14 +02:00
*****************************************************************************/
#ifndef EMBEDDED_LIBRARY
2000-07-31 21:29:14 +02:00
class delayed_row :public ilink {
public:
char *record;
2000-07-31 21:29:14 +02:00
enum_duplicates dup;
time_t start_time;
ulong sql_mode;
bool auto_increment_field_not_null;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
bool query_start_used, ignore, log_query;
bool stmt_depends_on_first_successful_insert_id_in_prev_stmt;
ulonglong first_successful_insert_id_in_prev_stmt;
ulonglong forced_insert_id;
ulong auto_increment_increment;
ulong auto_increment_offset;
timestamp_auto_set_type timestamp_field_type;
LEX_STRING query;
Time_zone *time_zone;
2000-07-31 21:29:14 +02:00
delayed_row(LEX_STRING const query_arg, enum_duplicates dup_arg,
bool ignore_arg, bool log_query_arg)
: record(0), dup(dup_arg), ignore(ignore_arg), log_query(log_query_arg),
2009-03-24 15:52:35 +01:00
forced_insert_id(0), query(query_arg), time_zone(0)
{}
2000-07-31 21:29:14 +02:00
~delayed_row()
{
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
2010-07-08 23:20:08 +02:00
my_free(query.str);
my_free(record);
2000-07-31 21:29:14 +02:00
}
};
/**
Delayed_insert - context of a thread responsible for delayed insert
into one table. When processing delayed inserts, we create an own
thread for every distinct table. Later on all delayed inserts directed
into that table are handled by a dedicated thread.
*/
2000-07-31 21:29:14 +02:00
class Delayed_insert :public ilink {
2000-07-31 21:29:14 +02:00
uint locks_in_memory;
thr_lock_type delayed_lock;
2000-07-31 21:29:14 +02:00
public:
THD thd;
TABLE *table;
mysql_mutex_t mutex;
mysql_cond_t cond, cond_client;
2000-07-31 21:29:14 +02:00
volatile uint tables_in_use,stacked_inserts;
volatile bool status;
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
/**
When the handler thread starts, it clones a metadata lock ticket
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
which protects against GRL and ticket for the table to be inserted.
This is done to allow the deadlock detector to detect deadlocks
resulting from these locks.
Before this is done, the connection thread cannot safely exit
without causing problems for clone_ticket().
Once handler_thread_initialized has been set, it is safe for the
connection thread to exit.
Access to handler_thread_initialized is protected by di->mutex.
*/
bool handler_thread_initialized;
2000-07-31 21:29:14 +02:00
COPY_INFO info;
I_List<delayed_row> rows;
ulong group_count;
2001-09-01 09:38:16 +02:00
TABLE_LIST table_list; // Argument
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
/**
Request for IX metadata lock protecting against GRL which is
passed from connection thread to the handler thread.
*/
MDL_request grl_protection;
2000-07-31 21:29:14 +02:00
Delayed_insert()
:locks_in_memory(0), table(0),tables_in_use(0),stacked_inserts(0),
status(0), handler_thread_initialized(FALSE), group_count(0)
2000-07-31 21:29:14 +02:00
{
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
2009-07-14 21:31:19 +02:00
DBUG_ENTER("Delayed_insert constructor");
thd.security_ctx->user=(char*) delayed_user;
thd.security_ctx->host=(char*) my_localhost;
strmake(thd.security_ctx->priv_user, thd.security_ctx->user,
USERNAME_LENGTH);
2000-07-31 21:29:14 +02:00
thd.current_tablenr=0;
thd.command=COM_DELAYED_INSERT;
thd.lex->current_select= 0; // for my_message_sql
thd.lex->sql_command= SQLCOM_INSERT; // For innodb::store_lock()
Bug #45225 Locking: hang if drop table with no timeout This patch introduces timeouts for metadata locks. The timeout is specified in seconds using the new dynamic system variable "lock_wait_timeout" which has both GLOBAL and SESSION scopes. Allowed values range from 1 to 31536000 seconds (= 1 year). The default value is 1 year. The new server parameter "lock-wait-timeout" can be used to set the default value parameter upon server startup. "lock_wait_timeout" applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH READ LOCK and HANDLER statements. The patch also changes thr_lock.c code (table data locks used by MyISAM and other simplistic engines) to use the same system variable. InnoDB row locks are unaffected. One exception to the handling of the "lock_wait_timeout" variable is delayed inserts. All delayed inserts are executed with a timeout of 1 year regardless of the setting for the global variable. As the connection issuing the delayed insert gets no notification of delayed insert timeouts, we want to avoid unnecessary timeouts. It's important to note that the timeout value is used for each lock acquired and that one statement can take more than one lock. A statement can therefore block for longer than the lock_wait_timeout value before reporting a timeout error. When lock timeout occurs, ER_LOCK_WAIT_TIMEOUT is reported. Test case added to lock_multi.test.
2010-02-11 11:23:39 +01:00
/*
Prevent changes to global.lock_wait_timeout from affecting
delayed insert threads as any timeouts in delayed inserts
are not communicated to the client.
*/
thd.variables.lock_wait_timeout= LONG_TIMEOUT;
2000-07-31 21:29:14 +02:00
bzero((char*) &thd.net, sizeof(thd.net)); // Safety
bzero((char*) &table_list, sizeof(table_list)); // Safety
thd.system_thread= SYSTEM_THREAD_DELAYED_INSERT;
thd.security_ctx->host_or_ip= "";
2000-07-31 21:29:14 +02:00
bzero((char*) &info,sizeof(info));
mysql_mutex_init(key_delayed_insert_mutex, &mutex, MY_MUTEX_INIT_FAST);
mysql_cond_init(key_delayed_insert_cond, &cond, NULL);
mysql_cond_init(key_delayed_insert_cond_client, &cond_client, NULL);
mysql_mutex_lock(&LOCK_thread_count);
2000-07-31 21:29:14 +02:00
delayed_insert_threads++;
delayed_lock= global_system_variables.low_priority_updates ?
TL_WRITE_LOW_PRIORITY : TL_WRITE;
mysql_mutex_unlock(&LOCK_thread_count);
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
2009-07-14 21:31:19 +02:00
DBUG_VOID_RETURN;
2000-07-31 21:29:14 +02:00
}
~Delayed_insert()
2000-07-31 21:29:14 +02:00
{
/* The following is not really needed, but just for safety */
2000-07-31 21:29:14 +02:00
delayed_row *row;
while ((row=rows.get()))
delete row;
if (table)
{
2000-07-31 21:29:14 +02:00
close_thread_tables(&thd);
thd.mdl_context.release_transactional_locks();
}
mysql_mutex_lock(&LOCK_thread_count);
mysql_mutex_destroy(&mutex);
mysql_cond_destroy(&cond);
mysql_cond_destroy(&cond_client);
2000-07-31 21:29:14 +02:00
thd.unlink(); // Must be unlinked under lock
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
2010-07-08 23:20:08 +02:00
my_free(thd.query());
thd.security_ctx->user= thd.security_ctx->host=0;
2000-07-31 21:29:14 +02:00
thread_count--;
delayed_insert_threads--;
mysql_mutex_unlock(&LOCK_thread_count);
mysql_cond_broadcast(&COND_thread_count); /* Tell main we are ready */
2000-07-31 21:29:14 +02:00
}
/* The following is for checking when we can delete ourselves */
inline void lock()
{
locks_in_memory++; // Assume LOCK_delay_insert
}
void unlock()
{
mysql_mutex_lock(&LOCK_delayed_insert);
2000-07-31 21:29:14 +02:00
if (!--locks_in_memory)
{
mysql_mutex_lock(&mutex);
2000-07-31 21:29:14 +02:00
if (thd.killed && ! stacked_inserts && ! tables_in_use)
{
mysql_cond_signal(&cond);
2000-07-31 21:29:14 +02:00
status=1;
}
mysql_mutex_unlock(&mutex);
2000-07-31 21:29:14 +02:00
}
mysql_mutex_unlock(&LOCK_delayed_insert);
2000-07-31 21:29:14 +02:00
}
inline uint lock_count() { return locks_in_memory; }
TABLE* get_local_table(THD* client_thd);
bool open_and_lock_table();
2000-07-31 21:29:14 +02:00
bool handle_inserts(void);
};
I_List<Delayed_insert> delayed_threads;
2000-07-31 21:29:14 +02:00
/**
Return an instance of delayed insert thread that can handle
inserts into a given table, if it exists. Otherwise return NULL.
*/
2000-07-31 21:29:14 +02:00
static
Delayed_insert *find_handler(THD *thd, TABLE_LIST *table_list)
2000-07-31 21:29:14 +02:00
{
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "waiting for delay_list");
mysql_mutex_lock(&LOCK_delayed_insert); // Protect master list
I_List_iterator<Delayed_insert> it(delayed_threads);
Delayed_insert *di;
while ((di= it++))
2000-07-31 21:29:14 +02:00
{
if (!strcmp(table_list->db, di->table_list.db) &&
!strcmp(table_list->table_name, di->table_list.table_name))
2000-07-31 21:29:14 +02:00
{
di->lock();
2000-07-31 21:29:14 +02:00
break;
}
}
mysql_mutex_unlock(&LOCK_delayed_insert); // For unlink from list
return di;
2000-07-31 21:29:14 +02:00
}
/**
Attempt to find or create a delayed insert thread to handle inserts
into this table.
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
@return In case of success, table_list->table points to a local copy
of the delayed table or is set to NULL, which indicates a
request for lock upgrade. In case of failure, value of
table_list->table is undefined.
@retval TRUE - this thread ran out of resources OR
- a newly created delayed insert thread ran out of
resources OR
- the created thread failed to open and lock the table
(e.g. because it does not exist) OR
- the table opened in the created thread turned out to
be a view
@retval FALSE - table successfully opened OR
- too many delayed insert threads OR
- the table has triggers and we have to fall back to
a normal INSERT
Two latter cases indicate a request for lock upgrade.
XXX: why do we regard INSERT DELAYED into a view as an error and
do not simply perform a lock upgrade?
TODO: The approach with using two mutexes to work with the
delayed thread list -- LOCK_delayed_insert and
LOCK_delayed_create -- is redundant, and we only need one of
them to protect the list. The reason we have two locks is that
we do not want to block look-ups in the list while we're waiting
for the newly created thread to open the delayed table. However,
this wait itself is redundant -- we always call get_local_table
later on, and there wait again until the created thread acquires
a table lock.
As is redundant the concept of locks_in_memory, since we already
have another counter with similar semantics - tables_in_use,
both of them are devoted to counting the number of producers for
a given consumer (delayed insert thread), only at different
stages of producer-consumer relationship.
The 'status' variable in Delayed_insert is redundant
too, since there is already di->stacked_inserts.
*/
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
static
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
bool delayed_get_table(THD *thd, MDL_request *grl_protection_request,
TABLE_LIST *table_list)
2000-07-31 21:29:14 +02:00
{
int error;
Delayed_insert *di;
2000-07-31 21:29:14 +02:00
DBUG_ENTER("delayed_get_table");
A fix and a test case for Bug#19022 "Memory bug when switching db during trigger execution" Bug#17199 "Problem when view calls function from another database." Bug#18444 "Fully qualified stored function names don't work correctly in SELECT statements" Documentation note: this patch introduces a change in behaviour of prepared statements. This patch adds a few new invariants with regard to how THD::db should be used. These invariants should be preserved in future: - one should never refer to THD::db by pointer and always make a deep copy (strmake, strdup) - one should never compare two databases by pointer, but use strncmp or my_strncasecmp - TABLE_LIST object table->db should be always initialized in the parser or by creator of the object. For prepared statements it means that if the current database is changed after a statement is prepared, the database that was current at prepare remains active. This also means that you can not prepare a statement that implicitly refers to the current database if the latter is not set. This is not documented, and therefore needs documentation. This is NOT a change in behavior for almost all SQL statements except: - ALTER TABLE t1 RENAME t2 - OPTIMIZE TABLE t1 - ANALYZE TABLE t1 - TRUNCATE TABLE t1 -- until this patch t1 or t2 could be evaluated at the first execution of prepared statement. CURRENT_DATABASE() still works OK and is evaluated at every execution of prepared statement. Note, that in stored routines this is not an issue as the default database is the database of the stored procedure and "use" statement is prohibited in stored routines. This patch makes obsolete the use of check_db_used (it was never used in the old code too) and all other places that check for table->db and assign it from THD::db if it's NULL, except the parser. How this patch was created: THD::{db,db_length} were replaced with a LEX_STRING, THD::db. All the places that refer to THD::{db,db_length} were manually checked and: - if the place uses thd->db by pointer, it was fixed to make a deep copy - if a place compared two db pointers, it was fixed to compare them by value (via strcmp/my_strcasecmp, whatever was approproate) Then this intermediate patch was used to write a smaller patch that does the same thing but without a rename. TODO in 5.1: - remove check_db_used - deploy THD::set_db in mysql_change_db See also comments to individual files.
2006-06-26 22:47:52 +02:00
/* Must be set in the parser */
DBUG_ASSERT(table_list->db);
2000-07-31 21:29:14 +02:00
/* Find the thread which handles this table. */
if (!(di= find_handler(thd, table_list)))
2000-07-31 21:29:14 +02:00
{
/*
No match. Create a new thread to handle the table, but
no more than max_insert_delayed_threads.
*/
if (delayed_insert_threads >= thd->variables.max_insert_delayed_threads)
DBUG_RETURN(0);
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "Creating delayed handler");
mysql_mutex_lock(&LOCK_delayed_create);
/*
The first search above was done without LOCK_delayed_create.
Another thread might have created the handler in between. Search again.
*/
if (! (di= find_handler(thd, table_list)))
2000-07-31 21:29:14 +02:00
{
if (!(di= new Delayed_insert()))
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
goto end_create;
mysql_mutex_lock(&LOCK_thread_count);
2000-10-23 14:35:42 +02:00
thread_count++;
mysql_mutex_unlock(&LOCK_thread_count);
di->thd.set_db(table_list->db, (uint) strlen(table_list->db));
di->thd.set_query(my_strdup(table_list->table_name,
Bug#57306 SHOW PROCESSLIST does not display string literals well. Problem: Extended characters outside of ASCII range where not displayed properly in SHOW PROCESSLIST, because thd_info->query was always sent as system_character_set (utf8). This was wrong, because query buffer is never converted to utf8 - it is always have client character set. Fix: sending query buffer using query character set @ sql/sql_class.cc @ sql/sql_class.h Introducing a new class CSET_STRING, a LEX_STRING with character set. Adding set_query(&CSET_STRING) Adding reset_query(), to use instead of set_query(0, NULL). @ sql/event_data_objects.cc Using reset_query() @ sql/log_event.cc Using reset_query() Adding charset argument to set_query_and_id(). @ sql/slave.cc Using reset_query(). @ sql/sp_head.cc Changing backing up and restore code to use CSET_STRING. @ sql/sql_audit.h Using CSET_STRING. In the "else" branch it's OK not to use global_system_variables.character_set_client. &my_charset_latin1, which is set in constructor, is fine (verified with Sergey Vojtovich). @ sql/sql_insert.cc Using set_query() with proper character set: table_name is utf8. @ sql/sql_parse.cc Adding character set argument to set_query_and_id(). (This is the main point where thd->charset() is stored into thd->query_string.cs, for use in "SHOW PROCESSLIST".) Using reset_query(). @ sql/sql_prepare.cc Storing client character set into thd->query_string.cs. @ sql/sql_show.cc Using CSET_STRING to fetch and send charset-aware query information from threads. @ storage/myisam/ha_myisam.cc Using set_query() with proper character set: table_name is utf8. @ mysql-test/r/show_check.result @ mysql-test/t/show_check.test Adding tests
2010-11-18 15:08:32 +01:00
MYF(MY_WME | ME_FATALERROR)),
0, system_charset_info);
if (di->thd.db == NULL || di->thd.query() == NULL)
2000-07-31 21:29:14 +02:00
{
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
/* The error is reported */
delete di;
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
goto end_create;
2000-07-31 21:29:14 +02:00
}
di->table_list= *table_list; // Needed to open table
/* Replace volatile strings with local copies */
di->table_list.alias= di->table_list.table_name= di->thd.query();
di->table_list.db= di->thd.db;
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
/* We need the tickets so that they can be cloned in handle_delayed_insert */
di->grl_protection.init(MDL_key::GLOBAL, "", "",
MDL_INTENTION_EXCLUSIVE, MDL_STATEMENT);
di->grl_protection.ticket= grl_protection_request->ticket;
init_mdl_requests(&di->table_list);
di->table_list.mdl_request.ticket= table_list->mdl_request.ticket;
di->lock();
mysql_mutex_lock(&di->mutex);
if ((error= mysql_thread_create(key_thread_delayed_insert,
&di->thd.real_id, &connection_attrib,
handle_delayed_insert, (void*) di)))
2000-07-31 21:29:14 +02:00
{
DBUG_PRINT("error",
("Can't create thread to handle delayed insert (error %d)",
error));
mysql_mutex_unlock(&di->mutex);
di->unlock();
delete di;
my_error(ER_CANT_CREATE_THREAD, MYF(ME_FATALERROR), error);
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
goto end_create;
2000-07-31 21:29:14 +02:00
}
/*
Wait until table is open unless the handler thread or the connection
thread has been killed. Note that we in all cases must wait until the
handler thread has been properly initialized before exiting. Otherwise
we risk doing clone_ticket() on a ticket that is no longer valid.
*/
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "waiting for handler open");
while (!di->handler_thread_initialized ||
(!di->thd.killed && !di->table && !thd->killed))
2000-07-31 21:29:14 +02:00
{
mysql_cond_wait(&di->cond_client, &di->mutex);
2000-07-31 21:29:14 +02:00
}
mysql_mutex_unlock(&di->mutex);
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "got old table");
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
2009-12-09 12:51:45 +01:00
if (thd->killed)
{
di->unlock();
goto end_create;
}
if (di->thd.killed)
2000-07-31 21:29:14 +02:00
{
if (di->thd.is_error())
{
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
/*
Copy the error message. Note that we don't treat fatal
errors in the delayed thread as fatal errors in the
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
2009-12-09 12:51:45 +01:00
main thread. If delayed thread was killed, we don't
want to send "Server shutdown in progress" in the
INSERT THREAD.
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
*/
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
2009-12-09 12:51:45 +01:00
if (di->thd.stmt_da->sql_errno() == ER_SERVER_SHUTDOWN)
my_message(ER_QUERY_INTERRUPTED, ER(ER_QUERY_INTERRUPTED), MYF(0));
else
my_message(di->thd.stmt_da->sql_errno(), di->thd.stmt_da->message(),
MYF(0));
}
di->unlock();
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
goto end_create;
2000-07-31 21:29:14 +02:00
}
mysql_mutex_lock(&LOCK_delayed_insert);
delayed_threads.append(di);
mysql_mutex_unlock(&LOCK_delayed_insert);
2000-07-31 21:29:14 +02:00
}
mysql_mutex_unlock(&LOCK_delayed_create);
2000-07-31 21:29:14 +02:00
}
mysql_mutex_lock(&di->mutex);
table_list->table= di->get_local_table(thd);
mysql_mutex_unlock(&di->mutex);
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
if (table_list->table)
{
DBUG_ASSERT(! thd->is_error());
thd->di= di;
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
}
/* Unlock the delayed insert object after its last access. */
di->unlock();
DBUG_RETURN((table_list->table == NULL));
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
end_create:
mysql_mutex_unlock(&LOCK_delayed_create);
DBUG_RETURN(thd->is_error());
2000-07-31 21:29:14 +02:00
}
/**
As we can't let many client threads modify the same TABLE
structure of the dedicated delayed insert thread, we create an
own structure for each client thread. This includes a row
buffer to save the column values and new fields that point to
the new row buffer. The memory is allocated in the client
thread and is freed automatically.
@pre This function is called from the client thread. Delayed
insert thread mutex must be acquired before invoking this
function.
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
@return Not-NULL table object on success. NULL in case of an error,
which is set in client_thd.
2000-07-31 21:29:14 +02:00
*/
TABLE *Delayed_insert::get_local_table(THD* client_thd)
2000-07-31 21:29:14 +02:00
{
my_ptrdiff_t adjust_ptrs;
Field **field,**org_field, *found_next_number_field;
2000-07-31 21:29:14 +02:00
TABLE *copy;
TABLE_SHARE *share;
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
uchar *bitmap;
DBUG_ENTER("Delayed_insert::get_local_table");
2000-07-31 21:29:14 +02:00
/* First request insert thread to get a lock */
status=1;
tables_in_use++;
if (!thd.lock) // Table is not locked
{
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(client_thd, "waiting for handler lock");
2010-02-02 00:22:16 +01:00
mysql_cond_signal(&cond); // Tell handler to lock table
while (!thd.killed && !thd.lock && ! client_thd->killed)
2000-07-31 21:29:14 +02:00
{
mysql_cond_wait(&cond_client, &mutex);
2000-07-31 21:29:14 +02:00
}
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(client_thd, "got handler lock");
2000-07-31 21:29:14 +02:00
if (client_thd->killed)
goto error;
if (thd.killed)
2000-07-31 21:29:14 +02:00
{
/*
Copy the error message. Note that we don't treat fatal
errors in the delayed thread as fatal errors in the
main thread. If delayed thread was killed, we don't
want to send "Server shutdown in progress" in the
INSERT THREAD.
The thread could be killed with an error message if
di->handle_inserts() or di->open_and_lock_table() fails.
The thread could be killed without an error message if
killed using mysql_notify_thread_having_shared_lock() or
kill_delayed_threads_for_table().
*/
if (!thd.is_error() || thd.stmt_da->sql_errno() == ER_SERVER_SHUTDOWN)
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
2009-12-09 12:51:45 +01:00
my_message(ER_QUERY_INTERRUPTED, ER(ER_QUERY_INTERRUPTED), MYF(0));
else
my_message(thd.stmt_da->sql_errno(), thd.stmt_da->message(), MYF(0));
goto error;
2000-07-31 21:29:14 +02:00
}
}
share= table->s;
2000-07-31 21:29:14 +02:00
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
/*
Allocate memory for the TABLE object, the field pointers array, and
one record buffer of reclength size. Normally a table has three
record buffers of rec_buff_length size, which includes alignment
bytes. Since the table copy is used for creating one record only,
the other record buffers and alignment are unnecessary.
*/
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(client_thd, "allocating local table");
copy= (TABLE*) client_thd->alloc(sizeof(*copy)+
(share->fields+1)*sizeof(Field**)+
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
share->reclength +
share->column_bitmap_size*2);
2000-07-31 21:29:14 +02:00
if (!copy)
goto error;
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
/* Copy the TABLE object. */
2000-07-31 21:29:14 +02:00
*copy= *table;
/* We don't need to change the file handler here */
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
/* Assign the pointers for the field pointers array and the record. */
field= copy->field= (Field**) (copy + 1);
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
bitmap= (uchar*) (field + share->fields + 1);
copy->record[0]= (bitmap + share->column_bitmap_size * 2);
memcpy((char*) copy->record[0], (char*) table->record[0], share->reclength);
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
/*
Make a copy of all fields.
The copied fields need to point into the copied record. This is done
by copying the field objects with their old pointer values and then
"move" the pointers by the distance between the original and copied
records. That way we preserve the relative positions in the records.
*/
adjust_ptrs= PTR_BYTE_DIFF(copy->record[0], table->record[0]);
found_next_number_field= table->found_next_number_field;
for (org_field= table->field; *org_field; org_field++, field++)
2000-07-31 21:29:14 +02:00
{
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
if (!(*field= (*org_field)->new_field(client_thd->mem_root, copy, 1)))
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
goto error;
(*field)->orig_table= copy; // Remove connection
(*field)->move_field_offset(adjust_ptrs); // Point at copy->record[0]
if (*org_field == found_next_number_field)
(*field)->table->found_next_number_field= *field;
2000-07-31 21:29:14 +02:00
}
*field=0;
/* Adjust timestamp */
if (table->timestamp_field)
{
/* Restore offset as this may have been reset in handle_inserts */
copy->timestamp_field=
(Field_timestamp*) copy->field[share->timestamp_field_offset];
copy->timestamp_field->unireg_check= table->timestamp_field->unireg_check;
copy->timestamp_field_type= copy->timestamp_field->get_auto_set_type();
2000-07-31 21:29:14 +02:00
}
/* Adjust in_use for pointing to client thread */
copy->in_use= client_thd;
/* Adjust lock_count. This table object is not part of a lock. */
copy->lock_count= 0;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
/* Adjust bitmaps */
copy->def_read_set.bitmap= (my_bitmap_map*) bitmap;
copy->def_write_set.bitmap= ((my_bitmap_map*)
(bitmap + share->column_bitmap_size));
copy->tmp_set.bitmap= 0; // To catch errors
bzero((char*) bitmap, share->column_bitmap_size*2);
copy->read_set= &copy->def_read_set;
copy->write_set= &copy->def_write_set;
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
DBUG_RETURN(copy);
2000-07-31 21:29:14 +02:00
/* Got fatal error */
error:
tables_in_use--;
status=1;
mysql_cond_signal(&cond); // Inform thread about abort
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
2006-06-26 20:57:18 +02:00
DBUG_RETURN(0);
2000-07-31 21:29:14 +02:00
}
/* Put a question in queue */
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
static
2007-05-16 08:32:47 +02:00
int write_delayed(THD *thd, TABLE *table, enum_duplicates duplic,
LEX_STRING query, bool ignore, bool log_on)
2000-07-31 21:29:14 +02:00
{
delayed_row *row= 0;
Delayed_insert *di=thd->di;
const Discrete_interval *forced_auto_inc;
2000-07-31 21:29:14 +02:00
DBUG_ENTER("write_delayed");
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
DBUG_PRINT("enter", ("query = '%s' length %lu", query.str,
(ulong) query.length));
2000-07-31 21:29:14 +02:00
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "waiting for handler insert");
mysql_mutex_lock(&di->mutex);
2000-07-31 21:29:14 +02:00
while (di->stacked_inserts >= delayed_queue_size && !thd->killed)
mysql_cond_wait(&di->cond_client, &di->mutex);
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(thd, "storing row into queue");
2000-07-31 21:29:14 +02:00
if (thd->killed)
2000-07-31 21:29:14 +02:00
goto err;
/*
Take a copy of the query string, if there is any. The string will
be free'ed when the row is destroyed. If there is no query string,
we don't do anything special.
*/
if (query.str)
{
char *str;
if (!(str= my_strndup(query.str, query.length, MYF(MY_WME))))
goto err;
query.str= str;
}
row= new delayed_row(query, duplic, ignore, log_on);
if (row == NULL)
{
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
2010-07-08 23:20:08 +02:00
my_free(query.str);
2000-07-31 21:29:14 +02:00
goto err;
}
2000-07-31 21:29:14 +02:00
if (!(row->record= (char*) my_malloc(table->s->reclength, MYF(MY_WME))))
2000-07-31 21:29:14 +02:00
goto err;
memcpy(row->record, table->record[0], table->s->reclength);
2000-07-31 21:29:14 +02:00
row->start_time= thd->start_time;
row->query_start_used= thd->query_start_used;
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
/*
those are for the binlog: LAST_INSERT_ID() has been evaluated at this
time, so record does not need it, but statement-based binlogging of the
INSERT will need when the row is actually inserted.
As for SET INSERT_ID, DELAYED does not honour it (BUG#20830).
*/
row->stmt_depends_on_first_successful_insert_id_in_prev_stmt=
thd->stmt_depends_on_first_successful_insert_id_in_prev_stmt;
row->first_successful_insert_id_in_prev_stmt=
thd->first_successful_insert_id_in_prev_stmt;
row->timestamp_field_type= table->timestamp_field_type;
2000-07-31 21:29:14 +02:00
/* Add session variable timezone
Time_zone object will not be freed even the thread is ended.
So we can get time_zone object from thread which handling delayed statement.
See the comment of my_tz_find() for detail.
*/
if (thd->time_zone_used)
{
row->time_zone = thd->variables.time_zone;
}
else
{
row->time_zone = NULL;
}
/* Copy session variables. */
row->auto_increment_increment= thd->variables.auto_increment_increment;
row->auto_increment_offset= thd->variables.auto_increment_offset;
row->sql_mode= thd->variables.sql_mode;
row->auto_increment_field_not_null= table->auto_increment_field_not_null;
/* Copy the next forced auto increment value, if any. */
if ((forced_auto_inc= thd->auto_inc_intervals_forced.get_next()))
{
row->forced_insert_id= forced_auto_inc->minimum();
DBUG_PRINT("delayed", ("transmitting auto_inc: %lu",
(ulong) row->forced_insert_id));
}
2000-07-31 21:29:14 +02:00
di->rows.push_back(row);
di->stacked_inserts++;
di->status=1;
if (table->s->blob_fields)
2000-07-31 21:29:14 +02:00
unlink_blobs(table);
mysql_cond_signal(&di->cond);
2000-07-31 21:29:14 +02:00
thread_safe_increment(delayed_rows_in_use,&LOCK_delayed_status);
mysql_mutex_unlock(&di->mutex);
2000-07-31 21:29:14 +02:00
DBUG_RETURN(0);
err:
delete row;
mysql_mutex_unlock(&di->mutex);
2000-07-31 21:29:14 +02:00
DBUG_RETURN(1);
}
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
2007-05-16 07:51:05 +02:00
/**
Signal the delayed insert thread that this user connection
is finished using it for this statement.
*/
2000-07-31 21:29:14 +02:00
static void end_delayed_insert(THD *thd)
{
DBUG_ENTER("end_delayed_insert");
Delayed_insert *di=thd->di;
mysql_mutex_lock(&di->mutex);
DBUG_PRINT("info",("tables in use: %d",di->tables_in_use));
2000-07-31 21:29:14 +02:00
if (!--di->tables_in_use || di->thd.killed)
{ // Unlock table
di->status=1;
mysql_cond_signal(&di->cond);
2000-07-31 21:29:14 +02:00
}
mysql_mutex_unlock(&di->mutex);
DBUG_VOID_RETURN;
2000-07-31 21:29:14 +02:00
}
/* We kill all delayed threads when doing flush-tables */
void kill_delayed_threads(void)
{
mysql_mutex_lock(&LOCK_delayed_insert); // For unlink from list
2000-07-31 21:29:14 +02:00
I_List_iterator<Delayed_insert> it(delayed_threads);
Delayed_insert *di;
while ((di= it++))
2000-07-31 21:29:14 +02:00
{
di->thd.killed= THD::KILL_CONNECTION;
if (di->thd.mysys_var)
2000-07-31 21:29:14 +02:00
{
mysql_mutex_lock(&di->thd.mysys_var->mutex);
if (di->thd.mysys_var->current_cond)
2000-07-31 21:29:14 +02:00
{
/*
We need the following test because the main mutex may be locked
in handle_delayed_insert()
*/
if (&di->mutex != di->thd.mysys_var->current_mutex)
mysql_mutex_lock(di->thd.mysys_var->current_mutex);
mysql_cond_broadcast(di->thd.mysys_var->current_cond);
if (&di->mutex != di->thd.mysys_var->current_mutex)
mysql_mutex_unlock(di->thd.mysys_var->current_mutex);
2000-07-31 21:29:14 +02:00
}
mysql_mutex_unlock(&di->thd.mysys_var->mutex);
2000-07-31 21:29:14 +02:00
}
}
mysql_mutex_unlock(&LOCK_delayed_insert); // For unlink from list
2000-07-31 21:29:14 +02:00
}
/**
A strategy for the prelocking algorithm which prevents the
delayed insert thread from opening tables with engines which
do not support delayed inserts.
Particularly it allows to abort open_tables() as soon as we
discover that we have opened a MERGE table, without acquiring
metadata locks on underlying tables.
*/
class Delayed_prelocking_strategy : public Prelocking_strategy
{
public:
virtual bool handle_routine(THD *thd, Query_tables_list *prelocking_ctx,
Sroutine_hash_entry *rt, sp_head *sp,
bool *need_prelocking);
virtual bool handle_table(THD *thd, Query_tables_list *prelocking_ctx,
TABLE_LIST *table_list, bool *need_prelocking);
virtual bool handle_view(THD *thd, Query_tables_list *prelocking_ctx,
TABLE_LIST *table_list, bool *need_prelocking);
};
bool Delayed_prelocking_strategy::
handle_table(THD *thd, Query_tables_list *prelocking_ctx,
TABLE_LIST *table_list, bool *need_prelocking)
{
DBUG_ASSERT(table_list->lock_type == TL_WRITE_DELAYED);
if (!(table_list->table->file->ha_table_flags() & HA_CAN_INSERT_DELAYED))
{
my_error(ER_DELAYED_NOT_SUPPORTED, MYF(0), table_list->table_name);
return TRUE;
}
return FALSE;
}
bool Delayed_prelocking_strategy::
handle_routine(THD *thd, Query_tables_list *prelocking_ctx,
Sroutine_hash_entry *rt, sp_head *sp,
bool *need_prelocking)
{
/* LEX used by the delayed insert thread has no routines. */
DBUG_ASSERT(0);
return FALSE;
}
bool Delayed_prelocking_strategy::
handle_view(THD *thd, Query_tables_list *prelocking_ctx,
TABLE_LIST *table_list, bool *need_prelocking)
{
/* We don't open views in the delayed insert thread. */
DBUG_ASSERT(0);
return FALSE;
}
/**
Open and lock table for use by delayed thread and check that
this table is suitable for delayed inserts.
@retval FALSE - Success.
@retval TRUE - Failure.
*/
bool Delayed_insert::open_and_lock_table()
{
Delayed_prelocking_strategy prelocking_strategy;
/*
Use special prelocking strategy to get ER_DELAYED_NOT_SUPPORTED
error for tables with engines which don't support delayed inserts.
*/
if (!(table= open_n_lock_single_table(&thd, &table_list,
TL_WRITE_DELAYED,
MYSQL_OPEN_IGNORE_GLOBAL_READ_LOCK,
&prelocking_strategy)))
{
thd.fatal_error(); // Abort waiting inserts
return TRUE;
}
if (table->triggers)
{
/*
Table has triggers. This is not an error, but we do
not support triggers with delayed insert. Terminate the delayed
thread without an error and thus request lock upgrade.
*/
return TRUE;
}
table->copy_blobs= 1;
return FALSE;
}
/*
* Create a new delayed insert thread
*/
pthread_handler_t handle_delayed_insert(void *arg)
{
Delayed_insert *di=(Delayed_insert*) arg;
THD *thd= &di->thd;
pthread_detach_this_thread();
/* Add thread to THD list so that's it's visible in 'show processlist' */
mysql_mutex_lock(&LOCK_thread_count);
thd->thread_id= thd->variables.pseudo_thread_id= thread_id++;
thd->set_current_time();
threads.append(thd);
thd->killed=abort_loop ? THD::KILL_CONNECTION : THD::NOT_KILLED;
mysql_mutex_unlock(&LOCK_thread_count);
mysql_thread_set_psi_id(thd->thread_id);
/*
Wait until the client runs into mysql_cond_wait(),
where we free it after the table is opened and di linked in the list.
If we did not wait here, the client might detect the opened table
before it is linked to the list. It would release LOCK_delayed_create
and allow another thread to create another handler for the same table,
since it does not find one in the list.
*/
mysql_mutex_lock(&di->mutex);
if (my_thread_init())
{
/* Can't use my_error since store_globals has not yet been called */
thd->stmt_da->set_error_status(thd, ER_OUT_OF_RESOURCES,
ER(ER_OUT_OF_RESOURCES), NULL);
di->handler_thread_initialized= TRUE;
}
2009-11-24 07:27:16 +01:00
else
{
DBUG_ENTER("handle_delayed_insert");
thd->thread_stack= (char*) &thd;
if (init_thr_lock() || thd->store_globals())
{
/* Can't use my_error since store_globals has perhaps failed */
thd->stmt_da->set_error_status(thd, ER_OUT_OF_RESOURCES,
ER(ER_OUT_OF_RESOURCES), NULL);
di->handler_thread_initialized= TRUE;
2009-11-24 07:27:16 +01:00
thd->fatal_error();
goto err;
}
thd->lex->sql_command= SQLCOM_INSERT; // For innodb::store_lock()
2009-11-24 07:27:16 +01:00
/*
INSERT DELAYED has to go to row-based format because the time
at which rows are inserted cannot be determined in mixed mode.
2009-11-24 07:27:16 +01:00
*/
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
2010-01-07 16:39:11 +01:00
thd->set_current_stmt_binlog_format_row_if_mixed();
2009-11-24 07:27:16 +01:00
/*
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
Clone tickets representing protection against GRL and the lock on
the target table for the insert and add them to the list of granted
metadata locks held by the handler thread. This is safe since the
handler thread is not holding nor waiting on any metadata locks.
*/
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
if (thd->mdl_context.clone_ticket(&di->grl_protection) ||
thd->mdl_context.clone_ticket(&di->table_list.mdl_request))
{
Patch that refactors global read lock implementation and fixes bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ LOCK" and bug #54673 "It takes too long to get readlock for 'FLUSH TABLES WITH READ LOCK'". The first bug manifested itself as a deadlock which occurred when a connection, which had some table open through HANDLER statement, tried to update some data through DML statement while another connection tried to execute FLUSH TABLES WITH READ LOCK concurrently. What happened was that FTWRL in the second connection managed to perform first step of GRL acquisition and thus blocked all upcoming DML. After that it started to wait for table open through HANDLER statement to be flushed. When the first connection tried to execute DML it has started to wait for GRL/the second connection creating deadlock. The second bug manifested itself as starvation of FLUSH TABLES WITH READ LOCK statements in cases when there was a constant stream of concurrent DML statements (in two or more connections). This has happened because requests for protection against GRL which were acquired by DML statements were ignoring presence of pending GRL and thus the latter was starved. This patch solves both these problems by re-implementing GRL using metadata locks. Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace). Note that unlike in old implementation acquisition of protection against GRL in DML and DDL is semi-automatic. We assume that any statement which should be blocked by GRL will either open and acquires write-lock on tables or acquires metadata locks on objects it is going to modify. For any such statement global IX metadata lock is automatically acquired for its duration. The first problem is solved because waits for GRL become visible to deadlock detector in metadata locking subsystem and thus deadlocks like one in the first bug become impossible. The second problem is solved because global S locks which are used for GRL implementation are given preference over IX locks which are acquired by concurrent DML (and we can switch to fair scheduling in future if needed). Important change: FTWRL/GRL no longer blocks DML and DDL on temporary tables. Before this patch behavior was not consistent in this respect: in some cases DML/DDL statements on temporary tables were blocked while in others they were not. Since the main use cases for FTWRL are various forms of backups and temporary tables are not preserved during backups we have opted for consistently allowing DML/DDL on temporary tables during FTWRL/GRL. Important change: This patch changes thread state names which are used when DML/DDL of FTWRL is waiting for global read lock. It is now either "Waiting for global read lock" or "Waiting for commit lock" depending on the stage on which FTWRL is. Incompatible change: To solve deadlock in events code which was exposed by this patch we have to replace LOCK_event_metadata mutex with metadata locks on events. As result we have to prohibit DDL on events under LOCK TABLES. This patch also adds extensive test coverage for interaction of DML/DDL and FTWRL. Performance of new and old global read lock implementations in sysbench tests were compared. There were no significant difference between new and old implementations.
2010-11-11 18:11:05 +01:00
thd->mdl_context.release_transactional_locks();
di->handler_thread_initialized= TRUE;
goto err;
}
/*
Now that the ticket has been cloned, it is safe for the connection
thread to exit.
*/
di->handler_thread_initialized= TRUE;
di->table_list.mdl_request.ticket= NULL;
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
2009-11-30 16:55:03 +01:00
if (di->open_and_lock_table())
2009-11-24 07:27:16 +01:00
goto err;
/* Tell client that the thread is initialized */
mysql_cond_signal(&di->cond_client);
2009-11-24 07:27:16 +01:00
/* Now wait until we get an insert or lock to handle */
/* We will not abort as long as a client thread uses this thread */
for (;;)
{
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
2009-12-09 12:51:45 +01:00
if (thd->killed)
2009-11-24 07:27:16 +01:00
{
uint lock_count;
/*
Remove this from delay insert list so that no one can request a
table from this
*/
mysql_mutex_unlock(&di->mutex);
mysql_mutex_lock(&LOCK_delayed_insert);
2009-11-24 07:27:16 +01:00
di->unlink();
lock_count=di->lock_count();
mysql_mutex_unlock(&LOCK_delayed_insert);
mysql_mutex_lock(&di->mutex);
2009-11-24 07:27:16 +01:00
if (!lock_count && !di->tables_in_use && !di->stacked_inserts)
break; // Time to die
}
/* Shouldn't wait if killed or an insert is waiting. */
if (!thd->killed && !di->status && !di->stacked_inserts)
2009-11-24 07:27:16 +01:00
{
struct timespec abstime;
set_timespec(abstime, delayed_insert_timeout);
/* Information for pthread_kill */
di->thd.mysys_var->current_mutex= &di->mutex;
di->thd.mysys_var->current_cond= &di->cond;
thd_proc_info(&(di->thd), "Waiting for INSERT");
DBUG_PRINT("info",("Waiting for someone to insert rows"));
while (!thd->killed && !di->status)
2009-11-24 07:27:16 +01:00
{
int error;
mysql_audit_release(thd);
2009-11-24 07:27:16 +01:00
#if defined(HAVE_BROKEN_COND_TIMEDWAIT)
error= mysql_cond_wait(&di->cond, &di->mutex);
2009-11-24 07:27:16 +01:00
#else
error= mysql_cond_timedwait(&di->cond, &di->mutex, &abstime);
2009-11-24 07:27:16 +01:00
#ifdef EXTRA_DEBUG
if (error && error != EINTR && error != ETIMEDOUT)
{
fprintf(stderr, "Got error %d from mysql_cond_timedwait\n", error);
DBUG_PRINT("error", ("Got error %d from mysql_cond_timedwait",
error));
2009-11-24 07:27:16 +01:00
}
#endif
#endif
if (error == ETIMEDOUT || error == ETIME)
thd->killed= THD::KILL_CONNECTION;
}
/* We can't lock di->mutex and mysys_var->mutex at the same time */
mysql_mutex_unlock(&di->mutex);
mysql_mutex_lock(&di->thd.mysys_var->mutex);
2009-11-24 07:27:16 +01:00
di->thd.mysys_var->current_mutex= 0;
di->thd.mysys_var->current_cond= 0;
mysql_mutex_unlock(&di->thd.mysys_var->mutex);
mysql_mutex_lock(&di->mutex);
2009-11-24 07:27:16 +01:00
}
thd_proc_info(&(di->thd), 0);
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
2009-12-09 12:51:45 +01:00
if (di->tables_in_use && ! thd->lock && !thd->killed)
2009-11-24 07:27:16 +01:00
{
/*
Request for new delayed insert.
Lock the table, but avoid to be blocked by a global read lock.
If we got here while a global read lock exists, then one or more
inserts started before the lock was requested. These are allowed
to complete their work before the server returns control to the
client which requested the global read lock. The delayed insert
handler will close the table and finish when the outstanding
inserts are done.
*/
if (! (thd->lock= mysql_lock_tables(thd, &di->table, 1, 0)))
2009-11-24 07:27:16 +01:00
{
/* Fatal error */
thd->killed= THD::KILL_CONNECTION;
2009-11-24 07:27:16 +01:00
}
mysql_cond_broadcast(&di->cond_client);
2009-11-24 07:27:16 +01:00
}
if (di->stacked_inserts)
{
if (di->handle_inserts())
{
/* Some fatal error */
thd->killed= THD::KILL_CONNECTION;
}
}
di->status=0;
if (!di->stacked_inserts && !di->tables_in_use && thd->lock)
{
/*
No one is doing a insert delayed
Unlock table so that other threads can use it
*/
MYSQL_LOCK *lock=thd->lock;
thd->lock=0;
mysql_mutex_unlock(&di->mutex);
2009-11-24 07:27:16 +01:00
/*
We need to release next_insert_id before unlocking. This is
enforced by handler::ha_external_lock().
*/
di->table->file->ha_release_auto_increment();
mysql_unlock_tables(thd, lock);
trans_commit_stmt(thd);
2009-11-24 07:27:16 +01:00
di->group_count=0;
mysql_audit_release(thd);
mysql_mutex_lock(&di->mutex);
2009-11-24 07:27:16 +01:00
}
if (di->tables_in_use)
mysql_cond_broadcast(&di->cond_client); // If waiting clients
2009-11-24 07:27:16 +01:00
}
err:
DBUG_LEAVE;
}
2000-07-31 21:29:14 +02:00
close_thread_tables(thd); // Free the table
thd->mdl_context.release_transactional_locks();
2000-07-31 21:29:14 +02:00
di->table=0;
thd->killed= THD::KILL_CONNECTION; // If error
mysql_cond_broadcast(&di->cond_client); // Safety
mysql_mutex_unlock(&di->mutex);
2000-07-31 21:29:14 +02:00
mysql_mutex_lock(&LOCK_delayed_create); // Because of delayed_get_table
mysql_mutex_lock(&LOCK_delayed_insert);
/*
di should be unlinked from the thread handler list and have no active
clients
*/
2000-07-31 21:29:14 +02:00
delete di;
mysql_mutex_unlock(&LOCK_delayed_insert);
mysql_mutex_unlock(&LOCK_delayed_create);
2000-07-31 21:29:14 +02:00
my_thread_end();
pthread_exit(0);
return 0;
2000-07-31 21:29:14 +02:00
}
/* Remove pointers from temporary fields to allocated values */
static void unlink_blobs(register TABLE *table)
{
for (Field **ptr=table->field ; *ptr ; ptr++)
{
if ((*ptr)->flags & BLOB_FLAG)
((Field_blob *) (*ptr))->clear_temporary();
}
}
/* Free blobs stored in current row */
static void free_delayed_insert_blobs(register TABLE *table)
{
for (Field **ptr=table->field ; *ptr ; ptr++)
{
if ((*ptr)->flags & BLOB_FLAG)
{
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
uchar *str;
2000-07-31 21:29:14 +02:00
((Field_blob *) (*ptr))->get_ptr(&str);
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
2010-07-08 23:20:08 +02:00
my_free(str);
2000-07-31 21:29:14 +02:00
((Field_blob *) (*ptr))->reset();
}
}
}
bool Delayed_insert::handle_inserts(void)
2000-07-31 21:29:14 +02:00
{
int error;
ulong max_rows;
bool has_trans = TRUE;
bool using_ignore= 0, using_opt_replace= 0,
using_bin_log= mysql_bin_log.is_open();
delayed_row *row;
2000-07-31 21:29:14 +02:00
DBUG_ENTER("handle_inserts");
/* Allow client to insert new rows */
mysql_mutex_unlock(&mutex);
2000-07-31 21:29:14 +02:00
table->next_number_field=table->found_next_number_field;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
table->use_all_columns();
2000-07-31 21:29:14 +02:00
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(&thd, "upgrading lock");
Bug #45225 Locking: hang if drop table with no timeout This patch introduces timeouts for metadata locks. The timeout is specified in seconds using the new dynamic system variable "lock_wait_timeout" which has both GLOBAL and SESSION scopes. Allowed values range from 1 to 31536000 seconds (= 1 year). The default value is 1 year. The new server parameter "lock-wait-timeout" can be used to set the default value parameter upon server startup. "lock_wait_timeout" applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH READ LOCK and HANDLER statements. The patch also changes thr_lock.c code (table data locks used by MyISAM and other simplistic engines) to use the same system variable. InnoDB row locks are unaffected. One exception to the handling of the "lock_wait_timeout" variable is delayed inserts. All delayed inserts are executed with a timeout of 1 year regardless of the setting for the global variable. As the connection issuing the delayed insert gets no notification of delayed insert timeouts, we want to avoid unnecessary timeouts. It's important to note that the timeout value is used for each lock acquired and that one statement can take more than one lock. A statement can therefore block for longer than the lock_wait_timeout value before reporting a timeout error. When lock timeout occurs, ER_LOCK_WAIT_TIMEOUT is reported. Test case added to lock_multi.test.
2010-02-11 11:23:39 +01:00
if (thr_upgrade_write_delay_lock(*thd.lock->locks, delayed_lock,
thd.variables.lock_wait_timeout))
2000-07-31 21:29:14 +02:00
{
/*
This can happen if thread is killed either by a shutdown
or if another thread is removing the current table definition
from the table cache.
*/
my_error(ER_DELAYED_CANT_CHANGE_LOCK,MYF(ME_FATALERROR),
table->s->table_name.str);
2000-07-31 21:29:14 +02:00
goto err;
}
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(&thd, "insert");
max_rows= delayed_insert_limit;
if (thd.killed || table->s->has_old_version())
2000-07-31 21:29:14 +02:00
{
thd.killed= THD::KILL_CONNECTION;
max_rows= ULONG_MAX; // Do as much as possible
2000-07-31 21:29:14 +02:00
}
/*
We can't use row caching when using the binary log because if
we get a crash, then binary log will contain rows that are not yet
written to disk, which will cause problems in replication.
*/
if (!using_bin_log)
table->file->extra(HA_EXTRA_WRITE_CACHE);
mysql_mutex_lock(&mutex);
2000-07-31 21:29:14 +02:00
while ((row=rows.get()))
{
stacked_inserts--;
mysql_mutex_unlock(&mutex);
memcpy(table->record[0],row->record,table->s->reclength);
2000-07-31 21:29:14 +02:00
thd.start_time=row->start_time;
thd.query_start_used=row->query_start_used;
/*
To get the exact auto_inc interval to store in the binlog we must not
use values from the previous interval (of the previous rows).
*/
bool log_query= (row->log_query && row->query.str != NULL);
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
DBUG_PRINT("delayed", ("query: '%s' length: %lu", row->query.str ?
row->query.str : "[NULL]",
(ulong) row->query.length));
if (log_query)
{
/*
Guaranteed that the INSERT DELAYED STMT will not be here
in SBR when mysql binlog is enabled.
*/
DBUG_ASSERT(!(mysql_bin_log.is_open() &&
!thd.is_current_stmt_binlog_format_row()));
/*
This is the first value of an INSERT statement.
It is the right place to clear a forced insert_id.
This is usually done after the last value of an INSERT statement,
but we won't know this in the insert delayed thread. But before
the first value is sufficiently equivalent to after the last
value of the previous statement.
*/
table->file->ha_release_auto_increment();
thd.auto_inc_intervals_in_cur_stmt_for_binlog.empty();
}
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
thd.first_successful_insert_id_in_prev_stmt=
row->first_successful_insert_id_in_prev_stmt;
thd.stmt_depends_on_first_successful_insert_id_in_prev_stmt=
row->stmt_depends_on_first_successful_insert_id_in_prev_stmt;
table->timestamp_field_type= row->timestamp_field_type;
table->auto_increment_field_not_null= row->auto_increment_field_not_null;
2000-07-31 21:29:14 +02:00
/* Copy the session variables. */
thd.variables.auto_increment_increment= row->auto_increment_increment;
thd.variables.auto_increment_offset= row->auto_increment_offset;
thd.variables.sql_mode= row->sql_mode;
/* Copy a forced insert_id, if any. */
if (row->forced_insert_id)
{
DBUG_PRINT("delayed", ("received auto_inc: %lu",
(ulong) row->forced_insert_id));
thd.force_one_auto_inc_interval(row->forced_insert_id);
}
info.ignore= row->ignore;
2000-07-31 21:29:14 +02:00
info.handle_duplicates= row->dup;
if (info.ignore ||
2005-01-10 20:55:05 +01:00
info.handle_duplicates != DUP_ERROR)
{
table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
using_ignore=1;
}
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
if (info.handle_duplicates == DUP_REPLACE &&
(!table->triggers ||
!table->triggers->has_delete_triggers()))
{
table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
using_opt_replace= 1;
}
if (info.handle_duplicates == DUP_UPDATE)
table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
2002-11-03 23:56:25 +01:00
thd.clear_error(); // reset error for binlog
if (write_record(&thd, table, &info))
2000-07-31 21:29:14 +02:00
{
info.error_count++; // Ignore errors
thread_safe_increment(delayed_insert_errors,&LOCK_delayed_status);
2000-11-11 22:57:35 +01:00
row->log_query = 0;
2000-07-31 21:29:14 +02:00
}
if (using_ignore)
{
using_ignore=0;
table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
}
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
if (using_opt_replace)
{
using_opt_replace= 0;
table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
}
if (table->s->blob_fields)
2000-07-31 21:29:14 +02:00
free_delayed_insert_blobs(table);
thread_safe_decrement(delayed_rows_in_use,&LOCK_delayed_status);
2000-07-31 21:29:14 +02:00
thread_safe_increment(delayed_insert_writes,&LOCK_delayed_status);
mysql_mutex_lock(&mutex);
2000-07-31 21:29:14 +02:00
/*
Reset the table->auto_increment_field_not_null as it is valid for
only one row.
*/
table->auto_increment_field_not_null= FALSE;
2000-07-31 21:29:14 +02:00
delete row;
/*
Let READ clients do something once in a while
We should however not break in the middle of a multi-line insert
if we have binary logging enabled as we don't want other commands
on this table until all entries has been processed
*/
if (group_count++ >= max_rows && (row= rows.head()) &&
(!(row->log_query & using_bin_log)))
2000-07-31 21:29:14 +02:00
{
group_count=0;
if (stacked_inserts || tables_in_use) // Let these wait a while
{
if (tables_in_use)
mysql_cond_broadcast(&cond_client); // If waiting clients
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(&thd, "reschedule");
mysql_mutex_unlock(&mutex);
2000-07-31 21:29:14 +02:00
if ((error=table->file->extra(HA_EXTRA_NO_CACHE)))
{
/* This should never happen */
table->file->print_error(error,MYF(0));
sql_print_error("%s", thd.stmt_da->message());
DBUG_PRINT("error", ("HA_EXTRA_NO_CACHE failed in loop"));
2000-07-31 21:29:14 +02:00
goto err;
}
2002-03-22 21:55:08 +01:00
query_cache_invalidate3(&thd, table, 1);
Bug #45225 Locking: hang if drop table with no timeout This patch introduces timeouts for metadata locks. The timeout is specified in seconds using the new dynamic system variable "lock_wait_timeout" which has both GLOBAL and SESSION scopes. Allowed values range from 1 to 31536000 seconds (= 1 year). The default value is 1 year. The new server parameter "lock-wait-timeout" can be used to set the default value parameter upon server startup. "lock_wait_timeout" applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH READ LOCK and HANDLER statements. The patch also changes thr_lock.c code (table data locks used by MyISAM and other simplistic engines) to use the same system variable. InnoDB row locks are unaffected. One exception to the handling of the "lock_wait_timeout" variable is delayed inserts. All delayed inserts are executed with a timeout of 1 year regardless of the setting for the global variable. As the connection issuing the delayed insert gets no notification of delayed insert timeouts, we want to avoid unnecessary timeouts. It's important to note that the timeout value is used for each lock acquired and that one statement can take more than one lock. A statement can therefore block for longer than the lock_wait_timeout value before reporting a timeout error. When lock timeout occurs, ER_LOCK_WAIT_TIMEOUT is reported. Test case added to lock_multi.test.
2010-02-11 11:23:39 +01:00
if (thr_reschedule_write_lock(*thd.lock->locks,
thd.variables.lock_wait_timeout))
2000-07-31 21:29:14 +02:00
{
/* This is not known to happen. */
my_error(ER_DELAYED_CANT_CHANGE_LOCK,MYF(ME_FATALERROR),
table->s->table_name.str);
goto err;
2000-07-31 21:29:14 +02:00
}
if (!using_bin_log)
table->file->extra(HA_EXTRA_WRITE_CACHE);
mysql_mutex_lock(&mutex);
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(&thd, "insert");
2000-07-31 21:29:14 +02:00
}
if (tables_in_use)
mysql_cond_broadcast(&cond_client); // If waiting clients
2000-07-31 21:29:14 +02:00
}
}
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
2007-02-22 16:03:08 +01:00
thd_proc_info(&thd, 0);
mysql_mutex_unlock(&mutex);
/*
We need to flush the pending event when using row-based
replication since the flushing normally done in binlog_query() is
not done last in the statement: for delayed inserts, the insert
statement is logged *before* all rows are inserted.
We can flush the pending event without checking the thd->lock
since the delayed insert *thread* is not inside a stored function
or trigger.
TODO: Move the logging to last in the sequence of rows.
*/
has_trans= thd.lex->sql_command == SQLCOM_CREATE_TABLE ||
table->file->has_transactions();
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
2010-01-07 16:39:11 +01:00
if (thd.is_current_stmt_binlog_format_row() &&
thd.binlog_flush_pending_rows_event(TRUE, has_trans))
goto err;
2000-07-31 21:29:14 +02:00
if ((error=table->file->extra(HA_EXTRA_NO_CACHE)))
{ // This shouldn't happen
table->file->print_error(error,MYF(0));
sql_print_error("%s", thd.stmt_da->message());
DBUG_PRINT("error", ("HA_EXTRA_NO_CACHE failed after loop"));
2000-07-31 21:29:14 +02:00
goto err;
}
2002-03-22 21:55:08 +01:00
query_cache_invalidate3(&thd, table, 1);
mysql_mutex_lock(&mutex);
2000-07-31 21:29:14 +02:00
DBUG_RETURN(0);
err:
#ifndef DBUG_OFF
max_rows= 0; // For DBUG output
#endif
/* Remove all not used rows */
while ((row=rows.get()))
{
if (table->s->blob_fields)
{
memcpy(table->record[0],row->record,table->s->reclength);
free_delayed_insert_blobs(table);
}
delete row;
thread_safe_increment(delayed_insert_errors,&LOCK_delayed_status);
stacked_inserts--;
#ifndef DBUG_OFF
max_rows++;
#endif
}
DBUG_PRINT("error", ("dropped %lu rows after an error", max_rows));
2000-07-31 21:29:14 +02:00
thread_safe_increment(delayed_insert_errors, &LOCK_delayed_status);
mysql_mutex_lock(&mutex);
2000-07-31 21:29:14 +02:00
DBUG_RETURN(1);
}
#endif /* EMBEDDED_LIBRARY */
2000-07-31 21:29:14 +02:00
/***************************************************************************
2002-07-25 00:00:56 +02:00
Store records in INSERT ... SELECT *
2000-07-31 21:29:14 +02:00
***************************************************************************/
2004-07-16 00:15:55 +02:00
/*
make insert specific preparation and checks after opening tables
SYNOPSIS
mysql_insert_select_prepare()
thd thread handler
RETURN
FALSE OK
TRUE Error
2004-07-16 00:15:55 +02:00
*/
bool mysql_insert_select_prepare(THD *thd)
2004-07-16 00:15:55 +02:00
{
LEX *lex= thd->lex;
SELECT_LEX *select_lex= &lex->select_lex;
2004-12-30 23:44:00 +01:00
TABLE_LIST *first_select_leaf_table;
2004-07-16 00:15:55 +02:00
DBUG_ENTER("mysql_insert_select_prepare");
2005-07-03 13:17:52 +02:00
2004-09-03 14:18:40 +02:00
/*
SELECT_LEX do not belong to INSERT statement, so we can't add WHERE
2004-12-30 23:44:00 +01:00
clause if table is VIEW
2004-09-03 14:18:40 +02:00
*/
2005-07-03 13:17:52 +02:00
2004-12-30 23:44:00 +01:00
if (mysql_prepare_insert(thd, lex->query_tables,
lex->query_tables->table, lex->field_list, 0,
lex->update_list, lex->value_list,
lex->duplicates,
&select_lex->where, TRUE, FALSE, FALSE))
2004-11-25 01:23:13 +01:00
DBUG_RETURN(TRUE);
2004-12-30 23:44:00 +01:00
/*
exclude first table from leaf tables list, because it belong to
INSERT
*/
DBUG_ASSERT(select_lex->leaf_tables != 0);
lex->leaf_tables_insert= select_lex->leaf_tables;
/* skip all leaf tables belonged to view where we are insert */
for (first_select_leaf_table= select_lex->leaf_tables->next_leaf;
first_select_leaf_table &&
first_select_leaf_table->belong_to_view &&
first_select_leaf_table->belong_to_view ==
lex->leaf_tables_insert->belong_to_view;
first_select_leaf_table= first_select_leaf_table->next_leaf)
{}
select_lex->leaf_tables= first_select_leaf_table;
DBUG_RETURN(FALSE);
2004-07-16 00:15:55 +02:00
}
select_insert::select_insert(TABLE_LIST *table_list_par, TABLE *table_par,
2004-12-22 12:54:39 +01:00
List<Item> *fields_par,
List<Item> *update_fields,
List<Item> *update_values,
2004-12-22 12:54:39 +01:00
enum_duplicates duplic,
bool ignore_check_option_errors)
:table_list(table_list_par), table(table_par), fields(fields_par),
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
autoinc_value_of_last_inserted_row(0),
insert_into_view(table_list_par && table_list_par->view != 0)
{
bzero((char*) &info,sizeof(info));
2004-12-22 12:54:39 +01:00
info.handle_duplicates= duplic;
info.ignore= ignore_check_option_errors;
info.update_fields= update_fields;
info.update_values= update_values;
if (table_list_par)
info.view= (table_list_par->view ? table_list_par : 0);
}
2000-07-31 21:29:14 +02:00
int
select_insert::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
2000-07-31 21:29:14 +02:00
{
LEX *lex= thd->lex;
int res;
table_map map= 0;
SELECT_LEX *lex_current_select_save= lex->current_select;
2000-07-31 21:29:14 +02:00
DBUG_ENTER("select_insert::prepare");
unit= u;
/*
Since table in which we are going to insert is added to the first
select, LEX::current_select should point to the first select while
we are fixing fields from insert list.
*/
lex->current_select= &lex->select_lex;
/* Errors during check_insert_fields() should not be ignored. */
lex->current_select->no_error= FALSE;
res= (setup_fields(thd, 0, values, MARK_COLUMNS_READ, 0, 0) ||
check_insert_fields(thd, table_list, *fields, values,
!insert_into_view, 1, &map));
if (!res && fields->elements)
{
bool saved_abort_on_warning= thd->abort_on_warning;
thd->abort_on_warning= !info.ignore && (thd->variables.sql_mode &
(MODE_STRICT_TRANS_TABLES |
MODE_STRICT_ALL_TABLES));
res= check_that_all_fields_are_given_values(thd, table_list->table,
table_list);
thd->abort_on_warning= saved_abort_on_warning;
}
if (info.handle_duplicates == DUP_UPDATE && !res)
{
Name_resolution_context *context= &lex->select_lex.context;
Name_resolution_context_state ctx_state;
/* Save the state of the current name resolution context. */
ctx_state.save_state(context, table_list);
/* Perform name resolution only in the first table - 'table_list'. */
table_list->next_local= 0;
context->resolve_in_table_list_only(table_list);
lex->select_lex.no_wrap_view_item= TRUE;
res= res || check_update_fields(thd, context->table_list,
*info.update_fields, *info.update_values,
&map);
lex->select_lex.no_wrap_view_item= FALSE;
/*
When we are not using GROUP BY and there are no ungrouped aggregate functions
we can refer to other tables in the ON DUPLICATE KEY part.
We use next_name_resolution_table descructively, so check it first (views?)
*/
DBUG_ASSERT (!table_list->next_name_resolution_table);
if (lex->select_lex.group_list.elements == 0 &&
!lex->select_lex.with_sum_func)
/*
We must make a single context out of the two separate name resolution contexts :
the INSERT table and the tables in the SELECT part of INSERT ... SELECT.
To do that we must concatenate the two lists
*/
2007-02-26 13:52:54 +01:00
table_list->next_name_resolution_table=
ctx_state.get_first_name_resolution_table();
2007-02-26 13:52:54 +01:00
res= res || setup_fields(thd, 0, *info.update_values,
MARK_COLUMNS_READ, 0, 0);
if (!res)
{
/*
Traverse the update values list and substitute fields from the
select for references (Item_ref objects) to them. This is done in
order to get correct values from those fields when the select
employs a temporary table.
*/
List_iterator<Item> li(*info.update_values);
Item *item;
while ((item= li++))
{
item->transform(&Item::update_value_transformer,
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
(uchar*)lex->current_select);
}
}
/* Restore the current context. */
ctx_state.restore_state(context, table_list);
}
lex->current_select= lex_current_select_save;
if (res)
2000-07-31 21:29:14 +02:00
DBUG_RETURN(1);
/*
if it is INSERT into join view then check_insert_fields already found
real table for insert
*/
table= table_list->table;
/*
Is table which we are changing used somewhere in other parts of
query
*/
if (unique_table(thd, table_list, table_list->next_global, 0))
{
/* Using same table for INSERT and SELECT */
lex->current_select->options|= OPTION_BUFFER_RESULT;
lex->current_select->join->select_options|= OPTION_BUFFER_RESULT;
}
else if (!(lex->current_select->options & OPTION_BUFFER_RESULT) &&
thd->locked_tables_mode <= LTM_LOCK_TABLES)
{
/*
We must not yet prepare the result table if it is the same as one of the
source tables (INSERT SELECT). The preparation may disable
indexes on the result table, which may be used during the select, if it
is the same table (Bug #6034). Do the preparation after the select phase
in select_insert::prepare2().
We won't start bulk inserts at all if this statement uses functions or
should invoke triggers since they may access to the same table too.
*/
table->file->ha_start_bulk_insert((ha_rows) 0);
}
restore_record(table,s->default_values); // Get empty record
2000-08-16 04:14:02 +02:00
table->next_number_field=table->found_next_number_field;
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
#ifdef HAVE_REPLICATION
if (thd->slave_thread &&
(info.handle_duplicates == DUP_UPDATE) &&
(table->next_number_field != NULL) &&
rpl_master_has_bug(&active_mi->rli, 24432, TRUE, NULL, NULL))
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
2007-02-15 20:28:58 +01:00
DBUG_RETURN(1);
#endif
2000-07-31 21:29:14 +02:00
thd->cuted_fields=0;
2005-02-02 21:39:21 +01:00
if (info.ignore || info.handle_duplicates != DUP_ERROR)
table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
if (info.handle_duplicates == DUP_REPLACE &&
(!table->triggers || !table->triggers->has_delete_triggers()))
table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
if (info.handle_duplicates == DUP_UPDATE)
table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
thd->abort_on_warning= (!info.ignore &&
(thd->variables.sql_mode &
(MODE_STRICT_TRANS_TABLES |
MODE_STRICT_ALL_TABLES)));
res= (table_list->prepare_where(thd, 0, TRUE) ||
table_list->prepare_check_option(thd));
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
if (!res)
2007-04-04 16:58:25 +02:00
prepare_triggers_for_insert_stmt(table);
DBUG_RETURN(res);
2000-07-31 21:29:14 +02:00
}
/*
Finish the preparation of the result table.
SYNOPSIS
select_insert::prepare2()
void
DESCRIPTION
If the result table is the same as one of the source tables (INSERT SELECT),
the result table is not finally prepared at the join prepair phase.
Do the final preparation now.
RETURN
0 OK
*/
int select_insert::prepare2(void)
{
DBUG_ENTER("select_insert::prepare2");
if (thd->lex->current_select->options & OPTION_BUFFER_RESULT &&
thd->locked_tables_mode <= LTM_LOCK_TABLES)
table->file->ha_start_bulk_insert((ha_rows) 0);
2005-02-02 21:39:21 +01:00
DBUG_RETURN(0);
}
void select_insert::cleanup()
{
/* select_insert/select_create are never re-used in prepared statement */
DBUG_ASSERT(0);
}
2000-07-31 21:29:14 +02:00
select_insert::~select_insert()
{
DBUG_ENTER("~select_insert");
2000-07-31 21:29:14 +02:00
if (table)
{
table->next_number_field=0;
table->auto_increment_field_not_null= FALSE;
table->file->ha_reset();
2000-07-31 21:29:14 +02:00
}
thd->count_cuted_fields= CHECK_FIELD_IGNORE;
thd->abort_on_warning= 0;
DBUG_VOID_RETURN;
2000-07-31 21:29:14 +02:00
}
bool select_insert::send_data(List<Item> &values)
{
DBUG_ENTER("select_insert::send_data");
bool error=0;
if (unit->offset_limit_cnt)
2000-07-31 21:29:14 +02:00
{ // using limit offset,count
unit->offset_limit_cnt--;
DBUG_RETURN(0);
2000-07-31 21:29:14 +02:00
}
2004-12-06 10:38:56 +01:00
thd->count_cuted_fields= CHECK_FIELD_WARN; // Calculate cuted fields
store_values(values);
thd->count_cuted_fields= CHECK_FIELD_ERROR_FOR_NULL;
if (thd->is_error())
{
table->auto_increment_field_not_null= FALSE;
2004-12-06 10:38:56 +01:00
DBUG_RETURN(1);
}
if (table_list) // Not CREATE ... SELECT
{
switch (table_list->view_check_option(thd, info.ignore)) {
case VIEW_CHECK_SKIP:
DBUG_RETURN(0);
case VIEW_CHECK_ERROR:
DBUG_RETURN(1);
}
2004-09-03 14:18:40 +02:00
}
// Release latches in case bulk insert takes a long time
ha_release_temporary_latches(thd);
error= write_record(thd, table, &info);
table->auto_increment_field_not_null= FALSE;
if (!error)
2000-07-31 21:29:14 +02:00
{
2005-06-22 19:07:28 +02:00
if (table->triggers || info.handle_duplicates == DUP_UPDATE)
{
/*
2005-06-22 19:07:28 +02:00
Restore fields of the record since it is possible that they were
changed by ON DUPLICATE KEY UPDATE clause.
If triggers exist then whey can modify some fields which were not
originally touched by INSERT ... SELECT, so we have to restore
their original values for the next row.
*/
restore_record(table, s->default_values);
}
if (table->next_number_field)
{
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
/*
If no value has been autogenerated so far, we need to remember the
value we just saw, we may need to send it to client in the end.
*/
if (thd->first_successful_insert_id_in_cur_stmt == 0) // optimization
autoinc_value_of_last_inserted_row=
table->next_number_field->val_int();
/*
Clear auto-increment field for the next record, if triggers are used
we will clear it twice, but this should be cheap.
*/
table->next_number_field->reset();
}
2000-07-31 21:29:14 +02:00
}
DBUG_RETURN(error);
2000-07-31 21:29:14 +02:00
}
void select_insert::store_values(List<Item> &values)
{
if (fields->elements)
fill_record_n_invoke_before_triggers(thd, *fields, values, 1,
table->triggers, TRG_EVENT_INSERT);
else
fill_record_n_invoke_before_triggers(thd, table->field, values, 1,
table->triggers, TRG_EVENT_INSERT);
}
2000-07-31 21:29:14 +02:00
void select_insert::send_error(uint errcode,const char *err)
{
2003-10-08 11:01:58 +02:00
DBUG_ENTER("select_insert::send_error");
my_message(errcode, err, MYF(0));
2003-10-08 11:01:58 +02:00
DBUG_VOID_RETURN;
2000-07-31 21:29:14 +02:00
}
bool select_insert::send_eof()
{
int error;
bool const trans_table= table->file->has_transactions();
ulonglong id, row_count;
bool changed;
THD::killed_state killed_status= thd->killed;
DBUG_ENTER("select_insert::send_eof");
DBUG_PRINT("enter", ("trans_table=%d, table_type='%s'",
trans_table, table->file->table_type()));
error= (thd->locked_tables_mode <= LTM_LOCK_TABLES ?
table->file->ha_end_bulk_insert() : 0);
if (!error && thd->is_error())
error= thd->stmt_da->sql_errno();
table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
changed= (info.copied || info.deleted || info.updated);
if (changed)
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
{
/*
We must invalidate the table in the query cache before binlog writing
and ha_autocommit_or_rollback.
*/
query_cache_invalidate3(thd, table, 1);
2003-08-29 12:44:35 +02:00
}
if (thd->transaction.stmt.modified_non_trans_table)
thd->transaction.all.modified_non_trans_table= TRUE;
DBUG_ASSERT(trans_table || !changed ||
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
2007-07-30 17:27:36 +02:00
thd->transaction.stmt.modified_non_trans_table);
/*
Write to binlog before commiting transaction. No statement will
be written by the binlog_query() below in RBR mode. All the
events are in the transaction cache and will be written when
ha_autocommit_or_rollback() is issued below.
*/
if (mysql_bin_log.is_open() &&
(!error || thd->transaction.stmt.modified_non_trans_table))
{
int errcode= 0;
if (!error)
thd->clear_error();
else
errcode= query_error_code(thd, killed_status == THD::NOT_KILLED);
if (thd->binlog_query(THD::ROW_QUERY_TYPE,
thd->query(), thd->query_length(),
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
2010-01-07 16:39:11 +01:00
trans_table, FALSE, FALSE, errcode))
{
table->file->ha_release_auto_increment();
DBUG_RETURN(1);
}
}
table->file->ha_release_auto_increment();
if (error)
2000-07-31 21:29:14 +02:00
{
table->file->print_error(error,MYF(0));
DBUG_RETURN(1);
2000-07-31 21:29:14 +02:00
}
char buff[160];
if (info.ignore)
sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
(ulong) (info.records - info.copied),
(ulong) thd->warning_info->statement_warn_count());
2000-07-31 21:29:14 +02:00
else
sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
(ulong) (info.deleted+info.updated),
(ulong) thd->warning_info->statement_warn_count());
row_count= info.copied + info.deleted +
((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
info.touched : info.updated);
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
2006-07-09 17:52:19 +02:00
id= (thd->first_successful_insert_id_in_cur_stmt > 0) ?
thd->first_successful_insert_id_in_cur_stmt :
(thd->arg_of_last_insert_id_function ?
thd->first_successful_insert_id_in_prev_stmt :
(info.copied ? autoinc_value_of_last_inserted_row : 0));
::my_ok(thd, row_count, id, buff);
DBUG_RETURN(0);
2000-07-31 21:29:14 +02:00
}
void select_insert::abort_result_set() {
DBUG_ENTER("select_insert::abort_result_set");
/*
If the creation of the table failed (due to a syntax error, for
example), no table will have been opened and therefore 'table'
will be NULL. In that case, we still need to execute the rollback
and the end of the function.
*/
if (table)
{
bool changed, transactional_table;
/*
If we are not in prelocked mode, we end the bulk insert started
before.
*/
if (thd->locked_tables_mode <= LTM_LOCK_TABLES)
table->file->ha_end_bulk_insert();
/*
If at least one row has been inserted/modified and will stay in
the table (the table doesn't have transactions) we must write to
the binlog (and the error code will make the slave stop).
For many errors (example: we got a duplicate key error while
inserting into a MyISAM table), no row will be added to the table,
so passing the error to the slave will not help since there will
be an error code mismatch (the inserts will succeed on the slave
with no error).
If table creation failed, the number of rows modified will also be
zero, so no check for that is made.
*/
changed= (info.copied || info.deleted || info.updated);
transactional_table= table->file->has_transactions();
if (thd->transaction.stmt.modified_non_trans_table)
{
if (!can_rollback_data())
thd->transaction.all.modified_non_trans_table= TRUE;
if (mysql_bin_log.is_open())
{
int errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
/* error of writing binary log is ignored */
(void) thd->binlog_query(THD::ROW_QUERY_TYPE, thd->query(),
thd->query_length(),
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
2010-01-07 16:39:11 +01:00
transactional_table, FALSE, FALSE, errcode);
}
if (changed)
query_cache_invalidate3(thd, table, 1);
}
DBUG_ASSERT(transactional_table || !changed ||
thd->transaction.stmt.modified_non_trans_table);
table->file->ha_release_auto_increment();
}
DBUG_VOID_RETURN;
}
2000-07-31 21:29:14 +02:00
/***************************************************************************
2002-07-25 00:00:56 +02:00
CREATE TABLE (SELECT) ...
2000-07-31 21:29:14 +02:00
***************************************************************************/
/**
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
Create table from lists of fields and items (or just return TABLE
object for pre-opened existing table).
@param thd [in] Thread object
@param create_info [in] Create information (like MAX_ROWS, ENGINE or
temporary table flag)
@param create_table [in] Pointer to TABLE_LIST object providing database
and name for table to be created or to be open
@param alter_info [in/out] Initial list of columns and indexes for the
table to be created
@param items [in] List of items which should be used to produce
rest of fields for the table (corresponding
fields will be added to the end of
alter_info->create_list)
@param lock [out] Pointer to the MYSQL_LOCK object for table
created will be returned in this parameter.
Since this table is not included in THD::lock
caller is responsible for explicitly unlocking
this table.
@param hooks [in] Hooks to be invoked before and after obtaining
table lock on the table being created.
@note
This function assumes that either table exists and was pre-opened and
locked at open_and_lock_tables() stage (and in this case we just emit
error or warning and return pre-opened TABLE object) or an exclusive
metadata lock was acquired on table so we can safely create, open and
lock table in it (we don't acquire metadata lock if this create is
for temporary table).
@note
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
Since this function contains some logic specific to CREATE TABLE ...
SELECT it should be changed before it can be used in other contexts.
@retval non-zero Pointer to TABLE object for table created or opened
@retval 0 Error
*/
static TABLE *create_table_from_items(THD *thd, HA_CREATE_INFO *create_info,
TABLE_LIST *create_table,
5.1 version of a fix and test cases for bugs: Bug#4968 ""Stored procedure crash if cursor opened on altered table" Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing" Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from stored procedure." Bug#19733 "Repeated alter, or repeated create/drop, fails" Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server" Bug#24879 "Prepared Statements: CREATE TABLE (UTF8 KEY) produces a growing key length" (this bug is not fixed in 5.0) Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE statements in stored routines or as prepared statements caused incorrect results (and crashes in versions prior to 5.0.25). In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options). The problem of bugs 4968, 19733, 19282 and 6895 was that functions mysql_prepare_table, mysql_create_table and mysql_alter_table are not re-execution friendly: during their operation they modify contents of LEX (members create_info, alter_info, key_list, create_list), thus making the LEX unusable for the next execution. In particular, these functions removed processed columns and keys from create_list, key_list and drop_list. Search the code in sql_table.cc for drop_it.remove() and similar patterns to find evidence. The fix is to supply to these functions a usable copy of each of the above structures at every re-execution of an SQL statement. To simplify memory management, LEX::key_list and LEX::create_list were added to LEX::alter_info, a fresh copy of which is created for every execution. The problem of crashing bug 22060 stemmed from the fact that the above metnioned functions were not only modifying HA_CREATE_INFO structure in LEX, but also were changing it to point to areas in volatile memory of the execution memory root. The patch solves this problem by creating and using an on-stack copy of HA_CREATE_INFO in mysql_execute_command. Additionally, this patch splits the part of mysql_alter_table that analizes and rewrites information from the parser into a separate function - mysql_prepare_alter_table, in analogy with mysql_prepare_table, which is renamed to mysql_prepare_create_table.
2007-05-28 13:30:01 +02:00
Alter_info *alter_info,
2006-05-18 13:35:15 +02:00
List<Item> *items,
MYSQL_LOCK **lock,
TABLEOP_HOOKS *hooks)
{
TABLE tmp_table; // Used during 'Create_field()'
2006-05-18 13:35:15 +02:00
TABLE_SHARE share;
TABLE *table= 0;
uint select_field_count= items->elements;
/* Add selected items to field list */
List_iterator_fast<Item> it(*items);
Item *item;
Field *tmp_field;
DBUG_ENTER("create_table_from_items");
tmp_table.alias= 0;
tmp_table.timestamp_field= 0;
2006-05-18 13:35:15 +02:00
tmp_table.s= &share;
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE corrupts a MERGE table Bug 26867 - LOCK TABLES + REPAIR + merge table result in memory/cpu hogging Bug 26377 - Deadlock with MERGE and FLUSH TABLE Bug 25038 - Waiting TRUNCATE Bug 25700 - merge base tables get corrupted by optimize/analyze/repair table Bug 30275 - Merge tables: flush tables or unlock tables causes server to crash Bug 19627 - temporary merge table locking Bug 27660 - Falcon: merge table possible Bug 30273 - merge tables: Can't lock file (errno: 155) The problems were: Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE corrupts a MERGE table 1. A thread trying to lock a MERGE table performs busy waiting while REPAIR TABLE or a similar table administration task is ongoing on one or more of its MyISAM tables. 2. A thread trying to lock a MERGE table performs busy waiting until all threads that did REPAIR TABLE or similar table administration tasks on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK TABLES. The difference against problem #1 is that the busy waiting takes place *after* the administration task. It is terminated by UNLOCK TABLES only. 3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the lock. This does *not* require a MERGE table. The first FLUSH TABLES can be replaced by any statement that requires other threads to reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke the problem. Bug 26867 - LOCK TABLES + REPAIR + merge table result in memory/cpu hogging Trying DML on a MERGE table, which has a child locked and repaired by another thread, made an infinite loop in the server. Bug 26377 - Deadlock with MERGE and FLUSH TABLE Locking a MERGE table and its children in parent-child order and flushing the child deadlocked the server. Bug 25038 - Waiting TRUNCATE Truncating a MERGE child, while the MERGE table was in use, let the truncate fail instead of waiting for the table to become free. Bug 25700 - merge base tables get corrupted by optimize/analyze/repair table Repairing a child of an open MERGE table corrupted the child. It was necessary to FLUSH the child first. Bug 30275 - Merge tables: flush tables or unlock tables causes server to crash Flushing and optimizing locked MERGE children crashed the server. Bug 19627 - temporary merge table locking Use of a temporary MERGE table with non-temporary children could corrupt the children. Temporary tables are never locked. So we do now prohibit non-temporary chidlren of a temporary MERGE table. Bug 27660 - Falcon: merge table possible It was possible to create a MERGE table with non-MyISAM children. Bug 30273 - merge tables: Can't lock file (errno: 155) This was a Windows-only bug. Table administration statements sometimes failed with "Can't lock file (errno: 155)". These bugs are fixed by a new implementation of MERGE table open. When opening a MERGE table in open_tables() we do now add the child tables to the list of tables to be opened by open_tables() (the "query_list"). The children are not opened in the handler at this stage. After opening the parent, open_tables() opens each child from the now extended query_list. When the last child is opened, we remove the children from the query_list again and attach the children to the parent. This behaves similar to the old open. However it does not open the MyISAM tables directly, but grabs them from the already open children. When closing a MERGE table in close_thread_table() we detach the children only. Closing of the children is done implicitly because they are in thd->open_tables. For more detail see the comment at the top of ha_myisammrg.cc. Changed from open_ltable() to open_and_lock_tables() in all places that can be relevant for MERGE tables. The latter can handle tables added to the list on the fly. When open_ltable() was used in a loop over a list of tables, the list must be temporarily terminated after every table for open_and_lock_tables(). table_list->required_type is set to FRMTYPE_TABLE to avoid open of special tables. Handling of derived tables is suppressed. These details are handled by the new function open_n_lock_single_table(), which has nearly the same signature as open_ltable() and can replace it in most cases. In reopen_tables() some of the tables open by a thread can be closed and reopened. When a MERGE child is affected, the parent must be closed and reopened too. Closing of the parent is forced before the first child is closed. Reopen happens in the order of thd->open_tables. MERGE parents do not attach their children automatically at open. This is done after all tables are reopened. So all children are open when attaching them. Special lock handling like mysql_lock_abort() or mysql_lock_remove() needs to be suppressed for MERGE children or forwarded to the parent. This depends on the situation. In loops over all open tables one suppresses child lock handling. When a single table is touched, forwarding is done. Behavioral changes: =================== This patch changes the behavior of temporary MERGE tables. Temporary MERGE must have temporary children. The old behavior was wrong. A temporary table is not locked. Hence even non-temporary children were not locked. See Bug 19627 - temporary merge table locking. You cannot change the union list of a non-temporary MERGE table when LOCK TABLES is in effect. The following does *not* work: CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...; LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE; ALTER TABLE m1 ... UNION=(t1,t2) ...; However, you can do this with a temporary MERGE table. You cannot create a MERGE table with CREATE ... SELECT, neither as a temporary MERGE table, nor as a non-temporary MERGE table. CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...; Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
init_tmp_table_share(thd, &share, "", 0, "", "");
2006-05-18 13:35:15 +02:00
tmp_table.s->db_create_options=0;
tmp_table.s->blob_ptr_size= portable_sizeof_char_ptr;
2006-05-18 13:35:15 +02:00
tmp_table.s->db_low_byte_first=
test(create_info->db_type == myisam_hton ||
create_info->db_type == heap_hton);
tmp_table.null_row=tmp_table.maybe_null=0;
while ((item=it++))
{
Create_field *cr_field;
Field *field, *def_field;
if (item->type() == Item::FUNC_ITEM)
if (item->result_type() != STRING_RESULT)
field= item->tmp_table_field(&tmp_table);
else
field= item->tmp_table_field_from_field_type(&tmp_table, 0);
else
field= create_tmp_field(thd, &tmp_table, item, item->type(),
(Item ***) 0, &tmp_field, &def_field, 0, 0, 0, 0,
0);
if (!field ||
!(cr_field=new Create_field(field,(item->type() == Item::FIELD_ITEM ?
((Item_field *)item)->field :
(Field*) 0))))
DBUG_RETURN(0);
if (item->maybe_null)
cr_field->flags &= ~NOT_NULL_FLAG;
5.1 version of a fix and test cases for bugs: Bug#4968 ""Stored procedure crash if cursor opened on altered table" Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing" Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from stored procedure." Bug#19733 "Repeated alter, or repeated create/drop, fails" Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server" Bug#24879 "Prepared Statements: CREATE TABLE (UTF8 KEY) produces a growing key length" (this bug is not fixed in 5.0) Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE statements in stored routines or as prepared statements caused incorrect results (and crashes in versions prior to 5.0.25). In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options). The problem of bugs 4968, 19733, 19282 and 6895 was that functions mysql_prepare_table, mysql_create_table and mysql_alter_table are not re-execution friendly: during their operation they modify contents of LEX (members create_info, alter_info, key_list, create_list), thus making the LEX unusable for the next execution. In particular, these functions removed processed columns and keys from create_list, key_list and drop_list. Search the code in sql_table.cc for drop_it.remove() and similar patterns to find evidence. The fix is to supply to these functions a usable copy of each of the above structures at every re-execution of an SQL statement. To simplify memory management, LEX::key_list and LEX::create_list were added to LEX::alter_info, a fresh copy of which is created for every execution. The problem of crashing bug 22060 stemmed from the fact that the above metnioned functions were not only modifying HA_CREATE_INFO structure in LEX, but also were changing it to point to areas in volatile memory of the execution memory root. The patch solves this problem by creating and using an on-stack copy of HA_CREATE_INFO in mysql_execute_command. Additionally, this patch splits the part of mysql_alter_table that analizes and rewrites information from the parser into a separate function - mysql_prepare_alter_table, in analogy with mysql_prepare_table, which is renamed to mysql_prepare_create_table.
2007-05-28 13:30:01 +02:00
alter_info->create_list.push_back(cr_field);
}
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
DBUG_EXECUTE_IF("sleep_create_select_before_create", my_sleep(6000000););
/*
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
Create and lock table.
Note that we either creating (or opening existing) temporary table or
creating base table on which name we have exclusive lock. So code below
should not cause deadlocks or races.
We don't log the statement, it will be logged later.
If this is a HEAP table, the automatic DELETE FROM which is written to the
binlog when a HEAP table is opened for the first time since startup, must
not be written: 1) it would be wrong (imagine we're in CREATE SELECT: we
don't want to delete from it) 2) it would be written before the CREATE
TABLE, which is a wrong order. So we keep binary logging disabled when we
open_table().
*/
{
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
if (!mysql_create_table_no_lock(thd, create_table->db,
create_table->table_name,
5.1 version of a fix and test cases for bugs: Bug#4968 ""Stored procedure crash if cursor opened on altered table" Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing" Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from stored procedure." Bug#19733 "Repeated alter, or repeated create/drop, fails" Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server" Bug#24879 "Prepared Statements: CREATE TABLE (UTF8 KEY) produces a growing key length" (this bug is not fixed in 5.0) Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE statements in stored routines or as prepared statements caused incorrect results (and crashes in versions prior to 5.0.25). In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options). The problem of bugs 4968, 19733, 19282 and 6895 was that functions mysql_prepare_table, mysql_create_table and mysql_alter_table are not re-execution friendly: during their operation they modify contents of LEX (members create_info, alter_info, key_list, create_list), thus making the LEX unusable for the next execution. In particular, these functions removed processed columns and keys from create_list, key_list and drop_list. Search the code in sql_table.cc for drop_it.remove() and similar patterns to find evidence. The fix is to supply to these functions a usable copy of each of the above structures at every re-execution of an SQL statement. To simplify memory management, LEX::key_list and LEX::create_list were added to LEX::alter_info, a fresh copy of which is created for every execution. The problem of crashing bug 22060 stemmed from the fact that the above metnioned functions were not only modifying HA_CREATE_INFO structure in LEX, but also were changing it to point to areas in volatile memory of the execution memory root. The patch solves this problem by creating and using an on-stack copy of HA_CREATE_INFO in mysql_execute_command. Additionally, this patch splits the part of mysql_alter_table that analizes and rewrites information from the parser into a separate function - mysql_prepare_alter_table, in analogy with mysql_prepare_table, which is renamed to mysql_prepare_create_table.
2007-05-28 13:30:01 +02:00
create_info, alter_info, 0,
select_field_count, NULL))
{
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
DBUG_EXECUTE_IF("sleep_create_select_before_open", my_sleep(6000000););
2006-05-18 13:35:15 +02:00
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
if (!(create_info->options & HA_LEX_CREATE_TMP_TABLE))
{
Open_table_context ot_ctx(thd, MYSQL_OPEN_REOPEN);
/*
Here we open the destination table, on which we already have
an exclusive metadata lock.
*/
if (open_table(thd, create_table, thd->mem_root, &ot_ctx))
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
{
quick_rm_table(create_info->db_type, create_table->db,
table_case_name(create_info, create_table->table_name),
0);
}
else
table= create_table->table;
}
else
{
Open_table_context ot_ctx(thd, MYSQL_OPEN_TEMPORARY_ONLY);
if (open_table(thd, create_table, thd->mem_root, &ot_ctx))
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
{
/*
This shouldn't happen as creation of temporary table should make
it preparable for open. But let us do close_temporary_table() here
just in case.
*/
drop_temporary_table(thd, create_table, NULL);
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
}
else
table= create_table->table;
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
}
}
if (!table) // open failed
DBUG_RETURN(0);
}
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
DBUG_EXECUTE_IF("sleep_create_select_before_lock", my_sleep(6000000););
table->reginfo.lock_type=TL_WRITE;
2006-05-18 13:35:15 +02:00
hooks->prelock(&table, 1); // Call prelock hooks
/*
mysql_lock_tables() below should never fail with request to reopen table
since it won't wait for the table lock (we have exclusive metadata lock on
A prerequisite patch for the fix for Bug#46224 "HANDLER statements within a transaction might lead to deadlocks". Introduce a notion of a sentinel to MDL_context. A sentinel is a ticket that separates all tickets in the context into two groups: before and after it. Currently we can have (and need) only one designated sentinel -- it separates all locks taken by LOCK TABLE or HANDLER statement, which must survive COMMIT and ROLLBACK and all other locks, which must be released at COMMIT or ROLLBACK. The tricky part is maintaining the sentinel up to date when someone release its corresponding ticket. This can happen, e.g. if someone issues DROP TABLE under LOCK TABLES (generally, see all calls to release_all_locks_for_name()). MDL_context::release_ticket() is modified to take care of it. ****** A fix and a test case for Bug#46224 "HANDLER statements within a transaction might lead to deadlocks". An attempt to mix HANDLER SQL statements, which are transaction- agnostic, an open multi-statement transaction, and DDL against the involved tables (in a concurrent connection) could lead to a deadlock. The deadlock would occur when HANDLER OPEN or HANDLER READ would have to wait on a conflicting metadata lock. If the connection that issued HANDLER statement also had other metadata locks (say, acquired in scope of a transaction), a classical deadlock situation of mutual wait could occur. Incompatible change: entering LOCK TABLES mode automatically closes all open HANDLERs in the current connection. Incompatible change: previously an attempt to wait on a lock in a connection that has an open HANDLER statement could wait indefinitely/deadlock. After this patch, an error ER_LOCK_DEADLOCK is produced. The idea of the fix is to merge thd->handler_mdl_context with the main mdl_context of the connection, used for transactional locks. This makes deadlock detection possible, since all waits with locks are "visible" and available to analysis in a single MDL context of the connection. Since HANDLER locks and transactional locks have a different life cycle -- HANDLERs are explicitly open and closed, and so are HANDLER locks, explicitly acquired and released, whereas transactional locks "accumulate" till the end of a transaction and are released only with COMMIT, ROLLBACK and ROLLBACK TO SAVEPOINT, a concept of "sentinel" was introduced to MDL_context. All locks, HANDLER and others, reside in the same linked list. However, a selected element of the list separates locks with different life cycle. HANDLER locks always reside at the end of the list, after the sentinel. Transactional locks are prepended to the beginning of the list, before the sentinel. Thus, ROLLBACK, COMMIT or ROLLBACK TO SAVEPOINT, only release those locks that reside before the sentinel. HANDLER locks must be released explicitly as part of HANDLER CLOSE statement, or an implicit close. The same approach with sentinel is also employed for LOCK TABLES locks. Since HANDLER and LOCK TABLES statement has never worked together, the implementation is made simple and only maintains one sentinel, which is used either for HANDLER locks, or for LOCK TABLES locks.
2009-12-22 17:09:15 +01:00
the table) and thus can't get aborted.
*/
if (! ((*lock)= mysql_lock_tables(thd, &table, 1, 0)) ||
hooks->postlock(&table, 1))
{
if (*lock)
{
mysql_unlock_tables(thd, *lock);
*lock= 0;
}
drop_open_table(thd, table, create_table->db, create_table->table_name);
DBUG_RETURN(0);
}
DBUG_RETURN(table);
}
2000-07-31 21:29:14 +02:00
int
select_create::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
{
MYSQL_LOCK *extra_lock= NULL;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
DBUG_ENTER("select_create::prepare");
TABLEOP_HOOKS *hook_ptr= NULL;
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
/*
For row-based replication, the CREATE-SELECT statement is written
in two pieces: the first one contain the CREATE TABLE statement
necessary to create the table and the second part contain the rows
that should go into the table.
For non-temporary tables, the start of the CREATE-SELECT
implicitly commits the previous transaction, and all events
forming the statement will be stored the transaction cache. At end
of the statement, the entire statement is committed as a
transaction, and all events are written to the binary log.
On the master, the table is locked for the duration of the
statement, but since the CREATE part is replicated as a simple
statement, there is no way to lock the table for accesses on the
slave. Hence, we have to hold on to the CREATE part of the
statement until the statement has finished.
*/
2006-03-24 12:24:31 +01:00
class MY_HOOKS : public TABLEOP_HOOKS {
public:
MY_HOOKS(select_create *x, TABLE_LIST *create_table_arg,
TABLE_LIST *select_tables_arg)
: ptr(x),
create_table(create_table_arg),
select_tables(select_tables_arg)
{
}
private:
virtual int do_postlock(TABLE **tables, uint count)
2006-03-24 12:24:31 +01:00
{
int error;
THD *thd= const_cast<THD*>(ptr->get_thd());
TABLE_LIST *save_next_global= create_table->next_global;
create_table->next_global= select_tables;
2010-02-04 23:08:08 +01:00
error= thd->decide_logging_format(create_table);
create_table->next_global= save_next_global;
if (error)
return error;
TABLE const *const table = *tables;
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
2009-07-14 21:31:19 +02:00
if (thd->is_current_stmt_binlog_format_row() &&
!table->s->tmp_table)
{
if (int error= ptr->binlog_show_create_table(tables, count))
return error;
}
return 0;
2006-03-24 12:24:31 +01:00
}
select_create *ptr;
TABLE_LIST *create_table;
TABLE_LIST *select_tables;
2006-03-24 12:24:31 +01:00
};
MY_HOOKS hooks(this, create_table, select_tables);
hook_ptr= &hooks;
unit= u;
/*
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
Start a statement transaction before the create if we are using
row-based replication for the statement. If we are creating a
temporary table, we need to start a statement transaction.
*/
if ((thd->lex->create_info.options & HA_LEX_CREATE_TMP_TABLE) == 0 &&
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
2009-07-14 21:31:19 +02:00
thd->is_current_stmt_binlog_format_row() &&
mysql_bin_log.is_open())
{
thd->binlog_start_trans_and_stmt();
}
DBUG_ASSERT(create_table->table == NULL);
DBUG_EXECUTE_IF("sleep_create_select_before_check_if_exists", my_sleep(6000000););
if (!(table= create_table_from_items(thd, create_info, create_table,
alter_info, &values,
&extra_lock, hook_ptr)))
/* abort() deletes table */
DBUG_RETURN(-1);
2000-07-31 21:29:14 +02:00
if (extra_lock)
{
DBUG_ASSERT(m_plock == NULL);
if (create_info->options & HA_LEX_CREATE_TMP_TABLE)
m_plock= &m_lock;
else
m_plock= &thd->extra_lock;
*m_plock= extra_lock;
}
if (table->s->fields < values.elements)
2003-10-06 20:02:27 +02:00
{
Fix for BUG#11755168 '46895: test "outfile_loaddata" fails (reproducible)'. In sql_class.cc, 'row_count', of type 'ha_rows', was used as last argument for ER_TRUNCATED_WRONG_VALUE_FOR_FIELD which is "Incorrect %-.32s value: '%-.128s' for column '%.192s' at row %ld". So 'ha_rows' was used as 'long'. On SPARC32 Solaris builds, 'long' is 4 bytes and 'ha_rows' is 'longlong' i.e. 8 bytes. So the printf-like code was reading only the first 4 bytes. Because the CPU is big-endian, 1LL is 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 so the first four bytes yield 0. So the warning message had "row 0" instead of "row 1" in test outfile_loaddata.test: -Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 1 +Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 0 All error-messaging functions which internally invoke some printf-life function are potential candidate for such mistakes. One apparently easy way to catch such mistakes is to use ATTRIBUTE_FORMAT (from my_attribute.h). But this works only when call site has both: a) the format as a string literal b) the types of arguments. So: func(ER(ER_BLAH), 10); will silently not be checked, because ER(ER_BLAH) is not known at compile time (it is known at run-time, and depends on the chosen language). And func("%s", a va_list argument); has the same problem, as the *real* type of arguments is not known at this site at compile time (it's known in some caller). Moreover, func(ER(ER_BLAH)); though possibly correct (if ER(ER_BLAH) has no '%' markers), will not compile (gcc says "error: format not a string literal and no format arguments"). Consequences: 1) ATTRIBUTE_FORMAT is here added only to functions which in practice take "string literal" formats: "my_error_reporter" and "print_admin_msg". 2) it cannot be added to the other functions: my_error(), push_warning_printf(), Table_check_intact::report_error(), general_log_print(). To do a one-time check of functions listed in (2), the following "static code analysis" has been done: 1) replace my_error(ER_xxx, arguments for substitution in format) with the equivalent my_printf_error(ER_xxx,ER(ER_xxx), arguments for substitution in format), so that we have ER(ER_xxx) and the arguments *in the same call site* 2) add ATTRIBUTE_FORMAT to push_warning_printf(), Table_check_intact::report_error(), general_log_print() 3) replace ER(xxx) with the hard-coded English text found in errmsg.txt (like: ER(ER_UNKNOWN_ERROR) is replaced with "Unknown error"), so that a call site has the format as string literal 4) this way, ATTRIBUTE_FORMAT can effectively do its job 5) compile, fix errors detected by ATTRIBUTE_FORMAT 6) revert steps 1-2-3. The present patch has no compiler error when submitted again to the static code analysis above. It cannot catch all problems though: see Field::set_warning(), in which a call to push_warning_printf() has a variable error (thus, not replacable by a string literal); I checked set_warning() calls by hand though. See also WL 5883 for one proposal to avoid such bugs from appearing again in the future. The issues fixed in the patch are: a) mismatch in types (like 'int' passed to '%ld') b) more arguments passed than specified in the format. This patch resolves mismatches by changing the type/number of arguments, not by changing error messages of sql/share/errmsg.txt. The latter would be wrong, per the following old rule: errmsg.txt must be as stable as possible; no insertions or deletions of messages, no changes of type or number of printf-like format specifiers, are allowed, as long as the change impacts a message already released in a GA version. If this rule is not followed: - Connectors, which use error message numbers, will be confused (by insertions/deletions of messages) - using errmsg.sys of MySQL 5.1.n with mysqld of MySQL 5.1.(n+1) could produce wrong messages or crash; such usage can easily happen if installing 5.1.(n+1) while /etc/my.cnf still has --language=/path/to/5.1.n/xxx; or if copying mysqld from 5.1.(n+1) into a 5.1.n installation. When fixing b), I have verified that the superfluous arguments were not used in the format in the first 5.1 GA (5.1.30 'bteam@astra04-20081114162938-z8mctjp6st27uobm'). Had they been used, then passing them today, even if the message doesn't use them anymore, would have been necessary, as explained above.
2011-05-16 22:04:01 +02:00
my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), 1L);
2003-10-06 20:02:27 +02:00
DBUG_RETURN(-1);
}
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
/* First field to copy */
field= table->field+table->s->fields - values.elements;
/* Mark all fields that are given values */
for (Field **f= field ; *f ; f++)
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
bitmap_set_bit(table->write_set, (*f)->field_index);
2000-07-31 21:29:14 +02:00
/* Don't set timestamp if used */
table->timestamp_field_type= TIMESTAMP_NO_AUTO_SET;
2000-07-31 21:29:14 +02:00
table->next_number_field=table->found_next_number_field;
restore_record(table,s->default_values); // Get empty record
2000-07-31 21:29:14 +02:00
thd->cuted_fields=0;
2005-01-10 20:55:05 +01:00
if (info.ignore || info.handle_duplicates != DUP_ERROR)
table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
if (info.handle_duplicates == DUP_REPLACE &&
(!table->triggers || !table->triggers->has_delete_triggers()))
table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
if (info.handle_duplicates == DUP_UPDATE)
table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
if (thd->locked_tables_mode <= LTM_LOCK_TABLES)
table->file->ha_start_bulk_insert((ha_rows) 0);
thd->abort_on_warning= (!info.ignore &&
(thd->variables.sql_mode &
(MODE_STRICT_TRANS_TABLES |
MODE_STRICT_ALL_TABLES)));
if (check_that_all_fields_are_given_values(thd, table, table_list))
DBUG_RETURN(1);
table->mark_columns_needed_for_insert();
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
table->file->extra(HA_EXTRA_WRITE_CACHE);
DBUG_RETURN(0);
2000-07-31 21:29:14 +02:00
}
int
select_create::binlog_show_create_table(TABLE **tables, uint count)
{
/*
Note 1: In RBR mode, we generate a CREATE TABLE statement for the
created table by calling store_create_info() (behaves as SHOW
CREATE TABLE). In the event of an error, nothing should be
written to the binary log, even if the table is non-transactional;
therefore we pretend that the generated CREATE TABLE statement is
for a transactional table. The event will then be put in the
transaction cache, and any subsequent events (e.g., table-map
events and binrow events) will also be put there. We can then use
ha_autocommit_or_rollback() to either throw away the entire
kaboodle of events, or write them to the binary log.
We write the CREATE TABLE statement here and not in prepare()
since there potentially are sub-selects or accesses to information
schema that will do a close_thread_tables(), destroying the
statement transaction cache.
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
*/
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
2009-07-14 21:31:19 +02:00
DBUG_ASSERT(thd->is_current_stmt_binlog_format_row());
DBUG_ASSERT(tables && *tables && count > 0);
char buf[2048];
String query(buf, sizeof(buf), system_charset_info);
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
int result;
TABLE_LIST tmp_table_list;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
memset(&tmp_table_list, 0, sizeof(tmp_table_list));
tmp_table_list.table = *tables;
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
query.length(0); // Have to zero it since constructor doesn't
result= store_create_info(thd, &tmp_table_list, &query, create_info,
/* show_database */ TRUE);
DBUG_ASSERT(result == 0); /* store_create_info() always return 0 */
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
if (mysql_bin_log.is_open())
{
int errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
result= thd->binlog_query(THD::STMT_QUERY_TYPE,
query.ptr(), query.length(),
/* is_trans */ TRUE,
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
2010-01-07 16:39:11 +01:00
/* direct */ FALSE,
/* suppress_use */ FALSE,
errcode);
}
return result;
}
void select_create::store_values(List<Item> &values)
2000-07-31 21:29:14 +02:00
{
fill_record_n_invoke_before_triggers(thd, field, values, 1,
table->triggers, TRG_EVENT_INSERT);
2000-07-31 21:29:14 +02:00
}
void select_create::send_error(uint errcode,const char *err)
{
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
DBUG_ENTER("select_create::send_error");
DBUG_PRINT("info",
("Current statement %s row-based",
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
2009-07-14 21:31:19 +02:00
thd->is_current_stmt_binlog_format_row() ? "is" : "is NOT"));
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
DBUG_PRINT("info",
("Current table (at 0x%lu) %s a temporary (or non-existant) table",
(ulong) table,
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
table && !table->s->tmp_table ? "is NOT" : "is"));
/*
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
This will execute any rollbacks that are necessary before writing
the transcation cache.
We disable the binary log since nothing should be written to the
binary log. This disabling is important, since we potentially do
a "roll back" of non-transactional tables by removing the table,
and the actual rollback might generate events that should not be
written to the binary log.
*/
tmp_disable_binlog(thd);
select_insert::send_error(errcode, err);
reenable_binlog(thd);
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
DBUG_VOID_RETURN;
2000-07-31 21:29:14 +02:00
}
2000-07-31 21:29:14 +02:00
bool select_create::send_eof()
{
bool tmp=select_insert::send_eof();
if (tmp)
abort_result_set();
2000-07-31 21:29:14 +02:00
else
{
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
/*
Do an implicit commit at end of statement for non-temporary
tables. This can fail, but we should unlock the table
nevertheless.
*/
if (!table->s->tmp_table)
{
trans_commit_stmt(thd);
trans_commit_implicit(thd);
}
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
if (m_plock)
{
mysql_unlock_tables(thd, *m_plock);
*m_plock= NULL;
m_plock= NULL;
}
2000-07-31 21:29:14 +02:00
}
return tmp;
}
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
2007-05-11 19:51:03 +02:00
void select_create::abort_result_set()
2000-07-31 21:29:14 +02:00
{
DBUG_ENTER("select_create::abort_result_set");
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
/*
In select_insert::abort_result_set() we roll back the statement, including
truncating the transaction cache of the binary log. To do this, we
pretend that the statement is transactional, even though it might
be the case that it was not.
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
We roll back the statement prior to deleting the table and prior
to releasing the lock on the table, since there might be potential
for failure if the rollback is executed after the drop or after
unlocking the table.
We also roll back the statement regardless of whether the creation
of the table succeeded or not, since we need to reset the binary
log state.
*/
tmp_disable_binlog(thd);
select_insert::abort_result_set();
thd->transaction.stmt.modified_non_trans_table= FALSE;
reenable_binlog(thd);
/* possible error of writing binary log is ignored deliberately */
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
2010-01-07 16:39:11 +01:00
(void) thd->binlog_flush_pending_rows_event(TRUE, TRUE);
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
if (m_plock)
2000-07-31 21:29:14 +02:00
{
mysql_unlock_tables(thd, *m_plock);
*m_plock= NULL;
m_plock= NULL;
2000-07-31 21:29:14 +02:00
}
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
2000-07-31 21:29:14 +02:00
if (table)
{
table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
2006-07-01 23:51:10 +02:00
table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
table->auto_increment_field_not_null= FALSE;
drop_open_table(thd, table, create_table->db, create_table->table_name);
table=0; // Safety
2000-07-31 21:29:14 +02:00
}
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
2006-12-21 09:29:02 +01:00
DBUG_VOID_RETURN;
2000-07-31 21:29:14 +02:00
}
/*****************************************************************************
2002-07-25 00:00:56 +02:00
Instansiate templates
2000-07-31 21:29:14 +02:00
*****************************************************************************/
#ifdef HAVE_EXPLICIT_TEMPLATE_INSTANTIATION
template class List_iterator_fast<List_item>;
#ifndef EMBEDDED_LIBRARY
template class I_List<Delayed_insert>;
template class I_List_iterator<Delayed_insert>;
2000-07-31 21:29:14 +02:00
template class I_List<delayed_row>;
#endif /* EMBEDDED_LIBRARY */
#endif /* HAVE_EXPLICIT_TEMPLATE_INSTANTIATION */