2009-09-23 23:32:31 +02:00
|
|
|
#ifndef TABLE_INCLUDED
|
|
|
|
#define TABLE_INCLUDED
|
|
|
|
|
2008-11-10 21:21:49 +01:00
|
|
|
/* Copyright 2000-2008 MySQL AB, 2008 Sun Microsystems, Inc.
|
2001-12-06 13:10:51 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
2006-12-23 20:17:15 +01:00
|
|
|
the Free Software Foundation; version 2 of the License.
|
2001-12-06 13:10:51 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
2001-12-06 13:10:51 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; if not, write to the Free Software
|
|
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
|
|
|
|
|
|
|
2009-11-30 16:55:03 +01:00
|
|
|
#include "sql_plist.h"
|
2009-12-08 10:57:07 +01:00
|
|
|
#include "mdl.h"
|
2009-11-30 16:55:03 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
/* Structs that defines the TABLE */
|
|
|
|
|
|
|
|
class Item; /* Needed by ORDER */
|
2006-11-01 02:31:56 +01:00
|
|
|
class Item_subselect;
|
2008-10-07 23:34:00 +02:00
|
|
|
class Item_field;
|
2000-07-31 21:29:14 +02:00
|
|
|
class GRANT_TABLE;
|
2003-02-12 20:55:37 +01:00
|
|
|
class st_select_lex_unit;
|
2004-07-16 00:15:55 +02:00
|
|
|
class st_select_lex;
|
2005-07-18 13:31:02 +02:00
|
|
|
class partition_info;
|
2004-10-19 23:12:55 +02:00
|
|
|
class COND_EQUAL;
|
2005-10-27 23:18:23 +02:00
|
|
|
class Security_context;
|
2000-07-31 21:29:14 +02:00
|
|
|
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
/*************************************************************************/
|
|
|
|
|
|
|
|
/**
|
|
|
|
View_creation_ctx -- creation context of view objects.
|
|
|
|
*/
|
|
|
|
|
|
|
|
class View_creation_ctx : public Default_object_creation_ctx,
|
|
|
|
public Sql_alloc
|
|
|
|
{
|
|
|
|
public:
|
|
|
|
static View_creation_ctx *create(THD *thd);
|
|
|
|
|
|
|
|
static View_creation_ctx *create(THD *thd,
|
2007-07-16 22:59:21 +02:00
|
|
|
TABLE_LIST *view);
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
|
|
|
|
private:
|
|
|
|
View_creation_ctx(THD *thd)
|
|
|
|
: Default_object_creation_ctx(thd)
|
|
|
|
{ }
|
|
|
|
};
|
|
|
|
|
|
|
|
/*************************************************************************/
|
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
/* Order clause list element */
|
|
|
|
|
|
|
|
typedef struct st_order {
|
|
|
|
struct st_order *next;
|
|
|
|
Item **item; /* Point at item in select fields */
|
2004-06-25 15:52:01 +02:00
|
|
|
Item *item_ptr; /* Storage for initial item */
|
2003-10-14 12:59:28 +02:00
|
|
|
Item **item_copy; /* For SPs; the original item ptr */
|
2004-08-31 10:58:45 +02:00
|
|
|
int counter; /* position in SELECT list, correct
|
|
|
|
only if counter_used is true*/
|
2000-07-31 21:29:14 +02:00
|
|
|
bool asc; /* true if ascending */
|
|
|
|
bool free_me; /* true if item isn't shared */
|
|
|
|
bool in_field_list; /* true if in select field list */
|
2004-10-07 00:45:06 +02:00
|
|
|
bool counter_used; /* parameter was counter of columns */
|
2000-07-31 21:29:14 +02:00
|
|
|
Field *field; /* If tmp-table group */
|
|
|
|
char *buff; /* If tmp-table group */
|
2005-01-06 12:00:13 +01:00
|
|
|
table_map used, depend_map;
|
2000-07-31 21:29:14 +02:00
|
|
|
} ORDER;
|
|
|
|
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief The current state of the privilege checking process for the current
|
|
|
|
user, SQL statement and SQL object.
|
|
|
|
|
|
|
|
@details The privilege checking process is divided into phases depending on
|
|
|
|
the level of the privilege to be checked and the type of object to be
|
|
|
|
accessed. Due to the mentioned scattering of privilege checking
|
|
|
|
functionality, it is necessary to keep track of the state of the
|
|
|
|
process. This information is stored in privilege, want_privilege, and
|
|
|
|
orig_want_privilege.
|
|
|
|
|
|
|
|
A GRANT_INFO also serves as a cache of the privilege hash tables. Relevant
|
|
|
|
members are grant_table and version.
|
|
|
|
*/
|
2000-07-31 21:29:14 +02:00
|
|
|
typedef struct st_grant_info
|
|
|
|
{
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief A copy of the privilege information regarding the current host,
|
|
|
|
database, object and user.
|
|
|
|
|
|
|
|
@details The version of this copy is found in GRANT_INFO::version.
|
|
|
|
*/
|
2000-07-31 21:29:14 +02:00
|
|
|
GRANT_TABLE *grant_table;
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief Used for cache invalidation when caching privilege information.
|
|
|
|
|
|
|
|
@details The privilege information is stored on disk, with dedicated
|
|
|
|
caches residing in memory: table-level and column-level privileges,
|
|
|
|
respectively, have their own dedicated caches.
|
|
|
|
|
|
|
|
The GRANT_INFO works as a level 1 cache with this member updated to the
|
|
|
|
current value of the global variable @c grant_version (@c static variable
|
|
|
|
in sql_acl.cc). It is updated Whenever the GRANT_INFO is refreshed from
|
|
|
|
the level 2 cache. The level 2 cache is the @c column_priv_hash structure
|
|
|
|
(@c static variable in sql_acl.cc)
|
|
|
|
|
|
|
|
@see grant_version
|
|
|
|
*/
|
2000-07-31 21:29:14 +02:00
|
|
|
uint version;
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief The set of privileges that the current user has fulfilled for a
|
|
|
|
certain host, database, and object.
|
|
|
|
|
|
|
|
@details This field is continually updated throughout the access checking
|
|
|
|
process. In each step the "wanted privilege" is checked against the
|
|
|
|
fulfilled privileges. When/if the intersection of these sets is empty,
|
|
|
|
access is granted.
|
|
|
|
|
|
|
|
The set is implemented as a bitmap, with the bits defined in sql_acl.h.
|
|
|
|
*/
|
2002-06-12 14:04:18 +02:00
|
|
|
ulong privilege;
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief the set of privileges that the current user needs to fulfil in
|
|
|
|
order to carry out the requested operation.
|
|
|
|
*/
|
2002-06-12 14:04:18 +02:00
|
|
|
ulong want_privilege;
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
2005-10-27 23:18:23 +02:00
|
|
|
Stores the requested access acl of top level tables list. Is used to
|
|
|
|
check access rights to the underlying tables of a view.
|
|
|
|
*/
|
|
|
|
ulong orig_want_privilege;
|
2000-07-31 21:29:14 +02:00
|
|
|
} GRANT_INFO;
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
enum tmp_table_type
|
|
|
|
{
|
2007-03-22 16:00:47 +01:00
|
|
|
NO_TMP_TABLE, NON_TRANSACTIONAL_TMP_TABLE, TRANSACTIONAL_TMP_TABLE,
|
2005-11-23 21:45:02 +01:00
|
|
|
INTERNAL_TMP_TABLE, SYSTEM_TMP_TABLE
|
|
|
|
};
|
2001-05-09 22:02:36 +02:00
|
|
|
|
A fix and a test case for Bug#26141 mixing table types in trigger
causes full table lock on innodb table.
Also fixes Bug#28502 Triggers that update another innodb table
will block on X lock unnecessarily (duplciate).
Code review fixes.
Both bugs' synopses are misleading: InnoDB table is
not X locked. The statements, however, cannot proceed concurrently,
but this happens due to lock conflicts for tables used in triggers,
not for the InnoDB table.
If a user had an InnoDB table, and two triggers, AFTER UPDATE and
AFTER INSERT, competing for different resources (e.g. two distinct
MyISAM tables), then these two triggers would not be able to execute
concurrently. Moreover, INSERTS/UPDATES of the InnoDB table would
not be able to run concurrently.
The problem had other side-effects (see respective bug reports).
This behavior was a consequence of a shortcoming of the pre-locking
algorithm, which would not distinguish between different DML operations
(e.g. INSERT and DELETE) and pre-lock all the tables
that are used by any trigger defined on the subject table.
The idea of the fix is to extend the pre-locking algorithm to keep track,
for each table, what DML operation it is used for and not
load triggers that are known to never be fired.
2007-07-12 20:26:41 +02:00
|
|
|
/** Event on which trigger is invoked. */
|
|
|
|
enum trg_event_type
|
|
|
|
{
|
|
|
|
TRG_EVENT_INSERT= 0,
|
|
|
|
TRG_EVENT_UPDATE= 1,
|
|
|
|
TRG_EVENT_DELETE= 2,
|
|
|
|
TRG_EVENT_MAX
|
|
|
|
};
|
|
|
|
|
2004-08-24 14:37:51 +02:00
|
|
|
enum frm_type_enum
|
|
|
|
{
|
|
|
|
FRMTYPE_ERROR= 0,
|
|
|
|
FRMTYPE_TABLE,
|
|
|
|
FRMTYPE_VIEW
|
|
|
|
};
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
enum release_type { RELEASE_NORMAL, RELEASE_WAIT_FOR_DROP };
|
|
|
|
|
2003-04-24 13:33:33 +02:00
|
|
|
typedef struct st_filesort_info
|
|
|
|
{
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
IO_CACHE *io_cache; /* If sorted through filesort */
|
|
|
|
uchar **sort_keys; /* Buffer for sorting keys */
|
|
|
|
uchar *buffpek; /* Buffer for buffpek structures */
|
|
|
|
uint buffpek_len; /* Max number of buffpeks in the buffer */
|
|
|
|
uchar *addon_buf; /* Pointer to a buffer if sorted with fields */
|
|
|
|
size_t addon_length; /* Length of the buffer */
|
2003-04-24 13:33:33 +02:00
|
|
|
struct st_sort_addon_field *addon_field; /* Pointer to the fields info */
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
void (*unpack)(struct st_sort_addon_field *, uchar *); /* To unpack back */
|
|
|
|
uchar *record_pointers; /* If sorted in memory */
|
|
|
|
ha_rows found_records; /* How many records in sort */
|
2003-04-24 13:33:33 +02:00
|
|
|
} FILESORT_INFO;
|
|
|
|
|
2003-08-02 11:43:18 +02:00
|
|
|
|
2004-10-01 16:54:06 +02:00
|
|
|
/*
|
2005-04-22 12:30:09 +02:00
|
|
|
Values in this enum are used to indicate how a tables TIMESTAMP field
|
|
|
|
should be treated. It can be set to the current timestamp on insert or
|
|
|
|
update or both.
|
|
|
|
WARNING: The values are used for bit operations. If you change the
|
|
|
|
enum, you must keep the bitwise relation of the values. For example:
|
|
|
|
(int) TIMESTAMP_AUTO_SET_ON_BOTH must be equal to
|
|
|
|
(int) TIMESTAMP_AUTO_SET_ON_INSERT | (int) TIMESTAMP_AUTO_SET_ON_UPDATE.
|
|
|
|
We use an enum here so that the debugger can display the value names.
|
2004-10-01 16:54:06 +02:00
|
|
|
*/
|
|
|
|
enum timestamp_auto_set_type
|
|
|
|
{
|
|
|
|
TIMESTAMP_NO_AUTO_SET= 0, TIMESTAMP_AUTO_SET_ON_INSERT= 1,
|
|
|
|
TIMESTAMP_AUTO_SET_ON_UPDATE= 2, TIMESTAMP_AUTO_SET_ON_BOTH= 3
|
|
|
|
};
|
2005-04-22 12:30:09 +02:00
|
|
|
#define clear_timestamp_auto_bits(_target_, _bits_) \
|
|
|
|
(_target_)= (enum timestamp_auto_set_type)((int)(_target_) & ~(int)(_bits_))
|
2004-10-01 16:54:06 +02:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
class Field_timestamp;
|
|
|
|
class Field_blob;
|
2004-09-07 14:29:46 +02:00
|
|
|
class Table_triggers_list;
|
2000-07-31 21:29:14 +02:00
|
|
|
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
/**
|
|
|
|
Category of table found in the table share.
|
|
|
|
*/
|
|
|
|
enum enum_table_category
|
|
|
|
{
|
|
|
|
/**
|
|
|
|
Unknown value.
|
|
|
|
*/
|
|
|
|
TABLE_UNKNOWN_CATEGORY=0,
|
|
|
|
|
|
|
|
/**
|
|
|
|
Temporary table.
|
|
|
|
The table is visible only in the session.
|
|
|
|
Therefore,
|
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
do not apply to this table.
|
2007-08-15 15:43:08 +02:00
|
|
|
Note that LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
can be used on temporary tables.
|
|
|
|
Temporary tables are not part of the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_TEMPORARY=1,
|
|
|
|
|
|
|
|
/**
|
|
|
|
User table.
|
|
|
|
These tables do honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
User tables are cached in the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_USER=2,
|
|
|
|
|
|
|
|
/**
|
|
|
|
System table, maintained by the server.
|
|
|
|
These tables do honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
Typically, writes to system tables are performed by
|
|
|
|
the server implementation, not explicitly be a user.
|
|
|
|
System tables are cached in the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_SYSTEM=3,
|
|
|
|
|
|
|
|
/**
|
|
|
|
Information schema tables.
|
|
|
|
These tables are an interface provided by the system
|
|
|
|
to inspect the system metadata.
|
|
|
|
These tables do *not* honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
as there is no point in locking explicitely
|
|
|
|
an INFORMATION_SCHEMA table.
|
|
|
|
Nothing is directly written to information schema tables.
|
|
|
|
Note that this value is not used currently,
|
|
|
|
since information schema tables are not shared,
|
|
|
|
but implemented as session specific temporary tables.
|
|
|
|
*/
|
|
|
|
/*
|
|
|
|
TODO: Fixing the performance issues of I_S will lead
|
|
|
|
to I_S tables in the table cache, which should use
|
|
|
|
this table type.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_INFORMATION=4,
|
|
|
|
|
|
|
|
/**
|
|
|
|
Performance schema tables.
|
|
|
|
These tables are an interface provided by the system
|
|
|
|
to inspect the system performance data.
|
|
|
|
These tables do *not* honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
as there is no point in locking explicitely
|
|
|
|
a PERFORMANCE_SCHEMA table.
|
|
|
|
An example of PERFORMANCE_SCHEMA tables are:
|
|
|
|
- mysql.slow_log
|
|
|
|
- mysql.general_log,
|
|
|
|
which *are* updated even when there is either
|
|
|
|
a GLOBAL READ LOCK or a GLOBAL READ_ONLY in effect.
|
|
|
|
User queries do not write directly to these tables
|
|
|
|
(there are exceptions for log tables).
|
|
|
|
The server implementation perform writes.
|
|
|
|
Performance tables are cached in the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_PERFORMANCE=5
|
|
|
|
};
|
|
|
|
typedef enum enum_table_category TABLE_CATEGORY;
|
|
|
|
|
|
|
|
TABLE_CATEGORY get_table_category(const LEX_STRING *db,
|
|
|
|
const LEX_STRING *name);
|
|
|
|
|
2009-11-30 16:55:03 +01:00
|
|
|
|
|
|
|
struct TABLE_share;
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
This structure is shared between different table objects. There is one
|
|
|
|
instance of table share per one table in the database.
|
|
|
|
*/
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2009-10-14 13:14:58 +02:00
|
|
|
struct TABLE_SHARE
|
2005-01-06 12:00:13 +01:00
|
|
|
{
|
2009-10-14 13:14:58 +02:00
|
|
|
TABLE_SHARE() {} /* Remove gcc warning */
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
|
|
|
|
/** Category of this table. */
|
|
|
|
TABLE_CATEGORY table_category;
|
|
|
|
|
2004-06-25 15:52:01 +02:00
|
|
|
/* hash of field names (contains pointers to elements of field array) */
|
2005-01-06 12:00:13 +01:00
|
|
|
HASH name_hash; /* hash of field names */
|
|
|
|
MEM_ROOT mem_root;
|
2000-07-31 21:29:14 +02:00
|
|
|
TYPELIB keynames; /* Pointers to keynames */
|
|
|
|
TYPELIB fieldnames; /* Pointer to fieldnames */
|
|
|
|
TYPELIB *intervals; /* pointer to interval info */
|
2009-11-30 20:38:25 +01:00
|
|
|
pthread_mutex_t LOCK_ha_data; /* To protect access to ha_data */
|
2009-10-14 13:14:58 +02:00
|
|
|
TABLE_SHARE *next, **prev; /* Link to unused shares */
|
2005-11-23 21:45:02 +01:00
|
|
|
|
2009-11-30 16:55:03 +01:00
|
|
|
/*
|
|
|
|
Doubly-linked (back-linked) lists of used and unused TABLE objects
|
|
|
|
for this share.
|
|
|
|
*/
|
|
|
|
I_P_List <TABLE, TABLE_share> used_tables;
|
|
|
|
I_P_List <TABLE, TABLE_share> free_tables;
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
/* The following is copied to each TABLE on OPEN */
|
|
|
|
Field **field;
|
2005-11-23 21:45:02 +01:00
|
|
|
Field **found_next_number_field;
|
|
|
|
Field *timestamp_field; /* Used only during open */
|
2005-01-06 12:00:13 +01:00
|
|
|
KEY *key_info; /* data of keys in database */
|
|
|
|
uint *blob_field; /* Index to blobs in Field arrray*/
|
2005-11-23 21:45:02 +01:00
|
|
|
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
uchar *default_values; /* row with default values */
|
2006-06-29 15:39:34 +02:00
|
|
|
LEX_STRING comment; /* Comment about table */
|
2005-01-06 12:00:13 +01:00
|
|
|
CHARSET_INFO *table_charset; /* Default charset of string fields */
|
|
|
|
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
MY_BITMAP all_set;
|
2006-08-21 17:02:11 +02:00
|
|
|
/*
|
|
|
|
Key which is used for looking-up table in table cache and in the list
|
|
|
|
of thread's temporary tables. Has the form of:
|
|
|
|
"database_name\0table_name\0" + optional part for temporary tables.
|
|
|
|
|
|
|
|
Note that all three 'table_cache_key', 'db' and 'table_name' members
|
|
|
|
must be set (and be non-zero) for tables in table cache. They also
|
|
|
|
should correspond to each other.
|
|
|
|
To ensure this one can use set_table_cache() methods.
|
|
|
|
*/
|
2005-11-23 21:45:02 +01:00
|
|
|
LEX_STRING table_cache_key;
|
|
|
|
LEX_STRING db; /* Pointer to db */
|
|
|
|
LEX_STRING table_name; /* Table name (for open) */
|
|
|
|
LEX_STRING path; /* Path to .frm file (from datadir) */
|
|
|
|
LEX_STRING normalized_path; /* unpack_filename(path) */
|
2005-09-13 03:02:17 +02:00
|
|
|
LEX_STRING connect_string;
|
2007-01-29 15:07:11 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
Set of keys in use, implemented as a Bitmap.
|
|
|
|
Excludes keys disabled by ALTER TABLE ... DISABLE KEYS.
|
|
|
|
*/
|
|
|
|
key_map keys_in_use;
|
2005-01-06 12:00:13 +01:00
|
|
|
key_map keys_for_keyread;
|
2005-11-23 21:45:02 +01:00
|
|
|
ha_rows min_rows, max_rows; /* create information */
|
2005-01-06 12:00:13 +01:00
|
|
|
ulong avg_row_length; /* create information */
|
|
|
|
ulong raid_chunksize;
|
2007-02-28 22:25:50 +01:00
|
|
|
ulong version, mysql_version;
|
2005-01-06 12:00:13 +01:00
|
|
|
ulong timestamp_offset; /* Set to offset+1 of record */
|
|
|
|
ulong reclength; /* Recordlength */
|
|
|
|
|
2007-03-02 17:43:45 +01:00
|
|
|
plugin_ref db_plugin; /* storage engine plugin */
|
|
|
|
inline handlerton *db_type() const /* table_type for handler */
|
|
|
|
{
|
|
|
|
// DBUG_ASSERT(db_plugin);
|
|
|
|
return db_plugin ? plugin_data(db_plugin, handlerton*) : NULL;
|
|
|
|
}
|
2000-07-31 21:29:14 +02:00
|
|
|
enum row_type row_type; /* How rows are stored */
|
2005-01-06 12:00:13 +01:00
|
|
|
enum tmp_table_type tmp_table;
|
2009-08-12 15:11:06 +02:00
|
|
|
enum enum_ha_unused unused1;
|
|
|
|
enum enum_ha_unused unused2;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
uint ref_count; /* How many TABLE objects uses this */
|
|
|
|
uint open_count; /* Number of tables in open list */
|
2005-01-06 12:00:13 +01:00
|
|
|
uint blob_ptr_size; /* 4 or 8 */
|
2006-05-03 14:59:17 +02:00
|
|
|
uint key_block_size; /* create key_block_size, if used */
|
2005-10-04 17:04:20 +02:00
|
|
|
uint null_bytes, last_null_bit_pos;
|
2005-01-06 12:00:13 +01:00
|
|
|
uint fields; /* Number of fields */
|
|
|
|
uint rec_buff_length; /* Size of table->record[] buffer */
|
|
|
|
uint keys, key_parts;
|
|
|
|
uint max_key_length, max_unique_length, total_key_length;
|
|
|
|
uint uniques; /* Number of UNIQUE index */
|
|
|
|
uint null_fields; /* number of null fields */
|
|
|
|
uint blob_fields; /* number of blob fields */
|
2005-11-23 21:45:02 +01:00
|
|
|
uint timestamp_field_offset; /* Field number for timestamp field */
|
2005-01-12 02:38:53 +01:00
|
|
|
uint varchar_fields; /* number of varchar fields */
|
2000-07-31 21:29:14 +02:00
|
|
|
uint db_create_options; /* Create options from database */
|
|
|
|
uint db_options_in_use; /* Options in use */
|
|
|
|
uint db_record_offset; /* if HA_REC_IN_SEQ */
|
2005-01-06 12:00:13 +01:00
|
|
|
uint raid_type, raid_chunks;
|
2005-11-23 21:45:02 +01:00
|
|
|
uint rowid_field_offset; /* Field_nr +1 to rowid field */
|
2005-01-06 12:00:13 +01:00
|
|
|
/* Index of auto-updated TIMESTAMP field in field array */
|
|
|
|
uint primary_key;
|
2007-03-17 00:13:25 +01:00
|
|
|
uint next_number_index; /* autoincrement key number */
|
|
|
|
uint next_number_key_offset; /* autoinc keypart offset in a key */
|
|
|
|
uint next_number_keypart; /* autoinc keypart number in a key */
|
2005-11-23 21:45:02 +01:00
|
|
|
uint error, open_errno, errarg; /* error from open_table_def() */
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
uint column_bitmap_size;
|
2005-11-23 21:45:02 +01:00
|
|
|
uchar frm_version;
|
|
|
|
bool null_field_first;
|
|
|
|
bool system; /* Set if system table (one record) */
|
|
|
|
bool crypted; /* If .frm file is crypted */
|
|
|
|
bool db_low_byte_first; /* Portable row format */
|
|
|
|
bool crashed;
|
|
|
|
bool is_view;
|
2005-12-22 06:39:02 +01:00
|
|
|
ulong table_map_id; /* for row-based replication */
|
|
|
|
ulonglong table_map_version;
|
2006-03-17 18:11:07 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
Cache for row-based replication table share checks that does not
|
|
|
|
need to be repeated. Possible values are: -1 when cache value is
|
|
|
|
not calculated yet, 0 when table *shall not* be replicated, 1 when
|
|
|
|
table *may* be replicated.
|
|
|
|
*/
|
|
|
|
int cached_row_logging_check;
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
#ifdef WITH_PARTITION_STORAGE_ENGINE
|
Bug#38804: Query deadlock causes all tables to be inaccessible.
Problem was a mutex added in bug n 27405 for solving a problem
with auto_increment in partitioned innodb tables.
(in ha_partition::write_row over partitions file->ha_write_row)
Solution is to use the patch for bug#33479, which refines the
usage of mutexes for auto_increment.
Backport of bug-33479 from 6.0:
Bug-33479: auto_increment failures in partitioning
Several problems with auto_increment in partitioning
(with MyISAM, InnoDB. Locking issues, not handling
multi-row INSERTs properly etc.)
Changed the auto_increment handling for partitioning:
Added a ha_data variable in table_share for storage engine specific data
such as auto_increment value handling in partitioning, also see WL 4305
and using the ha_data->mutex to lock around read + update.
The idea is this:
Store the table's reserved auto_increment value in
the TABLE_SHARE and use a mutex to, lock it for reading and updating it
and unlocking it, in one block. Only accessing all partitions
when it is not initialized.
Also allow reservations of ranges, and if no one has done a reservation
afterwards, lower the reservation to what was actually used after
the statement is done (via release_auto_increment from WL 3146).
The lock is kept from the first reservation if it is statement based
replication and a multi-row INSERT statement where the number of
candidate rows to insert is not known in advance (like INSERT SELECT,
LOAD DATA, unlike INSERT VALUES (row1), (row2),,(rowN)).
This should also lead to better concurrancy (no need to have a mutex
protection around write_row in all cases)
and work with any local storage engine.
2008-09-08 15:30:01 +02:00
|
|
|
/** @todo: Move into *ha_data for partitioning */
|
2006-05-10 18:53:40 +02:00
|
|
|
bool auto_partitioned;
|
2007-03-27 19:09:56 +02:00
|
|
|
const char *partition_info;
|
2005-11-23 21:45:02 +01:00
|
|
|
uint partition_info_len;
|
2007-06-01 16:44:09 +02:00
|
|
|
uint partition_info_buffer_size;
|
2007-03-27 19:09:56 +02:00
|
|
|
const char *part_state;
|
2006-01-17 08:40:00 +01:00
|
|
|
uint part_state_len;
|
2005-12-21 19:18:40 +01:00
|
|
|
handlerton *default_part_db_type;
|
2005-11-23 21:45:02 +01:00
|
|
|
#endif
|
2006-08-21 17:02:11 +02:00
|
|
|
|
Bug#38804: Query deadlock causes all tables to be inaccessible.
Problem was a mutex added in bug n 27405 for solving a problem
with auto_increment in partitioned innodb tables.
(in ha_partition::write_row over partitions file->ha_write_row)
Solution is to use the patch for bug#33479, which refines the
usage of mutexes for auto_increment.
Backport of bug-33479 from 6.0:
Bug-33479: auto_increment failures in partitioning
Several problems with auto_increment in partitioning
(with MyISAM, InnoDB. Locking issues, not handling
multi-row INSERTs properly etc.)
Changed the auto_increment handling for partitioning:
Added a ha_data variable in table_share for storage engine specific data
such as auto_increment value handling in partitioning, also see WL 4305
and using the ha_data->mutex to lock around read + update.
The idea is this:
Store the table's reserved auto_increment value in
the TABLE_SHARE and use a mutex to, lock it for reading and updating it
and unlocking it, in one block. Only accessing all partitions
when it is not initialized.
Also allow reservations of ranges, and if no one has done a reservation
afterwards, lower the reservation to what was actually used after
the statement is done (via release_auto_increment from WL 3146).
The lock is kept from the first reservation if it is statement based
replication and a multi-row INSERT statement where the number of
candidate rows to insert is not known in advance (like INSERT SELECT,
LOAD DATA, unlike INSERT VALUES (row1), (row2),,(rowN)).
This should also lead to better concurrancy (no need to have a mutex
protection around write_row in all cases)
and work with any local storage engine.
2008-09-08 15:30:01 +02:00
|
|
|
/** place to store storage engine specific data */
|
|
|
|
void *ha_data;
|
Bug #37433 Deadlock between open_table, close_open_tables,
get_table_share, drop_open_table
In the partition handler code, LOCK_open and share->LOCK_ha_data
are acquired in the wrong order in certain cases. When doing a
multi-row INSERT (i.e a INSERT..SELECT) in a table with auto-
increment column(s). the increments must be in a monotonically
continuous increasing sequence (i.e it can't have "holes"). To
achieve this, a lock is held for the duration of the operation.
share->LOCK_ha_data was used for this purpose.
Whenever there was a need to open a view _during_ the operation
(views are not currently pre-opened the way tables are), and
LOCK_open was grabbed, a deadlock could occur. share->LOCK_ha_data
is other places used _while_ holding LOCK_open.
A new mutex was introduced in the HA_DATA_PARTITION structure,
for exclusive use of the autoincrement data fields, so we don't
need to overload the use of LOCK_ha_data here.
A module test case has not been supplied, since the problem occurs
as a result of a race condition, and testing for this condition
is thus not deterministic. Testing for it could be done by
setting up a test case as described in the bug report.
2009-10-15 13:07:04 +02:00
|
|
|
void (*ha_data_destroy)(void *); /* An optional destructor for ha_data */
|
Bug#38804: Query deadlock causes all tables to be inaccessible.
Problem was a mutex added in bug n 27405 for solving a problem
with auto_increment in partitioned innodb tables.
(in ha_partition::write_row over partitions file->ha_write_row)
Solution is to use the patch for bug#33479, which refines the
usage of mutexes for auto_increment.
Backport of bug-33479 from 6.0:
Bug-33479: auto_increment failures in partitioning
Several problems with auto_increment in partitioning
(with MyISAM, InnoDB. Locking issues, not handling
multi-row INSERTs properly etc.)
Changed the auto_increment handling for partitioning:
Added a ha_data variable in table_share for storage engine specific data
such as auto_increment value handling in partitioning, also see WL 4305
and using the ha_data->mutex to lock around read + update.
The idea is this:
Store the table's reserved auto_increment value in
the TABLE_SHARE and use a mutex to, lock it for reading and updating it
and unlocking it, in one block. Only accessing all partitions
when it is not initialized.
Also allow reservations of ranges, and if no one has done a reservation
afterwards, lower the reservation to what was actually used after
the statement is done (via release_auto_increment from WL 3146).
The lock is kept from the first reservation if it is statement based
replication and a multi-row INSERT statement where the number of
candidate rows to insert is not known in advance (like INSERT SELECT,
LOAD DATA, unlike INSERT VALUES (row1), (row2),,(rowN)).
This should also lead to better concurrancy (no need to have a mutex
protection around write_row in all cases)
and work with any local storage engine.
2008-09-08 15:30:01 +02:00
|
|
|
|
2006-08-21 17:02:11 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
Set share's table cache key and update its db and table name appropriately.
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
set_table_cache_key()
|
|
|
|
key_buff Buffer with already built table cache key to be
|
|
|
|
referenced from share.
|
|
|
|
key_length Key length.
|
|
|
|
|
|
|
|
NOTES
|
|
|
|
Since 'key_buff' buffer will be referenced from share it should has same
|
|
|
|
life-time as share itself.
|
|
|
|
This method automatically ensures that TABLE_SHARE::table_name/db have
|
|
|
|
appropriate values by using table cache key as their source.
|
|
|
|
*/
|
|
|
|
|
|
|
|
void set_table_cache_key(char *key_buff, uint key_length)
|
|
|
|
{
|
|
|
|
table_cache_key.str= key_buff;
|
|
|
|
table_cache_key.length= key_length;
|
|
|
|
/*
|
|
|
|
Let us use the fact that the key is "db/0/table_name/0" + optional
|
|
|
|
part for temporary tables.
|
|
|
|
*/
|
|
|
|
db.str= table_cache_key.str;
|
|
|
|
db.length= strlen(db.str);
|
|
|
|
table_name.str= db.str + db.length + 1;
|
|
|
|
table_name.length= strlen(table_name.str);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Set share's table cache key and update its db and table name appropriately.
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
set_table_cache_key()
|
|
|
|
key_buff Buffer to be used as storage for table cache key
|
|
|
|
(should be at least key_length bytes).
|
|
|
|
key Value for table cache key.
|
|
|
|
key_length Key length.
|
|
|
|
|
|
|
|
NOTE
|
|
|
|
Since 'key_buff' buffer will be used as storage for table cache key
|
|
|
|
it should has same life-time as share itself.
|
|
|
|
*/
|
|
|
|
|
|
|
|
void set_table_cache_key(char *key_buff, const char *key, uint key_length)
|
|
|
|
{
|
|
|
|
memcpy(key_buff, key, key_length);
|
|
|
|
set_table_cache_key(key_buff, key_length);
|
|
|
|
}
|
|
|
|
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
inline bool honor_global_locks()
|
|
|
|
{
|
|
|
|
return ((table_category == TABLE_CATEGORY_USER)
|
|
|
|
|| (table_category == TABLE_CATEGORY_SYSTEM));
|
|
|
|
}
|
|
|
|
|
|
|
|
inline bool require_write_privileges()
|
|
|
|
{
|
|
|
|
return (table_category == TABLE_CATEGORY_PERFORMANCE);
|
|
|
|
}
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
|
|
|
|
inline ulong get_table_def_version()
|
|
|
|
{
|
|
|
|
return table_map_id;
|
|
|
|
}
|
|
|
|
|
2008-04-08 18:01:20 +02:00
|
|
|
/**
|
|
|
|
Convert unrelated members of TABLE_SHARE to one enum
|
2008-05-20 09:29:16 +02:00
|
|
|
representing its type.
|
2008-04-08 18:01:20 +02:00
|
|
|
|
|
|
|
@todo perhaps we need to have a member instead of a function.
|
|
|
|
*/
|
2008-05-20 09:29:16 +02:00
|
|
|
enum enum_table_ref_type get_table_ref_type() const
|
2008-04-08 18:01:20 +02:00
|
|
|
{
|
|
|
|
if (is_view)
|
2008-05-20 09:29:16 +02:00
|
|
|
return TABLE_REF_VIEW;
|
2008-04-08 18:01:20 +02:00
|
|
|
switch (tmp_table) {
|
|
|
|
case NO_TMP_TABLE:
|
2008-05-20 09:29:16 +02:00
|
|
|
return TABLE_REF_BASE_TABLE;
|
2008-04-08 18:01:20 +02:00
|
|
|
case SYSTEM_TMP_TABLE:
|
2008-05-20 09:29:16 +02:00
|
|
|
return TABLE_REF_I_S_TABLE;
|
2008-04-08 18:01:20 +02:00
|
|
|
default:
|
2008-05-20 09:29:16 +02:00
|
|
|
return TABLE_REF_TMP_TABLE;
|
2008-04-08 18:01:20 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
/**
|
|
|
|
Return a table metadata version.
|
|
|
|
* for base tables, we return table_map_id.
|
|
|
|
It is assigned from a global counter incremented for each
|
|
|
|
new table loaded into the table definition cache (TDC).
|
|
|
|
* for temporary tables it's table_map_id again. But for
|
|
|
|
temporary tables table_map_id is assigned from
|
|
|
|
thd->query_id. The latter is assigned from a thread local
|
|
|
|
counter incremented for every new SQL statement. Since
|
|
|
|
temporary tables are thread-local, each temporary table
|
|
|
|
gets a unique id.
|
|
|
|
* for everything else (views, information schema tables),
|
|
|
|
the version id is zero.
|
|
|
|
|
|
|
|
This choice of version id is a large compromise
|
|
|
|
to have a working prepared statement validation in 5.1. In
|
|
|
|
future version ids will be persistent, as described in WL#4180.
|
|
|
|
|
|
|
|
Let's try to explain why and how this limited solution allows
|
|
|
|
to validate prepared statements.
|
|
|
|
|
2008-05-17 23:51:18 +02:00
|
|
|
Firstly, sets (in mathematical sense) of version numbers
|
2008-05-20 09:29:16 +02:00
|
|
|
never intersect for different table types. Therefore,
|
2008-04-08 18:01:20 +02:00
|
|
|
version id of a temporary table is never compared with
|
2008-05-17 23:51:18 +02:00
|
|
|
a version id of a view, and vice versa.
|
2008-04-08 18:01:20 +02:00
|
|
|
|
|
|
|
Secondly, for base tables, we know that each DDL flushes the
|
|
|
|
respective share from the TDC. This ensures that whenever
|
|
|
|
a table is altered or dropped and recreated, it gets a new
|
|
|
|
version id.
|
|
|
|
Unfortunately, since elements of the TDC are also flushed on
|
|
|
|
LRU basis, this choice of version ids leads to false positives.
|
|
|
|
E.g. when the TDC size is too small, we may have a SELECT
|
|
|
|
* FROM INFORMATION_SCHEMA.TABLES flush all its elements, which
|
|
|
|
in turn will lead to a validation error and a subsequent
|
|
|
|
reprepare of all prepared statements. This is
|
|
|
|
considered acceptable, since as long as prepared statements are
|
|
|
|
automatically reprepared, spurious invalidation is only
|
|
|
|
a performance hit. Besides, no better simple solution exists.
|
|
|
|
|
|
|
|
For temporary tables, using thd->query_id ensures that if
|
|
|
|
a temporary table was altered or recreated, a new version id is
|
|
|
|
assigned. This suits validation needs very well and will perhaps
|
|
|
|
never change.
|
|
|
|
|
|
|
|
Metadata of information schema tables never changes.
|
|
|
|
Thus we can safely assume 0 for a good enough version id.
|
|
|
|
|
|
|
|
Views are a special and tricky case. A view is always inlined
|
|
|
|
into the parse tree of a prepared statement at prepare.
|
|
|
|
Thus, when we execute a prepared statement, the parse tree
|
|
|
|
will not get modified even if the view is replaced with another
|
|
|
|
view. Therefore, we can safely choose 0 for version id of
|
|
|
|
views and effectively never invalidate a prepared statement
|
|
|
|
when a view definition is altered. Note, that this leads to
|
|
|
|
wrong binary log in statement-based replication, since we log
|
|
|
|
prepared statement execution in form Query_log_events
|
|
|
|
containing conventional statements. But since there is no
|
|
|
|
metadata locking for views, the very same problem exists for
|
|
|
|
conventional statements alone, as reported in Bug#25144. The only
|
|
|
|
difference between prepared and conventional execution is,
|
|
|
|
effectively, that for prepared statements the race condition
|
|
|
|
window is much wider.
|
|
|
|
In 6.0 we plan to support view metadata locking (WL#3726) and
|
|
|
|
extend table definition cache to cache views (WL#4298).
|
|
|
|
When this is done, views will be handled in the same fashion
|
|
|
|
as the base tables.
|
|
|
|
|
2008-05-20 09:29:16 +02:00
|
|
|
Finally, by taking into account table type, we always
|
2008-04-08 18:01:20 +02:00
|
|
|
track that a change has taken place when a view is replaced
|
|
|
|
with a base table, a base table is replaced with a temporary
|
|
|
|
table and so on.
|
|
|
|
|
2008-05-20 09:29:16 +02:00
|
|
|
@sa TABLE_LIST::is_table_ref_id_equal()
|
2008-04-08 18:01:20 +02:00
|
|
|
*/
|
2008-05-20 09:29:16 +02:00
|
|
|
ulong get_table_ref_version() const
|
2008-04-08 18:01:20 +02:00
|
|
|
{
|
2008-05-17 23:51:18 +02:00
|
|
|
return (tmp_table == SYSTEM_TMP_TABLE || is_view) ? 0 : table_map_id;
|
2008-04-08 18:01:20 +02:00
|
|
|
}
|
|
|
|
|
2009-10-14 13:14:58 +02:00
|
|
|
};
|
2005-01-06 12:00:13 +01:00
|
|
|
|
|
|
|
|
2007-05-11 19:51:03 +02:00
|
|
|
extern ulong refresh_version;
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
/* Information for one open table */
|
2007-03-05 18:08:41 +01:00
|
|
|
enum index_hint_type
|
|
|
|
{
|
|
|
|
INDEX_HINT_IGNORE,
|
|
|
|
INDEX_HINT_USE,
|
|
|
|
INDEX_HINT_FORCE
|
|
|
|
};
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2009-10-14 13:14:58 +02:00
|
|
|
struct TABLE
|
|
|
|
{
|
|
|
|
TABLE() {} /* Remove gcc warning */
|
2006-02-25 16:46:30 +01:00
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
TABLE_SHARE *s;
|
|
|
|
handler *file;
|
2009-10-14 13:14:58 +02:00
|
|
|
TABLE *next, *prev;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2009-11-30 16:55:03 +01:00
|
|
|
private:
|
|
|
|
/**
|
|
|
|
Links for the lists of used/unused TABLE objects for this share.
|
|
|
|
Declared as private to avoid direct manipulation with those objects.
|
|
|
|
One should use methods of I_P_List template instead.
|
|
|
|
*/
|
|
|
|
TABLE *share_next, **share_prev;
|
|
|
|
|
|
|
|
friend struct TABLE_share;
|
|
|
|
|
|
|
|
public:
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
THD *in_use; /* Which thread uses this */
|
|
|
|
Field **field; /* Pointer to fields */
|
|
|
|
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
uchar *record[2]; /* Pointer to records */
|
|
|
|
uchar *write_row_record; /* Used as optimisation in
|
2005-12-22 06:39:02 +01:00
|
|
|
THD::write_row */
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
uchar *insert_values; /* used by INSERT ... UPDATE */
|
2007-03-05 18:08:41 +01:00
|
|
|
/*
|
|
|
|
Map of keys that can be used to retrieve all data from this table
|
|
|
|
needed by the query without reading the row.
|
|
|
|
*/
|
|
|
|
key_map covering_keys;
|
|
|
|
key_map quick_keys, merge_keys;
|
2007-01-29 15:07:11 +01:00
|
|
|
/*
|
|
|
|
A set of keys that can be used in the query that references this
|
2007-01-30 18:07:41 +01:00
|
|
|
table.
|
2007-01-29 15:07:11 +01:00
|
|
|
|
|
|
|
All indexes disabled on the table's TABLE_SHARE (see TABLE::s) will be
|
|
|
|
subtracted from this set upon instantiation. Thus for any TABLE t it holds
|
|
|
|
that t.keys_in_use_for_query is a subset of t.s.keys_in_use. Generally we
|
|
|
|
must not introduce any new keys here (see setup_tables).
|
|
|
|
|
|
|
|
The set is implemented as a bitmap.
|
|
|
|
*/
|
|
|
|
key_map keys_in_use_for_query;
|
2007-03-05 18:08:41 +01:00
|
|
|
/* Map of keys that can be used to calculate GROUP BY without sorting */
|
|
|
|
key_map keys_in_use_for_group_by;
|
|
|
|
/* Map of keys that can be used to calculate ORDER BY without sorting */
|
|
|
|
key_map keys_in_use_for_order_by;
|
2005-01-06 12:00:13 +01:00
|
|
|
KEY *key_info; /* data of keys in database */
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
Field *next_number_field; /* Set if next_number is activated */
|
|
|
|
Field *found_next_number_field; /* Set on open */
|
2005-01-06 12:00:13 +01:00
|
|
|
Field_timestamp *timestamp_field;
|
|
|
|
|
|
|
|
/* Table's triggers, 0 if there are no of them */
|
|
|
|
Table_triggers_list *triggers;
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *pos_in_table_list;/* Element referring to this table */
|
2009-12-02 16:22:15 +01:00
|
|
|
/* Position in thd->locked_table_list under LOCK TABLES */
|
|
|
|
TABLE_LIST *pos_in_locked_tables;
|
2005-01-06 12:00:13 +01:00
|
|
|
ORDER *group;
|
|
|
|
const char *alias; /* alias or table name */
|
|
|
|
uchar *null_flags;
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
my_bitmap_map *bitmap_init_value;
|
|
|
|
MY_BITMAP def_read_set, def_write_set, tmp_set; /* containers */
|
|
|
|
MY_BITMAP *read_set, *write_set; /* Active column sets */
|
2007-11-01 21:52:56 +01:00
|
|
|
/*
|
|
|
|
The ID of the query that opened and is using this table. Has different
|
|
|
|
meanings depending on the table type.
|
|
|
|
|
|
|
|
Temporary tables:
|
|
|
|
|
|
|
|
table->query_id is set to thd->query_id for the duration of a statement
|
|
|
|
and is reset to 0 once it is closed by the same statement. A non-zero
|
|
|
|
table->query_id means that a statement is using the table even if it's
|
|
|
|
not the current statement (table is in use by some outer statement).
|
|
|
|
|
|
|
|
Non-temporary tables:
|
|
|
|
|
|
|
|
Under pre-locked or LOCK TABLES mode: query_id is set to thd->query_id
|
|
|
|
for the duration of a statement and is reset to 0 once it is closed by
|
|
|
|
the same statement. A non-zero query_id is used to control which tables
|
|
|
|
in the list of pre-opened and locked tables are actually being used.
|
|
|
|
*/
|
2005-03-19 01:12:25 +01:00
|
|
|
query_id_t query_id;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2006-07-28 19:27:01 +02:00
|
|
|
/*
|
|
|
|
For each key that has quick_keys.is_set(key) == TRUE: estimate of #records
|
|
|
|
and max #key parts that range access would use.
|
|
|
|
*/
|
2005-01-06 12:00:13 +01:00
|
|
|
ha_rows quick_rows[MAX_KEY];
|
2006-07-28 19:27:01 +02:00
|
|
|
|
|
|
|
/* Bitmaps of key parts that =const for the entire join. */
|
2005-01-06 12:00:13 +01:00
|
|
|
key_part_map const_key_parts[MAX_KEY];
|
2006-07-28 19:27:01 +02:00
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
uint quick_key_parts[MAX_KEY];
|
2006-05-10 15:40:20 +02:00
|
|
|
uint quick_n_ranges[MAX_KEY];
|
2004-06-25 15:52:01 +02:00
|
|
|
|
2006-07-28 19:27:01 +02:00
|
|
|
/*
|
|
|
|
Estimate of number of records that satisfy SARGable part of the table
|
|
|
|
condition, or table->file->records if no SARGable condition could be
|
|
|
|
constructed.
|
|
|
|
This value is used by join optimizer as an estimate of number of records
|
|
|
|
that will pass the table condition (condition that depends on fields of
|
|
|
|
this table and constants)
|
|
|
|
*/
|
|
|
|
ha_rows quick_condition_rows;
|
|
|
|
|
2004-10-01 16:54:06 +02:00
|
|
|
/*
|
|
|
|
If this table has TIMESTAMP field with auto-set property (pointed by
|
|
|
|
timestamp_field member) then this variable indicates during which
|
|
|
|
operations (insert only/on update/in both cases) we should set this
|
|
|
|
field to current timestamp. If there are no such field in this table
|
|
|
|
or we should not automatically set its value during execution of current
|
|
|
|
statement then the variable contains TIMESTAMP_NO_AUTO_SET (i.e. 0).
|
|
|
|
|
|
|
|
Value of this variable is set for each statement in open_table() and
|
|
|
|
if needed cleared later in statement processing code (see mysql_update()
|
|
|
|
as example).
|
2004-06-25 15:52:01 +02:00
|
|
|
*/
|
2004-10-01 16:54:06 +02:00
|
|
|
timestamp_auto_set_type timestamp_field_type;
|
2005-01-06 12:00:13 +01:00
|
|
|
table_map map; /* ID bit of table (1,2,4,8,16...) */
|
2006-02-20 15:23:57 +01:00
|
|
|
|
|
|
|
uint lock_position; /* Position in MYSQL_LOCK.table */
|
|
|
|
uint lock_data_start; /* Start pos. in MYSQL_LOCK.locks */
|
|
|
|
uint lock_count; /* Number of locks */
|
2005-01-06 12:00:13 +01:00
|
|
|
uint tablenr,used_fields;
|
|
|
|
uint temp_pool_slot; /* Used by intern temp tables */
|
|
|
|
uint status; /* What's in record[0] */
|
|
|
|
uint db_stat; /* mode of file as in handler.h */
|
|
|
|
/* number of select if it is derived table */
|
|
|
|
uint derived_select_number;
|
|
|
|
int current_lock; /* Type of lock on table */
|
2000-07-31 21:29:14 +02:00
|
|
|
my_bool copy_blobs; /* copy_blobs when storing */
|
2006-02-20 15:23:57 +01:00
|
|
|
|
|
|
|
/*
|
2005-02-05 16:16:29 +01:00
|
|
|
0 or JOIN_TYPE_{LEFT|RIGHT}. Currently this is only compared to 0.
|
|
|
|
If maybe_null !=0, this table is inner w.r.t. some outer join operation,
|
|
|
|
and null_row may be true.
|
|
|
|
*/
|
|
|
|
uint maybe_null;
|
2004-12-11 13:51:52 +01:00
|
|
|
/*
|
2005-02-05 16:16:29 +01:00
|
|
|
If true, the current table row is considered to have all columns set to
|
|
|
|
NULL, including columns declared as "not null" (see maybe_null).
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
*/
|
|
|
|
my_bool null_row;
|
BUG#25091 (A DELETE statement to mysql database is not logged in ROW format):
With this patch, statements that change metadata (in the mysql database)
is logged as statements, while normal changes (e.g., using INSERT, DELETE,
and/or UPDATE) is logged according to the format in effect.
The log tables (i.e., general_log and slow_log) are not replicated at all.
With this patch, the following statements are replicated as statements:
GRANT, REVOKE (ALL), CREATE USER, DROP USER, and RENAME USER.
2007-02-26 10:19:08 +01:00
|
|
|
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
/*
|
|
|
|
TODO: Each of the following flags take up 8 bits. They can just as easily
|
BUG#25091 (A DELETE statement to mysql database is not logged in ROW format):
With this patch, statements that change metadata (in the mysql database)
is logged as statements, while normal changes (e.g., using INSERT, DELETE,
and/or UPDATE) is logged according to the format in effect.
The log tables (i.e., general_log and slow_log) are not replicated at all.
With this patch, the following statements are replicated as statements:
GRANT, REVOKE (ALL), CREATE USER, DROP USER, and RENAME USER.
2007-02-26 10:19:08 +01:00
|
|
|
be put into one single unsigned long and instead of taking up 18
|
|
|
|
bytes, it would take up 4.
|
2004-12-11 13:51:52 +01:00
|
|
|
*/
|
2003-01-09 01:19:14 +01:00
|
|
|
my_bool force_index;
|
2009-10-07 17:03:42 +02:00
|
|
|
|
|
|
|
/**
|
|
|
|
Flag set when the statement contains FORCE INDEX FOR ORDER BY
|
|
|
|
See TABLE_LIST::process_index_hints().
|
|
|
|
*/
|
|
|
|
my_bool force_index_order;
|
|
|
|
|
|
|
|
/**
|
|
|
|
Flag set when the statement contains FORCE INDEX FOR GROUP BY
|
|
|
|
See TABLE_LIST::process_index_hints().
|
|
|
|
*/
|
|
|
|
my_bool force_index_group;
|
2001-05-23 01:40:46 +02:00
|
|
|
my_bool distinct,const_table,no_rows;
|
2009-08-07 13:51:40 +02:00
|
|
|
|
|
|
|
/**
|
|
|
|
If set, the optimizer has found that row retrieval should access index
|
|
|
|
tree only.
|
|
|
|
*/
|
|
|
|
my_bool key_read;
|
|
|
|
my_bool no_keyread;
|
2006-01-19 03:56:06 +01:00
|
|
|
my_bool locked_by_logger;
|
BUG#25091 (A DELETE statement to mysql database is not logged in ROW format):
With this patch, statements that change metadata (in the mysql database)
is logged as statements, while normal changes (e.g., using INSERT, DELETE,
and/or UPDATE) is logged according to the format in effect.
The log tables (i.e., general_log and slow_log) are not replicated at all.
With this patch, the following statements are replicated as statements:
GRANT, REVOKE (ALL), CREATE USER, DROP USER, and RENAME USER.
2007-02-26 10:19:08 +01:00
|
|
|
my_bool no_replicate;
|
2000-08-29 11:31:01 +02:00
|
|
|
my_bool locked_by_name;
|
2002-03-01 17:57:08 +01:00
|
|
|
my_bool fulltext_searched;
|
2005-01-06 12:00:13 +01:00
|
|
|
my_bool no_cache;
|
2007-11-01 21:52:56 +01:00
|
|
|
/* To signal that the table is associated with a HANDLER statement */
|
|
|
|
my_bool open_by_handler;
|
2007-03-30 16:13:33 +02:00
|
|
|
/*
|
|
|
|
To indicate that a non-null value of the auto_increment field
|
|
|
|
was provided by the user or retrieved from the current record.
|
|
|
|
Used only in the MODE_NO_AUTO_VALUE_ON_ZERO mode.
|
|
|
|
*/
|
2004-06-25 15:52:01 +02:00
|
|
|
my_bool auto_increment_field_not_null;
|
2004-12-06 01:00:37 +01:00
|
|
|
my_bool insert_or_update; /* Can be used by the handler */
|
2004-12-06 18:18:35 +01:00
|
|
|
my_bool alias_name_used; /* true if table_name is alias */
|
2005-07-18 13:31:02 +02:00
|
|
|
my_bool get_fields_in_item_tree; /* Signal to fix_field */
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
REGINFO reginfo; /* field connections */
|
|
|
|
MEM_ROOT mem_root;
|
|
|
|
GRANT_INFO grant;
|
2003-04-24 13:33:33 +02:00
|
|
|
FILESORT_INFO sort;
|
2005-11-23 21:45:02 +01:00
|
|
|
#ifdef WITH_PARTITION_STORAGE_ENGINE
|
|
|
|
partition_info *part_info; /* Partition related information */
|
2005-12-26 06:40:09 +01:00
|
|
|
bool no_partitions_used; /* If true, all partitions have been pruned away */
|
2005-11-23 21:45:02 +01:00
|
|
|
#endif
|
2009-12-04 00:52:05 +01:00
|
|
|
MDL_ticket *mdl_ticket;
|
2005-09-22 00:11:21 +02:00
|
|
|
|
|
|
|
bool fill_item_list(List<Item> *item_list) const;
|
|
|
|
void reset_item_list(List<Item> *item_list) const;
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
void clear_column_bitmaps(void);
|
|
|
|
void prepare_for_position(void);
|
|
|
|
void mark_columns_used_by_index_no_reset(uint index, MY_BITMAP *map);
|
|
|
|
void mark_columns_used_by_index(uint index);
|
|
|
|
void restore_column_maps_after_mark_index();
|
|
|
|
void mark_auto_increment_column(void);
|
|
|
|
void mark_columns_needed_for_update(void);
|
|
|
|
void mark_columns_needed_for_delete(void);
|
|
|
|
void mark_columns_needed_for_insert(void);
|
|
|
|
inline void column_bitmaps_set(MY_BITMAP *read_set_arg,
|
|
|
|
MY_BITMAP *write_set_arg)
|
|
|
|
{
|
|
|
|
read_set= read_set_arg;
|
|
|
|
write_set= write_set_arg;
|
|
|
|
if (file)
|
|
|
|
file->column_bitmaps_signal();
|
|
|
|
}
|
|
|
|
inline void column_bitmaps_set_no_signal(MY_BITMAP *read_set_arg,
|
|
|
|
MY_BITMAP *write_set_arg)
|
|
|
|
{
|
|
|
|
read_set= read_set_arg;
|
|
|
|
write_set= write_set_arg;
|
|
|
|
}
|
|
|
|
inline void use_all_columns()
|
|
|
|
{
|
|
|
|
column_bitmaps_set(&s->all_set, &s->all_set);
|
|
|
|
}
|
|
|
|
inline void default_column_bitmaps()
|
|
|
|
{
|
|
|
|
read_set= &def_read_set;
|
|
|
|
write_set= &def_write_set;
|
|
|
|
}
|
2007-05-11 19:51:03 +02:00
|
|
|
/*
|
2009-11-30 23:01:27 +01:00
|
|
|
Is this instance of the table should be reopen?
|
2007-05-11 19:51:03 +02:00
|
|
|
*/
|
2009-11-30 23:01:27 +01:00
|
|
|
inline bool needs_reopen()
|
2007-05-11 19:51:03 +02:00
|
|
|
{ return s->version != refresh_version; }
|
2000-07-31 21:29:14 +02:00
|
|
|
};
|
|
|
|
|
2009-11-30 16:55:03 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
Helper class which specifies which members of TABLE are used for
|
|
|
|
participation in the list of used/unused TABLE objects for the share.
|
|
|
|
*/
|
|
|
|
|
|
|
|
struct TABLE_share
|
|
|
|
{
|
|
|
|
static inline TABLE **next_ptr(TABLE *l)
|
|
|
|
{
|
|
|
|
return &l->share_next;
|
|
|
|
}
|
|
|
|
static inline TABLE ***prev_ptr(TABLE *l)
|
|
|
|
{
|
|
|
|
return &l->share_prev;
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
|
2007-02-12 13:06:14 +01:00
|
|
|
enum enum_schema_table_state
|
|
|
|
{
|
|
|
|
NOT_PROCESSED= 0,
|
|
|
|
PROCESSED_BY_CREATE_SORT_INDEX,
|
|
|
|
PROCESSED_BY_JOIN_EXEC
|
|
|
|
};
|
2000-07-31 21:29:14 +02:00
|
|
|
|
2004-11-13 11:56:39 +01:00
|
|
|
typedef struct st_foreign_key_info
|
|
|
|
{
|
|
|
|
LEX_STRING *forein_id;
|
|
|
|
LEX_STRING *referenced_db;
|
|
|
|
LEX_STRING *referenced_table;
|
2006-05-02 13:31:39 +02:00
|
|
|
LEX_STRING *update_method;
|
|
|
|
LEX_STRING *delete_method;
|
2007-01-15 10:39:28 +01:00
|
|
|
LEX_STRING *referenced_key_name;
|
2004-11-13 11:56:39 +01:00
|
|
|
List<LEX_STRING> foreign_fields;
|
|
|
|
List<LEX_STRING> referenced_fields;
|
|
|
|
} FOREIGN_KEY_INFO;
|
|
|
|
|
2005-12-22 10:07:47 +01:00
|
|
|
/*
|
|
|
|
Make sure that the order of schema_tables and enum_schema_tables are the same.
|
|
|
|
*/
|
2004-11-13 11:56:39 +01:00
|
|
|
|
|
|
|
enum enum_schema_tables
|
|
|
|
{
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_CHARSETS= 0,
|
|
|
|
SCH_COLLATIONS,
|
|
|
|
SCH_COLLATION_CHARACTER_SET_APPLICABILITY,
|
|
|
|
SCH_COLUMNS,
|
|
|
|
SCH_COLUMN_PRIVILEGES,
|
2005-12-22 10:07:47 +01:00
|
|
|
SCH_ENGINES,
|
2006-01-30 13:15:23 +01:00
|
|
|
SCH_EVENTS,
|
2006-02-01 14:47:08 +01:00
|
|
|
SCH_FILES,
|
2006-09-14 01:37:40 +02:00
|
|
|
SCH_GLOBAL_STATUS,
|
|
|
|
SCH_GLOBAL_VARIABLES,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_KEY_COLUMN_USAGE,
|
|
|
|
SCH_OPEN_TABLES,
|
2006-01-10 16:44:04 +01:00
|
|
|
SCH_PARTITIONS,
|
2005-12-21 19:18:40 +01:00
|
|
|
SCH_PLUGINS,
|
2006-02-16 14:45:05 +01:00
|
|
|
SCH_PROCESSLIST,
|
2007-07-02 13:27:39 +02:00
|
|
|
SCH_PROFILES,
|
2006-05-02 13:31:39 +02:00
|
|
|
SCH_REFERENTIAL_CONSTRAINTS,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_PROCEDURES,
|
|
|
|
SCH_SCHEMATA,
|
|
|
|
SCH_SCHEMA_PRIVILEGES,
|
2006-09-14 01:37:40 +02:00
|
|
|
SCH_SESSION_STATUS,
|
|
|
|
SCH_SESSION_VARIABLES,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_STATISTICS,
|
|
|
|
SCH_STATUS,
|
|
|
|
SCH_TABLES,
|
|
|
|
SCH_TABLE_CONSTRAINTS,
|
|
|
|
SCH_TABLE_NAMES,
|
|
|
|
SCH_TABLE_PRIVILEGES,
|
|
|
|
SCH_TRIGGERS,
|
2006-01-29 02:44:51 +01:00
|
|
|
SCH_USER_PRIVILEGES,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_VARIABLES,
|
2006-01-29 02:44:51 +01:00
|
|
|
SCH_VIEWS
|
2004-11-13 11:56:39 +01:00
|
|
|
};
|
|
|
|
|
|
|
|
|
2007-04-25 14:15:05 +02:00
|
|
|
#define MY_I_S_MAYBE_NULL 1
|
|
|
|
#define MY_I_S_UNSIGNED 2
|
|
|
|
|
|
|
|
|
2007-08-03 00:14:05 +02:00
|
|
|
#define SKIP_OPEN_TABLE 0 // do not open table
|
|
|
|
#define OPEN_FRM_ONLY 1 // open FRM file only
|
|
|
|
#define OPEN_FULL_TABLE 2 // open FRM,MYD, MYI files
|
|
|
|
|
2004-11-13 11:56:39 +01:00
|
|
|
typedef struct st_field_info
|
|
|
|
{
|
2008-03-07 13:56:15 +01:00
|
|
|
/**
|
|
|
|
This is used as column name.
|
|
|
|
*/
|
2004-11-13 11:56:39 +01:00
|
|
|
const char* field_name;
|
2008-03-07 13:56:15 +01:00
|
|
|
/**
|
|
|
|
For string-type columns, this is the maximum number of
|
|
|
|
characters. Otherwise, it is the 'display-length' for the column.
|
|
|
|
*/
|
2004-11-13 11:56:39 +01:00
|
|
|
uint field_length;
|
2008-03-07 13:56:15 +01:00
|
|
|
/**
|
|
|
|
This denotes data type for the column. For the most part, there seems to
|
|
|
|
be one entry in the enum for each SQL data type, although there seem to
|
|
|
|
be a number of additional entries in the enum.
|
|
|
|
*/
|
2004-11-13 11:56:39 +01:00
|
|
|
enum enum_field_types field_type;
|
|
|
|
int value;
|
2008-03-07 13:56:15 +01:00
|
|
|
/**
|
|
|
|
This is used to set column attributes. By default, columns are @c NOT
|
|
|
|
@c NULL and @c SIGNED, and you can deviate from the default
|
|
|
|
by setting the appopriate flags. You can use either one of the flags
|
|
|
|
@c MY_I_S_MAYBE_NULL and @cMY_I_S_UNSIGNED or
|
|
|
|
combine them using the bitwise or operator @c |. Both flags are
|
|
|
|
defined in table.h.
|
|
|
|
*/
|
2007-04-25 14:15:05 +02:00
|
|
|
uint field_flags; // Field atributes(maybe_null, signed, unsigned etc.)
|
2004-11-13 11:56:39 +01:00
|
|
|
const char* old_name;
|
2008-03-07 13:56:15 +01:00
|
|
|
/**
|
|
|
|
This should be one of @c SKIP_OPEN_TABLE,
|
|
|
|
@c OPEN_FRM_ONLY or @c OPEN_FULL_TABLE.
|
|
|
|
*/
|
2007-08-03 00:14:05 +02:00
|
|
|
uint open_method;
|
2004-11-13 11:56:39 +01:00
|
|
|
} ST_FIELD_INFO;
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2007-07-06 14:18:49 +02:00
|
|
|
struct TABLE_LIST;
|
2004-11-13 11:56:39 +01:00
|
|
|
typedef class Item COND;
|
|
|
|
|
|
|
|
typedef struct st_schema_table
|
|
|
|
{
|
|
|
|
const char* table_name;
|
|
|
|
ST_FIELD_INFO *fields_info;
|
|
|
|
/* Create information_schema table */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE *(*create_table) (THD *thd, TABLE_LIST *table_list);
|
2004-11-13 11:56:39 +01:00
|
|
|
/* Fill table with data */
|
2007-07-06 14:18:49 +02:00
|
|
|
int (*fill_table) (THD *thd, TABLE_LIST *tables, COND *cond);
|
2004-11-13 11:56:39 +01:00
|
|
|
/* Handle fileds for old SHOW */
|
|
|
|
int (*old_format) (THD *thd, struct st_schema_table *schema_table);
|
2007-08-03 00:14:05 +02:00
|
|
|
int (*process_table) (THD *thd, TABLE_LIST *tables, TABLE *table,
|
|
|
|
bool res, LEX_STRING *db_name, LEX_STRING *table_name);
|
2004-11-13 11:56:39 +01:00
|
|
|
int idx_field1, idx_field2;
|
2004-12-18 11:49:13 +01:00
|
|
|
bool hidden;
|
2007-08-03 00:14:05 +02:00
|
|
|
uint i_s_requested_object; /* the object we need to open(TABLE | VIEW) */
|
2004-11-13 11:56:39 +01:00
|
|
|
} ST_SCHEMA_TABLE;
|
|
|
|
|
|
|
|
|
2000-09-25 23:33:25 +02:00
|
|
|
#define JOIN_TYPE_LEFT 1
|
|
|
|
#define JOIN_TYPE_RIGHT 2
|
|
|
|
|
2005-07-01 06:05:42 +02:00
|
|
|
#define VIEW_ALGORITHM_UNDEFINED 0
|
|
|
|
#define VIEW_ALGORITHM_TMPTABLE 1
|
|
|
|
#define VIEW_ALGORITHM_MERGE 2
|
2004-07-16 00:15:55 +02:00
|
|
|
|
2006-07-31 16:33:37 +02:00
|
|
|
#define VIEW_SUID_INVOKER 0
|
|
|
|
#define VIEW_SUID_DEFINER 1
|
|
|
|
#define VIEW_SUID_DEFAULT 2
|
|
|
|
|
2004-09-29 15:35:01 +02:00
|
|
|
/* view WITH CHECK OPTION parameter options */
|
2004-09-03 14:18:40 +02:00
|
|
|
#define VIEW_CHECK_NONE 0
|
|
|
|
#define VIEW_CHECK_LOCAL 1
|
|
|
|
#define VIEW_CHECK_CASCADED 2
|
|
|
|
|
2004-09-29 15:35:01 +02:00
|
|
|
/* result of view WITH CHECK OPTION parameter check */
|
|
|
|
#define VIEW_CHECK_OK 0
|
|
|
|
#define VIEW_CHECK_ERROR 1
|
|
|
|
#define VIEW_CHECK_SKIP 2
|
|
|
|
|
2008-07-24 22:38:44 +02:00
|
|
|
/** The threshold size a blob field buffer before it is freed */
|
|
|
|
#define MAX_TDC_BLOB_SIZE 65536
|
|
|
|
|
2004-11-17 11:45:05 +01:00
|
|
|
class select_union;
|
2005-03-24 14:32:11 +01:00
|
|
|
class TMP_TABLE_PARAM;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2007-07-06 14:18:49 +02:00
|
|
|
Item *create_view_field(THD *thd, TABLE_LIST *view, Item **field_ref,
|
2005-07-01 06:05:42 +02:00
|
|
|
const char *name);
|
|
|
|
|
2004-09-14 18:28:29 +02:00
|
|
|
struct Field_translator
|
|
|
|
{
|
|
|
|
Item *item;
|
|
|
|
const char *name;
|
|
|
|
};
|
2004-07-16 00:15:55 +02:00
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
Column reference of a NATURAL/USING join. Since column references in
|
|
|
|
joins can be both from views and stored tables, may point to either a
|
|
|
|
Field (for tables), or a Field_translator (for views).
|
|
|
|
*/
|
|
|
|
|
2005-08-17 16:19:31 +02:00
|
|
|
class Natural_join_column: public Sql_alloc
|
2005-08-12 16:57:19 +02:00
|
|
|
{
|
|
|
|
public:
|
|
|
|
Field_translator *view_field; /* Column reference of merge view. */
|
2008-10-07 23:34:00 +02:00
|
|
|
Item_field *table_field; /* Column reference of table or temp view. */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *table_ref; /* Original base table/view reference. */
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
True if a common join column of two NATURAL/USING join operands. Notice
|
|
|
|
that when we have a hierarchy of nested NATURAL/USING joins, a column can
|
|
|
|
be common at some level of nesting but it may not be common at higher
|
|
|
|
levels of nesting. Thus this flag may change depending on at which level
|
|
|
|
we are looking at some column.
|
|
|
|
*/
|
|
|
|
bool is_common;
|
|
|
|
public:
|
2007-07-06 14:18:49 +02:00
|
|
|
Natural_join_column(Field_translator *field_param, TABLE_LIST *tab);
|
2008-10-07 23:34:00 +02:00
|
|
|
Natural_join_column(Item_field *field_param, TABLE_LIST *tab);
|
2005-08-12 16:57:19 +02:00
|
|
|
const char *name();
|
|
|
|
Item *create_item(THD *thd);
|
|
|
|
Field *field();
|
|
|
|
const char *table_name();
|
|
|
|
const char *db_name();
|
|
|
|
GRANT_INFO *grant();
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Table reference in the FROM clause.
|
|
|
|
|
|
|
|
These table references can be of several types that correspond to
|
|
|
|
different SQL elements. Below we list all types of TABLE_LISTs with
|
|
|
|
the necessary conditions to determine when a TABLE_LIST instance
|
|
|
|
belongs to a certain type.
|
|
|
|
|
|
|
|
1) table (TABLE_LIST::view == NULL)
|
|
|
|
- base table
|
|
|
|
(TABLE_LIST::derived == NULL)
|
|
|
|
- subquery - TABLE_LIST::table is a temp table
|
|
|
|
(TABLE_LIST::derived != NULL)
|
|
|
|
- information schema table
|
|
|
|
(TABLE_LIST::schema_table != NULL)
|
|
|
|
NOTICE: for schema tables TABLE_LIST::field_translation may be != NULL
|
|
|
|
2) view (TABLE_LIST::view != NULL)
|
|
|
|
- merge (TABLE_LIST::effective_algorithm == VIEW_ALGORITHM_MERGE)
|
|
|
|
also (TABLE_LIST::field_translation != NULL)
|
|
|
|
- tmptable (TABLE_LIST::effective_algorithm == VIEW_ALGORITHM_TMPTABLE)
|
|
|
|
also (TABLE_LIST::field_translation == NULL)
|
|
|
|
3) nested table reference (TABLE_LIST::nested_join != NULL)
|
|
|
|
- table sequence - e.g. (t1, t2, t3)
|
|
|
|
TODO: how to distinguish from a JOIN?
|
|
|
|
- general JOIN
|
|
|
|
TODO: how to distinguish from a table sequence?
|
|
|
|
- NATURAL JOIN
|
|
|
|
(TABLE_LIST::natural_join != NULL)
|
|
|
|
- JOIN ... USING
|
|
|
|
(TABLE_LIST::join_using_fields != NULL)
|
|
|
|
*/
|
|
|
|
|
2009-10-14 13:14:58 +02:00
|
|
|
struct LEX;
|
2007-07-23 18:09:48 +02:00
|
|
|
class Index_hint;
|
2007-07-06 14:18:49 +02:00
|
|
|
struct TABLE_LIST
|
2002-06-11 10:20:31 +02:00
|
|
|
{
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST() {} /* Remove gcc warning */
|
2007-04-05 13:24:34 +02:00
|
|
|
|
|
|
|
/**
|
|
|
|
Prepare TABLE_LIST that consists of one table instance to use in
|
|
|
|
simple_open_and_lock_tables
|
|
|
|
*/
|
|
|
|
inline void init_one_table(const char *db_name_arg,
|
2009-12-02 16:37:10 +01:00
|
|
|
size_t db_length_arg,
|
2007-04-05 13:24:34 +02:00
|
|
|
const char *table_name_arg,
|
2009-12-02 16:37:10 +01:00
|
|
|
size_t table_name_length_arg,
|
2009-12-01 14:27:03 +01:00
|
|
|
const char *alias_arg,
|
2007-04-05 13:24:34 +02:00
|
|
|
enum thr_lock_type lock_type_arg)
|
|
|
|
{
|
|
|
|
bzero((char*) this, sizeof(*this));
|
|
|
|
db= (char*) db_name_arg;
|
2009-12-02 16:37:10 +01:00
|
|
|
db_length= db_length_arg;
|
2009-12-01 14:27:03 +01:00
|
|
|
table_name= (char*) table_name_arg;
|
2009-12-02 16:37:10 +01:00
|
|
|
table_name_length= table_name_length_arg;
|
2009-12-01 14:27:03 +01:00
|
|
|
alias= (char*) alias_arg;
|
2007-04-05 13:24:34 +02:00
|
|
|
lock_type= lock_type_arg;
|
2009-12-08 10:57:07 +01:00
|
|
|
mdl_request.init(0, db, table_name, MDL_SHARED);
|
2007-04-05 13:24:34 +02:00
|
|
|
}
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
List of tables local to a subquery (used by SQL_LIST). Considers
|
|
|
|
views as leaves (unlike 'next_leaf' below). Created at parse time
|
|
|
|
in st_select_lex::add_table_to_list() -> table_list.link_in_list().
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_local;
|
2004-07-16 00:15:55 +02:00
|
|
|
/* link in a global list of all queries tables */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_global, **prev_global;
|
2005-01-06 12:00:13 +01:00
|
|
|
char *db, *alias, *table_name, *schema_table_name;
|
2004-07-01 22:46:43 +02:00
|
|
|
char *option; /* Used by cache index */
|
2002-09-20 13:05:18 +02:00
|
|
|
Item *on_expr; /* Used with outer join */
|
2005-03-03 15:38:59 +01:00
|
|
|
/*
|
2005-08-12 16:57:19 +02:00
|
|
|
The structure of ON expression presented in the member above
|
2005-03-03 15:38:59 +01:00
|
|
|
can be changed during certain optimizations. This member
|
|
|
|
contains a snapshot of AND-OR structure of the ON expression
|
|
|
|
made after permanent transformations of the parse tree, and is
|
|
|
|
used to restore ON clause before every reexecution of a prepared
|
|
|
|
statement or stored procedure.
|
|
|
|
*/
|
|
|
|
Item *prep_on_expr;
|
2004-10-19 23:12:55 +02:00
|
|
|
COND_EQUAL *cond_equal; /* Used with outer join */
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
During parsing - left operand of NATURAL/USING join where 'this' is
|
|
|
|
the right operand. After parsing (this->natural_join == this) iff
|
|
|
|
'this' represents a NATURAL or USING join operation. Thus after
|
|
|
|
parsing 'this' is a NATURAL/USING join iff (natural_join != NULL).
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *natural_join;
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
True if 'this' represents a nested join that is a NATURAL JOIN.
|
|
|
|
For one of the operands of 'this', the member 'natural_join' points
|
|
|
|
to the other operand of 'this'.
|
|
|
|
*/
|
|
|
|
bool is_natural_join;
|
|
|
|
/* Field names in a USING clause for JOIN ... USING. */
|
|
|
|
List<String> *join_using_fields;
|
|
|
|
/*
|
|
|
|
Explicitly store the result columns of either a NATURAL/USING join or
|
|
|
|
an operand of such a join.
|
|
|
|
*/
|
|
|
|
List<Natural_join_column> *join_columns;
|
|
|
|
/* TRUE if join_columns contains all columns of this table reference. */
|
|
|
|
bool is_join_columns_complete;
|
|
|
|
|
|
|
|
/*
|
|
|
|
List of nodes in a nested join tree, that should be considered as
|
|
|
|
leaves with respect to name resolution. The leaves are: views,
|
|
|
|
top-most nodes representing NATURAL/USING joins, subqueries, and
|
|
|
|
base tables. All of these TABLE_LIST instances contain a
|
|
|
|
materialized list of columns. The list is local to a subquery.
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_name_resolution_table;
|
2005-08-12 16:57:19 +02:00
|
|
|
/* Index names in a "... JOIN ... USE/IGNORE INDEX ..." clause. */
|
2007-07-23 18:09:48 +02:00
|
|
|
List<Index_hint> *index_hints;
|
2006-02-16 08:30:53 +01:00
|
|
|
TABLE *table; /* opened table */
|
|
|
|
uint table_id; /* table id (from binlog) for opened table */
|
2004-11-05 16:29:47 +01:00
|
|
|
/*
|
|
|
|
select_result for derived table to pass it from table creation to table
|
|
|
|
filling procedure
|
|
|
|
*/
|
|
|
|
select_union *derived_result;
|
2004-07-16 00:15:55 +02:00
|
|
|
/*
|
|
|
|
Reference from aux_tables to local list entry of main select of
|
|
|
|
multi-delete statement:
|
|
|
|
delete t1 from t2,t1 where t1.a<'B' and t2.b=t1.b;
|
|
|
|
here it will be reference of first occurrence of t1 to second (as you
|
|
|
|
can see this lists can't be merged)
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *correspondent_table;
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief Normally, this field is non-null for anonymous derived tables only.
|
|
|
|
|
|
|
|
@details This field is set to non-null for
|
|
|
|
|
|
|
|
- Anonymous derived tables, In this case it points to the SELECT_LEX_UNIT
|
|
|
|
representing the derived table. E.g. for a query
|
|
|
|
|
|
|
|
@verbatim SELECT * FROM (SELECT a FROM t1) b @endverbatim
|
|
|
|
|
|
|
|
For the @c TABLE_LIST representing the derived table @c b, @c derived
|
|
|
|
points to the SELECT_LEX_UNIT representing the result of the query within
|
|
|
|
parenteses.
|
|
|
|
|
|
|
|
- Views. This is set for views with @verbatim ALGORITHM = TEMPTABLE
|
|
|
|
@endverbatim by mysql_make_view().
|
|
|
|
|
|
|
|
@note Inside views, a subquery in the @c FROM clause is not allowed.
|
|
|
|
@note Do not use this field to separate views/base tables/anonymous
|
|
|
|
derived tables. Use TABLE_LIST::is_anonymous_derived_table().
|
|
|
|
*/
|
2004-07-16 00:15:55 +02:00
|
|
|
st_select_lex_unit *derived; /* SELECT_LEX_UNIT of derived table */
|
2004-11-13 11:56:39 +01:00
|
|
|
ST_SCHEMA_TABLE *schema_table; /* Information_schema table */
|
|
|
|
st_select_lex *schema_select_lex;
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
True when the view field translation table is used to convert
|
|
|
|
schema table fields for backwards compatibility with SHOW command.
|
|
|
|
*/
|
2005-01-24 16:44:54 +01:00
|
|
|
bool schema_table_reformed;
|
2005-03-24 14:32:11 +01:00
|
|
|
TMP_TABLE_PARAM *schema_table_param;
|
2004-07-16 00:15:55 +02:00
|
|
|
/* link to select_lex where this table was used */
|
|
|
|
st_select_lex *select_lex;
|
2009-10-14 13:14:58 +02:00
|
|
|
LEX *view; /* link on VIEW lex for merging */
|
2004-09-14 18:28:29 +02:00
|
|
|
Field_translator *field_translation; /* array of VIEW fields */
|
2005-07-01 06:05:42 +02:00
|
|
|
/* pointer to element after last one in translation table above */
|
|
|
|
Field_translator *field_translation_end;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
|
|
|
List (based on next_local) of underlying tables of this view. I.e. it
|
|
|
|
does not include the tables of subqueries used in the view. Is set only
|
|
|
|
for merged views.
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *merge_underlying_list;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
|
|
|
- 0 for base tables
|
|
|
|
- in case of the view it is the list of all (not only underlying
|
|
|
|
tables but also used in subquery ones) tables of the view.
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
List<TABLE_LIST> *view_tables;
|
2004-07-21 03:26:20 +02:00
|
|
|
/* most upper view this table belongs to */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *belong_to_view;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
|
|
|
The view directly referencing this table
|
|
|
|
(non-zero only for merged underlying tables of a view).
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *referencing_view;
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
/* Ptr to parent MERGE table list item. See top comment in ha_myisammrg.cc */
|
|
|
|
TABLE_LIST *parent_l;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
2005-10-29 11:11:34 +02:00
|
|
|
Security context (non-zero only for tables which belong
|
|
|
|
to view with SQL SECURITY DEFINER)
|
2005-10-27 23:18:23 +02:00
|
|
|
*/
|
|
|
|
Security_context *security_ctx;
|
|
|
|
/*
|
2005-10-29 11:11:34 +02:00
|
|
|
This view security context (non-zero only for views with
|
|
|
|
SQL SECURITY DEFINER)
|
2005-10-27 23:18:23 +02:00
|
|
|
*/
|
|
|
|
Security_context *view_sctx;
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
List of all base tables local to a subquery including all view
|
|
|
|
tables. Unlike 'next_local', this in this list views are *not*
|
|
|
|
leaves. Created in setup_tables() -> make_leaves_list().
|
|
|
|
*/
|
2006-07-25 14:23:25 +02:00
|
|
|
bool allowed_show;
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_leaf;
|
2004-07-16 00:15:55 +02:00
|
|
|
Item *where; /* VIEW WHERE clause condition */
|
2004-09-03 14:18:40 +02:00
|
|
|
Item *check_option; /* WITH CHECK OPTION condition */
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
LEX_STRING select_stmt; /* text of (CREATE/SELECT) statement */
|
2004-10-07 00:45:06 +02:00
|
|
|
LEX_STRING md5; /* md5 of query text */
|
2004-07-16 00:15:55 +02:00
|
|
|
LEX_STRING source; /* source of CREATE VIEW */
|
2004-10-07 00:45:06 +02:00
|
|
|
LEX_STRING view_db; /* saved view database */
|
|
|
|
LEX_STRING view_name; /* saved view name */
|
2004-07-16 00:15:55 +02:00
|
|
|
LEX_STRING timestamp; /* GMT time stamp of last operation */
|
2005-09-14 09:53:09 +02:00
|
|
|
st_lex_user definer; /* definer of view */
|
2004-07-16 00:15:55 +02:00
|
|
|
ulonglong file_version; /* version of file's field set */
|
2004-09-03 20:38:01 +02:00
|
|
|
ulonglong updatable_view; /* VIEW can be updated */
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief The declared algorithm, if this is a view.
|
|
|
|
@details One of
|
|
|
|
- VIEW_ALGORITHM_UNDEFINED
|
|
|
|
- VIEW_ALGORITHM_TMPTABLE
|
|
|
|
- VIEW_ALGORITHM_MERGE
|
|
|
|
@to do Replace with an enum
|
|
|
|
*/
|
|
|
|
ulonglong algorithm;
|
2005-09-14 09:53:09 +02:00
|
|
|
ulonglong view_suid; /* view is suid (TRUE dy default) */
|
2004-09-03 14:18:40 +02:00
|
|
|
ulonglong with_check; /* WITH CHECK OPTION */
|
2004-10-07 14:43:04 +02:00
|
|
|
/*
|
|
|
|
effective value of WITH CHECK OPTION (differ for temporary table
|
|
|
|
algorithm)
|
|
|
|
*/
|
|
|
|
uint8 effective_with_check;
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief The view algorithm that is actually used, if this is a view.
|
|
|
|
@details One of
|
|
|
|
- VIEW_ALGORITHM_UNDEFINED
|
|
|
|
- VIEW_ALGORITHM_TMPTABLE
|
|
|
|
- VIEW_ALGORITHM_MERGE
|
|
|
|
@to do Replace with an enum
|
|
|
|
*/
|
|
|
|
uint8 effective_algorithm;
|
2004-07-16 00:15:55 +02:00
|
|
|
GRANT_INFO grant;
|
2004-11-24 12:56:51 +01:00
|
|
|
/* data need by some engines in query cache*/
|
|
|
|
ulonglong engine_data;
|
|
|
|
/* call back function for asking handler about caching in query cache */
|
|
|
|
qc_engine_callback callback_func;
|
2000-09-25 23:33:25 +02:00
|
|
|
thr_lock_type lock_type;
|
2002-09-20 13:05:18 +02:00
|
|
|
uint outer_join; /* Which join type */
|
2004-06-25 15:52:01 +02:00
|
|
|
uint shared; /* Used in multi-upd */
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
size_t db_length;
|
|
|
|
size_t table_name_length;
|
2004-07-22 16:52:04 +02:00
|
|
|
bool updatable; /* VIEW/TABLE can be updated now */
|
2002-09-20 13:05:18 +02:00
|
|
|
bool straight; /* optimize with prev table */
|
|
|
|
bool updating; /* for replicate-do/ignore table */
|
2004-06-11 07:27:21 +02:00
|
|
|
bool force_index; /* prefer index over table scan */
|
|
|
|
bool ignore_leaves; /* preload only non-leaf nodes */
|
|
|
|
table_map dep_tables; /* tables the table depends on */
|
2004-07-01 22:46:43 +02:00
|
|
|
table_map on_expr_dep_tables; /* tables on expression depends on */
|
2004-06-11 07:27:21 +02:00
|
|
|
struct st_nested_join *nested_join; /* if the element is a nested join */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *embedding; /* nested join containing the table */
|
|
|
|
List<TABLE_LIST> *join_list;/* join list the table belongs to */
|
2004-06-25 15:52:01 +02:00
|
|
|
bool cacheable_table; /* stop PS caching */
|
2004-10-07 00:45:06 +02:00
|
|
|
/* used in multi-upd/views privilege check */
|
2004-07-16 00:15:55 +02:00
|
|
|
bool table_in_first_from_clause;
|
|
|
|
bool skip_temporary; /* this table shouldn't be temporary */
|
2004-10-07 00:45:06 +02:00
|
|
|
/* TRUE if this merged view contain auto_increment field */
|
2004-07-16 00:15:55 +02:00
|
|
|
bool contain_auto_increment;
|
2005-05-11 01:31:13 +02:00
|
|
|
bool multitable_view; /* TRUE iff this is multitable view */
|
2005-09-01 11:36:42 +02:00
|
|
|
bool compact_view_format; /* Use compact format for SHOW CREATE VIEW */
|
2005-07-01 06:05:42 +02:00
|
|
|
/* view where processed */
|
|
|
|
bool where_processed;
|
2007-05-31 23:15:40 +02:00
|
|
|
/* TRUE <=> VIEW CHECK OPTION expression has been processed */
|
|
|
|
bool check_option_processed;
|
2004-08-24 14:37:51 +02:00
|
|
|
/* FRMTYPE_ERROR if any type is acceptable */
|
|
|
|
enum frm_type_enum required_type;
|
2005-12-21 19:18:40 +01:00
|
|
|
handlerton *db_type; /* table_type for handler */
|
2004-07-16 00:15:55 +02:00
|
|
|
char timestamp_buffer[20]; /* buffer for timestamp (19+1) */
|
2005-03-04 14:35:28 +01:00
|
|
|
/*
|
|
|
|
This TABLE_LIST object is just placeholder for prelocking, it will be
|
|
|
|
used for implicit LOCK TABLES only and won't be used in real statement.
|
|
|
|
*/
|
|
|
|
bool prelocking_placeholder;
|
2009-11-30 23:13:06 +01:00
|
|
|
/**
|
|
|
|
Indicates that if TABLE_LIST object corresponds to the table/view
|
Backport of revno ## 2617.31.1, 2617.31.3, 2617.31.4, 2617.31.5,
2617.31.12, 2617.31.15, 2617.31.15, 2617.31.16, 2617.43.1
- initial changeset that introduced the fix for
Bug#989 and follow up fixes for all test suite failures
introduced in the initial changeset.
------------------------------------------------------------
revno: 2617.31.1
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Fri 2009-03-06 19:17:00 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
Currently the MySQL server does not keep metadata locks on
schema objects for the duration of a transaction, thus failing
to guarantee the integrity of the schema objects being used
during the transaction and to protect then from concurrent
DDL operations. This also poses a problem for replication as
a DDL operation might be replicated even thought there are
active transactions using the object being modified.
The solution is to defer the release of metadata locks until
a active transaction is either committed or rolled back. This
prevents other statements from modifying the table for the
entire duration of the transaction. This provides commitment
ordering for guaranteeing serializability across multiple
transactions.
- Incompatible change:
If MySQL's metadata locking system encounters a lock conflict,
the usual schema is to use the try and back-off technique to
avoid deadlocks -- this schema consists in releasing all locks
and trying to acquire them all in one go.
But in a transactional context this algorithm can't be utilized
as its not possible to release locks acquired during the course
of the transaction without breaking the transaction commitments.
To avoid deadlocks in this case, the ER_LOCK_DEADLOCK will be
returned if a lock conflict is encountered during a transaction.
Let's consider an example:
A transaction has two statements that modify table t1, then table
t2, and then commits. The first statement of the transaction will
acquire a shared metadata lock on table t1, and it will be kept
utill COMMIT to ensure serializability.
At the moment when the second statement attempts to acquire a
shared metadata lock on t2, a concurrent ALTER or DROP statement
might have locked t2 exclusively. The prescription of the current
locking protocol is that the acquirer of the shared lock backs off
-- gives up all his current locks and retries. This implies that
the entire multi-statement transaction has to be rolled back.
- Incompatible change:
FLUSH commands such as FLUSH PRIVILEGES and FLUSH TABLES WITH READ
LOCK won't cause locked tables to be implicitly unlocked anymore.
2009-12-05 00:02:48 +01:00
|
|
|
which requires special handling.
|
2009-11-30 16:55:03 +01:00
|
|
|
*/
|
2009-11-30 23:13:06 +01:00
|
|
|
enum
|
|
|
|
{
|
Backport of revno ## 2617.31.1, 2617.31.3, 2617.31.4, 2617.31.5,
2617.31.12, 2617.31.15, 2617.31.15, 2617.31.16, 2617.43.1
- initial changeset that introduced the fix for
Bug#989 and follow up fixes for all test suite failures
introduced in the initial changeset.
------------------------------------------------------------
revno: 2617.31.1
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Fri 2009-03-06 19:17:00 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
Currently the MySQL server does not keep metadata locks on
schema objects for the duration of a transaction, thus failing
to guarantee the integrity of the schema objects being used
during the transaction and to protect then from concurrent
DDL operations. This also poses a problem for replication as
a DDL operation might be replicated even thought there are
active transactions using the object being modified.
The solution is to defer the release of metadata locks until
a active transaction is either committed or rolled back. This
prevents other statements from modifying the table for the
entire duration of the transaction. This provides commitment
ordering for guaranteeing serializability across multiple
transactions.
- Incompatible change:
If MySQL's metadata locking system encounters a lock conflict,
the usual schema is to use the try and back-off technique to
avoid deadlocks -- this schema consists in releasing all locks
and trying to acquire them all in one go.
But in a transactional context this algorithm can't be utilized
as its not possible to release locks acquired during the course
of the transaction without breaking the transaction commitments.
To avoid deadlocks in this case, the ER_LOCK_DEADLOCK will be
returned if a lock conflict is encountered during a transaction.
Let's consider an example:
A transaction has two statements that modify table t1, then table
t2, and then commits. The first statement of the transaction will
acquire a shared metadata lock on table t1, and it will be kept
utill COMMIT to ensure serializability.
At the moment when the second statement attempts to acquire a
shared metadata lock on t2, a concurrent ALTER or DROP statement
might have locked t2 exclusively. The prescription of the current
locking protocol is that the acquirer of the shared lock backs off
-- gives up all his current locks and retries. This implies that
the entire multi-statement transaction has to be rolled back.
- Incompatible change:
FLUSH commands such as FLUSH PRIVILEGES and FLUSH TABLES WITH READ
LOCK won't cause locked tables to be implicitly unlocked anymore.
2009-12-05 00:02:48 +01:00
|
|
|
/* Normal open. */
|
|
|
|
OPEN_NORMAL= 0,
|
|
|
|
/* Associate a table share only if the the table exists. */
|
|
|
|
OPEN_IF_EXISTS,
|
|
|
|
/* Don't associate a table share. */
|
|
|
|
OPEN_STUB
|
|
|
|
} open_strategy;
|
|
|
|
/**
|
|
|
|
Indicates the locking strategy for the object being opened:
|
|
|
|
whether the associated metadata lock is shared or exclusive.
|
|
|
|
*/
|
|
|
|
enum
|
|
|
|
{
|
|
|
|
/* Take a shared metadata lock before the object is opened. */
|
|
|
|
SHARED_MDL= 0,
|
2009-11-30 23:13:06 +01:00
|
|
|
/*
|
Backport of revno ## 2617.31.1, 2617.31.3, 2617.31.4, 2617.31.5,
2617.31.12, 2617.31.15, 2617.31.15, 2617.31.16, 2617.43.1
- initial changeset that introduced the fix for
Bug#989 and follow up fixes for all test suite failures
introduced in the initial changeset.
------------------------------------------------------------
revno: 2617.31.1
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Fri 2009-03-06 19:17:00 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
Currently the MySQL server does not keep metadata locks on
schema objects for the duration of a transaction, thus failing
to guarantee the integrity of the schema objects being used
during the transaction and to protect then from concurrent
DDL operations. This also poses a problem for replication as
a DDL operation might be replicated even thought there are
active transactions using the object being modified.
The solution is to defer the release of metadata locks until
a active transaction is either committed or rolled back. This
prevents other statements from modifying the table for the
entire duration of the transaction. This provides commitment
ordering for guaranteeing serializability across multiple
transactions.
- Incompatible change:
If MySQL's metadata locking system encounters a lock conflict,
the usual schema is to use the try and back-off technique to
avoid deadlocks -- this schema consists in releasing all locks
and trying to acquire them all in one go.
But in a transactional context this algorithm can't be utilized
as its not possible to release locks acquired during the course
of the transaction without breaking the transaction commitments.
To avoid deadlocks in this case, the ER_LOCK_DEADLOCK will be
returned if a lock conflict is encountered during a transaction.
Let's consider an example:
A transaction has two statements that modify table t1, then table
t2, and then commits. The first statement of the transaction will
acquire a shared metadata lock on table t1, and it will be kept
utill COMMIT to ensure serializability.
At the moment when the second statement attempts to acquire a
shared metadata lock on t2, a concurrent ALTER or DROP statement
might have locked t2 exclusively. The prescription of the current
locking protocol is that the acquirer of the shared lock backs off
-- gives up all his current locks and retries. This implies that
the entire multi-statement transaction has to be rolled back.
- Incompatible change:
FLUSH commands such as FLUSH PRIVILEGES and FLUSH TABLES WITH READ
LOCK won't cause locked tables to be implicitly unlocked anymore.
2009-12-05 00:02:48 +01:00
|
|
|
Take a exclusive metadata lock before the object is opened.
|
|
|
|
If opening is successful, downgrade to a shared lock.
|
2009-11-30 23:13:06 +01:00
|
|
|
*/
|
Backport of revno ## 2617.31.1, 2617.31.3, 2617.31.4, 2617.31.5,
2617.31.12, 2617.31.15, 2617.31.15, 2617.31.16, 2617.43.1
- initial changeset that introduced the fix for
Bug#989 and follow up fixes for all test suite failures
introduced in the initial changeset.
------------------------------------------------------------
revno: 2617.31.1
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 4284-6.0
timestamp: Fri 2009-03-06 19:17:00 -0300
message:
Bug#989: If DROP TABLE while there's an active transaction, wrong binlog order
WL#4284: Transactional DDL locking
Currently the MySQL server does not keep metadata locks on
schema objects for the duration of a transaction, thus failing
to guarantee the integrity of the schema objects being used
during the transaction and to protect then from concurrent
DDL operations. This also poses a problem for replication as
a DDL operation might be replicated even thought there are
active transactions using the object being modified.
The solution is to defer the release of metadata locks until
a active transaction is either committed or rolled back. This
prevents other statements from modifying the table for the
entire duration of the transaction. This provides commitment
ordering for guaranteeing serializability across multiple
transactions.
- Incompatible change:
If MySQL's metadata locking system encounters a lock conflict,
the usual schema is to use the try and back-off technique to
avoid deadlocks -- this schema consists in releasing all locks
and trying to acquire them all in one go.
But in a transactional context this algorithm can't be utilized
as its not possible to release locks acquired during the course
of the transaction without breaking the transaction commitments.
To avoid deadlocks in this case, the ER_LOCK_DEADLOCK will be
returned if a lock conflict is encountered during a transaction.
Let's consider an example:
A transaction has two statements that modify table t1, then table
t2, and then commits. The first statement of the transaction will
acquire a shared metadata lock on table t1, and it will be kept
utill COMMIT to ensure serializability.
At the moment when the second statement attempts to acquire a
shared metadata lock on t2, a concurrent ALTER or DROP statement
might have locked t2 exclusively. The prescription of the current
locking protocol is that the acquirer of the shared lock backs off
-- gives up all his current locks and retries. This implies that
the entire multi-statement transaction has to be rolled back.
- Incompatible change:
FLUSH commands such as FLUSH PRIVILEGES and FLUSH TABLES WITH READ
LOCK won't cause locked tables to be implicitly unlocked anymore.
2009-12-05 00:02:48 +01:00
|
|
|
EXCLUSIVE_DOWNGRADABLE_MDL,
|
|
|
|
/* Take a exclusive metadata lock before the object is opened. */
|
|
|
|
EXCLUSIVE_MDL
|
|
|
|
} lock_strategy;
|
|
|
|
/* For transactional locking. */
|
|
|
|
int lock_timeout; /* NOWAIT or WAIT [X] */
|
|
|
|
bool lock_transactional; /* If transactional lock requested. */
|
2007-11-23 15:21:24 +01:00
|
|
|
bool internal_tmp_table;
|
2009-11-10 19:48:46 +01:00
|
|
|
/** TRUE if an alias for this table was specified in the SQL. */
|
|
|
|
bool is_alias;
|
|
|
|
/** TRUE if the table is referred to in the statement using a fully
|
|
|
|
qualified name (<db_name>.<table_name>).
|
|
|
|
*/
|
|
|
|
bool is_fqtn;
|
2004-07-01 22:46:43 +02:00
|
|
|
|
|
|
|
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
/* View creation context. */
|
|
|
|
|
|
|
|
View_creation_ctx *view_creation_ctx;
|
|
|
|
|
|
|
|
/*
|
|
|
|
Attributes to save/load view creation context in/from frm-file.
|
|
|
|
|
|
|
|
Ther are required only to be able to use existing parser to load
|
|
|
|
view-definition file. As soon as the parser parsed the file, view
|
|
|
|
creation context is initialized and the attributes become redundant.
|
|
|
|
|
|
|
|
These attributes MUST NOT be used for any purposes but the parsing.
|
|
|
|
*/
|
|
|
|
|
|
|
|
LEX_STRING view_client_cs_name;
|
|
|
|
LEX_STRING view_connection_cl_name;
|
|
|
|
|
|
|
|
/*
|
|
|
|
View definition (SELECT-statement) in the UTF-form.
|
|
|
|
*/
|
|
|
|
|
|
|
|
LEX_STRING view_body_utf8;
|
|
|
|
|
|
|
|
/* End of view definition context. */
|
|
|
|
|
A fix and a test case for Bug#26141 mixing table types in trigger
causes full table lock on innodb table.
Also fixes Bug#28502 Triggers that update another innodb table
will block on X lock unnecessarily (duplciate).
Code review fixes.
Both bugs' synopses are misleading: InnoDB table is
not X locked. The statements, however, cannot proceed concurrently,
but this happens due to lock conflicts for tables used in triggers,
not for the InnoDB table.
If a user had an InnoDB table, and two triggers, AFTER UPDATE and
AFTER INSERT, competing for different resources (e.g. two distinct
MyISAM tables), then these two triggers would not be able to execute
concurrently. Moreover, INSERTS/UPDATES of the InnoDB table would
not be able to run concurrently.
The problem had other side-effects (see respective bug reports).
This behavior was a consequence of a shortcoming of the pre-locking
algorithm, which would not distinguish between different DML operations
(e.g. INSERT and DELETE) and pre-lock all the tables
that are used by any trigger defined on the subject table.
The idea of the fix is to extend the pre-locking algorithm to keep track,
for each table, what DML operation it is used for and not
load triggers that are known to never be fired.
2007-07-12 20:26:41 +02:00
|
|
|
/**
|
|
|
|
Indicates what triggers we need to pre-load for this TABLE_LIST
|
|
|
|
when opening an associated TABLE. This is filled after
|
|
|
|
the parsed tree is created.
|
|
|
|
*/
|
|
|
|
uint8 trg_event_map;
|
2009-10-19 13:13:26 +02:00
|
|
|
/* TRUE <=> this table is a const one and was optimized away. */
|
|
|
|
bool optimized_away;
|
A fix and a test case for Bug#26141 mixing table types in trigger
causes full table lock on innodb table.
Also fixes Bug#28502 Triggers that update another innodb table
will block on X lock unnecessarily (duplciate).
Code review fixes.
Both bugs' synopses are misleading: InnoDB table is
not X locked. The statements, however, cannot proceed concurrently,
but this happens due to lock conflicts for tables used in triggers,
not for the InnoDB table.
If a user had an InnoDB table, and two triggers, AFTER UPDATE and
AFTER INSERT, competing for different resources (e.g. two distinct
MyISAM tables), then these two triggers would not be able to execute
concurrently. Moreover, INSERTS/UPDATES of the InnoDB table would
not be able to run concurrently.
The problem had other side-effects (see respective bug reports).
This behavior was a consequence of a shortcoming of the pre-locking
algorithm, which would not distinguish between different DML operations
(e.g. INSERT and DELETE) and pre-lock all the tables
that are used by any trigger defined on the subject table.
The idea of the fix is to extend the pre-locking algorithm to keep track,
for each table, what DML operation it is used for and not
load triggers that are known to never be fired.
2007-07-12 20:26:41 +02:00
|
|
|
|
2007-08-03 00:14:05 +02:00
|
|
|
uint i_s_requested_object;
|
|
|
|
bool has_db_lookup_value;
|
|
|
|
bool has_table_lookup_value;
|
|
|
|
uint table_open_method;
|
2007-02-12 13:06:14 +01:00
|
|
|
enum enum_schema_table_state schema_table_state;
|
2009-11-30 16:55:03 +01:00
|
|
|
|
2009-12-08 10:57:07 +01:00
|
|
|
MDL_request mdl_request;
|
2009-11-30 16:55:03 +01:00
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
void calc_md5(char *buffer);
|
2005-10-27 23:18:23 +02:00
|
|
|
void set_underlying_merge();
|
2004-09-29 15:35:01 +02:00
|
|
|
int view_check_option(THD *thd, bool ignore_failure);
|
2005-10-27 23:18:23 +02:00
|
|
|
bool setup_underlying(THD *thd);
|
2004-11-08 00:54:23 +01:00
|
|
|
void cleanup_items();
|
2007-05-11 19:51:03 +02:00
|
|
|
bool placeholder()
|
|
|
|
{
|
2009-11-30 16:55:03 +01:00
|
|
|
return derived || view || schema_table || !table;
|
2007-05-11 19:51:03 +02:00
|
|
|
}
|
2008-02-22 11:30:33 +01:00
|
|
|
void print(THD *thd, String *str, enum_query_type query_type);
|
2007-07-06 14:18:49 +02:00
|
|
|
bool check_single_table(TABLE_LIST **table, table_map map,
|
|
|
|
TABLE_LIST *view);
|
2004-09-15 22:42:56 +02:00
|
|
|
bool set_insert_values(MEM_ROOT *mem_root);
|
2005-07-01 06:05:42 +02:00
|
|
|
void hide_view_error(THD *thd);
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *find_underlying_table(TABLE *table);
|
|
|
|
TABLE_LIST *first_leaf_for_name_resolution();
|
|
|
|
TABLE_LIST *last_leaf_for_name_resolution();
|
2005-08-12 16:57:19 +02:00
|
|
|
bool is_leaf_for_name_resolution();
|
2007-07-06 14:18:49 +02:00
|
|
|
inline TABLE_LIST *top_table()
|
2005-08-02 21:54:49 +02:00
|
|
|
{ return belong_to_view ? belong_to_view : this; }
|
2005-07-01 06:05:42 +02:00
|
|
|
inline bool prepare_check_option(THD *thd)
|
|
|
|
{
|
|
|
|
bool res= FALSE;
|
|
|
|
if (effective_with_check)
|
|
|
|
res= prep_check_option(thd, effective_with_check);
|
|
|
|
return res;
|
|
|
|
}
|
|
|
|
inline bool prepare_where(THD *thd, Item **conds,
|
|
|
|
bool no_where_clause)
|
|
|
|
{
|
|
|
|
if (effective_algorithm == VIEW_ALGORITHM_MERGE)
|
|
|
|
return prep_where(thd, conds, no_where_clause);
|
|
|
|
return FALSE;
|
|
|
|
}
|
2005-10-27 23:18:23 +02:00
|
|
|
|
|
|
|
void register_want_access(ulong want_access);
|
|
|
|
bool prepare_security(THD *thd);
|
|
|
|
#ifndef NO_EMBEDDED_ACCESS_CHECKS
|
|
|
|
Security_context *find_view_security_context(THD *thd);
|
|
|
|
bool prepare_view_securety_context(THD *thd);
|
|
|
|
#endif
|
2006-07-06 21:59:04 +02:00
|
|
|
/*
|
|
|
|
Cleanup for re-execution in a prepared statement or a stored
|
|
|
|
procedure.
|
|
|
|
*/
|
|
|
|
void reinit_before_use(THD *thd);
|
2006-11-01 02:31:56 +01:00
|
|
|
Item_subselect *containing_subselect();
|
2005-10-27 23:18:23 +02:00
|
|
|
|
2007-03-05 18:08:41 +01:00
|
|
|
/*
|
2009-10-14 13:14:58 +02:00
|
|
|
Compiles the tagged hints list and fills up TABLE::keys_in_use_for_query,
|
|
|
|
TABLE::keys_in_use_for_group_by, TABLE::keys_in_use_for_order_by,
|
|
|
|
TABLE::force_index and TABLE::covering_keys.
|
2007-03-05 18:08:41 +01:00
|
|
|
*/
|
|
|
|
bool process_index_hints(TABLE *table);
|
|
|
|
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
/* Access MERGE child def version. See top comment in ha_myisammrg.cc */
|
|
|
|
inline ulong get_child_def_version()
|
|
|
|
{
|
|
|
|
return child_def_version;
|
|
|
|
}
|
|
|
|
inline void set_child_def_version(ulong version)
|
|
|
|
{
|
|
|
|
child_def_version= version;
|
|
|
|
}
|
|
|
|
inline void init_child_def_version()
|
|
|
|
{
|
|
|
|
child_def_version= ~0UL;
|
|
|
|
}
|
|
|
|
|
2008-04-08 18:01:20 +02:00
|
|
|
/**
|
|
|
|
Compare the version of metadata from the previous execution
|
|
|
|
(if any) with values obtained from the current table
|
|
|
|
definition cache element.
|
|
|
|
|
2008-04-16 23:04:49 +02:00
|
|
|
@sa check_and_update_table_version()
|
2008-04-08 18:01:20 +02:00
|
|
|
*/
|
|
|
|
inline
|
2008-05-20 09:29:16 +02:00
|
|
|
bool is_table_ref_id_equal(TABLE_SHARE *s) const
|
2008-04-08 18:01:20 +02:00
|
|
|
{
|
2008-05-20 09:29:16 +02:00
|
|
|
return (m_table_ref_type == s->get_table_ref_type() &&
|
|
|
|
m_table_ref_version == s->get_table_ref_version());
|
2008-04-08 18:01:20 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
Record the value of metadata version of the corresponding
|
|
|
|
table definition cache element in this parse tree node.
|
|
|
|
|
2008-04-16 23:04:49 +02:00
|
|
|
@sa check_and_update_table_version()
|
2008-04-08 18:01:20 +02:00
|
|
|
*/
|
|
|
|
inline
|
2008-05-20 09:29:16 +02:00
|
|
|
void set_table_ref_id(TABLE_SHARE *s)
|
2008-04-08 18:01:20 +02:00
|
|
|
{
|
2008-05-20 09:29:16 +02:00
|
|
|
m_table_ref_type= s->get_table_ref_type();
|
|
|
|
m_table_ref_version= s->get_table_ref_version();
|
2008-04-08 18:01:20 +02:00
|
|
|
}
|
|
|
|
|
2008-09-03 16:45:40 +02:00
|
|
|
/**
|
|
|
|
@brief True if this TABLE_LIST represents an anonymous derived table,
|
|
|
|
i.e. the result of a subquery.
|
|
|
|
*/
|
|
|
|
bool is_anonymous_derived_table() const { return derived && !view; }
|
|
|
|
|
|
|
|
/**
|
|
|
|
@brief Returns the name of the database that the referenced table belongs
|
|
|
|
to.
|
|
|
|
*/
|
|
|
|
char *get_db_name() { return view != NULL ? view_db.str : db; }
|
|
|
|
|
|
|
|
/**
|
|
|
|
@brief Returns the name of the table that this TABLE_LIST represents.
|
|
|
|
|
|
|
|
@details The unqualified table name or view name for a table or view,
|
|
|
|
respectively.
|
|
|
|
*/
|
|
|
|
char *get_table_name() { return view != NULL ? view_name.str : table_name; }
|
|
|
|
|
2005-07-01 06:05:42 +02:00
|
|
|
private:
|
|
|
|
bool prep_check_option(THD *thd, uint8 check_opt_type);
|
|
|
|
bool prep_where(THD *thd, Item **conds, bool no_where_clause);
|
2006-07-06 21:59:04 +02:00
|
|
|
/*
|
|
|
|
Cleanup for re-execution in a prepared statement or a stored
|
|
|
|
procedure.
|
|
|
|
*/
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
|
|
|
|
/* Remembered MERGE child def version. See top comment in ha_myisammrg.cc */
|
|
|
|
ulong child_def_version;
|
2008-05-17 23:51:18 +02:00
|
|
|
/** See comments for set_metadata_id() */
|
2008-05-20 09:29:16 +02:00
|
|
|
enum enum_table_ref_type m_table_ref_type;
|
2008-05-17 23:51:18 +02:00
|
|
|
/** See comments for set_metadata_id() */
|
2008-05-20 09:29:16 +02:00
|
|
|
ulong m_table_ref_version;
|
2007-07-06 14:18:49 +02:00
|
|
|
};
|
2001-04-11 13:04:03 +02:00
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Item;
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
Iterator over the fields of a generic table reference.
|
|
|
|
*/
|
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Field_iterator: public Sql_alloc
|
|
|
|
{
|
|
|
|
public:
|
2006-02-25 16:46:30 +01:00
|
|
|
Field_iterator() {} /* Remove gcc warning */
|
2004-07-16 00:15:55 +02:00
|
|
|
virtual ~Field_iterator() {}
|
|
|
|
virtual void set(TABLE_LIST *)= 0;
|
|
|
|
virtual void next()= 0;
|
2004-09-03 20:43:04 +02:00
|
|
|
virtual bool end_of_fields()= 0; /* Return 1 at end of list */
|
2004-07-16 00:15:55 +02:00
|
|
|
virtual const char *name()= 0;
|
2005-07-01 06:05:42 +02:00
|
|
|
virtual Item *create_item(THD *)= 0;
|
2004-07-16 00:15:55 +02:00
|
|
|
virtual Field *field()= 0;
|
|
|
|
};
|
|
|
|
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
Iterator over the fields of a base table, view with temporary
|
|
|
|
table, or subquery.
|
|
|
|
*/
|
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Field_iterator_table: public Field_iterator
|
|
|
|
{
|
|
|
|
Field **ptr;
|
|
|
|
public:
|
|
|
|
Field_iterator_table() :ptr(0) {}
|
|
|
|
void set(TABLE_LIST *table) { ptr= table->table->field; }
|
|
|
|
void set_table(TABLE *table) { ptr= table->field; }
|
|
|
|
void next() { ptr++; }
|
2004-09-03 20:43:04 +02:00
|
|
|
bool end_of_fields() { return *ptr == 0; }
|
2004-07-16 00:15:55 +02:00
|
|
|
const char *name();
|
2005-07-01 06:05:42 +02:00
|
|
|
Item *create_item(THD *thd);
|
2004-07-16 00:15:55 +02:00
|
|
|
Field *field() { return *ptr; }
|
|
|
|
};
|
|
|
|
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/* Iterator over the fields of a merge view. */
|
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Field_iterator_view: public Field_iterator
|
|
|
|
{
|
2004-09-14 18:28:29 +02:00
|
|
|
Field_translator *ptr, *array_end;
|
2005-07-01 06:05:42 +02:00
|
|
|
TABLE_LIST *view;
|
2004-07-16 00:15:55 +02:00
|
|
|
public:
|
|
|
|
Field_iterator_view() :ptr(0), array_end(0) {}
|
|
|
|
void set(TABLE_LIST *table);
|
|
|
|
void next() { ptr++; }
|
2004-09-03 20:43:04 +02:00
|
|
|
bool end_of_fields() { return ptr == array_end; }
|
2004-07-16 00:15:55 +02:00
|
|
|
const char *name();
|
2005-07-01 06:05:42 +02:00
|
|
|
Item *create_item(THD *thd);
|
2005-01-05 15:48:23 +01:00
|
|
|
Item **item_ptr() {return &ptr->item; }
|
2004-07-16 00:15:55 +02:00
|
|
|
Field *field() { return 0; }
|
2005-07-01 06:05:42 +02:00
|
|
|
inline Item *item() { return ptr->item; }
|
2005-08-12 16:57:19 +02:00
|
|
|
Field_translator *field_translator() { return ptr; }
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Field_iterator interface to the list of materialized fields of a
|
|
|
|
NATURAL/USING join.
|
|
|
|
*/
|
|
|
|
|
|
|
|
class Field_iterator_natural_join: public Field_iterator
|
|
|
|
{
|
2005-11-28 20:57:50 +01:00
|
|
|
List_iterator_fast<Natural_join_column> column_ref_it;
|
2005-08-12 16:57:19 +02:00
|
|
|
Natural_join_column *cur_column_ref;
|
|
|
|
public:
|
2005-11-28 20:57:50 +01:00
|
|
|
Field_iterator_natural_join() :cur_column_ref(NULL) {}
|
|
|
|
~Field_iterator_natural_join() {}
|
2005-08-12 16:57:19 +02:00
|
|
|
void set(TABLE_LIST *table);
|
2005-08-19 14:22:30 +02:00
|
|
|
void next();
|
2005-08-12 16:57:19 +02:00
|
|
|
bool end_of_fields() { return !cur_column_ref; }
|
|
|
|
const char *name() { return cur_column_ref->name(); }
|
|
|
|
Item *create_item(THD *thd) { return cur_column_ref->create_item(thd); }
|
|
|
|
Field *field() { return cur_column_ref->field(); }
|
|
|
|
Natural_join_column *column_ref() { return cur_column_ref; }
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Generic iterator over the fields of an arbitrary table reference.
|
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
This class unifies the various ways of iterating over the columns
|
|
|
|
of a table reference depending on the type of SQL entity it
|
|
|
|
represents. If such an entity represents a nested table reference,
|
|
|
|
this iterator encapsulates the iteration over the columns of the
|
|
|
|
members of the table reference.
|
|
|
|
|
|
|
|
IMPLEMENTATION
|
|
|
|
The implementation assumes that all underlying NATURAL/USING table
|
|
|
|
references already contain their result columns and are linked into
|
|
|
|
the list TABLE_LIST::next_name_resolution_table.
|
|
|
|
*/
|
|
|
|
|
|
|
|
class Field_iterator_table_ref: public Field_iterator
|
|
|
|
{
|
|
|
|
TABLE_LIST *table_ref, *first_leaf, *last_leaf;
|
|
|
|
Field_iterator_table table_field_it;
|
|
|
|
Field_iterator_view view_field_it;
|
|
|
|
Field_iterator_natural_join natural_join_it;
|
|
|
|
Field_iterator *field_it;
|
|
|
|
void set_field_iterator();
|
|
|
|
public:
|
|
|
|
Field_iterator_table_ref() :field_it(NULL) {}
|
|
|
|
void set(TABLE_LIST *table);
|
|
|
|
void next();
|
|
|
|
bool end_of_fields()
|
|
|
|
{ return (table_ref == last_leaf && field_it->end_of_fields()); }
|
|
|
|
const char *name() { return field_it->name(); }
|
2008-09-03 16:45:40 +02:00
|
|
|
const char *get_table_name();
|
|
|
|
const char *get_db_name();
|
2005-08-12 16:57:19 +02:00
|
|
|
GRANT_INFO *grant();
|
|
|
|
Item *create_item(THD *thd) { return field_it->create_item(thd); }
|
|
|
|
Field *field() { return field_it->field(); }
|
2008-10-07 23:34:00 +02:00
|
|
|
Natural_join_column *get_or_create_column_ref(THD *thd, TABLE_LIST *parent_table_ref);
|
2005-11-28 20:57:50 +01:00
|
|
|
Natural_join_column *get_natural_column_ref();
|
2004-07-16 00:15:55 +02:00
|
|
|
};
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2004-06-11 07:27:21 +02:00
|
|
|
typedef struct st_nested_join
|
|
|
|
{
|
|
|
|
List<TABLE_LIST> join_list; /* list of elements in the nested join */
|
|
|
|
table_map used_tables; /* bitmap of tables in the nested join */
|
2004-07-01 22:46:43 +02:00
|
|
|
table_map not_null_tables; /* tables that rejects nulls */
|
2004-06-11 07:27:21 +02:00
|
|
|
struct st_join_table *first_nested;/* the first nested table in the plan */
|
2005-10-25 17:28:27 +02:00
|
|
|
/*
|
|
|
|
Used to count tables in the nested join in 2 isolated places:
|
|
|
|
1. In make_outerjoin_info().
|
|
|
|
2. check_interleaving_with_nj/restore_prev_nj_state (these are called
|
|
|
|
by the join optimizer.
|
|
|
|
Before each use the counters are zeroed by reset_nj_counters.
|
|
|
|
*/
|
|
|
|
uint counter;
|
|
|
|
nested_join_map nj_map; /* Bit used to identify this nested join*/
|
2004-06-11 07:27:21 +02:00
|
|
|
} NESTED_JOIN;
|
2004-07-01 22:46:43 +02:00
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2002-06-11 10:20:31 +02:00
|
|
|
typedef struct st_changed_table_list
|
|
|
|
{
|
2002-03-15 22:57:31 +01:00
|
|
|
struct st_changed_table_list *next;
|
2002-06-08 23:58:05 +02:00
|
|
|
char *key;
|
2002-03-15 22:57:31 +01:00
|
|
|
uint32 key_length;
|
|
|
|
} CHANGED_TABLE_LIST;
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2004-06-11 07:27:21 +02:00
|
|
|
typedef struct st_open_table_list{
|
2001-04-11 13:04:03 +02:00
|
|
|
struct st_open_table_list *next;
|
|
|
|
char *db,*table;
|
|
|
|
uint32 in_use,locked;
|
|
|
|
} OPEN_TABLE_LIST;
|
2003-08-02 11:43:18 +02:00
|
|
|
|
2006-02-14 16:20:48 +01:00
|
|
|
typedef struct st_table_field_w_type
|
|
|
|
{
|
|
|
|
LEX_STRING name;
|
|
|
|
LEX_STRING type;
|
|
|
|
LEX_STRING cset;
|
|
|
|
} TABLE_FIELD_W_TYPE;
|
|
|
|
|
2003-08-02 11:43:18 +02:00
|
|
|
|
2006-02-14 16:20:48 +01:00
|
|
|
my_bool
|
2006-08-17 14:22:59 +02:00
|
|
|
table_check_intact(TABLE *table, const uint table_f_count,
|
2007-04-05 13:24:34 +02:00
|
|
|
const TABLE_FIELD_W_TYPE *table_def);
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
|
|
|
|
static inline my_bitmap_map *tmp_use_all_columns(TABLE *table,
|
|
|
|
MY_BITMAP *bitmap)
|
|
|
|
{
|
|
|
|
my_bitmap_map *old= bitmap->bitmap;
|
|
|
|
bitmap->bitmap= table->s->all_set.bitmap;
|
|
|
|
return old;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static inline void tmp_restore_column_map(MY_BITMAP *bitmap,
|
|
|
|
my_bitmap_map *old)
|
|
|
|
{
|
|
|
|
bitmap->bitmap= old;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* The following is only needed for debugging */
|
|
|
|
|
|
|
|
static inline my_bitmap_map *dbug_tmp_use_all_columns(TABLE *table,
|
|
|
|
MY_BITMAP *bitmap)
|
|
|
|
{
|
|
|
|
#ifndef DBUG_OFF
|
|
|
|
return tmp_use_all_columns(table, bitmap);
|
|
|
|
#else
|
|
|
|
return 0;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void dbug_tmp_restore_column_map(MY_BITMAP *bitmap,
|
|
|
|
my_bitmap_map *old)
|
|
|
|
{
|
|
|
|
#ifndef DBUG_OFF
|
|
|
|
tmp_restore_column_map(bitmap, old);
|
|
|
|
#endif
|
|
|
|
}
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
|
2008-12-09 18:46:03 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
Variant of the above : handle both read and write sets.
|
|
|
|
Provide for the possiblity of the read set being the same as the write set
|
|
|
|
*/
|
|
|
|
static inline void dbug_tmp_use_all_columns(TABLE *table,
|
|
|
|
my_bitmap_map **save,
|
|
|
|
MY_BITMAP *read_set,
|
|
|
|
MY_BITMAP *write_set)
|
|
|
|
{
|
|
|
|
#ifndef DBUG_OFF
|
|
|
|
save[0]= read_set->bitmap;
|
|
|
|
save[1]= write_set->bitmap;
|
|
|
|
(void) tmp_use_all_columns(table, read_set);
|
|
|
|
(void) tmp_use_all_columns(table, write_set);
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static inline void dbug_tmp_restore_column_maps(MY_BITMAP *read_set,
|
|
|
|
MY_BITMAP *write_set,
|
|
|
|
my_bitmap_map **old)
|
|
|
|
{
|
|
|
|
#ifndef DBUG_OFF
|
|
|
|
tmp_restore_column_map(read_set, old[0]);
|
|
|
|
tmp_restore_column_map(write_set, old[1]);
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
size_t max_row_length(TABLE *table, const uchar *data);
|
|
|
|
|
2009-11-30 16:55:03 +01:00
|
|
|
|
2009-12-08 10:57:07 +01:00
|
|
|
void init_mdl_requests(TABLE_LIST *table_list);
|
2009-11-30 16:55:03 +01:00
|
|
|
|
2009-09-23 23:32:31 +02:00
|
|
|
#endif /* TABLE_INCLUDED */
|