2006-12-31 01:02:27 +01:00
|
|
|
/* Copyright (C) 2000-2006 MySQL AB
|
2001-12-06 13:10:51 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
2006-12-23 20:17:15 +01:00
|
|
|
the Free Software Foundation; version 2 of the License.
|
2001-12-06 13:10:51 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
2001-12-06 13:10:51 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; if not, write to the Free Software
|
|
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
|
|
|
|
|
|
|
|
|
|
/* Structs that defines the TABLE */
|
|
|
|
|
|
|
|
class Item; /* Needed by ORDER */
|
2006-11-01 02:31:56 +01:00
|
|
|
class Item_subselect;
|
2000-07-31 21:29:14 +02:00
|
|
|
class GRANT_TABLE;
|
2003-02-12 20:55:37 +01:00
|
|
|
class st_select_lex_unit;
|
2004-07-16 00:15:55 +02:00
|
|
|
class st_select_lex;
|
2005-07-18 13:31:02 +02:00
|
|
|
class partition_info;
|
2004-10-19 23:12:55 +02:00
|
|
|
class COND_EQUAL;
|
2005-10-27 23:18:23 +02:00
|
|
|
class Security_context;
|
2000-07-31 21:29:14 +02:00
|
|
|
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
/*************************************************************************/
|
|
|
|
|
|
|
|
/**
|
|
|
|
View_creation_ctx -- creation context of view objects.
|
|
|
|
*/
|
|
|
|
|
|
|
|
class View_creation_ctx : public Default_object_creation_ctx,
|
|
|
|
public Sql_alloc
|
|
|
|
{
|
|
|
|
public:
|
|
|
|
static View_creation_ctx *create(THD *thd);
|
|
|
|
|
|
|
|
static View_creation_ctx *create(THD *thd,
|
2007-07-16 22:59:21 +02:00
|
|
|
TABLE_LIST *view);
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
|
|
|
|
private:
|
|
|
|
View_creation_ctx(THD *thd)
|
|
|
|
: Default_object_creation_ctx(thd)
|
|
|
|
{ }
|
|
|
|
};
|
|
|
|
|
|
|
|
/*************************************************************************/
|
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
/* Order clause list element */
|
|
|
|
|
|
|
|
typedef struct st_order {
|
|
|
|
struct st_order *next;
|
|
|
|
Item **item; /* Point at item in select fields */
|
2004-06-25 15:52:01 +02:00
|
|
|
Item *item_ptr; /* Storage for initial item */
|
2003-10-14 12:59:28 +02:00
|
|
|
Item **item_copy; /* For SPs; the original item ptr */
|
2004-08-31 10:58:45 +02:00
|
|
|
int counter; /* position in SELECT list, correct
|
|
|
|
only if counter_used is true*/
|
2000-07-31 21:29:14 +02:00
|
|
|
bool asc; /* true if ascending */
|
|
|
|
bool free_me; /* true if item isn't shared */
|
|
|
|
bool in_field_list; /* true if in select field list */
|
2004-10-07 00:45:06 +02:00
|
|
|
bool counter_used; /* parameter was counter of columns */
|
2000-07-31 21:29:14 +02:00
|
|
|
Field *field; /* If tmp-table group */
|
|
|
|
char *buff; /* If tmp-table group */
|
2005-01-06 12:00:13 +01:00
|
|
|
table_map used, depend_map;
|
2000-07-31 21:29:14 +02:00
|
|
|
} ORDER;
|
|
|
|
|
|
|
|
typedef struct st_grant_info
|
|
|
|
{
|
|
|
|
GRANT_TABLE *grant_table;
|
|
|
|
uint version;
|
2002-06-12 14:04:18 +02:00
|
|
|
ulong privilege;
|
|
|
|
ulong want_privilege;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
|
|
|
Stores the requested access acl of top level tables list. Is used to
|
|
|
|
check access rights to the underlying tables of a view.
|
|
|
|
*/
|
|
|
|
ulong orig_want_privilege;
|
2000-07-31 21:29:14 +02:00
|
|
|
} GRANT_INFO;
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
enum tmp_table_type
|
|
|
|
{
|
2007-03-22 16:00:47 +01:00
|
|
|
NO_TMP_TABLE, NON_TRANSACTIONAL_TMP_TABLE, TRANSACTIONAL_TMP_TABLE,
|
2005-11-23 21:45:02 +01:00
|
|
|
INTERNAL_TMP_TABLE, SYSTEM_TMP_TABLE
|
|
|
|
};
|
2001-05-09 22:02:36 +02:00
|
|
|
|
A fix and a test case for Bug#26141 mixing table types in trigger
causes full table lock on innodb table.
Also fixes Bug#28502 Triggers that update another innodb table
will block on X lock unnecessarily (duplciate).
Code review fixes.
Both bugs' synopses are misleading: InnoDB table is
not X locked. The statements, however, cannot proceed concurrently,
but this happens due to lock conflicts for tables used in triggers,
not for the InnoDB table.
If a user had an InnoDB table, and two triggers, AFTER UPDATE and
AFTER INSERT, competing for different resources (e.g. two distinct
MyISAM tables), then these two triggers would not be able to execute
concurrently. Moreover, INSERTS/UPDATES of the InnoDB table would
not be able to run concurrently.
The problem had other side-effects (see respective bug reports).
This behavior was a consequence of a shortcoming of the pre-locking
algorithm, which would not distinguish between different DML operations
(e.g. INSERT and DELETE) and pre-lock all the tables
that are used by any trigger defined on the subject table.
The idea of the fix is to extend the pre-locking algorithm to keep track,
for each table, what DML operation it is used for and not
load triggers that are known to never be fired.
2007-07-12 20:26:41 +02:00
|
|
|
/** Event on which trigger is invoked. */
|
|
|
|
enum trg_event_type
|
|
|
|
{
|
|
|
|
TRG_EVENT_INSERT= 0,
|
|
|
|
TRG_EVENT_UPDATE= 1,
|
|
|
|
TRG_EVENT_DELETE= 2,
|
|
|
|
TRG_EVENT_MAX
|
|
|
|
};
|
|
|
|
|
2004-08-24 14:37:51 +02:00
|
|
|
enum frm_type_enum
|
|
|
|
{
|
|
|
|
FRMTYPE_ERROR= 0,
|
|
|
|
FRMTYPE_TABLE,
|
|
|
|
FRMTYPE_VIEW
|
|
|
|
};
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
enum release_type { RELEASE_NORMAL, RELEASE_WAIT_FOR_DROP };
|
|
|
|
|
2003-04-24 13:33:33 +02:00
|
|
|
typedef struct st_filesort_info
|
|
|
|
{
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
IO_CACHE *io_cache; /* If sorted through filesort */
|
|
|
|
uchar **sort_keys; /* Buffer for sorting keys */
|
|
|
|
uchar *buffpek; /* Buffer for buffpek structures */
|
|
|
|
uint buffpek_len; /* Max number of buffpeks in the buffer */
|
|
|
|
uchar *addon_buf; /* Pointer to a buffer if sorted with fields */
|
|
|
|
size_t addon_length; /* Length of the buffer */
|
2003-04-24 13:33:33 +02:00
|
|
|
struct st_sort_addon_field *addon_field; /* Pointer to the fields info */
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
void (*unpack)(struct st_sort_addon_field *, uchar *); /* To unpack back */
|
|
|
|
uchar *record_pointers; /* If sorted in memory */
|
|
|
|
ha_rows found_records; /* How many records in sort */
|
2003-04-24 13:33:33 +02:00
|
|
|
} FILESORT_INFO;
|
|
|
|
|
2003-08-02 11:43:18 +02:00
|
|
|
|
2004-10-01 16:54:06 +02:00
|
|
|
/*
|
2005-04-22 12:30:09 +02:00
|
|
|
Values in this enum are used to indicate how a tables TIMESTAMP field
|
|
|
|
should be treated. It can be set to the current timestamp on insert or
|
|
|
|
update or both.
|
|
|
|
WARNING: The values are used for bit operations. If you change the
|
|
|
|
enum, you must keep the bitwise relation of the values. For example:
|
|
|
|
(int) TIMESTAMP_AUTO_SET_ON_BOTH must be equal to
|
|
|
|
(int) TIMESTAMP_AUTO_SET_ON_INSERT | (int) TIMESTAMP_AUTO_SET_ON_UPDATE.
|
|
|
|
We use an enum here so that the debugger can display the value names.
|
2004-10-01 16:54:06 +02:00
|
|
|
*/
|
|
|
|
enum timestamp_auto_set_type
|
|
|
|
{
|
|
|
|
TIMESTAMP_NO_AUTO_SET= 0, TIMESTAMP_AUTO_SET_ON_INSERT= 1,
|
|
|
|
TIMESTAMP_AUTO_SET_ON_UPDATE= 2, TIMESTAMP_AUTO_SET_ON_BOTH= 3
|
|
|
|
};
|
2005-04-22 12:30:09 +02:00
|
|
|
#define clear_timestamp_auto_bits(_target_, _bits_) \
|
|
|
|
(_target_)= (enum timestamp_auto_set_type)((int)(_target_) & ~(int)(_bits_))
|
2004-10-01 16:54:06 +02:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
class Field_timestamp;
|
|
|
|
class Field_blob;
|
2004-09-07 14:29:46 +02:00
|
|
|
class Table_triggers_list;
|
2000-07-31 21:29:14 +02:00
|
|
|
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
/**
|
|
|
|
Category of table found in the table share.
|
|
|
|
*/
|
|
|
|
enum enum_table_category
|
|
|
|
{
|
|
|
|
/**
|
|
|
|
Unknown value.
|
|
|
|
*/
|
|
|
|
TABLE_UNKNOWN_CATEGORY=0,
|
|
|
|
|
|
|
|
/**
|
|
|
|
Temporary table.
|
|
|
|
The table is visible only in the session.
|
|
|
|
Therefore,
|
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
do not apply to this table.
|
2007-08-15 15:43:08 +02:00
|
|
|
Note that LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
can be used on temporary tables.
|
|
|
|
Temporary tables are not part of the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_TEMPORARY=1,
|
|
|
|
|
|
|
|
/**
|
|
|
|
User table.
|
|
|
|
These tables do honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
User tables are cached in the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_USER=2,
|
|
|
|
|
|
|
|
/**
|
|
|
|
System table, maintained by the server.
|
|
|
|
These tables do honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
Typically, writes to system tables are performed by
|
|
|
|
the server implementation, not explicitly be a user.
|
|
|
|
System tables are cached in the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_SYSTEM=3,
|
|
|
|
|
|
|
|
/**
|
|
|
|
Information schema tables.
|
|
|
|
These tables are an interface provided by the system
|
|
|
|
to inspect the system metadata.
|
|
|
|
These tables do *not* honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
as there is no point in locking explicitely
|
|
|
|
an INFORMATION_SCHEMA table.
|
|
|
|
Nothing is directly written to information schema tables.
|
|
|
|
Note that this value is not used currently,
|
|
|
|
since information schema tables are not shared,
|
|
|
|
but implemented as session specific temporary tables.
|
|
|
|
*/
|
|
|
|
/*
|
|
|
|
TODO: Fixing the performance issues of I_S will lead
|
|
|
|
to I_S tables in the table cache, which should use
|
|
|
|
this table type.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_INFORMATION=4,
|
|
|
|
|
|
|
|
/**
|
|
|
|
Performance schema tables.
|
|
|
|
These tables are an interface provided by the system
|
|
|
|
to inspect the system performance data.
|
|
|
|
These tables do *not* honor:
|
2007-08-15 15:43:08 +02:00
|
|
|
- LOCK TABLE t FOR READ/WRITE
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
- FLUSH TABLES WITH READ LOCK
|
|
|
|
- SET GLOBAL READ_ONLY = ON
|
|
|
|
as there is no point in locking explicitely
|
|
|
|
a PERFORMANCE_SCHEMA table.
|
|
|
|
An example of PERFORMANCE_SCHEMA tables are:
|
|
|
|
- mysql.slow_log
|
|
|
|
- mysql.general_log,
|
|
|
|
which *are* updated even when there is either
|
|
|
|
a GLOBAL READ LOCK or a GLOBAL READ_ONLY in effect.
|
|
|
|
User queries do not write directly to these tables
|
|
|
|
(there are exceptions for log tables).
|
|
|
|
The server implementation perform writes.
|
|
|
|
Performance tables are cached in the table cache.
|
|
|
|
*/
|
|
|
|
TABLE_CATEGORY_PERFORMANCE=5
|
|
|
|
};
|
|
|
|
typedef enum enum_table_category TABLE_CATEGORY;
|
|
|
|
|
|
|
|
TABLE_CATEGORY get_table_category(const LEX_STRING *db,
|
|
|
|
const LEX_STRING *name);
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
This structure is shared between different table objects. There is one
|
|
|
|
instance of table share per one table in the database.
|
|
|
|
*/
|
2005-01-06 12:00:13 +01:00
|
|
|
|
|
|
|
typedef struct st_table_share
|
|
|
|
{
|
2006-11-29 21:51:09 +01:00
|
|
|
st_table_share() {} /* Remove gcc warning */
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
|
|
|
|
/** Category of this table. */
|
|
|
|
TABLE_CATEGORY table_category;
|
|
|
|
|
2004-06-25 15:52:01 +02:00
|
|
|
/* hash of field names (contains pointers to elements of field array) */
|
2005-01-06 12:00:13 +01:00
|
|
|
HASH name_hash; /* hash of field names */
|
|
|
|
MEM_ROOT mem_root;
|
2000-07-31 21:29:14 +02:00
|
|
|
TYPELIB keynames; /* Pointers to keynames */
|
|
|
|
TYPELIB fieldnames; /* Pointer to fieldnames */
|
|
|
|
TYPELIB *intervals; /* pointer to interval info */
|
2005-01-06 12:00:13 +01:00
|
|
|
pthread_mutex_t mutex; /* For locking the share */
|
|
|
|
pthread_cond_t cond; /* To signal that share is ready */
|
2005-11-23 21:45:02 +01:00
|
|
|
struct st_table_share *next, /* Link to unused shares */
|
|
|
|
**prev;
|
|
|
|
#ifdef NOT_YET
|
2005-01-06 12:00:13 +01:00
|
|
|
struct st_table *open_tables; /* link to open tables */
|
2005-11-23 21:45:02 +01:00
|
|
|
#endif
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
/* The following is copied to each TABLE on OPEN */
|
|
|
|
Field **field;
|
2005-11-23 21:45:02 +01:00
|
|
|
Field **found_next_number_field;
|
|
|
|
Field *timestamp_field; /* Used only during open */
|
2005-01-06 12:00:13 +01:00
|
|
|
KEY *key_info; /* data of keys in database */
|
|
|
|
uint *blob_field; /* Index to blobs in Field arrray*/
|
2005-11-23 21:45:02 +01:00
|
|
|
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
uchar *default_values; /* row with default values */
|
2006-06-29 15:39:34 +02:00
|
|
|
LEX_STRING comment; /* Comment about table */
|
2005-01-06 12:00:13 +01:00
|
|
|
CHARSET_INFO *table_charset; /* Default charset of string fields */
|
|
|
|
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
MY_BITMAP all_set;
|
2006-08-21 17:02:11 +02:00
|
|
|
/*
|
|
|
|
Key which is used for looking-up table in table cache and in the list
|
|
|
|
of thread's temporary tables. Has the form of:
|
|
|
|
"database_name\0table_name\0" + optional part for temporary tables.
|
|
|
|
|
|
|
|
Note that all three 'table_cache_key', 'db' and 'table_name' members
|
|
|
|
must be set (and be non-zero) for tables in table cache. They also
|
|
|
|
should correspond to each other.
|
|
|
|
To ensure this one can use set_table_cache() methods.
|
|
|
|
*/
|
2005-11-23 21:45:02 +01:00
|
|
|
LEX_STRING table_cache_key;
|
|
|
|
LEX_STRING db; /* Pointer to db */
|
|
|
|
LEX_STRING table_name; /* Table name (for open) */
|
|
|
|
LEX_STRING path; /* Path to .frm file (from datadir) */
|
|
|
|
LEX_STRING normalized_path; /* unpack_filename(path) */
|
2005-09-13 03:02:17 +02:00
|
|
|
LEX_STRING connect_string;
|
2007-01-29 15:07:11 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
Set of keys in use, implemented as a Bitmap.
|
|
|
|
Excludes keys disabled by ALTER TABLE ... DISABLE KEYS.
|
|
|
|
*/
|
|
|
|
key_map keys_in_use;
|
2005-01-06 12:00:13 +01:00
|
|
|
key_map keys_for_keyread;
|
2005-11-23 21:45:02 +01:00
|
|
|
ha_rows min_rows, max_rows; /* create information */
|
2005-01-06 12:00:13 +01:00
|
|
|
ulong avg_row_length; /* create information */
|
|
|
|
ulong raid_chunksize;
|
2007-02-28 22:25:50 +01:00
|
|
|
ulong version, mysql_version;
|
2005-01-06 12:00:13 +01:00
|
|
|
ulong timestamp_offset; /* Set to offset+1 of record */
|
|
|
|
ulong reclength; /* Recordlength */
|
|
|
|
|
2007-03-02 17:43:45 +01:00
|
|
|
plugin_ref db_plugin; /* storage engine plugin */
|
|
|
|
inline handlerton *db_type() const /* table_type for handler */
|
|
|
|
{
|
|
|
|
// DBUG_ASSERT(db_plugin);
|
|
|
|
return db_plugin ? plugin_data(db_plugin, handlerton*) : NULL;
|
|
|
|
}
|
2000-07-31 21:29:14 +02:00
|
|
|
enum row_type row_type; /* How rows are stored */
|
2005-01-06 12:00:13 +01:00
|
|
|
enum tmp_table_type tmp_table;
|
2007-08-13 15:11:25 +02:00
|
|
|
enum ha_choice transactional;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
uint ref_count; /* How many TABLE objects uses this */
|
|
|
|
uint open_count; /* Number of tables in open list */
|
2005-01-06 12:00:13 +01:00
|
|
|
uint blob_ptr_size; /* 4 or 8 */
|
2006-05-03 14:59:17 +02:00
|
|
|
uint key_block_size; /* create key_block_size, if used */
|
2005-10-04 17:04:20 +02:00
|
|
|
uint null_bytes, last_null_bit_pos;
|
2005-01-06 12:00:13 +01:00
|
|
|
uint fields; /* Number of fields */
|
|
|
|
uint rec_buff_length; /* Size of table->record[] buffer */
|
|
|
|
uint keys, key_parts;
|
|
|
|
uint max_key_length, max_unique_length, total_key_length;
|
|
|
|
uint uniques; /* Number of UNIQUE index */
|
|
|
|
uint null_fields; /* number of null fields */
|
|
|
|
uint blob_fields; /* number of blob fields */
|
2005-11-23 21:45:02 +01:00
|
|
|
uint timestamp_field_offset; /* Field number for timestamp field */
|
2005-01-12 02:38:53 +01:00
|
|
|
uint varchar_fields; /* number of varchar fields */
|
2000-07-31 21:29:14 +02:00
|
|
|
uint db_create_options; /* Create options from database */
|
|
|
|
uint db_options_in_use; /* Options in use */
|
|
|
|
uint db_record_offset; /* if HA_REC_IN_SEQ */
|
2005-01-06 12:00:13 +01:00
|
|
|
uint raid_type, raid_chunks;
|
2005-11-23 21:45:02 +01:00
|
|
|
uint rowid_field_offset; /* Field_nr +1 to rowid field */
|
2005-01-06 12:00:13 +01:00
|
|
|
/* Index of auto-updated TIMESTAMP field in field array */
|
|
|
|
uint primary_key;
|
2007-03-17 00:13:25 +01:00
|
|
|
uint next_number_index; /* autoincrement key number */
|
|
|
|
uint next_number_key_offset; /* autoinc keypart offset in a key */
|
|
|
|
uint next_number_keypart; /* autoinc keypart number in a key */
|
2005-11-23 21:45:02 +01:00
|
|
|
uint error, open_errno, errarg; /* error from open_table_def() */
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
uint column_bitmap_size;
|
2005-11-23 21:45:02 +01:00
|
|
|
uchar frm_version;
|
|
|
|
bool null_field_first;
|
|
|
|
bool system; /* Set if system table (one record) */
|
|
|
|
bool crypted; /* If .frm file is crypted */
|
|
|
|
bool db_low_byte_first; /* Portable row format */
|
|
|
|
bool crashed;
|
|
|
|
bool is_view;
|
|
|
|
bool name_lock, replace_with_name_lock;
|
|
|
|
bool waiting_on_cond; /* Protection against free */
|
2005-12-22 06:39:02 +01:00
|
|
|
ulong table_map_id; /* for row-based replication */
|
|
|
|
ulonglong table_map_version;
|
2006-03-17 18:11:07 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
Cache for row-based replication table share checks that does not
|
|
|
|
need to be repeated. Possible values are: -1 when cache value is
|
|
|
|
not calculated yet, 0 when table *shall not* be replicated, 1 when
|
|
|
|
table *may* be replicated.
|
|
|
|
*/
|
|
|
|
int cached_row_logging_check;
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
#ifdef WITH_PARTITION_STORAGE_ENGINE
|
2006-05-10 18:53:40 +02:00
|
|
|
bool auto_partitioned;
|
2007-03-27 19:09:56 +02:00
|
|
|
const char *partition_info;
|
2005-11-23 21:45:02 +01:00
|
|
|
uint partition_info_len;
|
2007-06-01 16:44:09 +02:00
|
|
|
uint partition_info_buffer_size;
|
2007-03-27 19:09:56 +02:00
|
|
|
const char *part_state;
|
2006-01-17 08:40:00 +01:00
|
|
|
uint part_state_len;
|
2005-12-21 19:18:40 +01:00
|
|
|
handlerton *default_part_db_type;
|
2005-11-23 21:45:02 +01:00
|
|
|
#endif
|
2006-08-21 17:02:11 +02:00
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Set share's table cache key and update its db and table name appropriately.
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
set_table_cache_key()
|
|
|
|
key_buff Buffer with already built table cache key to be
|
|
|
|
referenced from share.
|
|
|
|
key_length Key length.
|
|
|
|
|
|
|
|
NOTES
|
|
|
|
Since 'key_buff' buffer will be referenced from share it should has same
|
|
|
|
life-time as share itself.
|
|
|
|
This method automatically ensures that TABLE_SHARE::table_name/db have
|
|
|
|
appropriate values by using table cache key as their source.
|
|
|
|
*/
|
|
|
|
|
|
|
|
void set_table_cache_key(char *key_buff, uint key_length)
|
|
|
|
{
|
|
|
|
table_cache_key.str= key_buff;
|
|
|
|
table_cache_key.length= key_length;
|
|
|
|
/*
|
|
|
|
Let us use the fact that the key is "db/0/table_name/0" + optional
|
|
|
|
part for temporary tables.
|
|
|
|
*/
|
|
|
|
db.str= table_cache_key.str;
|
|
|
|
db.length= strlen(db.str);
|
|
|
|
table_name.str= db.str + db.length + 1;
|
|
|
|
table_name.length= strlen(table_name.str);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Set share's table cache key and update its db and table name appropriately.
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
set_table_cache_key()
|
|
|
|
key_buff Buffer to be used as storage for table cache key
|
|
|
|
(should be at least key_length bytes).
|
|
|
|
key Value for table cache key.
|
|
|
|
key_length Key length.
|
|
|
|
|
|
|
|
NOTE
|
|
|
|
Since 'key_buff' buffer will be used as storage for table cache key
|
|
|
|
it should has same life-time as share itself.
|
|
|
|
*/
|
|
|
|
|
|
|
|
void set_table_cache_key(char *key_buff, const char *key, uint key_length)
|
|
|
|
{
|
|
|
|
memcpy(key_buff, key, key_length);
|
|
|
|
set_table_cache_key(key_buff, key_length);
|
|
|
|
}
|
|
|
|
|
WL#3984 (Revise locking of mysql.general_log and mysql.slow_log)
Bug#25422 (Hang with log tables)
Bug 17876 (Truncating mysql.slow_log in a SP after using cursor locks the
thread)
Bug 23044 (Warnings on flush of a log table)
Bug 29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
a deadlock)
Prior to this fix, the server would hang when performing concurrent
ALTER TABLE or TRUNCATE TABLE statements against the LOG TABLES,
which are mysql.general_log and mysql.slow_log.
The root cause traces to the following code:
in sql_base.cc, open_table()
if (table->in_use != thd)
{
/* wait_for_condition will unlock LOCK_open for us */
wait_for_condition(thd, &LOCK_open, &COND_refresh);
}
The problem with this code is that the current implementation of the
LOGGER creates 'fake' THD objects, like
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
so that waiting for these non-existing threads to release table locks
cause the dead lock.
In general, the design of Log_to_csv_event_handler does not fit into the
general architecture of the server, so that the concept of general_log_thd
and slow_log_thd has to be abandoned:
- this implementation does not work with table locking
- it will not work with commands like SHOW PROCESSLIST
- having the log tables always opened does not integrate well with DDL
operations / FLUSH TABLES / SET GLOBAL READ_ONLY
With this patch, the fundamental design of the LOGGER has been changed to:
- always open and close a log table when writing a log
- remove totally the usage of fake THD objects
- clarify how locking of log tables is implemented in general.
See WL#3984 for details related to the new locking design.
Additional changes (misc bugs exposed and fixed):
1)
mysqldump which would ignore some tables in dump_all_tables_in_db(),
but forget to ignore the same in dump_all_views_in_db().
2)
mysqldump would also issue an empty "LOCK TABLE" command when all the tables
to lock are to be ignored (numrows == 0), instead of not issuing the query.
3)
Internal errors handlers could intercept errors but not warnings
(see sql_error.cc).
4)
Implementing a nested call to open tables, for the performance schema tables,
exposed an existing bug in remove_table_from_cache(), which would perform:
in_use->some_tables_deleted=1;
against another thread, without any consideration about thread locking.
This call inside remove_table_from_cache() was not required anyway,
since calling mysql_lock_abort() takes care of aborting -- cleanly -- threads
that might hold a lock on a table.
This line (in_use->some_tables_deleted=1) has been removed.
2007-07-27 08:31:06 +02:00
|
|
|
inline bool honor_global_locks()
|
|
|
|
{
|
|
|
|
return ((table_category == TABLE_CATEGORY_USER)
|
|
|
|
|| (table_category == TABLE_CATEGORY_SYSTEM));
|
|
|
|
}
|
|
|
|
|
|
|
|
inline bool require_write_privileges()
|
|
|
|
{
|
|
|
|
return (table_category == TABLE_CATEGORY_PERFORMANCE);
|
|
|
|
}
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
|
|
|
|
inline ulong get_table_def_version()
|
|
|
|
{
|
|
|
|
return table_map_id;
|
|
|
|
}
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
} TABLE_SHARE;
|
|
|
|
|
|
|
|
|
2007-05-11 19:51:03 +02:00
|
|
|
extern ulong refresh_version;
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
/* Information for one open table */
|
2007-03-05 18:08:41 +01:00
|
|
|
enum index_hint_type
|
|
|
|
{
|
|
|
|
INDEX_HINT_IGNORE,
|
|
|
|
INDEX_HINT_USE,
|
|
|
|
INDEX_HINT_FORCE
|
|
|
|
};
|
2005-01-06 12:00:13 +01:00
|
|
|
|
|
|
|
struct st_table {
|
2006-02-25 16:46:30 +01:00
|
|
|
st_table() {} /* Remove gcc warning */
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
TABLE_SHARE *s;
|
|
|
|
handler *file;
|
|
|
|
#ifdef NOT_YET
|
|
|
|
struct st_table *used_next, **used_prev; /* Link to used tables */
|
2005-11-23 21:45:02 +01:00
|
|
|
struct st_table *open_next, **open_prev; /* Link to open tables */
|
2007-05-11 19:51:03 +02:00
|
|
|
#endif
|
2005-01-06 12:00:13 +01:00
|
|
|
struct st_table *next, *prev;
|
|
|
|
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
/* For the below MERGE related members see top comment in ha_myisammrg.cc */
|
|
|
|
struct st_table *parent; /* Set in MERGE child. Ptr to parent */
|
|
|
|
TABLE_LIST *child_l; /* Set in MERGE parent. List of children */
|
|
|
|
TABLE_LIST **child_last_l; /* Set in MERGE parent. End of list */
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
THD *in_use; /* Which thread uses this */
|
|
|
|
Field **field; /* Pointer to fields */
|
|
|
|
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
uchar *record[2]; /* Pointer to records */
|
|
|
|
uchar *write_row_record; /* Used as optimisation in
|
2005-12-22 06:39:02 +01:00
|
|
|
THD::write_row */
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
uchar *insert_values; /* used by INSERT ... UPDATE */
|
2007-03-05 18:08:41 +01:00
|
|
|
/*
|
|
|
|
Map of keys that can be used to retrieve all data from this table
|
|
|
|
needed by the query without reading the row.
|
|
|
|
*/
|
|
|
|
key_map covering_keys;
|
|
|
|
key_map quick_keys, merge_keys;
|
2007-01-29 15:07:11 +01:00
|
|
|
/*
|
|
|
|
A set of keys that can be used in the query that references this
|
2007-01-30 18:07:41 +01:00
|
|
|
table.
|
2007-01-29 15:07:11 +01:00
|
|
|
|
|
|
|
All indexes disabled on the table's TABLE_SHARE (see TABLE::s) will be
|
|
|
|
subtracted from this set upon instantiation. Thus for any TABLE t it holds
|
|
|
|
that t.keys_in_use_for_query is a subset of t.s.keys_in_use. Generally we
|
|
|
|
must not introduce any new keys here (see setup_tables).
|
|
|
|
|
|
|
|
The set is implemented as a bitmap.
|
|
|
|
*/
|
|
|
|
key_map keys_in_use_for_query;
|
2007-03-05 18:08:41 +01:00
|
|
|
/* Map of keys that can be used to calculate GROUP BY without sorting */
|
|
|
|
key_map keys_in_use_for_group_by;
|
|
|
|
/* Map of keys that can be used to calculate ORDER BY without sorting */
|
|
|
|
key_map keys_in_use_for_order_by;
|
2005-01-06 12:00:13 +01:00
|
|
|
KEY *key_info; /* data of keys in database */
|
|
|
|
|
2005-11-23 21:45:02 +01:00
|
|
|
Field *next_number_field; /* Set if next_number is activated */
|
|
|
|
Field *found_next_number_field; /* Set on open */
|
2005-01-06 12:00:13 +01:00
|
|
|
Field_timestamp *timestamp_field;
|
|
|
|
|
|
|
|
/* Table's triggers, 0 if there are no of them */
|
|
|
|
Table_triggers_list *triggers;
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *pos_in_table_list;/* Element referring to this table */
|
2005-01-06 12:00:13 +01:00
|
|
|
ORDER *group;
|
|
|
|
const char *alias; /* alias or table name */
|
|
|
|
uchar *null_flags;
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
my_bitmap_map *bitmap_init_value;
|
|
|
|
MY_BITMAP def_read_set, def_write_set, tmp_set; /* containers */
|
|
|
|
MY_BITMAP *read_set, *write_set; /* Active column sets */
|
2007-11-01 21:52:56 +01:00
|
|
|
/*
|
|
|
|
The ID of the query that opened and is using this table. Has different
|
|
|
|
meanings depending on the table type.
|
|
|
|
|
|
|
|
Temporary tables:
|
|
|
|
|
|
|
|
table->query_id is set to thd->query_id for the duration of a statement
|
|
|
|
and is reset to 0 once it is closed by the same statement. A non-zero
|
|
|
|
table->query_id means that a statement is using the table even if it's
|
|
|
|
not the current statement (table is in use by some outer statement).
|
|
|
|
|
|
|
|
Non-temporary tables:
|
|
|
|
|
|
|
|
Under pre-locked or LOCK TABLES mode: query_id is set to thd->query_id
|
|
|
|
for the duration of a statement and is reset to 0 once it is closed by
|
|
|
|
the same statement. A non-zero query_id is used to control which tables
|
|
|
|
in the list of pre-opened and locked tables are actually being used.
|
|
|
|
*/
|
2005-03-19 01:12:25 +01:00
|
|
|
query_id_t query_id;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2006-07-28 19:27:01 +02:00
|
|
|
/*
|
|
|
|
For each key that has quick_keys.is_set(key) == TRUE: estimate of #records
|
|
|
|
and max #key parts that range access would use.
|
|
|
|
*/
|
2005-01-06 12:00:13 +01:00
|
|
|
ha_rows quick_rows[MAX_KEY];
|
2006-07-28 19:27:01 +02:00
|
|
|
|
|
|
|
/* Bitmaps of key parts that =const for the entire join. */
|
2005-01-06 12:00:13 +01:00
|
|
|
key_part_map const_key_parts[MAX_KEY];
|
2006-07-28 19:27:01 +02:00
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
uint quick_key_parts[MAX_KEY];
|
2006-05-10 15:40:20 +02:00
|
|
|
uint quick_n_ranges[MAX_KEY];
|
2004-06-25 15:52:01 +02:00
|
|
|
|
2006-07-28 19:27:01 +02:00
|
|
|
/*
|
|
|
|
Estimate of number of records that satisfy SARGable part of the table
|
|
|
|
condition, or table->file->records if no SARGable condition could be
|
|
|
|
constructed.
|
|
|
|
This value is used by join optimizer as an estimate of number of records
|
|
|
|
that will pass the table condition (condition that depends on fields of
|
|
|
|
this table and constants)
|
|
|
|
*/
|
|
|
|
ha_rows quick_condition_rows;
|
|
|
|
|
2004-10-01 16:54:06 +02:00
|
|
|
/*
|
|
|
|
If this table has TIMESTAMP field with auto-set property (pointed by
|
|
|
|
timestamp_field member) then this variable indicates during which
|
|
|
|
operations (insert only/on update/in both cases) we should set this
|
|
|
|
field to current timestamp. If there are no such field in this table
|
|
|
|
or we should not automatically set its value during execution of current
|
|
|
|
statement then the variable contains TIMESTAMP_NO_AUTO_SET (i.e. 0).
|
|
|
|
|
|
|
|
Value of this variable is set for each statement in open_table() and
|
|
|
|
if needed cleared later in statement processing code (see mysql_update()
|
|
|
|
as example).
|
2004-06-25 15:52:01 +02:00
|
|
|
*/
|
2004-10-01 16:54:06 +02:00
|
|
|
timestamp_auto_set_type timestamp_field_type;
|
2005-01-06 12:00:13 +01:00
|
|
|
table_map map; /* ID bit of table (1,2,4,8,16...) */
|
2006-02-20 15:23:57 +01:00
|
|
|
|
|
|
|
uint lock_position; /* Position in MYSQL_LOCK.table */
|
|
|
|
uint lock_data_start; /* Start pos. in MYSQL_LOCK.locks */
|
|
|
|
uint lock_count; /* Number of locks */
|
2005-01-06 12:00:13 +01:00
|
|
|
uint tablenr,used_fields;
|
|
|
|
uint temp_pool_slot; /* Used by intern temp tables */
|
|
|
|
uint status; /* What's in record[0] */
|
|
|
|
uint db_stat; /* mode of file as in handler.h */
|
|
|
|
/* number of select if it is derived table */
|
|
|
|
uint derived_select_number;
|
|
|
|
int current_lock; /* Type of lock on table */
|
2000-07-31 21:29:14 +02:00
|
|
|
my_bool copy_blobs; /* copy_blobs when storing */
|
2006-02-20 15:23:57 +01:00
|
|
|
|
|
|
|
/*
|
2005-02-05 16:16:29 +01:00
|
|
|
0 or JOIN_TYPE_{LEFT|RIGHT}. Currently this is only compared to 0.
|
|
|
|
If maybe_null !=0, this table is inner w.r.t. some outer join operation,
|
|
|
|
and null_row may be true.
|
|
|
|
*/
|
|
|
|
uint maybe_null;
|
2004-12-11 13:51:52 +01:00
|
|
|
/*
|
2005-02-05 16:16:29 +01:00
|
|
|
If true, the current table row is considered to have all columns set to
|
|
|
|
NULL, including columns declared as "not null" (see maybe_null).
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
*/
|
|
|
|
my_bool null_row;
|
BUG#25091 (A DELETE statement to mysql database is not logged in ROW format):
With this patch, statements that change metadata (in the mysql database)
is logged as statements, while normal changes (e.g., using INSERT, DELETE,
and/or UPDATE) is logged according to the format in effect.
The log tables (i.e., general_log and slow_log) are not replicated at all.
With this patch, the following statements are replicated as statements:
GRANT, REVOKE (ALL), CREATE USER, DROP USER, and RENAME USER.
2007-02-26 10:19:08 +01:00
|
|
|
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
/*
|
|
|
|
TODO: Each of the following flags take up 8 bits. They can just as easily
|
BUG#25091 (A DELETE statement to mysql database is not logged in ROW format):
With this patch, statements that change metadata (in the mysql database)
is logged as statements, while normal changes (e.g., using INSERT, DELETE,
and/or UPDATE) is logged according to the format in effect.
The log tables (i.e., general_log and slow_log) are not replicated at all.
With this patch, the following statements are replicated as statements:
GRANT, REVOKE (ALL), CREATE USER, DROP USER, and RENAME USER.
2007-02-26 10:19:08 +01:00
|
|
|
be put into one single unsigned long and instead of taking up 18
|
|
|
|
bytes, it would take up 4.
|
2004-12-11 13:51:52 +01:00
|
|
|
*/
|
2003-01-09 01:19:14 +01:00
|
|
|
my_bool force_index;
|
2001-05-23 01:40:46 +02:00
|
|
|
my_bool distinct,const_table,no_rows;
|
2005-01-06 12:00:13 +01:00
|
|
|
my_bool key_read, no_keyread;
|
2007-05-11 19:51:03 +02:00
|
|
|
/*
|
|
|
|
Placeholder for an open table which prevents other connections
|
|
|
|
from taking name-locks on this table. Typically used with
|
|
|
|
TABLE_SHARE::version member to take an exclusive name-lock on
|
|
|
|
this table name -- a name lock that not only prevents other
|
|
|
|
threads from opening the table, but also blocks other name
|
|
|
|
locks. This is achieved by:
|
|
|
|
- setting open_placeholder to 1 - this will block other name
|
|
|
|
locks, as wait_for_locked_table_name will be forced to wait,
|
|
|
|
see table_is_used for details.
|
|
|
|
- setting version to 0 - this will force other threads to close
|
|
|
|
the instance of this table and wait (this is the same approach
|
|
|
|
as used for usual name locks).
|
|
|
|
An exclusively name-locked table currently can have no handler
|
|
|
|
object associated with it (db_stat is always 0), but please do
|
|
|
|
not rely on that.
|
|
|
|
*/
|
|
|
|
my_bool open_placeholder;
|
2006-01-19 03:56:06 +01:00
|
|
|
my_bool locked_by_logger;
|
BUG#25091 (A DELETE statement to mysql database is not logged in ROW format):
With this patch, statements that change metadata (in the mysql database)
is logged as statements, while normal changes (e.g., using INSERT, DELETE,
and/or UPDATE) is logged according to the format in effect.
The log tables (i.e., general_log and slow_log) are not replicated at all.
With this patch, the following statements are replicated as statements:
GRANT, REVOKE (ALL), CREATE USER, DROP USER, and RENAME USER.
2007-02-26 10:19:08 +01:00
|
|
|
my_bool no_replicate;
|
2000-08-29 11:31:01 +02:00
|
|
|
my_bool locked_by_name;
|
2002-03-01 17:57:08 +01:00
|
|
|
my_bool fulltext_searched;
|
2005-01-06 12:00:13 +01:00
|
|
|
my_bool no_cache;
|
2007-11-01 21:52:56 +01:00
|
|
|
/* To signal that the table is associated with a HANDLER statement */
|
|
|
|
my_bool open_by_handler;
|
2007-03-30 16:13:33 +02:00
|
|
|
/*
|
|
|
|
To indicate that a non-null value of the auto_increment field
|
|
|
|
was provided by the user or retrieved from the current record.
|
|
|
|
Used only in the MODE_NO_AUTO_VALUE_ON_ZERO mode.
|
|
|
|
*/
|
2004-06-25 15:52:01 +02:00
|
|
|
my_bool auto_increment_field_not_null;
|
2004-12-06 01:00:37 +01:00
|
|
|
my_bool insert_or_update; /* Can be used by the handler */
|
2004-12-06 18:18:35 +01:00
|
|
|
my_bool alias_name_used; /* true if table_name is alias */
|
2005-07-18 13:31:02 +02:00
|
|
|
my_bool get_fields_in_item_tree; /* Signal to fix_field */
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
/* If MERGE children attached to parent. See top comment in ha_myisammrg.cc */
|
|
|
|
my_bool children_attached;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2000-07-31 21:29:14 +02:00
|
|
|
REGINFO reginfo; /* field connections */
|
|
|
|
MEM_ROOT mem_root;
|
|
|
|
GRANT_INFO grant;
|
2003-04-24 13:33:33 +02:00
|
|
|
FILESORT_INFO sort;
|
2005-11-23 21:45:02 +01:00
|
|
|
#ifdef WITH_PARTITION_STORAGE_ENGINE
|
|
|
|
partition_info *part_info; /* Partition related information */
|
2005-12-26 06:40:09 +01:00
|
|
|
bool no_partitions_used; /* If true, all partitions have been pruned away */
|
2005-11-23 21:45:02 +01:00
|
|
|
#endif
|
2005-09-22 00:11:21 +02:00
|
|
|
|
|
|
|
bool fill_item_list(List<Item> *item_list) const;
|
|
|
|
void reset_item_list(List<Item> *item_list) const;
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
void clear_column_bitmaps(void);
|
|
|
|
void prepare_for_position(void);
|
|
|
|
void mark_columns_used_by_index_no_reset(uint index, MY_BITMAP *map);
|
|
|
|
void mark_columns_used_by_index(uint index);
|
|
|
|
void restore_column_maps_after_mark_index();
|
|
|
|
void mark_auto_increment_column(void);
|
|
|
|
void mark_columns_needed_for_update(void);
|
|
|
|
void mark_columns_needed_for_delete(void);
|
|
|
|
void mark_columns_needed_for_insert(void);
|
|
|
|
inline void column_bitmaps_set(MY_BITMAP *read_set_arg,
|
|
|
|
MY_BITMAP *write_set_arg)
|
|
|
|
{
|
|
|
|
read_set= read_set_arg;
|
|
|
|
write_set= write_set_arg;
|
|
|
|
if (file)
|
|
|
|
file->column_bitmaps_signal();
|
|
|
|
}
|
|
|
|
inline void column_bitmaps_set_no_signal(MY_BITMAP *read_set_arg,
|
|
|
|
MY_BITMAP *write_set_arg)
|
|
|
|
{
|
|
|
|
read_set= read_set_arg;
|
|
|
|
write_set= write_set_arg;
|
|
|
|
}
|
|
|
|
inline void use_all_columns()
|
|
|
|
{
|
|
|
|
column_bitmaps_set(&s->all_set, &s->all_set);
|
|
|
|
}
|
|
|
|
inline void default_column_bitmaps()
|
|
|
|
{
|
|
|
|
read_set= &def_read_set;
|
|
|
|
write_set= &def_write_set;
|
|
|
|
}
|
2007-05-11 19:51:03 +02:00
|
|
|
/* Is table open or should be treated as such by name-locking? */
|
|
|
|
inline bool is_name_opened() { return db_stat || open_placeholder; }
|
|
|
|
/*
|
|
|
|
Is this instance of the table should be reopen or represents a name-lock?
|
|
|
|
*/
|
|
|
|
inline bool needs_reopen_or_name_lock()
|
|
|
|
{ return s->version != refresh_version; }
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
bool is_children_attached(void);
|
2000-07-31 21:29:14 +02:00
|
|
|
};
|
|
|
|
|
2007-02-12 13:06:14 +01:00
|
|
|
enum enum_schema_table_state
|
|
|
|
{
|
|
|
|
NOT_PROCESSED= 0,
|
|
|
|
PROCESSED_BY_CREATE_SORT_INDEX,
|
|
|
|
PROCESSED_BY_JOIN_EXEC
|
|
|
|
};
|
2000-07-31 21:29:14 +02:00
|
|
|
|
2004-11-13 11:56:39 +01:00
|
|
|
typedef struct st_foreign_key_info
|
|
|
|
{
|
|
|
|
LEX_STRING *forein_id;
|
|
|
|
LEX_STRING *referenced_db;
|
|
|
|
LEX_STRING *referenced_table;
|
2006-05-02 13:31:39 +02:00
|
|
|
LEX_STRING *update_method;
|
|
|
|
LEX_STRING *delete_method;
|
2007-01-15 10:39:28 +01:00
|
|
|
LEX_STRING *referenced_key_name;
|
2004-11-13 11:56:39 +01:00
|
|
|
List<LEX_STRING> foreign_fields;
|
|
|
|
List<LEX_STRING> referenced_fields;
|
|
|
|
} FOREIGN_KEY_INFO;
|
|
|
|
|
2005-12-22 10:07:47 +01:00
|
|
|
/*
|
|
|
|
Make sure that the order of schema_tables and enum_schema_tables are the same.
|
|
|
|
*/
|
2004-11-13 11:56:39 +01:00
|
|
|
|
|
|
|
enum enum_schema_tables
|
|
|
|
{
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_CHARSETS= 0,
|
|
|
|
SCH_COLLATIONS,
|
|
|
|
SCH_COLLATION_CHARACTER_SET_APPLICABILITY,
|
|
|
|
SCH_COLUMNS,
|
|
|
|
SCH_COLUMN_PRIVILEGES,
|
2005-12-22 10:07:47 +01:00
|
|
|
SCH_ENGINES,
|
2006-01-30 13:15:23 +01:00
|
|
|
SCH_EVENTS,
|
2006-02-01 14:47:08 +01:00
|
|
|
SCH_FILES,
|
2006-09-14 01:37:40 +02:00
|
|
|
SCH_GLOBAL_STATUS,
|
|
|
|
SCH_GLOBAL_VARIABLES,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_KEY_COLUMN_USAGE,
|
|
|
|
SCH_OPEN_TABLES,
|
2006-01-10 16:44:04 +01:00
|
|
|
SCH_PARTITIONS,
|
2005-12-21 19:18:40 +01:00
|
|
|
SCH_PLUGINS,
|
2006-02-16 14:45:05 +01:00
|
|
|
SCH_PROCESSLIST,
|
2007-07-02 13:27:39 +02:00
|
|
|
SCH_PROFILES,
|
2006-05-02 13:31:39 +02:00
|
|
|
SCH_REFERENTIAL_CONSTRAINTS,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_PROCEDURES,
|
|
|
|
SCH_SCHEMATA,
|
|
|
|
SCH_SCHEMA_PRIVILEGES,
|
2006-09-14 01:37:40 +02:00
|
|
|
SCH_SESSION_STATUS,
|
|
|
|
SCH_SESSION_VARIABLES,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_STATISTICS,
|
|
|
|
SCH_STATUS,
|
|
|
|
SCH_TABLES,
|
|
|
|
SCH_TABLE_CONSTRAINTS,
|
|
|
|
SCH_TABLE_NAMES,
|
|
|
|
SCH_TABLE_PRIVILEGES,
|
|
|
|
SCH_TRIGGERS,
|
2006-01-29 02:44:51 +01:00
|
|
|
SCH_USER_PRIVILEGES,
|
2005-08-05 11:01:29 +02:00
|
|
|
SCH_VARIABLES,
|
2006-01-29 02:44:51 +01:00
|
|
|
SCH_VIEWS
|
2004-11-13 11:56:39 +01:00
|
|
|
};
|
|
|
|
|
|
|
|
|
2007-04-25 14:15:05 +02:00
|
|
|
#define MY_I_S_MAYBE_NULL 1
|
|
|
|
#define MY_I_S_UNSIGNED 2
|
|
|
|
|
|
|
|
|
2007-08-03 00:14:05 +02:00
|
|
|
#define SKIP_OPEN_TABLE 0 // do not open table
|
|
|
|
#define OPEN_FRM_ONLY 1 // open FRM file only
|
|
|
|
#define OPEN_FULL_TABLE 2 // open FRM,MYD, MYI files
|
|
|
|
|
2004-11-13 11:56:39 +01:00
|
|
|
typedef struct st_field_info
|
|
|
|
{
|
|
|
|
const char* field_name;
|
|
|
|
uint field_length;
|
|
|
|
enum enum_field_types field_type;
|
|
|
|
int value;
|
2007-04-25 14:15:05 +02:00
|
|
|
uint field_flags; // Field atributes(maybe_null, signed, unsigned etc.)
|
2004-11-13 11:56:39 +01:00
|
|
|
const char* old_name;
|
2007-08-03 00:14:05 +02:00
|
|
|
uint open_method;
|
2004-11-13 11:56:39 +01:00
|
|
|
} ST_FIELD_INFO;
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2007-07-06 14:18:49 +02:00
|
|
|
struct TABLE_LIST;
|
2004-11-13 11:56:39 +01:00
|
|
|
typedef class Item COND;
|
|
|
|
|
|
|
|
typedef struct st_schema_table
|
|
|
|
{
|
|
|
|
const char* table_name;
|
|
|
|
ST_FIELD_INFO *fields_info;
|
|
|
|
/* Create information_schema table */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE *(*create_table) (THD *thd, TABLE_LIST *table_list);
|
2004-11-13 11:56:39 +01:00
|
|
|
/* Fill table with data */
|
2007-07-06 14:18:49 +02:00
|
|
|
int (*fill_table) (THD *thd, TABLE_LIST *tables, COND *cond);
|
2004-11-13 11:56:39 +01:00
|
|
|
/* Handle fileds for old SHOW */
|
|
|
|
int (*old_format) (THD *thd, struct st_schema_table *schema_table);
|
2007-08-03 00:14:05 +02:00
|
|
|
int (*process_table) (THD *thd, TABLE_LIST *tables, TABLE *table,
|
|
|
|
bool res, LEX_STRING *db_name, LEX_STRING *table_name);
|
2004-11-13 11:56:39 +01:00
|
|
|
int idx_field1, idx_field2;
|
2004-12-18 11:49:13 +01:00
|
|
|
bool hidden;
|
2007-08-03 00:14:05 +02:00
|
|
|
uint i_s_requested_object; /* the object we need to open(TABLE | VIEW) */
|
2004-11-13 11:56:39 +01:00
|
|
|
} ST_SCHEMA_TABLE;
|
|
|
|
|
|
|
|
|
2000-09-25 23:33:25 +02:00
|
|
|
#define JOIN_TYPE_LEFT 1
|
|
|
|
#define JOIN_TYPE_RIGHT 2
|
|
|
|
|
2005-07-01 06:05:42 +02:00
|
|
|
#define VIEW_ALGORITHM_UNDEFINED 0
|
|
|
|
#define VIEW_ALGORITHM_TMPTABLE 1
|
|
|
|
#define VIEW_ALGORITHM_MERGE 2
|
2004-07-16 00:15:55 +02:00
|
|
|
|
2006-07-31 16:33:37 +02:00
|
|
|
#define VIEW_SUID_INVOKER 0
|
|
|
|
#define VIEW_SUID_DEFINER 1
|
|
|
|
#define VIEW_SUID_DEFAULT 2
|
|
|
|
|
2004-09-29 15:35:01 +02:00
|
|
|
/* view WITH CHECK OPTION parameter options */
|
2004-09-03 14:18:40 +02:00
|
|
|
#define VIEW_CHECK_NONE 0
|
|
|
|
#define VIEW_CHECK_LOCAL 1
|
|
|
|
#define VIEW_CHECK_CASCADED 2
|
|
|
|
|
2004-09-29 15:35:01 +02:00
|
|
|
/* result of view WITH CHECK OPTION parameter check */
|
|
|
|
#define VIEW_CHECK_OK 0
|
|
|
|
#define VIEW_CHECK_ERROR 1
|
|
|
|
#define VIEW_CHECK_SKIP 2
|
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
struct st_lex;
|
2004-11-17 11:45:05 +01:00
|
|
|
class select_union;
|
2005-03-24 14:32:11 +01:00
|
|
|
class TMP_TABLE_PARAM;
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2007-07-06 14:18:49 +02:00
|
|
|
Item *create_view_field(THD *thd, TABLE_LIST *view, Item **field_ref,
|
2005-07-01 06:05:42 +02:00
|
|
|
const char *name);
|
|
|
|
|
2004-09-14 18:28:29 +02:00
|
|
|
struct Field_translator
|
|
|
|
{
|
|
|
|
Item *item;
|
|
|
|
const char *name;
|
|
|
|
};
|
2004-07-16 00:15:55 +02:00
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
Column reference of a NATURAL/USING join. Since column references in
|
|
|
|
joins can be both from views and stored tables, may point to either a
|
|
|
|
Field (for tables), or a Field_translator (for views).
|
|
|
|
*/
|
|
|
|
|
2005-08-17 16:19:31 +02:00
|
|
|
class Natural_join_column: public Sql_alloc
|
2005-08-12 16:57:19 +02:00
|
|
|
{
|
|
|
|
public:
|
|
|
|
Field_translator *view_field; /* Column reference of merge view. */
|
|
|
|
Field *table_field; /* Column reference of table or temp view. */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *table_ref; /* Original base table/view reference. */
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
True if a common join column of two NATURAL/USING join operands. Notice
|
|
|
|
that when we have a hierarchy of nested NATURAL/USING joins, a column can
|
|
|
|
be common at some level of nesting but it may not be common at higher
|
|
|
|
levels of nesting. Thus this flag may change depending on at which level
|
|
|
|
we are looking at some column.
|
|
|
|
*/
|
|
|
|
bool is_common;
|
|
|
|
public:
|
2007-07-06 14:18:49 +02:00
|
|
|
Natural_join_column(Field_translator *field_param, TABLE_LIST *tab);
|
|
|
|
Natural_join_column(Field *field_param, TABLE_LIST *tab);
|
2005-08-12 16:57:19 +02:00
|
|
|
const char *name();
|
|
|
|
Item *create_item(THD *thd);
|
|
|
|
Field *field();
|
|
|
|
const char *table_name();
|
|
|
|
const char *db_name();
|
|
|
|
GRANT_INFO *grant();
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Table reference in the FROM clause.
|
|
|
|
|
|
|
|
These table references can be of several types that correspond to
|
|
|
|
different SQL elements. Below we list all types of TABLE_LISTs with
|
|
|
|
the necessary conditions to determine when a TABLE_LIST instance
|
|
|
|
belongs to a certain type.
|
|
|
|
|
|
|
|
1) table (TABLE_LIST::view == NULL)
|
|
|
|
- base table
|
|
|
|
(TABLE_LIST::derived == NULL)
|
|
|
|
- subquery - TABLE_LIST::table is a temp table
|
|
|
|
(TABLE_LIST::derived != NULL)
|
|
|
|
- information schema table
|
|
|
|
(TABLE_LIST::schema_table != NULL)
|
|
|
|
NOTICE: for schema tables TABLE_LIST::field_translation may be != NULL
|
|
|
|
2) view (TABLE_LIST::view != NULL)
|
|
|
|
- merge (TABLE_LIST::effective_algorithm == VIEW_ALGORITHM_MERGE)
|
|
|
|
also (TABLE_LIST::field_translation != NULL)
|
|
|
|
- tmptable (TABLE_LIST::effective_algorithm == VIEW_ALGORITHM_TMPTABLE)
|
|
|
|
also (TABLE_LIST::field_translation == NULL)
|
|
|
|
3) nested table reference (TABLE_LIST::nested_join != NULL)
|
|
|
|
- table sequence - e.g. (t1, t2, t3)
|
|
|
|
TODO: how to distinguish from a JOIN?
|
|
|
|
- general JOIN
|
|
|
|
TODO: how to distinguish from a table sequence?
|
|
|
|
- NATURAL JOIN
|
|
|
|
(TABLE_LIST::natural_join != NULL)
|
|
|
|
- JOIN ... USING
|
|
|
|
(TABLE_LIST::join_using_fields != NULL)
|
|
|
|
*/
|
|
|
|
|
2007-07-23 18:09:48 +02:00
|
|
|
class Index_hint;
|
2007-07-06 14:18:49 +02:00
|
|
|
struct TABLE_LIST
|
2002-06-11 10:20:31 +02:00
|
|
|
{
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST() {} /* Remove gcc warning */
|
2007-04-05 13:24:34 +02:00
|
|
|
|
|
|
|
/**
|
|
|
|
Prepare TABLE_LIST that consists of one table instance to use in
|
|
|
|
simple_open_and_lock_tables
|
|
|
|
*/
|
|
|
|
inline void init_one_table(const char *db_name_arg,
|
|
|
|
const char *table_name_arg,
|
|
|
|
enum thr_lock_type lock_type_arg)
|
|
|
|
{
|
|
|
|
bzero((char*) this, sizeof(*this));
|
|
|
|
db= (char*) db_name_arg;
|
|
|
|
table_name= alias= (char*) table_name_arg;
|
|
|
|
lock_type= lock_type_arg;
|
|
|
|
}
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
List of tables local to a subquery (used by SQL_LIST). Considers
|
|
|
|
views as leaves (unlike 'next_leaf' below). Created at parse time
|
|
|
|
in st_select_lex::add_table_to_list() -> table_list.link_in_list().
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_local;
|
2004-07-16 00:15:55 +02:00
|
|
|
/* link in a global list of all queries tables */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_global, **prev_global;
|
2005-01-06 12:00:13 +01:00
|
|
|
char *db, *alias, *table_name, *schema_table_name;
|
2004-07-01 22:46:43 +02:00
|
|
|
char *option; /* Used by cache index */
|
2002-09-20 13:05:18 +02:00
|
|
|
Item *on_expr; /* Used with outer join */
|
2005-03-03 15:38:59 +01:00
|
|
|
/*
|
2005-08-12 16:57:19 +02:00
|
|
|
The structure of ON expression presented in the member above
|
2005-03-03 15:38:59 +01:00
|
|
|
can be changed during certain optimizations. This member
|
|
|
|
contains a snapshot of AND-OR structure of the ON expression
|
|
|
|
made after permanent transformations of the parse tree, and is
|
|
|
|
used to restore ON clause before every reexecution of a prepared
|
|
|
|
statement or stored procedure.
|
|
|
|
*/
|
|
|
|
Item *prep_on_expr;
|
2004-10-19 23:12:55 +02:00
|
|
|
COND_EQUAL *cond_equal; /* Used with outer join */
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
During parsing - left operand of NATURAL/USING join where 'this' is
|
|
|
|
the right operand. After parsing (this->natural_join == this) iff
|
|
|
|
'this' represents a NATURAL or USING join operation. Thus after
|
|
|
|
parsing 'this' is a NATURAL/USING join iff (natural_join != NULL).
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *natural_join;
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
True if 'this' represents a nested join that is a NATURAL JOIN.
|
|
|
|
For one of the operands of 'this', the member 'natural_join' points
|
|
|
|
to the other operand of 'this'.
|
|
|
|
*/
|
|
|
|
bool is_natural_join;
|
|
|
|
/* Field names in a USING clause for JOIN ... USING. */
|
|
|
|
List<String> *join_using_fields;
|
|
|
|
/*
|
|
|
|
Explicitly store the result columns of either a NATURAL/USING join or
|
|
|
|
an operand of such a join.
|
|
|
|
*/
|
|
|
|
List<Natural_join_column> *join_columns;
|
|
|
|
/* TRUE if join_columns contains all columns of this table reference. */
|
|
|
|
bool is_join_columns_complete;
|
|
|
|
|
|
|
|
/*
|
|
|
|
List of nodes in a nested join tree, that should be considered as
|
|
|
|
leaves with respect to name resolution. The leaves are: views,
|
|
|
|
top-most nodes representing NATURAL/USING joins, subqueries, and
|
|
|
|
base tables. All of these TABLE_LIST instances contain a
|
|
|
|
materialized list of columns. The list is local to a subquery.
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_name_resolution_table;
|
2005-08-12 16:57:19 +02:00
|
|
|
/* Index names in a "... JOIN ... USE/IGNORE INDEX ..." clause. */
|
2007-07-23 18:09:48 +02:00
|
|
|
List<Index_hint> *index_hints;
|
2006-02-16 08:30:53 +01:00
|
|
|
TABLE *table; /* opened table */
|
|
|
|
uint table_id; /* table id (from binlog) for opened table */
|
2004-11-05 16:29:47 +01:00
|
|
|
/*
|
|
|
|
select_result for derived table to pass it from table creation to table
|
|
|
|
filling procedure
|
|
|
|
*/
|
|
|
|
select_union *derived_result;
|
2004-07-16 00:15:55 +02:00
|
|
|
/*
|
|
|
|
Reference from aux_tables to local list entry of main select of
|
|
|
|
multi-delete statement:
|
|
|
|
delete t1 from t2,t1 where t1.a<'B' and t2.b=t1.b;
|
|
|
|
here it will be reference of first occurrence of t1 to second (as you
|
|
|
|
can see this lists can't be merged)
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *correspondent_table;
|
2004-07-16 00:15:55 +02:00
|
|
|
st_select_lex_unit *derived; /* SELECT_LEX_UNIT of derived table */
|
2004-11-13 11:56:39 +01:00
|
|
|
ST_SCHEMA_TABLE *schema_table; /* Information_schema table */
|
|
|
|
st_select_lex *schema_select_lex;
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
True when the view field translation table is used to convert
|
|
|
|
schema table fields for backwards compatibility with SHOW command.
|
|
|
|
*/
|
2005-01-24 16:44:54 +01:00
|
|
|
bool schema_table_reformed;
|
2005-03-24 14:32:11 +01:00
|
|
|
TMP_TABLE_PARAM *schema_table_param;
|
2004-07-16 00:15:55 +02:00
|
|
|
/* link to select_lex where this table was used */
|
|
|
|
st_select_lex *select_lex;
|
|
|
|
st_lex *view; /* link on VIEW lex for merging */
|
2004-09-14 18:28:29 +02:00
|
|
|
Field_translator *field_translation; /* array of VIEW fields */
|
2005-07-01 06:05:42 +02:00
|
|
|
/* pointer to element after last one in translation table above */
|
|
|
|
Field_translator *field_translation_end;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
|
|
|
List (based on next_local) of underlying tables of this view. I.e. it
|
|
|
|
does not include the tables of subqueries used in the view. Is set only
|
|
|
|
for merged views.
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *merge_underlying_list;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
|
|
|
- 0 for base tables
|
|
|
|
- in case of the view it is the list of all (not only underlying
|
|
|
|
tables but also used in subquery ones) tables of the view.
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
List<TABLE_LIST> *view_tables;
|
2004-07-21 03:26:20 +02:00
|
|
|
/* most upper view this table belongs to */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *belong_to_view;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
|
|
|
The view directly referencing this table
|
|
|
|
(non-zero only for merged underlying tables of a view).
|
|
|
|
*/
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *referencing_view;
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
/* Ptr to parent MERGE table list item. See top comment in ha_myisammrg.cc */
|
|
|
|
TABLE_LIST *parent_l;
|
2005-10-27 23:18:23 +02:00
|
|
|
/*
|
2005-10-29 11:11:34 +02:00
|
|
|
Security context (non-zero only for tables which belong
|
|
|
|
to view with SQL SECURITY DEFINER)
|
2005-10-27 23:18:23 +02:00
|
|
|
*/
|
|
|
|
Security_context *security_ctx;
|
|
|
|
/*
|
2005-10-29 11:11:34 +02:00
|
|
|
This view security context (non-zero only for views with
|
|
|
|
SQL SECURITY DEFINER)
|
2005-10-27 23:18:23 +02:00
|
|
|
*/
|
|
|
|
Security_context *view_sctx;
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
List of all base tables local to a subquery including all view
|
|
|
|
tables. Unlike 'next_local', this in this list views are *not*
|
|
|
|
leaves. Created in setup_tables() -> make_leaves_list().
|
|
|
|
*/
|
2006-07-25 14:23:25 +02:00
|
|
|
bool allowed_show;
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *next_leaf;
|
2004-07-16 00:15:55 +02:00
|
|
|
Item *where; /* VIEW WHERE clause condition */
|
2004-09-03 14:18:40 +02:00
|
|
|
Item *check_option; /* WITH CHECK OPTION condition */
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
LEX_STRING select_stmt; /* text of (CREATE/SELECT) statement */
|
2004-10-07 00:45:06 +02:00
|
|
|
LEX_STRING md5; /* md5 of query text */
|
2004-07-16 00:15:55 +02:00
|
|
|
LEX_STRING source; /* source of CREATE VIEW */
|
2004-10-07 00:45:06 +02:00
|
|
|
LEX_STRING view_db; /* saved view database */
|
|
|
|
LEX_STRING view_name; /* saved view name */
|
2004-07-16 00:15:55 +02:00
|
|
|
LEX_STRING timestamp; /* GMT time stamp of last operation */
|
2005-09-14 09:53:09 +02:00
|
|
|
st_lex_user definer; /* definer of view */
|
2004-07-16 00:15:55 +02:00
|
|
|
ulonglong file_version; /* version of file's field set */
|
2004-09-03 20:38:01 +02:00
|
|
|
ulonglong updatable_view; /* VIEW can be updated */
|
2004-07-16 00:15:55 +02:00
|
|
|
ulonglong revision; /* revision control number */
|
|
|
|
ulonglong algorithm; /* 0 any, 1 tmp tables , 2 merging */
|
2005-09-14 09:53:09 +02:00
|
|
|
ulonglong view_suid; /* view is suid (TRUE dy default) */
|
2004-09-03 14:18:40 +02:00
|
|
|
ulonglong with_check; /* WITH CHECK OPTION */
|
2004-10-07 14:43:04 +02:00
|
|
|
/*
|
|
|
|
effective value of WITH CHECK OPTION (differ for temporary table
|
|
|
|
algorithm)
|
|
|
|
*/
|
|
|
|
uint8 effective_with_check;
|
2005-07-01 06:05:42 +02:00
|
|
|
uint8 effective_algorithm; /* which algorithm was really used */
|
2004-07-16 00:15:55 +02:00
|
|
|
GRANT_INFO grant;
|
2004-11-24 12:56:51 +01:00
|
|
|
/* data need by some engines in query cache*/
|
|
|
|
ulonglong engine_data;
|
|
|
|
/* call back function for asking handler about caching in query cache */
|
|
|
|
qc_engine_callback callback_func;
|
2000-09-25 23:33:25 +02:00
|
|
|
thr_lock_type lock_type;
|
2002-09-20 13:05:18 +02:00
|
|
|
uint outer_join; /* Which join type */
|
2004-06-25 15:52:01 +02:00
|
|
|
uint shared; /* Used in multi-upd */
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
size_t db_length;
|
|
|
|
size_t table_name_length;
|
2004-07-22 16:52:04 +02:00
|
|
|
bool updatable; /* VIEW/TABLE can be updated now */
|
2002-09-20 13:05:18 +02:00
|
|
|
bool straight; /* optimize with prev table */
|
|
|
|
bool updating; /* for replicate-do/ignore table */
|
2004-06-11 07:27:21 +02:00
|
|
|
bool force_index; /* prefer index over table scan */
|
|
|
|
bool ignore_leaves; /* preload only non-leaf nodes */
|
|
|
|
table_map dep_tables; /* tables the table depends on */
|
2004-07-01 22:46:43 +02:00
|
|
|
table_map on_expr_dep_tables; /* tables on expression depends on */
|
2004-06-11 07:27:21 +02:00
|
|
|
struct st_nested_join *nested_join; /* if the element is a nested join */
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *embedding; /* nested join containing the table */
|
|
|
|
List<TABLE_LIST> *join_list;/* join list the table belongs to */
|
2004-06-25 15:52:01 +02:00
|
|
|
bool cacheable_table; /* stop PS caching */
|
2004-10-07 00:45:06 +02:00
|
|
|
/* used in multi-upd/views privilege check */
|
2004-07-16 00:15:55 +02:00
|
|
|
bool table_in_first_from_clause;
|
|
|
|
bool skip_temporary; /* this table shouldn't be temporary */
|
2004-10-07 00:45:06 +02:00
|
|
|
/* TRUE if this merged view contain auto_increment field */
|
2004-07-16 00:15:55 +02:00
|
|
|
bool contain_auto_increment;
|
2005-05-11 01:31:13 +02:00
|
|
|
bool multitable_view; /* TRUE iff this is multitable view */
|
2005-09-01 11:36:42 +02:00
|
|
|
bool compact_view_format; /* Use compact format for SHOW CREATE VIEW */
|
2005-07-01 06:05:42 +02:00
|
|
|
/* view where processed */
|
|
|
|
bool where_processed;
|
2007-05-31 23:15:40 +02:00
|
|
|
/* TRUE <=> VIEW CHECK OPTION expression has been processed */
|
|
|
|
bool check_option_processed;
|
2004-08-24 14:37:51 +02:00
|
|
|
/* FRMTYPE_ERROR if any type is acceptable */
|
|
|
|
enum frm_type_enum required_type;
|
2005-12-21 19:18:40 +01:00
|
|
|
handlerton *db_type; /* table_type for handler */
|
2004-07-16 00:15:55 +02:00
|
|
|
char timestamp_buffer[20]; /* buffer for timestamp (19+1) */
|
2005-03-04 14:35:28 +01:00
|
|
|
/*
|
|
|
|
This TABLE_LIST object is just placeholder for prelocking, it will be
|
|
|
|
used for implicit LOCK TABLES only and won't be used in real statement.
|
|
|
|
*/
|
|
|
|
bool prelocking_placeholder;
|
2007-05-11 19:51:03 +02:00
|
|
|
/*
|
|
|
|
This TABLE_LIST object corresponds to the table to be created
|
|
|
|
so it is possible that it does not exist (used in CREATE TABLE
|
|
|
|
... SELECT implementation).
|
|
|
|
*/
|
|
|
|
bool create;
|
2007-11-23 15:21:24 +01:00
|
|
|
bool internal_tmp_table;
|
2004-07-01 22:46:43 +02:00
|
|
|
|
|
|
|
|
Patch for the following bugs:
- BUG#11986: Stored routines and triggers can fail if the code
has a non-ascii symbol
- BUG#16291: mysqldump corrupts string-constants with non-ascii-chars
- BUG#19443: INFORMATION_SCHEMA does not support charsets properly
- BUG#21249: Character set of SP-var can be ignored
- BUG#25212: Character set of string constant is ignored (stored routines)
- BUG#25221: Character set of string constant is ignored (triggers)
There were a few general problems that caused these bugs:
1. Character set information of the original (definition) query for views,
triggers, stored routines and events was lost.
2. mysqldump output query in client character set, which can be
inappropriate to encode definition-query.
3. INFORMATION_SCHEMA used strings with mixed encodings to display object
definition;
1. No query-definition-character set.
In order to compile query into execution code, some extra data (such as
environment variables or the database character set) is used. The problem
here was that this context was not preserved. So, on the next load it can
differ from the original one, thus the result will be different.
The context contains the following data:
- client character set;
- connection collation (character set and collation);
- collation of the owner database;
The fix is to store this context and use it each time we parse (compile)
and execute the object (stored routine, trigger, ...).
2. Wrong mysqldump-output.
The original query can contain several encodings (by means of character set
introducers). The problem here was that we tried to convert original query
to the mysqldump-client character set.
Moreover, we stored queries in different character sets for different
objects (views, for one, used UTF8, triggers used original character set).
The solution is
- to store definition queries in the original character set;
- to change SHOW CREATE statement to output definition query in the
binary character set (i.e. without any conversion);
- introduce SHOW CREATE TRIGGER statement;
- to dump special statements to switch the context to the original one
before dumping and restore it afterwards.
Note, in order to preserve the database collation at the creation time,
additional ALTER DATABASE might be used (to temporary switch the database
collation back to the original value). In this case, ALTER DATABASE
privilege will be required. This is a backward-incompatible change.
3. INFORMATION_SCHEMA showed non-UTF8 strings
The fix is to generate UTF8-query during the parsing, store it in the object
and show it in the INFORMATION_SCHEMA.
Basically, the idea is to create a copy of the original query convert it to
UTF8. Character set introducers are removed and all text literals are
converted to UTF8.
This UTF8 query is intended to provide user-readable output. It must not be
used to recreate the object. Specialized SHOW CREATE statements should be
used for this.
The reason for this limitation is the following: the original query can
contain symbols from several character sets (by means of character set
introducers).
Example:
- original query:
CREATE VIEW v1 AS SELECT _cp1251 'Hello' AS c1;
- UTF8 query (for INFORMATION_SCHEMA):
CREATE VIEW v1 AS SELECT 'Hello' AS c1;
2007-06-28 19:34:54 +02:00
|
|
|
/* View creation context. */
|
|
|
|
|
|
|
|
View_creation_ctx *view_creation_ctx;
|
|
|
|
|
|
|
|
/*
|
|
|
|
Attributes to save/load view creation context in/from frm-file.
|
|
|
|
|
|
|
|
Ther are required only to be able to use existing parser to load
|
|
|
|
view-definition file. As soon as the parser parsed the file, view
|
|
|
|
creation context is initialized and the attributes become redundant.
|
|
|
|
|
|
|
|
These attributes MUST NOT be used for any purposes but the parsing.
|
|
|
|
*/
|
|
|
|
|
|
|
|
LEX_STRING view_client_cs_name;
|
|
|
|
LEX_STRING view_connection_cl_name;
|
|
|
|
|
|
|
|
/*
|
|
|
|
View definition (SELECT-statement) in the UTF-form.
|
|
|
|
*/
|
|
|
|
|
|
|
|
LEX_STRING view_body_utf8;
|
|
|
|
|
|
|
|
/* End of view definition context. */
|
|
|
|
|
A fix and a test case for Bug#26141 mixing table types in trigger
causes full table lock on innodb table.
Also fixes Bug#28502 Triggers that update another innodb table
will block on X lock unnecessarily (duplciate).
Code review fixes.
Both bugs' synopses are misleading: InnoDB table is
not X locked. The statements, however, cannot proceed concurrently,
but this happens due to lock conflicts for tables used in triggers,
not for the InnoDB table.
If a user had an InnoDB table, and two triggers, AFTER UPDATE and
AFTER INSERT, competing for different resources (e.g. two distinct
MyISAM tables), then these two triggers would not be able to execute
concurrently. Moreover, INSERTS/UPDATES of the InnoDB table would
not be able to run concurrently.
The problem had other side-effects (see respective bug reports).
This behavior was a consequence of a shortcoming of the pre-locking
algorithm, which would not distinguish between different DML operations
(e.g. INSERT and DELETE) and pre-lock all the tables
that are used by any trigger defined on the subject table.
The idea of the fix is to extend the pre-locking algorithm to keep track,
for each table, what DML operation it is used for and not
load triggers that are known to never be fired.
2007-07-12 20:26:41 +02:00
|
|
|
/**
|
|
|
|
Indicates what triggers we need to pre-load for this TABLE_LIST
|
|
|
|
when opening an associated TABLE. This is filled after
|
|
|
|
the parsed tree is created.
|
|
|
|
*/
|
|
|
|
uint8 trg_event_map;
|
|
|
|
|
2007-08-03 00:14:05 +02:00
|
|
|
uint i_s_requested_object;
|
|
|
|
bool has_db_lookup_value;
|
|
|
|
bool has_table_lookup_value;
|
|
|
|
uint table_open_method;
|
2007-02-12 13:06:14 +01:00
|
|
|
enum enum_schema_table_state schema_table_state;
|
2004-07-16 00:15:55 +02:00
|
|
|
void calc_md5(char *buffer);
|
2005-10-27 23:18:23 +02:00
|
|
|
void set_underlying_merge();
|
2004-09-29 15:35:01 +02:00
|
|
|
int view_check_option(THD *thd, bool ignore_failure);
|
2005-10-27 23:18:23 +02:00
|
|
|
bool setup_underlying(THD *thd);
|
2004-11-08 00:54:23 +01:00
|
|
|
void cleanup_items();
|
2007-05-11 19:51:03 +02:00
|
|
|
bool placeholder()
|
|
|
|
{
|
|
|
|
return derived || view || schema_table || create && !table->db_stat ||
|
|
|
|
!table;
|
|
|
|
}
|
2004-07-01 22:46:43 +02:00
|
|
|
void print(THD *thd, String *str);
|
2007-07-06 14:18:49 +02:00
|
|
|
bool check_single_table(TABLE_LIST **table, table_map map,
|
|
|
|
TABLE_LIST *view);
|
2004-09-15 22:42:56 +02:00
|
|
|
bool set_insert_values(MEM_ROOT *mem_root);
|
2005-07-01 06:05:42 +02:00
|
|
|
void hide_view_error(THD *thd);
|
2007-07-06 14:18:49 +02:00
|
|
|
TABLE_LIST *find_underlying_table(TABLE *table);
|
|
|
|
TABLE_LIST *first_leaf_for_name_resolution();
|
|
|
|
TABLE_LIST *last_leaf_for_name_resolution();
|
2005-08-12 16:57:19 +02:00
|
|
|
bool is_leaf_for_name_resolution();
|
2007-07-06 14:18:49 +02:00
|
|
|
inline TABLE_LIST *top_table()
|
2005-08-02 21:54:49 +02:00
|
|
|
{ return belong_to_view ? belong_to_view : this; }
|
2005-07-01 06:05:42 +02:00
|
|
|
inline bool prepare_check_option(THD *thd)
|
|
|
|
{
|
|
|
|
bool res= FALSE;
|
|
|
|
if (effective_with_check)
|
|
|
|
res= prep_check_option(thd, effective_with_check);
|
|
|
|
return res;
|
|
|
|
}
|
|
|
|
inline bool prepare_where(THD *thd, Item **conds,
|
|
|
|
bool no_where_clause)
|
|
|
|
{
|
|
|
|
if (effective_algorithm == VIEW_ALGORITHM_MERGE)
|
|
|
|
return prep_where(thd, conds, no_where_clause);
|
|
|
|
return FALSE;
|
|
|
|
}
|
2005-10-27 23:18:23 +02:00
|
|
|
|
|
|
|
void register_want_access(ulong want_access);
|
|
|
|
bool prepare_security(THD *thd);
|
|
|
|
#ifndef NO_EMBEDDED_ACCESS_CHECKS
|
|
|
|
Security_context *find_view_security_context(THD *thd);
|
|
|
|
bool prepare_view_securety_context(THD *thd);
|
|
|
|
#endif
|
2006-07-06 21:59:04 +02:00
|
|
|
/*
|
|
|
|
Cleanup for re-execution in a prepared statement or a stored
|
|
|
|
procedure.
|
|
|
|
*/
|
|
|
|
void reinit_before_use(THD *thd);
|
2006-11-01 02:31:56 +01:00
|
|
|
Item_subselect *containing_subselect();
|
2005-10-27 23:18:23 +02:00
|
|
|
|
2007-03-05 18:08:41 +01:00
|
|
|
/*
|
|
|
|
Compiles the tagged hints list and fills up st_table::keys_in_use_for_query,
|
|
|
|
st_table::keys_in_use_for_group_by, st_table::keys_in_use_for_order_by,
|
|
|
|
st_table::force_index and st_table::covering_keys.
|
|
|
|
*/
|
|
|
|
bool process_index_hints(TABLE *table);
|
|
|
|
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
/* Access MERGE child def version. See top comment in ha_myisammrg.cc */
|
|
|
|
inline ulong get_child_def_version()
|
|
|
|
{
|
|
|
|
return child_def_version;
|
|
|
|
}
|
|
|
|
inline void set_child_def_version(ulong version)
|
|
|
|
{
|
|
|
|
child_def_version= version;
|
|
|
|
}
|
|
|
|
inline void init_child_def_version()
|
|
|
|
{
|
|
|
|
child_def_version= ~0UL;
|
|
|
|
}
|
|
|
|
|
2005-07-01 06:05:42 +02:00
|
|
|
private:
|
|
|
|
bool prep_check_option(THD *thd, uint8 check_opt_type);
|
|
|
|
bool prep_where(THD *thd, Item **conds, bool no_where_clause);
|
2006-07-06 21:59:04 +02:00
|
|
|
/*
|
|
|
|
Cleanup for re-execution in a prepared statement or a stored
|
|
|
|
procedure.
|
|
|
|
*/
|
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Bug 25038 - Waiting TRUNCATE
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Bug 19627 - temporary merge table locking
Bug 27660 - Falcon: merge table possible
Bug 30273 - merge tables: Can't lock file (errno: 155)
The problems were:
Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE
corrupts a MERGE table
1. A thread trying to lock a MERGE table performs busy waiting while
REPAIR TABLE or a similar table administration task is ongoing on
one or more of its MyISAM tables.
2. A thread trying to lock a MERGE table performs busy waiting until all
threads that did REPAIR TABLE or similar table administration tasks
on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
TABLES. The difference against problem #1 is that the busy waiting
takes place *after* the administration task. It is terminated by
UNLOCK TABLES only.
3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
lock. This does *not* require a MERGE table. The first FLUSH TABLES
can be replaced by any statement that requires other threads to
reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
the problem.
Bug 26867 - LOCK TABLES + REPAIR + merge table result in
memory/cpu hogging
Trying DML on a MERGE table, which has a child locked and
repaired by another thread, made an infinite loop in the server.
Bug 26377 - Deadlock with MERGE and FLUSH TABLE
Locking a MERGE table and its children in parent-child order
and flushing the child deadlocked the server.
Bug 25038 - Waiting TRUNCATE
Truncating a MERGE child, while the MERGE table was in use,
let the truncate fail instead of waiting for the table to
become free.
Bug 25700 - merge base tables get corrupted by
optimize/analyze/repair table
Repairing a child of an open MERGE table corrupted the child.
It was necessary to FLUSH the child first.
Bug 30275 - Merge tables: flush tables or unlock tables
causes server to crash
Flushing and optimizing locked MERGE children crashed the server.
Bug 19627 - temporary merge table locking
Use of a temporary MERGE table with non-temporary children
could corrupt the children.
Temporary tables are never locked. So we do now prohibit
non-temporary chidlren of a temporary MERGE table.
Bug 27660 - Falcon: merge table possible
It was possible to create a MERGE table with non-MyISAM children.
Bug 30273 - merge tables: Can't lock file (errno: 155)
This was a Windows-only bug. Table administration statements
sometimes failed with "Can't lock file (errno: 155)".
These bugs are fixed by a new implementation of MERGE table open.
When opening a MERGE table in open_tables() we do now add the
child tables to the list of tables to be opened by open_tables()
(the "query_list"). The children are not opened in the handler at
this stage.
After opening the parent, open_tables() opens each child from the
now extended query_list. When the last child is opened, we remove
the children from the query_list again and attach the children to
the parent. This behaves similar to the old open. However it does
not open the MyISAM tables directly, but grabs them from the already
open children.
When closing a MERGE table in close_thread_table() we detach the
children only. Closing of the children is done implicitly because
they are in thd->open_tables.
For more detail see the comment at the top of ha_myisammrg.cc.
Changed from open_ltable() to open_and_lock_tables() in all places
that can be relevant for MERGE tables. The latter can handle tables
added to the list on the fly. When open_ltable() was used in a loop
over a list of tables, the list must be temporarily terminated
after every table for open_and_lock_tables().
table_list->required_type is set to FRMTYPE_TABLE to avoid open of
special tables. Handling of derived tables is suppressed.
These details are handled by the new function
open_n_lock_single_table(), which has nearly the same signature as
open_ltable() and can replace it in most cases.
In reopen_tables() some of the tables open by a thread can be
closed and reopened. When a MERGE child is affected, the parent
must be closed and reopened too. Closing of the parent is forced
before the first child is closed. Reopen happens in the order of
thd->open_tables. MERGE parents do not attach their children
automatically at open. This is done after all tables are reopened.
So all children are open when attaching them.
Special lock handling like mysql_lock_abort() or mysql_lock_remove()
needs to be suppressed for MERGE children or forwarded to the parent.
This depends on the situation. In loops over all open tables one
suppresses child lock handling. When a single table is touched,
forwarding is done.
Behavioral changes:
===================
This patch changes the behavior of temporary MERGE tables.
Temporary MERGE must have temporary children.
The old behavior was wrong. A temporary table is not locked. Hence
even non-temporary children were not locked. See
Bug 19627 - temporary merge table locking.
You cannot change the union list of a non-temporary MERGE table
when LOCK TABLES is in effect. The following does *not* work:
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...;
LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE;
ALTER TABLE m1 ... UNION=(t1,t2) ...;
However, you can do this with a temporary MERGE table.
You cannot create a MERGE table with CREATE ... SELECT, neither
as a temporary MERGE table, nor as a non-temporary MERGE table.
CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...;
Gives error message: table is not BASE TABLE.
2007-11-15 20:25:43 +01:00
|
|
|
|
|
|
|
/* Remembered MERGE child def version. See top comment in ha_myisammrg.cc */
|
|
|
|
ulong child_def_version;
|
2007-07-06 14:18:49 +02:00
|
|
|
};
|
2001-04-11 13:04:03 +02:00
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Item;
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
Iterator over the fields of a generic table reference.
|
|
|
|
*/
|
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Field_iterator: public Sql_alloc
|
|
|
|
{
|
|
|
|
public:
|
2006-02-25 16:46:30 +01:00
|
|
|
Field_iterator() {} /* Remove gcc warning */
|
2004-07-16 00:15:55 +02:00
|
|
|
virtual ~Field_iterator() {}
|
|
|
|
virtual void set(TABLE_LIST *)= 0;
|
|
|
|
virtual void next()= 0;
|
2004-09-03 20:43:04 +02:00
|
|
|
virtual bool end_of_fields()= 0; /* Return 1 at end of list */
|
2004-07-16 00:15:55 +02:00
|
|
|
virtual const char *name()= 0;
|
2005-07-01 06:05:42 +02:00
|
|
|
virtual Item *create_item(THD *)= 0;
|
2004-07-16 00:15:55 +02:00
|
|
|
virtual Field *field()= 0;
|
|
|
|
};
|
|
|
|
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/*
|
|
|
|
Iterator over the fields of a base table, view with temporary
|
|
|
|
table, or subquery.
|
|
|
|
*/
|
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Field_iterator_table: public Field_iterator
|
|
|
|
{
|
|
|
|
Field **ptr;
|
|
|
|
public:
|
|
|
|
Field_iterator_table() :ptr(0) {}
|
|
|
|
void set(TABLE_LIST *table) { ptr= table->table->field; }
|
|
|
|
void set_table(TABLE *table) { ptr= table->field; }
|
|
|
|
void next() { ptr++; }
|
2004-09-03 20:43:04 +02:00
|
|
|
bool end_of_fields() { return *ptr == 0; }
|
2004-07-16 00:15:55 +02:00
|
|
|
const char *name();
|
2005-07-01 06:05:42 +02:00
|
|
|
Item *create_item(THD *thd);
|
2004-07-16 00:15:55 +02:00
|
|
|
Field *field() { return *ptr; }
|
|
|
|
};
|
|
|
|
|
|
|
|
|
2005-08-12 16:57:19 +02:00
|
|
|
/* Iterator over the fields of a merge view. */
|
|
|
|
|
2004-07-16 00:15:55 +02:00
|
|
|
class Field_iterator_view: public Field_iterator
|
|
|
|
{
|
2004-09-14 18:28:29 +02:00
|
|
|
Field_translator *ptr, *array_end;
|
2005-07-01 06:05:42 +02:00
|
|
|
TABLE_LIST *view;
|
2004-07-16 00:15:55 +02:00
|
|
|
public:
|
|
|
|
Field_iterator_view() :ptr(0), array_end(0) {}
|
|
|
|
void set(TABLE_LIST *table);
|
|
|
|
void next() { ptr++; }
|
2004-09-03 20:43:04 +02:00
|
|
|
bool end_of_fields() { return ptr == array_end; }
|
2004-07-16 00:15:55 +02:00
|
|
|
const char *name();
|
2005-07-01 06:05:42 +02:00
|
|
|
Item *create_item(THD *thd);
|
2005-01-05 15:48:23 +01:00
|
|
|
Item **item_ptr() {return &ptr->item; }
|
2004-07-16 00:15:55 +02:00
|
|
|
Field *field() { return 0; }
|
2005-07-01 06:05:42 +02:00
|
|
|
inline Item *item() { return ptr->item; }
|
2005-08-12 16:57:19 +02:00
|
|
|
Field_translator *field_translator() { return ptr; }
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Field_iterator interface to the list of materialized fields of a
|
|
|
|
NATURAL/USING join.
|
|
|
|
*/
|
|
|
|
|
|
|
|
class Field_iterator_natural_join: public Field_iterator
|
|
|
|
{
|
2005-11-28 20:57:50 +01:00
|
|
|
List_iterator_fast<Natural_join_column> column_ref_it;
|
2005-08-12 16:57:19 +02:00
|
|
|
Natural_join_column *cur_column_ref;
|
|
|
|
public:
|
2005-11-28 20:57:50 +01:00
|
|
|
Field_iterator_natural_join() :cur_column_ref(NULL) {}
|
|
|
|
~Field_iterator_natural_join() {}
|
2005-08-12 16:57:19 +02:00
|
|
|
void set(TABLE_LIST *table);
|
2005-08-19 14:22:30 +02:00
|
|
|
void next();
|
2005-08-12 16:57:19 +02:00
|
|
|
bool end_of_fields() { return !cur_column_ref; }
|
|
|
|
const char *name() { return cur_column_ref->name(); }
|
|
|
|
Item *create_item(THD *thd) { return cur_column_ref->create_item(thd); }
|
|
|
|
Field *field() { return cur_column_ref->field(); }
|
|
|
|
Natural_join_column *column_ref() { return cur_column_ref; }
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Generic iterator over the fields of an arbitrary table reference.
|
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
This class unifies the various ways of iterating over the columns
|
|
|
|
of a table reference depending on the type of SQL entity it
|
|
|
|
represents. If such an entity represents a nested table reference,
|
|
|
|
this iterator encapsulates the iteration over the columns of the
|
|
|
|
members of the table reference.
|
|
|
|
|
|
|
|
IMPLEMENTATION
|
|
|
|
The implementation assumes that all underlying NATURAL/USING table
|
|
|
|
references already contain their result columns and are linked into
|
|
|
|
the list TABLE_LIST::next_name_resolution_table.
|
|
|
|
*/
|
|
|
|
|
|
|
|
class Field_iterator_table_ref: public Field_iterator
|
|
|
|
{
|
|
|
|
TABLE_LIST *table_ref, *first_leaf, *last_leaf;
|
|
|
|
Field_iterator_table table_field_it;
|
|
|
|
Field_iterator_view view_field_it;
|
|
|
|
Field_iterator_natural_join natural_join_it;
|
|
|
|
Field_iterator *field_it;
|
|
|
|
void set_field_iterator();
|
|
|
|
public:
|
|
|
|
Field_iterator_table_ref() :field_it(NULL) {}
|
|
|
|
void set(TABLE_LIST *table);
|
|
|
|
void next();
|
|
|
|
bool end_of_fields()
|
|
|
|
{ return (table_ref == last_leaf && field_it->end_of_fields()); }
|
|
|
|
const char *name() { return field_it->name(); }
|
|
|
|
const char *table_name();
|
|
|
|
const char *db_name();
|
|
|
|
GRANT_INFO *grant();
|
|
|
|
Item *create_item(THD *thd) { return field_it->create_item(thd); }
|
|
|
|
Field *field() { return field_it->field(); }
|
2006-03-02 10:50:15 +01:00
|
|
|
Natural_join_column *get_or_create_column_ref(TABLE_LIST *parent_table_ref);
|
2005-11-28 20:57:50 +01:00
|
|
|
Natural_join_column *get_natural_column_ref();
|
2004-07-16 00:15:55 +02:00
|
|
|
};
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2004-06-11 07:27:21 +02:00
|
|
|
typedef struct st_nested_join
|
|
|
|
{
|
|
|
|
List<TABLE_LIST> join_list; /* list of elements in the nested join */
|
|
|
|
table_map used_tables; /* bitmap of tables in the nested join */
|
2004-07-01 22:46:43 +02:00
|
|
|
table_map not_null_tables; /* tables that rejects nulls */
|
2004-06-11 07:27:21 +02:00
|
|
|
struct st_join_table *first_nested;/* the first nested table in the plan */
|
2005-10-25 17:28:27 +02:00
|
|
|
/*
|
|
|
|
Used to count tables in the nested join in 2 isolated places:
|
|
|
|
1. In make_outerjoin_info().
|
|
|
|
2. check_interleaving_with_nj/restore_prev_nj_state (these are called
|
|
|
|
by the join optimizer.
|
|
|
|
Before each use the counters are zeroed by reset_nj_counters.
|
|
|
|
*/
|
|
|
|
uint counter;
|
|
|
|
nested_join_map nj_map; /* Bit used to identify this nested join*/
|
2004-06-11 07:27:21 +02:00
|
|
|
} NESTED_JOIN;
|
2004-07-01 22:46:43 +02:00
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2002-06-11 10:20:31 +02:00
|
|
|
typedef struct st_changed_table_list
|
|
|
|
{
|
2002-03-15 22:57:31 +01:00
|
|
|
struct st_changed_table_list *next;
|
2002-06-08 23:58:05 +02:00
|
|
|
char *key;
|
2002-03-15 22:57:31 +01:00
|
|
|
uint32 key_length;
|
|
|
|
} CHANGED_TABLE_LIST;
|
|
|
|
|
2005-01-06 12:00:13 +01:00
|
|
|
|
2004-06-11 07:27:21 +02:00
|
|
|
typedef struct st_open_table_list{
|
2001-04-11 13:04:03 +02:00
|
|
|
struct st_open_table_list *next;
|
|
|
|
char *db,*table;
|
|
|
|
uint32 in_use,locked;
|
|
|
|
} OPEN_TABLE_LIST;
|
2003-08-02 11:43:18 +02:00
|
|
|
|
2006-02-14 16:20:48 +01:00
|
|
|
typedef struct st_table_field_w_type
|
|
|
|
{
|
|
|
|
LEX_STRING name;
|
|
|
|
LEX_STRING type;
|
|
|
|
LEX_STRING cset;
|
|
|
|
} TABLE_FIELD_W_TYPE;
|
|
|
|
|
2003-08-02 11:43:18 +02:00
|
|
|
|
2006-02-14 16:20:48 +01:00
|
|
|
my_bool
|
2006-08-17 14:22:59 +02:00
|
|
|
table_check_intact(TABLE *table, const uint table_f_count,
|
2007-04-05 13:24:34 +02:00
|
|
|
const TABLE_FIELD_W_TYPE *table_def);
|
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes
Changes that requires code changes in other code of other storage engines.
(Note that all changes are very straightforward and one should find all issues
by compiling a --debug build and fixing all compiler errors and all
asserts in field.cc while running the test suite),
- New optional handler function introduced: reset()
This is called after every DML statement to make it easy for a handler to
statement specific cleanups.
(The only case it's not called is if force the file to be closed)
- handler::extra(HA_EXTRA_RESET) is removed. Code that was there before
should be moved to handler::reset()
- table->read_set contains a bitmap over all columns that are needed
in the query. read_row() and similar functions only needs to read these
columns
- table->write_set contains a bitmap over all columns that will be updated
in the query. write_row() and update_row() only needs to update these
columns.
The above bitmaps should now be up to date in all context
(including ALTER TABLE, filesort()).
The handler is informed of any changes to the bitmap after
fix_fields() by calling the virtual function
handler::column_bitmaps_signal(). If the handler does caching of
these bitmaps (instead of using table->read_set, table->write_set),
it should redo the caching in this code. as the signal() may be sent
several times, it's probably best to set a variable in the signal
and redo the caching on read_row() / write_row() if the variable was
set.
- Removed the read_set and write_set bitmap objects from the handler class
- Removed all column bit handling functions from the handler class.
(Now one instead uses the normal bitmap functions in my_bitmap.c instead
of handler dedicated bitmap functions)
- field->query_id is removed. One should instead instead check
table->read_set and table->write_set if a field is used in the query.
- handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and
handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now
instead use table->read_set to check for which columns to retrieve.
- If a handler needs to call Field->val() or Field->store() on columns
that are not used in the query, one should install a temporary
all-columns-used map while doing so. For this, we provide the following
functions:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set);
field->val();
dbug_tmp_restore_column_map(table->read_set, old_map);
and similar for the write map:
my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set);
field->val();
dbug_tmp_restore_column_map(table->write_set, old_map);
If this is not done, you will sooner or later hit a DBUG_ASSERT
in the field store() / val() functions.
(For not DBUG binaries, the dbug_tmp_restore_column_map() and
dbug_tmp_restore_column_map() are inline dummy functions and should
be optimized away be the compiler).
- If one needs to temporary set the column map for all binaries (and not
just to avoid the DBUG_ASSERT() in the Field::store() / Field::val()
methods) one should use the functions tmp_use_all_columns() and
tmp_restore_column_map() instead of the above dbug_ variants.
- All 'status' fields in the handler base class (like records,
data_file_length etc) are now stored in a 'stats' struct. This makes
it easier to know what status variables are provided by the base
handler. This requires some trivial variable names in the extra()
function.
- New virtual function handler::records(). This is called to optimize
COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true.
(stats.records is not supposed to be an exact value. It's only has to
be 'reasonable enough' for the optimizer to be able to choose a good
optimization path).
- Non virtual handler::init() function added for caching of virtual
constants from engine.
- Removed has_transactions() virtual method. Now one should instead return
HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support
transactions.
- The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument
that is to be used with 'new handler_name()' to allocate the handler
in the right area. The xxxx_create_handler() function is also
responsible for any initialization of the object before returning.
For example, one should change:
static handler *myisam_create_handler(TABLE_SHARE *table)
{
return new ha_myisam(table);
}
->
static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
return new (mem_root) ha_myisam(table);
}
- New optional virtual function: use_hidden_primary_key().
This is called in case of an update/delete when
(table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined
but we don't have a primary key. This allows the handler to take precisions
in remembering any hidden primary key to able to update/delete any
found row. The default handler marks all columns to be read.
- handler::table_flags() now returns a ulonglong (to allow for more flags).
- New/changed table_flags()
- HA_HAS_RECORDS Set if ::records() is supported
- HA_NO_TRANSACTIONS Set if engine doesn't support transactions
- HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
Set if we should mark all primary key columns for
read when reading rows as part of a DELETE
statement. If there is no primary key,
all columns are marked for read.
- HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some
cases (based on table->read_set)
- HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS
Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION.
- HA_DUPP_POS Renamed to HA_DUPLICATE_POS
- HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
Set this if we should mark ALL key columns for
read when when reading rows as part of a DELETE
statement. In case of an update we will mark
all keys for read for which key part changed
value.
- HA_STATS_RECORDS_IS_EXACT
Set this if stats.records is exact.
(This saves us some extra records() calls
when optimizing COUNT(*))
- Removed table_flags()
- HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if
handler::records() gives an exact count() and
HA_STATS_RECORDS_IS_EXACT if stats.records is exact.
- HA_READ_RND_SAME Removed (no one supported this one)
- Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk()
- Renamed handler::dupp_pos to handler::dup_pos
- Removed not used variable handler::sortkey
Upper level handler changes:
- ha_reset() now does some overall checks and calls ::reset()
- ha_table_flags() added. This is a cached version of table_flags(). The
cache is updated on engine creation time and updated on open.
MySQL level changes (not obvious from the above):
- DBUG_ASSERT() added to check that column usage matches what is set
in the column usage bit maps. (This found a LOT of bugs in current
column marking code).
- In 5.1 before, all used columns was marked in read_set and only updated
columns was marked in write_set. Now we only mark columns for which we
need a value in read_set.
- Column bitmaps are created in open_binary_frm() and open_table_from_share().
(Before this was in table.cc)
- handler::table_flags() calls are replaced with handler::ha_table_flags()
- For calling field->val() you must have the corresponding bit set in
table->read_set. For calling field->store() you must have the
corresponding bit set in table->write_set. (There are asserts in
all store()/val() functions to catch wrong usage)
- thd->set_query_id is renamed to thd->mark_used_columns and instead
of setting this to an integer value, this has now the values:
MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE
Changed also all variables named 'set_query_id' to mark_used_columns.
- In filesort() we now inform the handler of exactly which columns are needed
doing the sort and choosing the rows.
- The TABLE_SHARE object has a 'all_set' column bitmap one can use
when one needs a column bitmap with all columns set.
(This is used for table->use_all_columns() and other places)
- The TABLE object has 3 column bitmaps:
- def_read_set Default bitmap for columns to be read
- def_write_set Default bitmap for columns to be written
- tmp_set Can be used as a temporary bitmap when needed.
The table object has also two pointer to bitmaps read_set and write_set
that the handler should use to find out which columns are used in which way.
- count() optimization now calls handler::records() instead of using
handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true).
- Added extra argument to Item::walk() to indicate if we should also
traverse sub queries.
- Added TABLE parameter to cp_buffer_from_ref()
- Don't close tables created with CREATE ... SELECT but keep them in
the table cache. (Faster usage of newly created tables).
New interfaces:
- table->clear_column_bitmaps() to initialize the bitmaps for tables
at start of new statements.
- table->column_bitmaps_set() to set up new column bitmaps and signal
the handler about this.
- table->column_bitmaps_set_no_signal() for some few cases where we need
to setup new column bitmaps but don't signal the handler (as the handler
has already been signaled about these before). Used for the momement
only in opt_range.cc when doing ROR scans.
- table->use_all_columns() to install a bitmap where all columns are marked
as use in the read and the write set.
- table->default_column_bitmaps() to install the normal read and write
column bitmaps, but not signaling the handler about this.
This is mainly used when creating TABLE instances.
- table->mark_columns_needed_for_delete(),
table->mark_columns_needed_for_delete() and
table->mark_columns_needed_for_insert() to allow us to put additional
columns in column usage maps if handler so requires.
(The handler indicates what it neads in handler->table_flags())
- table->prepare_for_position() to allow us to tell handler that it
needs to read primary key parts to be able to store them in
future table->position() calls.
(This replaces the table->file->ha_retrieve_all_pk function)
- table->mark_auto_increment_column() to tell handler are going to update
columns part of any auto_increment key.
- table->mark_columns_used_by_index() to mark all columns that is part of
an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow
it to quickly know that it only needs to read colums that are part
of the key. (The handler can also use the column map for detecting this,
but simpler/faster handler can just monitor the extra() call).
- table->mark_columns_used_by_index_no_reset() to in addition to other columns,
also mark all columns that is used by the given key.
- table->restore_column_maps_after_mark_index() to restore to default
column maps after a call to table->mark_columns_used_by_index().
- New item function register_field_in_read_map(), for marking used columns
in table->read_map. Used by filesort() to mark all used columns
- Maintain in TABLE->merge_keys set of all keys that are used in query.
(Simplices some optimization loops)
- Maintain Field->part_of_key_not_clustered which is like Field->part_of_key
but the field in the clustered key is not assumed to be part of all index.
(used in opt_range.cc for faster loops)
- dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map()
tmp_use_all_columns() and tmp_restore_column_map() functions to temporally
mark all columns as usable. The 'dbug_' version is primarily intended
inside a handler when it wants to just call Field:store() & Field::val()
functions, but don't need the column maps set for any other usage.
(ie:: bitmap_is_set() is never called)
- We can't use compare_records() to skip updates for handlers that returns
a partial column set and the read_set doesn't cover all columns in the
write set. The reason for this is that if we have a column marked only for
write we can't in the MySQL level know if the value changed or not.
The reason this worked before was that MySQL marked all to be written
columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden
bug'.
- open_table_from_share() does not anymore setup temporary MEM_ROOT
object as a thread specific variable for the handler. Instead we
send the to-be-used MEMROOT to get_new_handler().
(Simpler, faster code)
Bugs fixed:
- Column marking was not done correctly in a lot of cases.
(ALTER TABLE, when using triggers, auto_increment fields etc)
(Could potentially result in wrong values inserted in table handlers
relying on that the old column maps or field->set_query_id was correct)
Especially when it comes to triggers, there may be cases where the
old code would cause lost/wrong values for NDB and/or InnoDB tables.
- Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags:
OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG.
This allowed me to remove some wrong warnings about:
"Some non-transactional changed tables couldn't be rolled back"
- Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset
(thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose
some warnings about
"Some non-transactional changed tables couldn't be rolled back")
- Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table()
which could cause delete_table to report random failures.
- Fixed core dumps for some tests when running with --debug
- Added missing FN_LIBCHAR in mysql_rm_tmp_tables()
(This has probably caused us to not properly remove temporary files after
crash)
- slow_logs was not properly initialized, which could maybe cause
extra/lost entries in slow log.
- If we get an duplicate row on insert, change column map to read and
write all columns while retrying the operation. This is required by
the definition of REPLACE and also ensures that fields that are only
part of UPDATE are properly handled. This fixed a bug in NDB and
REPLACE where REPLACE wrongly copied some column values from the replaced
row.
- For table handler that doesn't support NULL in keys, we would give an error
when creating a primary key with NULL fields, even after the fields has been
automaticly converted to NOT NULL.
- Creating a primary key on a SPATIAL key, would fail if field was not
declared as NOT NULL.
Cleanups:
- Removed not used condition argument to setup_tables
- Removed not needed item function reset_query_id_processor().
- Field->add_index is removed. Now this is instead maintained in
(field->flags & FIELD_IN_ADD_INDEX)
- Field->fieldnr is removed (use field->field_index instead)
- New argument to filesort() to indicate that it should return a set of
row pointers (not used columns). This allowed me to remove some references
to sql_command in filesort and should also enable us to return column
results in some cases where we couldn't before.
- Changed column bitmap handling in opt_range.cc to be aligned with TABLE
bitmap, which allowed me to use bitmap functions instead of looping over
all fields to create some needed bitmaps. (Faster and smaller code)
- Broke up found too long lines
- Moved some variable declaration at start of function for better code
readability.
- Removed some not used arguments from functions.
(setup_fields(), mysql_prepare_insert_check_table())
- setup_fields() now takes an enum instead of an int for marking columns
usage.
- For internal temporary tables, use handler::write_row(),
handler::delete_row() and handler::update_row() instead of
handler::ha_xxxx() for faster execution.
- Changed some constants to enum's and define's.
- Using separate column read and write sets allows for easier checking
of timestamp field was set by statement.
- Remove calls to free_io_cache() as this is now done automaticly in ha_reset()
- Don't build table->normalized_path as this is now identical to table->path
(after bar's fixes to convert filenames)
- Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to
do comparision with the 'convert-dbug-for-diff' tool.
Things left to do in 5.1:
- We wrongly log failed CREATE TABLE ... SELECT in some cases when using
row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result)
Mats has promised to look into this.
- Test that my fix for CREATE TABLE ... SELECT is indeed correct.
(I added several test cases for this, but in this case it's better that
someone else also tests this throughly).
Lars has promosed to do this.
2006-06-04 17:52:22 +02:00
|
|
|
|
|
|
|
static inline my_bitmap_map *tmp_use_all_columns(TABLE *table,
|
|
|
|
MY_BITMAP *bitmap)
|
|
|
|
{
|
|
|
|
my_bitmap_map *old= bitmap->bitmap;
|
|
|
|
bitmap->bitmap= table->s->all_set.bitmap;
|
|
|
|
return old;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static inline void tmp_restore_column_map(MY_BITMAP *bitmap,
|
|
|
|
my_bitmap_map *old)
|
|
|
|
{
|
|
|
|
bitmap->bitmap= old;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* The following is only needed for debugging */
|
|
|
|
|
|
|
|
static inline my_bitmap_map *dbug_tmp_use_all_columns(TABLE *table,
|
|
|
|
MY_BITMAP *bitmap)
|
|
|
|
{
|
|
|
|
#ifndef DBUG_OFF
|
|
|
|
return tmp_use_all_columns(table, bitmap);
|
|
|
|
#else
|
|
|
|
return 0;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void dbug_tmp_restore_column_map(MY_BITMAP *bitmap,
|
|
|
|
my_bitmap_map *old)
|
|
|
|
{
|
|
|
|
#ifndef DBUG_OFF
|
|
|
|
tmp_restore_column_map(bitmap, old);
|
|
|
|
#endif
|
|
|
|
}
|
WL#3817: Simplify string / memory area types and make things more consistent (first part)
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
2007-05-10 11:59:39 +02:00
|
|
|
|
|
|
|
size_t max_row_length(TABLE *table, const uchar *data);
|
|
|
|
|