mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-30 18:41:56 +01:00

Author	SHA1	Message	Date
Mattias Jonsson	544f413df9	merge	2012-12-27 02:43:20 +01:00
Mattias Jonsson	36ac232d6d	manual merge of bug#14845133 mysql-5.1 -> mysql-5.5	2012-11-13 14:47:49 +01:00
Mattias Jonsson	b5ff983ab5	Bug#14845133: The problem is related to the changes made in bug#13025132. get_partition_set can do dynamic pruning which limits the partitions to scan even further. This is not accounted for when setting the correct start of the preallocated record buffer used in the priority queue, thus leading to wrong buffer is used (including wrong preset partitioning id, connected to that buffer). Solution is to fast forward the buffer pointer to point to the correct partition record buffer.	2012-11-13 09:21:59 +01:00
Mattias Jonsson	2f3baa743d	Bug#14845133: The problem is related to the changes made in bug#13025132. get_partition_set can do dynamic pruning which limits the partitions to scan even further. This is not accounted for when setting the correct start of the preallocated record buffer used in the priority queue, thus leading to wrong buffer is used (including wrong preset partitioning id, connected to that buffer). Solution is to fast forward the buffer pointer to point to the correct partition record buffer.	2012-11-13 09:21:59 +01:00
Jon Olav Hauglid	2943c8131a	Bug#14495351: CRASH IN HA_PARTITION::HANDLE_UNORDERED_NEXT Follow-up patch - Fix broken build: error: format ‘%u’ expects argument of type ‘unsigned int’, but argument 2 has type ‘key_part_map {aka long unsigned int}’ [-Werror=format]	2012-10-03 15:00:43 +02:00
Mattias Jonsson	8ce6582c37	Bug#14495351: CRASH IN HA_PARTITION::HANDLE_UNORDERED_NEXT The partitioning engine does not implement index_next for partitions which return HA_ERR_KEY_NOT_FOUND in index_read_map. If HA_ERR_KEY_NOT_FOUND was returned by a partition during index_read_map, that partition would not be included in following calls to index_next. If no partition returned a row in index_read_map, then the subsequent call to index_next would try to use a non existing handler (index out of bound). Even after fixing the index out of bound if at least one partition returned. So it is really two connected bugs 1) crash due to index out of bound (-1 unsigned). 2) not including partitions that returned HA_ERR_KEY_NOT_FOUND. Fixed by recording the partitions that returned HA_ERR_KEY_NOT_FOUND, and include them too when doing handle_ordered_next the first time.	2012-09-10 13:32:50 +02:00
Mattias Jonsson	5d83889791	Bug#13025132 - PARTITIONS USE TOO MUCH MEMORY pre-push fix, removed unused variable.	2012-08-20 12:39:36 +02:00
Mattias Jonsson	6c6d9b46a9	merge	2012-08-20 12:44:40 +02:00
Mattias Jonsson	091e4b192e	merge	2012-08-20 09:55:54 +02:00
Mattias Jonsson	1ffecedfc3	Bug#13025132 - PARTITIONS USE TOO MUCH MEMORY Additional patch to remove the part_id -> ref_buffer offset. The partitioning id and the associate record buffer can be found without having to calculate it. By initializing it for each used partition, and then reuse the key-buffer from the queue, it is not needed to have such map.	2012-08-17 14:25:32 +02:00
Mattias Jonsson	404cce0ff8	manual merge 5.1->5.5	2012-08-15 14:56:55 +02:00
Mattias Jonsson	bcee9f1896	Bug#13025132 - PARTITIONS USE TOO MUCH MEMORY The buffer for the current read row from each partition (m_ordered_rec_buffer) used for sorted reads was allocated on open and freed when the ha_partition handler was closed or destroyed. For tables with many partitions and big records this could take up too much valuable memory. Solution is to only allocate the memory when it is needed and free it when nolonger needed. I.e. allocate it in index_init and free it in index_end (and to handle failures also free it on reset, close etc.) Also only allocating needed memory, according to partitioning pruning. Manually tested that it does not use as much memory and releases it after queries.	2012-08-15 14:31:26 +02:00
Mattias Jonsson	f436b188e2	bug#13949735: crash regression from bug#13694811. There can be cases when the optimizer calls ha_partition::records_in_range when there are no matching partitions. So the DBUG_ASSERT of !tot_used_partitions does assert. Fixed by returning 0 instead when no matching partitions are found. This will avoid the crash. records_in_range will then try to find the biggest used partition, which will not find any partition and records_in_range will then return 0, meaning non rows can be found. Patch contributed by Davi Arnaut at twitter.	2012-05-15 12:45:52 +02:00
Joerg Bruehe	747fbf8f8b	Merge the 5.5.22 release build into main 5.5, conflict in "sql/filesort.cc" solved manually.	2012-03-20 22:27:49 +01:00
Mattias Jonsson	5584e61f35	merge of bug#1364811 into mysql-5.5	2012-03-14 21:57:15 +01:00
Mattias Jonsson	58b2147833	bug#13694811 Updated code comments according to reviewers requests.	2012-03-14 20:36:42 +01:00
Mattias Jonsson	645bddecaf	merge from mysql-5.1	2012-02-29 21:18:50 +01:00
Mattias Jonsson	937ee6b7a0	merge into mysql-5.1	2012-02-29 20:51:38 +01:00
Mattias Jonsson	8325fe02b3	Bug#13694811: THE OPTIMIZER WRONGLY USES THE FIRST INNODB PARTITION STATISTICS Problem was the fix for bug#11756867; It always used the first partitions, and stopped after it checked 10 [sub]partitions. (or until it found a partition which would contain a match). This results in bad statistics for tables where the first 10 partitions don't represent the majority of the data (like when the first 10 partitions only contained a few rows in total). The solution was to take statisics from the partitions containing the most rows instead: Added an array of partition ids which is sorted by number of records in descending order. this array is used in records_in_range to cover as many records as possible in as few calls as possible. Also changed the limit of how many partitions to use for the statistics from a static max of 10 partitions, into a dynamic model: Maximum number of partitions is now log2(total number of partitions) taken from the ordered array. It will continue calling partitions records_in_range until it has checked: (total rows in matching partitions) * (maximum number of partitions) / (number of used partitions) Also reverted the changes for ha_partition::scan_time() and ha_partition::estimate_rows_upper_bound() to before the fix of bug#11756867. Since they are not as slow as records_in_range.	2012-02-22 23:13:36 +01:00
Mattias Jonsson	74374933c8	Bug#11761296: 53775: QUERY ON PARTITIONED TABLE RETURNS CACHED RESULT FROM PREVIOUS TRANSACTION The current Query Cache API is not fully compatible with the partitioning engine. There is no good way to implement support for QC due to: 1) a static callback for ha_partition would need to have access to all partition names and call the underlying callback for each [sub]partition with the correct name. 2) pruning would be impossible, even if one used the ulonglong engine_data due to if engine_data is changed, the table is invalidated by the QC. So the only viable solution to avoid incorrect data is to not allow caching of queries using partitioned tables. (There are some extra changes, due to removal of \r as line break)	2012-02-20 22:59:11 +01:00
Mattias Jonsson	7ebeb1433e	Bug#13593865 - 64037: CRASH IN HA_PARTITION::CREATE_HANDLERS ON ALTER TABLE AFTER DROP PARTITION Bug#13608188 - 64038: CRASH IN HANDLER::HA_THD ON ALTER TABLE AFTER REPAIR NON-EXISTING PARTITION Backport of bug#13357766 from -trunk to -5.5. The state of some partitions was not reset on failure, leading to invalid states of partitions in consequent statements. Fixed by reverting back to original state for all partitions if not all partition names was resolved. Also adding extra security by forcing tables to be reopened in case of error in mysql_alter_table. (There is also removal of \r at the end of some lines.)	2012-02-02 12:47:17 +01:00
Georgi Kodinov	e8313e13aa	merge mysql-5.5->mysql-5.5-security	2011-10-12 15:07:15 +03:00
Mattias Jonsson	ab761db8d5	Bug#12696518: MEMORY LEAKS IN HA_PARTITION (VALGRIND TESTS ON TRUNK) (also 5.5+ solution for bug#11766879/bug#60106) The valgrind warning was due to an unused 'new handler_add_index(...)' which was never freed. The error handling did not work (fails as in bug#11766879) and the implementation was not as transparant as it could, therefore I made it a bit simpler and more transparant to the underlying handlers. This way it follows the api better and the error handling works and is also now tested. Also added a debug test to verify the error handling. Improved according to Jon Olavs review: Added class ha_partition_add_index. Also added base class Sql_alloc to handler_add_index. Update 3.	2011-09-15 20:49:39 +02:00
Mattias Jonsson	0fca226942	Bug#11766879/Bug#60106: DIFF BETWEEN # OF INDEXES IN MYSQL VS INNODB, PARTITONING, ON INDEX CREATE If the first partition succeeded in adding a index, but a successive partition failed, then the first partition had still the new index. The fix reverts the added indexes from previous partitions on failure.	2011-08-23 15:13:17 +02:00
Mats Kindahl	cf5e5f837a	Merging into mysql-5.5.16-release.	2011-08-15 20:12:11 +02:00
Kent Boortz	027b5f1ed4	Updated/added copyright headers	2011-07-03 17:47:37 +02:00
Kent Boortz	68f00a5686	Updated/added copyright headers	2011-06-30 17:37:13 +02:00
Mattias Jonsson	44aa582bb3	merge	2011-06-13 11:09:56 +02:00
Guilhem Bichot	12c42b980a	Fix for BUG#11755168 '46895: test "outfile_loaddata" fails (reproducible)'. In sql_class.cc, 'row_count', of type 'ha_rows', was used as last argument for ER_TRUNCATED_WRONG_VALUE_FOR_FIELD which is "Incorrect %-.32s value: '%-.128s' for column '%.192s' at row %ld". So 'ha_rows' was used as 'long'. On SPARC32 Solaris builds, 'long' is 4 bytes and 'ha_rows' is 'longlong' i.e. 8 bytes. So the printf-like code was reading only the first 4 bytes. Because the CPU is big-endian, 1LL is 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 so the first four bytes yield 0. So the warning message had "row 0" instead of "row 1" in test outfile_loaddata.test: -Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 1 +Warning 1366 Incorrect string value: '\xE1\xE2\xF7' for column 'b' at row 0 All error-messaging functions which internally invoke some printf-life function are potential candidate for such mistakes. One apparently easy way to catch such mistakes is to use ATTRIBUTE_FORMAT (from my_attribute.h). But this works only when call site has both: a) the format as a string literal b) the types of arguments. So: func(ER(ER_BLAH), 10); will silently not be checked, because ER(ER_BLAH) is not known at compile time (it is known at run-time, and depends on the chosen language). And func("%s", a va_list argument); has the same problem, as the real type of arguments is not known at this site at compile time (it's known in some caller). Moreover, func(ER(ER_BLAH)); though possibly correct (if ER(ER_BLAH) has no '%' markers), will not compile (gcc says "error: format not a string literal and no format arguments"). Consequences: 1) ATTRIBUTE_FORMAT is here added only to functions which in practice take "string literal" formats: "my_error_reporter" and "print_admin_msg". 2) it cannot be added to the other functions: my_error(), push_warning_printf(), Table_check_intact::report_error(), general_log_print(). To do a one-time check of functions listed in (2), the following "static code analysis" has been done: 1) replace my_error(ER_xxx, arguments for substitution in format) with the equivalent my_printf_error(ER_xxx,ER(ER_xxx), arguments for substitution in format), so that we have ER(ER_xxx) and the arguments in the same call site 2) add ATTRIBUTE_FORMAT to push_warning_printf(), Table_check_intact::report_error(), general_log_print() 3) replace ER(xxx) with the hard-coded English text found in errmsg.txt (like: ER(ER_UNKNOWN_ERROR) is replaced with "Unknown error"), so that a call site has the format as string literal 4) this way, ATTRIBUTE_FORMAT can effectively do its job 5) compile, fix errors detected by ATTRIBUTE_FORMAT 6) revert steps 1-2-3. The present patch has no compiler error when submitted again to the static code analysis above. It cannot catch all problems though: see Field::set_warning(), in which a call to push_warning_printf() has a variable error (thus, not replacable by a string literal); I checked set_warning() calls by hand though. See also WL 5883 for one proposal to avoid such bugs from appearing again in the future. The issues fixed in the patch are: a) mismatch in types (like 'int' passed to '%ld') b) more arguments passed than specified in the format. This patch resolves mismatches by changing the type/number of arguments, not by changing error messages of sql/share/errmsg.txt. The latter would be wrong, per the following old rule: errmsg.txt must be as stable as possible; no insertions or deletions of messages, no changes of type or number of printf-like format specifiers, are allowed, as long as the change impacts a message already released in a GA version. If this rule is not followed: - Connectors, which use error message numbers, will be confused (by insertions/deletions of messages) - using errmsg.sys of MySQL 5.1.n with mysqld of MySQL 5.1.(n+1) could produce wrong messages or crash; such usage can easily happen if installing 5.1.(n+1) while /etc/my.cnf still has --language=/path/to/5.1.n/xxx; or if copying mysqld from 5.1.(n+1) into a 5.1.n installation. When fixing b), I have verified that the superfluous arguments were not used in the format in the first 5.1 GA (5.1.30 'bteam@astra04-20081114162938-z8mctjp6st27uobm'). Had they been used, then passing them today, even if the message doesn't use them anymore, would have been necessary, as explained above. include/my_getopt.h: this function pointer is used only with "string literal" formats, so we can add ATTRIBUTE_FORMAT. mysql-test/collections/default.experimental: test should pass now sql/derror.cc: by having a format as string literal, ATTRIBUTE_FORMAT check becomes effective. sql/events.cc: Change justified by the following excerpt from sql/share/errmsg.txt: ER_EVENT_SAME_NAME eng "Same old and new event name" ER_EVENT_SET_VAR_ERROR eng "Error during starting/stopping of the scheduler. Error code %u" sql/field.cc: ER_TOO_BIG_SCALE 42000 S1009 eng "Too big scale %d specified for column '%-.192s'. Maximum is %lu." ER_TOO_BIG_PRECISION 42000 S1009 eng "Too big precision %d specified for column '%-.192s'. Maximum is %lu." ER_TOO_BIG_DISPLAYWIDTH 42000 S1009 eng "Display width out of range for column '%-.192s' (max = %lu)" sql/ha_ndbcluster.cc: ER_OUTOFMEMORY HY001 S1001 eng "Out of memory; restart server and try again (needed %d bytes)" (sizeof() returns size_t) sql/ha_ndbcluster_binlog.cc: Too many arguments for: ER_GET_ERRMSG eng "Got error %d '%-.100s' from %s" Patch by Jonas Oreland. sql/ha_partition.cc: print_admin_msg() is used only with a literal as format, so ATTRIBUTE_FORMAT works. sql/handler.cc: ER_OUTOFMEMORY HY001 S1001 eng "Out of memory; restart server and try again (needed %d bytes)" (sizeof() returns size_t) sql/item_create.cc: ER_TOO_BIG_SCALE 42000 S1009 eng "Too big scale %d specified for column '%-.192s'. Maximum is %lu." ER_TOO_BIG_PRECISION 42000 S1009 eng "Too big precision %d specified for column '%-.192s'. Maximum is %lu." 'c_len' and 'c_dec' are char, passed as %d !! We don't know their value (as strtoul() failed), but they are likely big, so we use INT_MAX. 'len' is ulong. sql/item_func.cc: ER_WARN_DATA_OUT_OF_RANGE 22003 eng "Out of range value for column '%s' at row %ld" ER_CANT_FIND_UDF eng "Can't load function '%-.192s'" sql/item_strfunc.cc: ER_TOO_BIG_FOR_UNCOMPRESS eng "Uncompressed data size too large; the maximum size is %d (probably, length of uncompressed data was corrupted)" max_allowed_packet is ulong. sql/mysql_priv.h: sql_print_message_func is a function _pointer_. sql/sp_head.cc: ER_SP_RECURSION_LIMIT eng "Recursive limit %d (as set by the max_sp_recursion_depth variable) was exceeded for routine %.192s" max_sp_recursion_depth is ulong sql/sql_acl.cc: ER_PASSWORD_NO_MATCH 42000 eng "Can't find any matching row in the user table" ER_CANT_CREATE_USER_WITH_GRANT 42000 eng "You are not allowed to create a user with GRANT" sql/sql_base.cc: ER_NOT_KEYFILE eng "Incorrect key file for table '%-.200s'; try to repair it" ER_TOO_MANY_TABLES eng "Too many tables; MySQL can only use %d tables in a join" MAX_TABLES is size_t. sql/sql_binlog.cc: ER_UNKNOWN_ERROR eng "Unknown error" sql/sql_class.cc: ER_TRUNCATED_WRONG_VALUE_FOR_FIELD eng "Incorrect %-.32s value: '%-.128s' for column '%.192s' at row %ld" WARN_DATA_TRUNCATED 01000 eng "Data truncated for column '%s' at row %ld" sql/sql_connect.cc: ER_HANDSHAKE_ERROR 08S01 eng "Bad handshake" ER_BAD_HOST_ERROR 08S01 eng "Can't get hostname for your address" sql/sql_insert.cc: ER_WRONG_VALUE_COUNT_ON_ROW 21S01 eng "Column count doesn't match value count at row %ld" sql/sql_parse.cc: ER_WARN_HOSTNAME_WONT_WORK eng "MySQL is started in --skip-name-resolve mode; you must restart it without this switch for this grant to work" ER_TOO_HIGH_LEVEL_OF_NESTING_FOR_SELECT eng "Too high level of nesting for select" ER_UNKNOWN_ERROR eng "Unknown error" sql/sql_partition.cc: ER_OUTOFMEMORY HY001 S1001 eng "Out of memory; restart server and try again (needed %d bytes)" sql/sql_plugin.cc: ER_OUTOFMEMORY HY001 S1001 eng "Out of memory; restart server and try again (needed %d bytes)" sql/sql_prepare.cc: ER_OUTOFMEMORY HY001 S1001 eng "Out of memory; restart server and try again (needed %d bytes)" ER_UNKNOWN_STMT_HANDLER eng "Unknown prepared statement handler (%.s) given to %s" length value (for '%.*s') must be 'int', per the doc of printf() and the code of my_vsnprintf(). sql/sql_show.cc: ER_OUTOFMEMORY HY001 S1001 eng "Out of memory; restart server and try again (needed %d bytes)" sql/sql_table.cc: ER_TOO_BIG_FIELDLENGTH 42000 S1009 eng "Column length too big for column '%-.192s' (max = %lu); use BLOB or TEXT instead" sql/table.cc: ER_NOT_FORM_FILE eng "Incorrect information in file: '%-.200s'" ER_COL_COUNT_DOESNT_MATCH_PLEASE_UPDATE eng "Column count of mysql.%s is wrong. Expected %d, found %d. Created with MySQL %d, now running %d. Please use mysql_upgrade to fix this error." table->s->mysql_version is ulong. sql/unireg.cc: ER_TOO_LONG_TABLE_COMMENT eng "Comment for table '%-.64s' is too long (max = %lu)" ER_TOO_LONG_FIELD_COMMENT eng "Comment for field '%-.64s' is too long (max = %lu)" ER_TOO_BIG_ROWSIZE 42000 eng "Row size too large. The maximum row size for the used table type, not counting BLOBs, is %ld. You have to change some columns to TEXT or BLOBs"	2011-05-16 22:04:01 +02:00
Kent Boortz	789aa8c485	Updated/added copyright headers	2011-07-04 01:25:49 +02:00
Mattias Jonsson	e827b51fa0	merge	2011-06-13 11:21:54 +02:00
Jon Olav Hauglid	9b076952ec	Bug#11853126 RE-ENABLE CONCURRENT READS WHILE CREATING SECONDARY INDEX IN INNODB The patches for Bug#11751388 and Bug#11784056 enabled concurrent reads while creating secondary indexes in InnoDB. However, they introduced a regression. This regression occured if ALTER TABLE failed after the index had been added, for example during the lock upgrade needed to update .FRM. If this happened, InnoDB and the server got out of sync with regards to which indexes actually existed. Therefore the patch for Bug#11815600 again disabled concurrent reads. This patch re-enables concurrent reads. The original regression is fixed by splitting the ADD INDEX operation into two parts. First the new index is created but not made active. This is done while concurrent reads are allowed. The second part of the operation makes the index active (or reverts the change). This is done after lock upgrade, which prevents the original regression. In order to implement this change, the patch changes the storage API for in-place index creation. handler::add_index() is split into two functions, handler_add_index() and handler::final_add_index(). The former for creating indexes without making them visible and the latter for commiting (i.e. making visible) new indexes or reverting the changes. Large parts of this patch were written by Marko Mäkelä. Test case added to innodb_mysql_lock.test.	2011-06-01 10:06:55 +02:00
Guilhem Bichot	3ceec2f19c	Merge from 5.1.	2011-05-21 10:21:08 +02:00
Karen Langford	83f19ef457	Merge from mysql-5.1.57-release	2011-05-06 10:03:02 +02:00
Mattias Jonsson	044bf3b6b3	bug#11765667: bug#58655: ASSERTION FAILED, SERVER CRASHES WITH MYSQLD GOT SIGNAL 6 The partitioning engine checked the auto_increment column even if it was not to be written, triggering a DBUG_ASSERT. Fixed by checking if table->write_set for that column was set.	2011-04-29 13:00:16 +02:00
Mattias Jonsson	54c1da00ee	removed dead obsolete code	2011-04-29 09:48:26 +02:00
Mattias Jonsson	440fba13d5	fix of partitioning tests that fails on windows + merge of minor cleanup	2011-04-29 09:56:36 +02:00
Mattias Jonsson	163807cfca	merge	2011-04-27 18:42:05 +02:00
Mattias Jonsson	401941c258	Post push fix for bug#11766249 bug#59316 Partitions can have different ref_length (position data length). Removed DBUG_ASSERT which crashed debug builds when using MAX_ROWS on some partitions.	2011-04-27 17:51:06 +02:00
Guilhem Bichot	3c894df56b	merge from latest 5.5	2011-04-26 13:14:42 +02:00
Mattias Jonsson	bdaaee5d04	post fix for werror build for bug#11766249.	2011-04-26 10:21:09 +02:00
Mattias Jonsson	f7b98c25f4	Manual merge from 5.1	2011-04-20 19:53:08 +02:00
Mattias Jonsson	046b57450d	merge	2011-04-20 18:00:50 +02:00
Mattias Jonsson	bd92ea4311	Bug#11766249 bug#59316: PARTITIONING AND INDEX_MERGE MEMORY LEAK Update for previous patch according to reviewers comments. Updated the constructors for ha_partitions to use the common init_handler_variables functions Added use of defines for size and offset to get better readability for the code that reads and writes the .par file. Also refactored the get_from_handler_file function.	2011-04-20 17:52:33 +02:00
unknown	914873674b	Bug#11867664: Fix server crashes on update with join on partitioned table.	2011-04-12 01:36:38 +02:00
Mattias Jonsson	e0887df8e1	Bug#11766249 bug#59316: PARTITIONING AND INDEX_MERGE MEMORY LEAK When executing row-ordered-retrieval index merge, the handler was cloned, but it used the wrong memory root, so instead of allocating memory on the thread/query's mem_root, it used the table's mem_root, resulting in non released memory in the table object, and was not freed until the table was closed. Solution was to ensure that memory used during cloning of a handler was allocated from the correct memory root. This was implemented by fixing handler::clone() to also take a name argument, so it can be used with partitioning. And in ha_partition only allocate the ha_partition's ref, and call the original ha_partition partitions clone() and set at cloned partitions. Fix of .bzrignore on Windows with VS 2010	2011-03-25 12:36:02 +01:00
Mattias Jonsson	8dbbeedf41	Bug#11867664: SERVER CRASHES ON UPDATE WITH JOIN ON PARTITIONED TABLE Regression from bug#11766232. m_last_part could be set beyond the last partition. Fixed by only setting it if within the limit. Also added check in print_error.	2011-03-18 11:03:54 +01:00
Mattias Jonsson	d3ca484f46	merge	2011-03-09 18:41:16 +01:00
Mattias Jonsson	e24398ee42	Bug#59297: Can't find record in 'tablename' on update inner join Regression introduced in bug#52455. Problem was that the fixed function did not set the last used partition variable, resulting in wrong partition used when storing the position of the newly retrieved row. Fixed by setting the last used partition in ha_partition::index_read_idx_map.	2011-01-24 13:41:44 +01:00
Jon Olav Hauglid	e99f2b1c4e	Bug #42230 during add index, cannot do queries on storage engines that implement add_index The problem was that ALTER TABLE blocked reads on an InnoDB table while adding a secondary index, even if this was not needed. It is only needed for the final step where the .frm file is updated. The reason queries were blocked, was that ALTER TABLE upgraded the metadata lock from MDL_SHARED_NO_WRITE (which blocks writes) to MDL_EXCLUSIVE (which blocks all accesses) before index creation. The way the server handles index creation, is that storage engines publish their capabilities to the server and the server determines which of the following three ways this can be handled: 1) build a new version of the table; 2) change the existing table but with exclusive metadata lock; 3) change the existing table but without metadata lock upgrade. For InnoDB and secondary index creation, option 3) should have been selected. However this failed for two reasons. First, InnoDB did not publish this capability properly. Second, the ALTER TABLE code failed to made proper use of the information supplied by the storage engine. A variable need_lock_for_indexes was set accordingly, but was not later used. This patch fixes this problem by only doing metadata lock upgrade before index creation/deletion if this variable has been set. This patch also changes some of the related terminology used in the code. Specifically the use of "fast" and "online" with respect to ALTER TABLE. "Fast" was used to indicate that an ALTER TABLE operation could be done without involving a temporary table. "Fast" has been renamed "in-place" to more accurately describe the behavior. "Online" meant that the operation could be done without taking a table lock. However, in the current implementation writes are always prohibited during ALTER TABLE and an exclusive metadata lock is held while updating the .frm, so ALTER TABLE is not completely online. This patch replaces "online" with "in-place", with additional comments indicating if concurrent reads are allowed during index creation/deletion or not. An important part of this update of terminology is renaming of the handler flags used by handlers to indicate if index creation/deletion can be done in-place and if concurrent reads are allowed. For example, the HA_ONLINE_ADD_INDEX_NO_WRITES flag has been renamed to HA_INPLACE_ADD_INDEX_NO_READ_WRITE, while HA_ONLINE_ADD_INDEX is now HA_INPLACE_ADD_INDEX_NO_WRITE. Note that this is a rename to clarify current behavior, the flag values have not changed and no flags have been removed or added. Test case added to innodb_mysql_sync.test.	2011-01-26 14:23:29 +01:00

1 2 3 4 5 ...

434 commits