row_build(), row_upd_index_replace_new_col_vals_index_pos(), and
row_upd_index_replace_new_col_vals() are safe.
btr_cur_optimistic_update(), btr_cur_pessimistic_update(): Note that
the B-tree page of the clustered index record is latched in mtr.
trx_undo_prev_version_build(): Add const qualifiers to index_rec
and rec. Note that the page of index_rec is latched in index_mtr.
row_vers_impl_x_locked_off_kernel(), row_vers_old_has_index_entry():
Note that the stack of versions is locked by mtr and thus it is
safe to call row_build().
Change the format of TRX_IDs in INFORMATION_SCHEMA tables from DEC to
HEX.
The current TRX_IDs are hard to remember and track down: 426355, 428466,
428566, etc.
In HEX:
* there are less "digits", the strings are shorter;
* since there are 16 instead of 10 "digits", the chance of having
repeating ones are smaller.
The above look like 68173, 689B2, 68A16 in HEX.
Discussed with: Ken
Approved by: Heikki (via IM)
btr_copy_blob_prefix(), btr_copy_externally_stored_field_prefix_low():
Document the return value as "number of bytes written", not "bytes written".
trx_undo_page_fetch_ext(): Explain the assertion ut_a(ext_len).
row_build_index_entry(): Explain the assertion ut_a(!ext).
Bugfix: Lock the MySQL mutex LOCK_thread_count before accessing
trx->mysql_query_str to avoid race conditions where MySQL sets it to
NULL after we have checked that it is not NULL and before we access it.
Approved by: Marko
enough prefixes of externally stored columns, so that purge will not have
to dereference any BLOB pointers, which may be invalid. This will not be
necessary for logging inserts, because inserts are no-ops in purge, and
the record will remain locked during transaction rollback.
TODO: in dict_build_table_def_step() or dict_build_index_def_step(),
prevent the creation of tables with too many columns for which a
prefix index is defined. This is because there is a size limit of undo
log records, and for each prefix-indexed column, the log must store
REC_MAX_INDEX_COL_LEN + BTR_EXTERN_FIELD_REF_SIZE bytes.
trx_undo_page_report_insert(): Assert that the index is clustered.
trx_undo_page_fetch_ext(): New function, for fetching the BLOB prefix
in trx_undo_page_report_modify().
trx_undo_page_report_modify(): Write long enough prefixes of the externally
stored columns to the undo log.
trx_undo_rec_get_partial_row(): Remove the parameter "ext". Assert that
the undo log contains long enough prefixes of the externally stored columns.
purge_node_t: Remove the field "ext".
* Change terminology:
wait lock -> requested lock
waited lock -> blocking lock
new: requesting transaction (the trx what owns the requested lock)
new: blocking transaction (the trx that owns the blocking lock)
* Add transaction ids to INFORMATION_SCHEMA.INNODB_LOCK_WAITS. This is
somewhat redundant because transaction ids can be found in INNODB_LOCKS
(which can be joined with INNODB_LOCK_WAITS) but would help users to
write shorter joins (one table less) in some cases where they want to
find which transaction is blocking which.
Suggested by: Ken
Approved by: Heikki
Only add indexed BLOBs to row_ext.
trx_undo_rec_get_partial_row(): Move the BLOB fetching to row_ext_create().
row_build(): Pass only those BLOBs to row_ext_create() that are referenced by
ordering columns of some indexes, similar to trx_undo_rec_get_partial_row().
row_ext_create(): Add the parameter "tuple". Move the implementation
from row0ext.ic to row0ext.c.
row_ext_lookup_ith(), row_ext_lookup(): Return a const pointer. Remove
the parameters "field" and "f_len". Make the row_ext_t* parameter const.
row_ext_t: Remove the field zip_size.
field_ref_zero[]: Declare in btr0types.h instead of btr0cur.h.
row_ext_lookup_low(): Rename to row_ext_cache_fill() and change the
signature.
only for those externally stored columns that occur in the ordering columns
of indexes. Prefetch the prefixes of those columns, because the clustered
index record and the BLOBs may have been deleted by the time when the
purge thread needs to read the BLOB prefixes.
row_ext_create(): Add the debug assertion ut_ad(ut_is_2pow(zip_size)).
column prefix of an externally stored column.
row_upd_ext_fetch(): New function.
row_upd_index_replace_new_col_vals(),
row_upd_index_replace_new_col_vals_index_pos(): Fetch prefixes of
externally stored columns when they are needed for column prefix
indexes. For memory allocation, add the parameter ext_heap. Avoid
repeating the inner loop after finding a matching upd_field->field_no.
set the "external storage" flag. When parsing the undo log, do not
misinterpret a SQL NULL column for externally stored.
These bugs were spotted by Heikki and Sunny.
trx_undo_page_report_modify(): Set the UNIV_EXTERN_STORAGE_FIELD flag
when needed.
trx_undo_rec_get_partial_row(): Check for len == UNIV_SQL_NULL.
Implement a limit on the memory used by the INNODB_TRX, INNODB_LOCKS and
INNODB_LOCK_WAITS tables. The maximum allowed memory is defined with the
macro TRX_I_S_MEM_LIMIT.
Approved by: Marko (via IM)
Add the query in information_schema.innodb_trx.trx_query. Add it even
though it is available in information_schema.processlist.info to make
inconsistencies between those two tables obvious.
It is rather confusting to see a transaction shown in innodb_trx and
innodb_locks that holds a lock on one table and the corresponding query
in processlist executing INSERT on another table. We do not want users
to contact us asking to explain that. It is caused by the fact that the
data for innodb_* tables and processlist is fetched at different time.
Approved by: Marko
Introduce a generic soultion to the common problem that MySQL do not add
functions needed by us in a reasonable time.
Start with a function that retrieves THD::thread_id, this is needed for
the information_schema.innodb_trx.mysql_thread_id column.
Approved by: Marko
recovered transactions from new ones. Until r1594, they were distinguished
by trx->sess == NULL.
trx_t: Add the bitfield is_recovered.
trx_lists_init_at_db_start(): Set trx->is_recovered.
trx_create(): Initialize trx->is_recovered = 0.
trx_print(): Display information about trx->is_recovered.
trx_rollback_or_clean_all_without_sess(): Skip new transactions.
Protect all accesses of trx_sys->trx_list with kernel_mutex.
trx_roll_crash_recv_trx, trx_roll_max_undo_no, trx_roll_progress_printed_pct:
Made these variables static.
Add innodb_locks.lock_data column and some relevant tests.
For record locks this column represents the ordering fields of the
locked row in a human readable, SQL-valid, format.
Approved by: Marko
enum trx_dict_op: dictionary operation modes
trx_get_dict_operation(), trx_set_dict_operation(): Accessors for
trx->dict_operation.
lock_table_enqueue_waiting(), lock_rec_enqueue_waiting(): Do not complain
about lock waits if the dictionary mode is TRX_DICT_OP_INDEX_MAY_WAIT.
row_merge_lock_table(): Remove the work-around for avoiding the warning
in lock_table_enqueue_waiting().
trx_undo_mark_as_dict_operation(): Do not write trx->table_id to the
undo log unless the dict_operation is TRX_DICT_OP_TABLE.
ha_innobase::add_index(): Set the dict_operation mode initially to
TRX_DICT_OP_INDEX_MAY_WAIT, then lock the table exclusively, and set the
mode to TRX_DICT_OP_INDEX, and optionally to TRX_DICT_OP_TABLE when
creating a temporary table.
fix the bugs introduced in r1591.
row_rec_to_index_entry_low(): Clear "n_ext". Do not allow it to be NULL.
Add const qualifier to dict_index_t*.
row_rec_to_index_entry(): Add the parameters "offsets" and "n_ext".
btr_cur_optimistic_update(): Add an assertion that there are no externally
stored columns. Remove the unreachable call to btr_cur_unmark_extern_fields()
and the preceding unnecessary call to rec_get_offsets().
btr_push_update_extern_fields(): Remove the parameters index, offsets.
Only report the additional externally stored columns of the update vector.
row_build(), trx_undo_rec_get_partial_row(): Flag externally stored columns
also with dfield_set_ext().
rec_copy_prefix_to_dtuple(): Assert that there are no externally stored
columns in the prefix.
row_build_row_ref(): Note and assert that the index is a secondary index,
and assert that there are no externally stored columns.
row_build_row_ref_fast(): Assert that there are no externally stored columns.
rec_offs_get_n_alloc(): Expose the function.
row_build_row_ref_in_tuple(): Assert that there are no externally stored
columns in a record of a secondary index.
row_build_row_ref_from_row(): Assert that there are no externally stored
columns.
row_upd_check_references_constraints(): Add the parameter offsets, to
avoid a redundant call to rec_get_offsets().
row_upd_del_mark_clust_rec(): Add the parameter offsets. Remove
duplicated code.
row_ins_index_entry_set_vals(): Copy the external storage flag.
sel_pop_prefetched_row(): Assert that there are no externally stored
columns.
row_scan_and_check_index(): Copy offsets to a temporary heap across
the invocation of row_rec_to_index_entry().
and use it for flagging externally stored columns in the data tuple.
The data tuple contains the same columns as the clustered index record,
but in a different order. This error was introduced in r1591.
TODO: the assertion ut_ad(!dfield_is_ext()) may fail in
btr_cur_pessimistic_update().
offsets_[] arrays, as suggested by Vasil.
rec_offs_set_n_alloc(): Declare as a public function. Assert that
n_alloc > REC_OFFS_HEADER_SIZE.
rec_offs_get_n_alloc(): Assert that n_alloc > REC_OFFS_HEADER_SIZE.
Copy any data (currently table name and table index) that may be
destroyed after releasing the kernel mutex into internal cache's
storage.
This is done in efficient manner using ha_storage type and a given
string is copied only once into the cache's storage. Later additions of
the same string use the already stored string, thus allocating memory
only once per unique string.
Approved by: Marko
Use the newly introduced mem_alloc2() to use the memory that has been
allocated in addition to the requested memory. This is done in order to
avoid wasting memory.
Do not calculate the sizes and offsets of the chunks in advance in
table_cache_init() because it is unknown how much bytes will actually
be allocated by mem_alloc2(). Rather calculate these on the run: after
each chunk is allocated set its size and the offset of the next chunk.
Similar patch approved by: Marko
innodb_lock_waits. See
https://svn.innodb.com/innobase/InformationSchema/TransactionsAndLocks
for design notes.
Things that need to be resolved before this goes live:
* MySQL must add thd_get_thread_id() function to their code
http://bugs.mysql.com/30930
* Allocate memory from mem_heap instead of using mem_alloc()
* Copy table name and index name into the cache because they may be
freed later which will result in referencing freed memory
Approved by: Marko
trx_t: Remove dict_undo_list and dict_redo_list.
innobase_create_temporary_tablename(): Replace TEMP_TABLE_PREFIX with
a table name suffix "#1" or "#2". In this way, the user can restore
precious data, should anything go wrong. It is possible to reach an
inconsistent state, because the creation, deletion and renaming of
single-table tablespaces are not transactional.
ut_print_namel(), fil_make_ibd_name(), innobase_rename_table(): Remove
the special treatment of TEMP_TABLE_PREFIX.
Introduce TEMP_INDEX_PREFIX == 0xff for temporary indexes. This byte
cannot occur in index names since MySQL 4.1. However, it might have
been possible to use this byte in MySQL 4.0.
recv_recovery_from_checkpoint_finish(): Call the new function
row_merge_drop_temp_indexes(), to drop all indexes whose name starts
with the byte 0xff.
row_merge_rename_indexes(): Renamed from row_merge_rename_index().
Remove the parameter "index".
row_drop_table_for_mysql(): Unconditionally call trx_commit_for_mysql().
row_drop_table_for_mysql_no_commit(): Correct the function commit,
based on the corrected comment of row_drop_table_for_mysql(). Rely on
table->to_be_dropped instead of TEMP_TABLE_PREFIX.
ha_innobase::add_index(): Simplify the control flow.
trx_t: Change the type of error_info from void* to const dict_index_t*.
trx_get_error_info(): Add const qualifier to trx_t*. Make this an
inline function.
be set always.
trx_rollback_active(): Split from trx_rollback_or_clean_all_without_sess().
row_undo_dictionary(): Do not return a value. Assert that all operations
succeed.
row_merge_drop_index(): Remove bogus comment about void return value.
trx_dummy_sess: Move the declaration from trx0roll.h to trx0trx.h,
because the variable is defined in trx0trx.c.
Some things still fail in innodb-index.test, and there seems to be
a race condition (data dictionary lock wait) when running with --valgrind.
dfield_t: Add an "external storage" flag, dfield->ext.
dfield_is_null(), dfield_is_ext(), dfield_set_ext(), dfield_set_null():
New functions.
dfield_copy(), dfield_copy_data(): Add const qualifiers, fix in/out comments.
data_write_sql_null(): Use memset().
big_rec_field_t: Replace byte* data with const void* data.
ut_ulint_sort(): Remove.
upd_field_t: Remove extern_storage.
upd_node_t: Replace ext_vec, n_ext_vec with n_ext.
row_merge_copy_blobs(): New function.
row_ins_index_entry(): Add the parameter "ibool foreign" for suppressing
foreign key checks during fast index creation or when inserting into
secondary indexes.
btr_page_insert_fits(): Add const qualifiers.
btr_cur_add_ext(), upd_ext_vec_contains(): Remove.
dfield_print_also_hex(), dfield_print(): Replace if...else if with switch.
Observe dfield_is_ext().
now that all of InnoDB code is built from a single Makefile and it should
not be possible to build the modules with mutually incompatible options.
#define INSIDE_HA_INNOBASE_CC: Remove.
srv_sizeof_trx_t_in_ha_innodb_cc: Remove.
dict_table_get_low_noninlined(): Remove. This function was unused.
Remove all _noninline functions. Remove the _noninline suffix from
all function calls in ha_innodb.cc.