mariadb/storage/innobase/include/rem0rec.ic
unknown e2513bf07f Apply snapshot innodb-5.1-ss1989
Fixes the following bugs:

Bug #30706: SQL thread on slave is allowed to block client queries when slave load is high
  Add (innodb|innobase|srv)_replication_delay MySQL config parameter.

Bug #30888: Innodb table + stored procedure + row deletion = server crash
  While adding code for the low level read of the AUTOINC value from the index,
  the case for MEDIUM ints which are 3 bytes was missed triggering an
  assertion.

Bug #30907: Regression: "--innodb_autoinc_lock_mode=0" (off) not same as older releases
  We don't rely on *first_value to be 0 when checking whether
  get_auto_increment() has been invoked for the first time in a multi-row
  INSERT. We instead use trx_t::n_autoinc_rows. Initialize trx::n_autoinc_rows
  inside ha_innobase::start_stmt() too.

Bug #31444: "InnoDB: Error: MySQL is freeing a thd" in innodb_mysql.test
  ha_innobase::external_lock(): Update prebuilt->mysql_has_locked and
  trx->n_mysql_tables_in_use only after row_lock_table_for_mysql() returns
  DB_SUCCESS.  A timeout on LOCK TABLES would lead to an inconsistent state,
  which would cause trx_free() to print a warning.

Bug #31494: innodb + 5.1 + read committed crash, assertion
  Set an error code when a deadlock occurs in semi-consistent read.


mysql-test/r/innodb.result:
  Apply snapshot innodb-5.1-ss1989
  
  Also, a test is moved into the new innodb_autoinc_lock_mode_zero
  test, because it depends on a non-default setting for a read-only
  variable.
  
  Revision r1821:
  Merge a change from MySQL AB:
  
  ChangeSet@1.2536.50.1  2007-08-02 12:45:56-07:00  igor@mysql.com
  
  Fixed bug#28404.
  This patch adds cost estimation for the queries with ORDER BY / GROUP BY
  and LIMIT.
  If there was a ref/range access to the table whose rows were required
  to be ordered in the result set the optimizer always employed this access
  though a scan by a different index that was compatible with the required
  order could be cheaper to produce the first L rows of the result set.
  Now for such queries the optimizer makes a choice between the cheapest
  ref/range accesses not compatible with the given order and index scans
  compatible with it.
  
  innodb.result: Adjusted results for test cases affected fy the fix for
  bug #28404.
  
  
  Revision r1781:
  Fix a test case that was broken after Bug#16979 fix. See r1645 and r1735.
  The variable used in the tests below was introduced in r1735.
  
  
  Revision r1792:
  innodb.result: Revert r1655, which should have been reverted as part of r1781.
  
  
  Revision r1843:
  Add test for Bug# 21409, the actual bug was fixed in r1334.
mysql-test/t/innodb.test:
  Apply snapshot innodb-5.1-ss1989
  
  Also, a test is moved into the new innodb_autoinc_lock_mode_zero
  test, because it depends on a non-default setting for a read-only
  variable.
  
  Revision r1781:
  Fix a test case that was broken after Bug#16979 fix. See r1645 and r1735.
  The variable used in the tests below was introduced in r1735.
  
  
  Revision r1843:
  Add test for Bug# 21409, the actual bug was fixed in r1334.
storage/innobase/buf/buf0lru.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1819:
  Merge r1815:1817 from branches/zip: Improve Valgrind instrumentation.
  
  UNIV_MEM_ASSERT_RW(): New macro, to check that the contents of a memory
  area is defined.
  
  UNIV_MEM_ASSERT_W(): New macro, to check that a memory area is writable.
  
  UNIV_MEM_ASSERT_AND_FREE(): New macro, to check that the memory is
  writable before declaring it free (unwritable).  This replaces UNIV_MEM_FREE()
  in many places.
  
  mem_init_buf(): Check that the memory is writable, and declare it undefined.
  
  mem_erase_buf(): Check that the memory is writable, and declare it freed.
storage/innobase/dict/dict0dict.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1894:
  Add debug lock checks to autoinc functions. Add lock guards around an
  invocation of dict_table_autoinc_initialize().
storage/innobase/dict/dict0load.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1974:
  Prevent loading of tables that have unsupported features most notably
  FTS indexes.
storage/innobase/handler/ha_innodb.cc:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1850:
  Implement this feature request:
  http://bugs.mysql.com/30706
  
  * Add a function that returns the number of microseconds since
    epoch - ut_time_us().
  
  * Add (innodb|innobase|srv)_replication_delay MySQL config parameter.
  
  * Add UT_WAIT_FOR() macro that waits for a specified condition to occur
    until a timeout elapses.
  
  * Using all of the above, handle the replication thread specially in
    srv_conc_enter_innodb().
  
  Approved by:	Heikki
  
  
  
  Revision r1887:
  Merge changes from MySQL AB:
  
  ChangeSet@1.2528.115.25  2007-08-27 18:18:14-06:00  tsmith@hindu.god
  
  Fix some Windows compiler warnings.
  
  dict0mem.c: Fix compiler warning with a cast.
  
  ha_innodb.cc: Change type to fix a compiler warning.
  
  
  Revision r1809:
  ha_innobase::external_lock(): Update prebuilt->mysql_has_locked and
  trx->n_mysql_tables_in_use only after row_lock_table_for_mysql()
  returns DB_SUCCESS.  A timeout on LOCK TABLES would lead to an
  inconsistent state, which would cause trx_free() to print a warning.
  
  This was later reported as Bug #31444.
  
  
  Revision r1833:
  Add /*== ... === */ decoration that was missing around some auto-inc functions.
  Add a missing comment, fix the length of a decoration.  Initialize the *value
  out parameter in ha_innobase::innobase_get_auto_increment().
  
  
  Revision r1866:
  Revert r1850 as MySQL did not approve the addition.
  
  log for r1850:
  
  Implement this feature request:
  http://bugs.mysql.com/30706
  
  * Add a function that returns the number of microseconds since
    epoch - ut_time_us().
  
  * Add (innodb|innobase|srv)_replication_delay MySQL config parameter.
  
  * Add UT_WAIT_FOR() macro that waits for a specified condition to occur
    until a timeout elapses.
  
  * Using all of the above, handle the replication thread specially in
    srv_conc_enter_innodb().
  
  
  Revision r1846:
  Add config option innodb_use_adaptive_hash_indexes to enable/disable
  adaptive hash indexes. It is enabled by default (no change in default
  behavior).
  
  Approved by:	Marko
  
  
  Revision r1974:
  Prevent loading of tables that have unsupported features most notably
  FTS indexes.
  
  
  Revision r1829:
  Add assertion to enforce check of an implicit invariant and add comment about
  retry of autoinc read semantics. We always reread the table's autoinc counter
  after attempting to initialize it i.e., we want to guarantee that a read of
  autoinc valus that is returned to the caller is always covered by the
  AUTOINC locking mechanism.
  
  
  Revision r1787:
  Move the prototype of innobase_print_identifier() from ut0ut.c to
  ha_prototypes.h.  Enclose the definitions in ha_prototypes.h in
  #ifndef UNIV_HOTBACKUP.
  
  
  Revision r1888:
  Merge a change from MySQL AB:
  
  ChangeSet@1.2528.115.30  2007-08-28 10:17:15-06:00  tsmith@hindu.god
  
  Fix another compiler warning on Windows in InnoDB.
  
  ha_innodb.cc:
  
  Fix compiler warning: ::get_auto_increment takes a ulonglong
  for nb_desired_values, but InnoDB's trx struct stores it as
  a ulint (unsigned long).  Probably harmless, as a single
  statement won't be asking for more than 2^32 rows.
  
  
  Revision r1987:
  Bug fix: The problem was that when write_row() attempted to update the max
  autoinc value, and if it was rolled back because of a deadlock, the 
  deadlock error (transaction rollback) was not being propagated back to MySQL.
  
  
  Revision r1889:
  Merge a change from MySQL AB:
  
  ChangeSet@1.2560  2007-09-21 10:15:16+02:00  gkodinov@local
  
  ha_innodb.cc: fixed type conversion warnings revealed by bug 30639 
  
  
  Revision r1989:
  Suppress printing of deadlock errors while reading the autoinc value.
  DB_DEADLOCK errors are part of normal processing and excessive printing
  of these error messages could be disconcerting for users. 
  
  
  Revision r1828:
  Fix two bugs:
  
  Bug# 30907: We don't rely on *first_value to be 0 when checking whether
  get_auto_increment() has been invoked for the first time in a multi-row
  INSERT. We instead use trx_t::n_autoinc_rows. Initialize trx::n_autoinc_rows
  inside ha_innobase::start_stmt() too.
  
  Bug# 30888: While adding code for the low level read of the AUTOINC value
  from the index, the case for MEDIUM ints which are 3 bytes was missed
  triggering an assertion.
storage/innobase/handler/ha_innodb.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1844:
  Remove the prototypes of some functions inside #if 0.
  The function definitions were removed in r1746.
storage/innobase/ibuf/ibuf0ibuf.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1965:
  ibuf_insert_to_index_page(): Fix typos in diagnostic output.
storage/innobase/include/db0err.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1974:
  Prevent loading of tables that have unsupported features most notably
  FTS indexes.
storage/innobase/include/ha_prototypes.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1787:
  Move the prototype of innobase_print_identifier() from ut0ut.c to
  ha_prototypes.h.  Enclose the definitions in ha_prototypes.h in
  #ifndef UNIV_HOTBACKUP.
storage/innobase/include/mach0data.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1779:
  Fix a bug that handles the case where the host specific byte order matches
  the InnoDB storage byte order, which is big-endian.
storage/innobase/include/mach0data.ic:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1779:
  Fix a bug that handles the case where the host specific byte order matches
  the InnoDB storage byte order, which is big-endian.
storage/innobase/include/mem0dbg.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1830:
  Improve memory debugging.  This is follow-up to r1819.
  
  mem_heap_validate(): Compile this function also if UNIV_MEM_DEBUG is
  defined.  Previously, this function was only compiled with UNIV_DEBUG.
  
  mem_heap_free_heap_top(): Flag the memory allocated, not freed, for
  Valgrind.  Otherwise, Valgrind would complain on the second call of
  mem_heap_empty().
  
  UNIV_MEM_ASSERT_RW(), UNIV_MEM_ASSERT_W(): Display additional diagnostics
  for failed Valgrind checks.
storage/innobase/include/mem0mem.ic:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1830:
  Improve memory debugging.  This is follow-up to r1819.
  
  mem_heap_validate(): Compile this function also if UNIV_MEM_DEBUG is
  defined.  Previously, this function was only compiled with UNIV_DEBUG.
  
  mem_heap_free_heap_top(): Flag the memory allocated, not freed, for
  Valgrind.  Otherwise, Valgrind would complain on the second call of
  mem_heap_empty().
  
  UNIV_MEM_ASSERT_RW(), UNIV_MEM_ASSERT_W(): Display additional diagnostics
  for failed Valgrind checks.
  
  
  Revision r1937:
  mem_heap_free_top(): Remove a bogus Valgrind warning.
  
  
  Revision r1819:
  Merge r1815:1817 from branches/zip: Improve Valgrind instrumentation.
  
  UNIV_MEM_ASSERT_RW(): New macro, to check that the contents of a memory
  area is defined.
  
  UNIV_MEM_ASSERT_W(): New macro, to check that a memory area is writable.
  
  UNIV_MEM_ASSERT_AND_FREE(): New macro, to check that the memory is
  writable before declaring it free (unwritable).  This replaces UNIV_MEM_FREE()
  in many places.
  
  mem_init_buf(): Check that the memory is writable, and declare it undefined.
  
  mem_erase_buf(): Check that the memory is writable, and declare it freed.
storage/innobase/include/rem0rec.ic:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1918:
  Improve Valgrind instrumentation.
  
  rec_offs_set_n_alloc(): Use UNIV_MEM_ASSERT_AND_ALLOC().
  
  UNIV_MEM_ASSERT_AND_ALLOC(): New directive, similar to
  UNIV_MEM_ASSERT_AND_FREE().
storage/innobase/include/row0mysql.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1783:
  Correct the function comments of row_create_table_for_mysql() and
  row_drop_table_for_mysql().
storage/innobase/include/sync0rw.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1757:
  Enclose rw_lock_validate() in #ifdef UNIV_DEBUG.  It is only called by
  debug assertions.
storage/innobase/include/univ.i:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1827:
  Merge r1826 from branches/zip: UNIV_MEM_ASSERT_AND_FREE():
  Use UNIV_MEM_ASSERT_W() instead of UNIV_MEM_ASSERT_RW().
  The memory area need not be initialized.
  This mistake was made in r1815.
  
  
  Revision r1918:
  Improve Valgrind instrumentation.
  
  rec_offs_set_n_alloc(): Use UNIV_MEM_ASSERT_AND_ALLOC().
  
  UNIV_MEM_ASSERT_AND_ALLOC(): New directive, similar to
  UNIV_MEM_ASSERT_AND_FREE().
  
  
  Revision r1830:
  Improve memory debugging.  This is follow-up to r1819.
  
  mem_heap_validate(): Compile this function also if UNIV_MEM_DEBUG is
  defined.  Previously, this function was only compiled with UNIV_DEBUG.
  
  mem_heap_free_heap_top(): Flag the memory allocated, not freed, for
  Valgrind.  Otherwise, Valgrind would complain on the second call of
  mem_heap_empty().
  
  UNIV_MEM_ASSERT_RW(), UNIV_MEM_ASSERT_W(): Display additional diagnostics
  for failed Valgrind checks.
  
  
  Revision r1819:
  Merge r1815:1817 from branches/zip: Improve Valgrind instrumentation.
  
  UNIV_MEM_ASSERT_RW(): New macro, to check that the contents of a memory
  area is defined.
  
  UNIV_MEM_ASSERT_W(): New macro, to check that a memory area is writable.
  
  UNIV_MEM_ASSERT_AND_FREE(): New macro, to check that the memory is
  writable before declaring it free (unwritable).  This replaces UNIV_MEM_FREE()
  in many places.
  
  mem_init_buf(): Check that the memory is writable, and declare it undefined.
  
  mem_erase_buf(): Check that the memory is writable, and declare it freed.
  
  
  Revision r1948:
  UNIV_MEM_ASSERT_RW(), UNIV_MEM_ASSERT_W(): Display also __FILE__ and __LINE__
  when these Valgrind checks fail.
storage/innobase/include/ut0ut.h:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1850:
  Implement this feature request:
  http://bugs.mysql.com/30706
  
  * Add a function that returns the number of microseconds since
    epoch - ut_time_us().
  
  * Add (innodb|innobase|srv)_replication_delay MySQL config parameter.
  
  * Add UT_WAIT_FOR() macro that waits for a specified condition to occur
    until a timeout elapses.
  
  * Using all of the above, handle the replication thread specially in
    srv_conc_enter_innodb().
  
  Approved by:	Heikki
  
  
  
  Revision r1862:
  Add ut_snprintf() function. On Windows this needs to be implemented
  using auxiliary functions because there is no snprintf-variant on
  Windows that behaves exactly as specified in the standard:
  
  * Always return the number of characters that would have been printed
    if the size were unlimited (not including the final `\0').
  * Always '\0'-terminate the result
  * Do not touch the buffer if size=0, only return the number of characters
    that would have been printed. Can be used to estimate the size needed
    and to allocate it dynamically.
  
  See http://www.freebsd.org/cgi/query-pr.cgi?pr=87260 for the reason why
  2 ap variables are used.
  
  Approved by:	Heikki
  
  
  Revision r1866:
  Revert r1850 as MySQL did not approve the addition.
  
  log for r1850:
  
  Implement this feature request:
  http://bugs.mysql.com/30706
  
  * Add a function that returns the number of microseconds since
    epoch - ut_time_us().
  
  * Add (innodb|innobase|srv)_replication_delay MySQL config parameter.
  
  * Add UT_WAIT_FOR() macro that waits for a specified condition to occur
    until a timeout elapses.
  
  * Using all of the above, handle the replication thread specially in
    srv_conc_enter_innodb().
storage/innobase/mem/mem0dbg.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1830:
  Improve memory debugging.  This is follow-up to r1819.
  
  mem_heap_validate(): Compile this function also if UNIV_MEM_DEBUG is
  defined.  Previously, this function was only compiled with UNIV_DEBUG.
  
  mem_heap_free_heap_top(): Flag the memory allocated, not freed, for
  Valgrind.  Otherwise, Valgrind would complain on the second call of
  mem_heap_empty().
  
  UNIV_MEM_ASSERT_RW(), UNIV_MEM_ASSERT_W(): Display additional diagnostics
  for failed Valgrind checks.
  
  
  Revision r1819:
  Merge r1815:1817 from branches/zip: Improve Valgrind instrumentation.
  
  UNIV_MEM_ASSERT_RW(): New macro, to check that the contents of a memory
  area is defined.
  
  UNIV_MEM_ASSERT_W(): New macro, to check that a memory area is writable.
  
  UNIV_MEM_ASSERT_AND_FREE(): New macro, to check that the memory is
  writable before declaring it free (unwritable).  This replaces UNIV_MEM_FREE()
  in many places.
  
  mem_init_buf(): Check that the memory is writable, and declare it undefined.
  
  mem_erase_buf(): Check that the memory is writable, and declare it freed.
storage/innobase/mem/mem0mem.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1819:
  Merge r1815:1817 from branches/zip: Improve Valgrind instrumentation.
  
  UNIV_MEM_ASSERT_RW(): New macro, to check that the contents of a memory
  area is defined.
  
  UNIV_MEM_ASSERT_W(): New macro, to check that a memory area is writable.
  
  UNIV_MEM_ASSERT_AND_FREE(): New macro, to check that the memory is
  writable before declaring it free (unwritable).  This replaces UNIV_MEM_FREE()
  in many places.
  
  mem_init_buf(): Check that the memory is writable, and declare it undefined.
  
  mem_erase_buf(): Check that the memory is writable, and declare it freed.
storage/innobase/row/row0mysql.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1786:
  row_create_table_for_mysql(), row_truncate_table_for_mysql(),
  row_drop_table_for_mysql(): Do not mention innodb_force_recovery
  when newraw is set.
  
  
  Revision r1790:
  row_drop_table_for_mysql(): Before calling
  dict_table_remove_from_cache(table) and thus freeing the memory
  allocated for the table, copy the table name.  This avoids reading
  freed memory when name == table->name.
  
  Approved by Sunny.
  
  
  Revision r1783:
  Correct the function comments of row_create_table_for_mysql() and
  row_drop_table_for_mysql().
  
  
  Revision r1894:
  Add debug lock checks to autoinc functions. Add lock guards around an
  invocation of dict_table_autoinc_initialize().
storage/innobase/row/row0sel.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1782:
  Add comment that the variable dest should be word aligned. After discussion
  on IM with Heikki.
  
  
  Revision r1988:
  Set an error code when a deadlock occurs in semi-consistent read.  (Bug #31494)
  
  innodb-semi-consistent: New tests for InnoDB semi-consistent reads.
  Unfortunately, these will not trigger Bug #31494, because there merely
  occur lock wait timeouts, not deadlocks.
  
  
  Revision r1820:
  Use the clustered index and not the one selected by the optimizer in the plan,
  when building a previous version of the row. This bug is triggered when
  running queries via InnoDB's internal SQL parser; when InnoDB's optimizer
  selects a secondary index for the plan.
  
  
  Revision r1828:
  Fix two bugs:
  
  Bug# 30907: We don't rely on *first_value to be 0 when checking whether
  get_auto_increment() has been invoked for the first time in a multi-row
  INSERT. We instead use trx_t::n_autoinc_rows. Initialize trx::n_autoinc_rows
  inside ha_innobase::start_stmt() too.
  
  Bug# 30888: While adding code for the low level read of the AUTOINC value
  from the index, the case for MEDIUM ints which are 3 bytes was missed
  triggering an assertion.
  
  
  Revision r1779:
  Fix a bug that handles the case where the host specific byte order matches
  the InnoDB storage byte order, which is big-endian.
storage/innobase/sync/sync0rw.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1757:
  Enclose rw_lock_validate() in #ifdef UNIV_DEBUG.  It is only called by
  debug assertions.
storage/innobase/ut/ut0ut.c:
  Apply snapshot innodb-5.1-ss1989
  
  Revision r1850:
  Implement this feature request:
  http://bugs.mysql.com/30706
  
  * Add a function that returns the number of microseconds since
    epoch - ut_time_us().
  
  * Add (innodb|innobase|srv)_replication_delay MySQL config parameter.
  
  * Add UT_WAIT_FOR() macro that waits for a specified condition to occur
    until a timeout elapses.
  
  * Using all of the above, handle the replication thread specially in
    srv_conc_enter_innodb().
  
  Approved by:	Heikki
  
  
  
  Revision r1873:
  snprintf() should always return non-negative result. According to
  Microsoft documentation about _vscprintf():
  
    If format is a null pointer, the invalid parameter handler is invoked,
    as described in Parameter Validation. If execution is allowed to
    continue, the functions return -1 and set errno to EINVAL.
  
  The UNIX variant of snprintf() segfaults if format is a NULL pointer
  (similar to strlen(NULL) for example), so it is better to conform to
  this behavior and crash our custom Windows version instead of
  returning -1. Noone would expect -1 to be returned from snprintf().
  
  Cosmetic: Add a space after typecast.
  
  Approved by:	Marko
  
  
  Revision r1862:
  Add ut_snprintf() function. On Windows this needs to be implemented
  using auxiliary functions because there is no snprintf-variant on
  Windows that behaves exactly as specified in the standard:
  
  * Always return the number of characters that would have been printed
    if the size were unlimited (not including the final `\0').
  * Always '\0'-terminate the result
  * Do not touch the buffer if size=0, only return the number of characters
    that would have been printed. Can be used to estimate the size needed
    and to allocate it dynamically.
  
  See http://www.freebsd.org/cgi/query-pr.cgi?pr=87260 for the reason why
  2 ap variables are used.
  
  Approved by:	Heikki
  
  
  Revision r1866:
  Revert r1850 as MySQL did not approve the addition.
  
  log for r1850:
  
  Implement this feature request:
  http://bugs.mysql.com/30706
  
  * Add a function that returns the number of microseconds since
    epoch - ut_time_us().
  
  * Add (innodb|innobase|srv)_replication_delay MySQL config parameter.
  
  * Add UT_WAIT_FOR() macro that waits for a specified condition to occur
    until a timeout elapses.
  
  * Using all of the above, handle the replication thread specially in
    srv_conc_enter_innodb().
  
  
  Revision r1787:
  Move the prototype of innobase_print_identifier() from ut0ut.c to
  ha_prototypes.h.  Enclose the definitions in ha_prototypes.h in
  #ifndef UNIV_HOTBACKUP.
  
  
  Revision r1789:
  ut_print_namel(): Do not assume that all '/' are separators between
  database and table names.
  
  Approved by Heikki.
  
  
  Revision r1936:
  ut_print_buf(): Add a Valgrind check that the buffer is wholly defined.
mysql-test/r/innodb-semi-consistent.result:
  Apply snapshot innodb-5.1-ss1989
  
  
  Revision r1988:
  Set an error code when a deadlock occurs in semi-consistent read.  (Bug #31494)
  
  innodb-semi-consistent: New tests for InnoDB semi-consistent reads.
  Unfortunately, these will not trigger Bug #31494, because there merely
  occur lock wait timeouts, not deadlocks.
mysql-test/r/innodb_autoinc_lock_mode_zero.result:
  New test, using read-only setting --innodb-autoinc-lock-mode=0
mysql-test/t/innodb-semi-consistent-master.opt:
  Apply snapshot innodb-5.1-ss1989
  
  
  Revision r1988:
  Set an error code when a deadlock occurs in semi-consistent read.  (Bug #31494)
  
  innodb-semi-consistent: New tests for InnoDB semi-consistent reads.
  Unfortunately, these will not trigger Bug #31494, because there merely
  occur lock wait timeouts, not deadlocks.
mysql-test/t/innodb-semi-consistent.test:
  Apply snapshot innodb-5.1-ss1989
  
  
  Revision r1988:
  Set an error code when a deadlock occurs in semi-consistent read.  (Bug #31494)
  
  innodb-semi-consistent: New tests for InnoDB semi-consistent reads.
  Unfortunately, these will not trigger Bug #31494, because there merely
  occur lock wait timeouts, not deadlocks.
mysql-test/t/innodb_autoinc_lock_mode_zero-master.opt:
  New test, using read-only setting --innodb-autoinc-lock-mode=0
mysql-test/t/innodb_autoinc_lock_mode_zero.test:
  New test, using read-only setting --innodb-autoinc-lock-mode=0
2007-11-06 15:42:58 -07:00

1532 lines
41 KiB
Text

/************************************************************************
Record manager
(c) 1994-1996 Innobase Oy
Created 5/30/1994 Heikki Tuuri
*************************************************************************/
#include "mach0data.h"
#include "ut0byte.h"
#include "dict0dict.h"
/* Compact flag ORed to the extra size returned by rec_get_offsets() */
#define REC_OFFS_COMPACT ((ulint) 1 << 31)
/* SQL NULL flag in offsets returned by rec_get_offsets() */
#define REC_OFFS_SQL_NULL ((ulint) 1 << 31)
/* External flag in offsets returned by rec_get_offsets() */
#define REC_OFFS_EXTERNAL ((ulint) 1 << 30)
/* Mask for offsets returned by rec_get_offsets() */
#define REC_OFFS_MASK (REC_OFFS_EXTERNAL - 1)
/* Offsets of the bit-fields in an old-style record. NOTE! In the table the
most significant bytes and bits are written below less significant.
(1) byte offset (2) bit usage within byte
downward from
origin -> 1 8 bits pointer to next record
2 8 bits pointer to next record
3 1 bit short flag
7 bits number of fields
4 3 bits number of fields
5 bits heap number
5 8 bits heap number
6 4 bits n_owned
4 bits info bits
*/
/* Offsets of the bit-fields in a new-style record. NOTE! In the table the
most significant bytes and bits are written below less significant.
(1) byte offset (2) bit usage within byte
downward from
origin -> 1 8 bits relative offset of next record
2 8 bits relative offset of next record
the relative offset is an unsigned 16-bit
integer:
(offset_of_next_record
- offset_of_this_record) mod 64Ki,
where mod is the modulo as a non-negative
number;
we can calculate the the offset of the next
record with the formula:
relative_offset + offset_of_this_record
mod UNIV_PAGE_SIZE
3 3 bits status:
000=conventional record
001=node pointer record (inside B-tree)
010=infimum record
011=supremum record
1xx=reserved
5 bits heap number
4 8 bits heap number
5 4 bits n_owned
4 bits info bits
*/
/* We list the byte offsets from the origin of the record, the mask,
and the shift needed to obtain each bit-field of the record. */
#define REC_NEXT 2
#define REC_NEXT_MASK 0xFFFFUL
#define REC_NEXT_SHIFT 0
#define REC_OLD_SHORT 3 /* This is single byte bit-field */
#define REC_OLD_SHORT_MASK 0x1UL
#define REC_OLD_SHORT_SHIFT 0
#define REC_OLD_N_FIELDS 4
#define REC_OLD_N_FIELDS_MASK 0x7FEUL
#define REC_OLD_N_FIELDS_SHIFT 1
#define REC_NEW_STATUS 3 /* This is single byte bit-field */
#define REC_NEW_STATUS_MASK 0x7UL
#define REC_NEW_STATUS_SHIFT 0
#define REC_OLD_HEAP_NO 5
#define REC_NEW_HEAP_NO 4
#define REC_HEAP_NO_MASK 0xFFF8UL
#define REC_HEAP_NO_SHIFT 3
#define REC_OLD_N_OWNED 6 /* This is single byte bit-field */
#define REC_NEW_N_OWNED 5 /* This is single byte bit-field */
#define REC_N_OWNED_MASK 0xFUL
#define REC_N_OWNED_SHIFT 0
#define REC_OLD_INFO_BITS 6 /* This is single byte bit-field */
#define REC_NEW_INFO_BITS 5 /* This is single byte bit-field */
#define REC_INFO_BITS_MASK 0xF0UL
#define REC_INFO_BITS_SHIFT 0
/* The deleted flag in info bits */
#define REC_INFO_DELETED_FLAG 0x20UL /* when bit is set to 1, it means the
record has been delete marked */
/* The following masks are used to filter the SQL null bit from
one-byte and two-byte offsets */
#define REC_1BYTE_SQL_NULL_MASK 0x80UL
#define REC_2BYTE_SQL_NULL_MASK 0x8000UL
/* In a 2-byte offset the second most significant bit denotes
a field stored to another page: */
#define REC_2BYTE_EXTERN_MASK 0x4000UL
#if REC_OLD_SHORT_MASK << (8 * (REC_OLD_SHORT - 3)) \
^ REC_OLD_N_FIELDS_MASK << (8 * (REC_OLD_N_FIELDS - 4)) \
^ REC_HEAP_NO_MASK << (8 * (REC_OLD_HEAP_NO - 4)) \
^ REC_N_OWNED_MASK << (8 * (REC_OLD_N_OWNED - 3)) \
^ REC_INFO_BITS_MASK << (8 * (REC_OLD_INFO_BITS - 3)) \
^ 0xFFFFFFFFUL
# error "sum of old-style masks != 0xFFFFFFFFUL"
#endif
#if REC_NEW_STATUS_MASK << (8 * (REC_NEW_STATUS - 3)) \
^ REC_HEAP_NO_MASK << (8 * (REC_NEW_HEAP_NO - 4)) \
^ REC_N_OWNED_MASK << (8 * (REC_NEW_N_OWNED - 3)) \
^ REC_INFO_BITS_MASK << (8 * (REC_NEW_INFO_BITS - 3)) \
^ 0xFFFFFFUL
# error "sum of new-style masks != 0xFFFFFFUL"
#endif
/***************************************************************
Sets the value of the ith field SQL null bit of an old-style record. */
void
rec_set_nth_field_null_bit(
/*=======================*/
rec_t* rec, /* in: record */
ulint i, /* in: ith field */
ibool val); /* in: value to set */
/***************************************************************
Sets an old-style record field to SQL null.
The physical size of the field is not changed. */
void
rec_set_nth_field_sql_null(
/*=======================*/
rec_t* rec, /* in: record */
ulint n); /* in: index of the field */
/***************************************************************
Sets the value of the ith field extern storage bit of an old-style record. */
void
rec_set_nth_field_extern_bit_old(
/*=============================*/
rec_t* rec, /* in: old-style record */
ulint i, /* in: ith field */
ibool val, /* in: value to set */
mtr_t* mtr); /* in: mtr holding an X-latch to the page where
rec is, or NULL; in the NULL case we do not
write to log about the change */
/***************************************************************
Sets the value of the ith field extern storage bit of a new-style record. */
void
rec_set_nth_field_extern_bit_new(
/*=============================*/
rec_t* rec, /* in: record */
dict_index_t* index, /* in: record descriptor */
ulint ith, /* in: ith field */
ibool val, /* in: value to set */
mtr_t* mtr); /* in: mtr holding an X-latch to the page
where rec is, or NULL; in the NULL case
we do not write to log about the change */
/**********************************************************
Gets a bit field from within 1 byte. */
UNIV_INLINE
ulint
rec_get_bit_field_1(
/*================*/
rec_t* rec, /* in: pointer to record origin */
ulint offs, /* in: offset from the origin down */
ulint mask, /* in: mask used to filter bits */
ulint shift) /* in: shift right applied after masking */
{
ut_ad(rec);
return((mach_read_from_1(rec - offs) & mask) >> shift);
}
/**********************************************************
Sets a bit field within 1 byte. */
UNIV_INLINE
void
rec_set_bit_field_1(
/*================*/
rec_t* rec, /* in: pointer to record origin */
ulint val, /* in: value to set */
ulint offs, /* in: offset from the origin down */
ulint mask, /* in: mask used to filter bits */
ulint shift) /* in: shift right applied after masking */
{
ut_ad(rec);
ut_ad(offs <= REC_N_OLD_EXTRA_BYTES);
ut_ad(mask);
ut_ad(mask <= 0xFFUL);
ut_ad(((mask >> shift) << shift) == mask);
ut_ad(((val << shift) & mask) == (val << shift));
mach_write_to_1(rec - offs,
(mach_read_from_1(rec - offs) & ~mask)
| (val << shift));
}
/**********************************************************
Gets a bit field from within 2 bytes. */
UNIV_INLINE
ulint
rec_get_bit_field_2(
/*================*/
rec_t* rec, /* in: pointer to record origin */
ulint offs, /* in: offset from the origin down */
ulint mask, /* in: mask used to filter bits */
ulint shift) /* in: shift right applied after masking */
{
ut_ad(rec);
return((mach_read_from_2(rec - offs) & mask) >> shift);
}
/**********************************************************
Sets a bit field within 2 bytes. */
UNIV_INLINE
void
rec_set_bit_field_2(
/*================*/
rec_t* rec, /* in: pointer to record origin */
ulint val, /* in: value to set */
ulint offs, /* in: offset from the origin down */
ulint mask, /* in: mask used to filter bits */
ulint shift) /* in: shift right applied after masking */
{
ut_ad(rec);
ut_ad(offs <= REC_N_OLD_EXTRA_BYTES);
ut_ad(mask > 0xFFUL);
ut_ad(mask <= 0xFFFFUL);
ut_ad((mask >> shift) & 1);
ut_ad(0 == ((mask >> shift) & ((mask >> shift) + 1)));
ut_ad(((mask >> shift) << shift) == mask);
ut_ad(((val << shift) & mask) == (val << shift));
mach_write_to_2(rec - offs,
(mach_read_from_2(rec - offs) & ~mask)
| (val << shift));
}
/**********************************************************
The following function is used to get the offset of the next chained record
on the same page. */
UNIV_INLINE
ulint
rec_get_next_offs(
/*==============*/
/* out: the page offset of the next chained record, or
0 if none */
rec_t* rec, /* in: physical record */
ulint comp) /* in: nonzero=compact page format */
{
ulint field_value;
#if REC_NEXT_MASK != 0xFFFFUL
# error "REC_NEXT_MASK != 0xFFFFUL"
#endif
#if REC_NEXT_SHIFT
# error "REC_NEXT_SHIFT != 0"
#endif
field_value = mach_read_from_2(rec - REC_NEXT);
if (comp) {
#if UNIV_PAGE_SIZE <= 32768
/* Note that for 64 KiB pages, field_value can 'wrap around'
and the debug assertion is not valid */
/* In the following assertion, field_value is interpreted
as signed 16-bit integer in 2's complement arithmetics.
If all platforms defined int16_t in the standard headers,
the expression could be written simpler as
(int16_t) field_value + ut_align_offset(...) < UNIV_PAGE_SIZE
*/
ut_ad((field_value >= 32768
? field_value - 65536
: field_value)
+ ut_align_offset(rec, UNIV_PAGE_SIZE)
< UNIV_PAGE_SIZE);
#endif
if (field_value == 0) {
return(0);
}
return(ut_align_offset(rec + field_value, UNIV_PAGE_SIZE));
} else {
ut_ad(field_value < UNIV_PAGE_SIZE);
return(field_value);
}
}
/**********************************************************
The following function is used to set the next record offset field of the
record. */
UNIV_INLINE
void
rec_set_next_offs(
/*==============*/
rec_t* rec, /* in: physical record */
ulint comp, /* in: nonzero=compact page format */
ulint next) /* in: offset of the next record, or 0 if none */
{
ut_ad(rec);
ut_ad(UNIV_PAGE_SIZE > next);
#if REC_NEXT_MASK != 0xFFFFUL
# error "REC_NEXT_MASK != 0xFFFFUL"
#endif
#if REC_NEXT_SHIFT
# error "REC_NEXT_SHIFT != 0"
#endif
if (comp) {
ulint field_value;
if (next) {
/* The following two statements calculate
next - offset_of_rec mod 64Ki, where mod is the modulo
as a non-negative number */
field_value = (ulint)((lint)next
- (lint)ut_align_offset(
rec, UNIV_PAGE_SIZE));
field_value &= REC_NEXT_MASK;
} else {
field_value = 0;
}
mach_write_to_2(rec - REC_NEXT, field_value);
} else {
mach_write_to_2(rec - REC_NEXT, next);
}
}
/**********************************************************
The following function is used to get the number of fields
in an old-style record. */
UNIV_INLINE
ulint
rec_get_n_fields_old(
/*=================*/
/* out: number of data fields */
rec_t* rec) /* in: physical record */
{
ulint ret;
ut_ad(rec);
ret = rec_get_bit_field_2(rec, REC_OLD_N_FIELDS,
REC_OLD_N_FIELDS_MASK,
REC_OLD_N_FIELDS_SHIFT);
ut_ad(ret <= REC_MAX_N_FIELDS);
ut_ad(ret > 0);
return(ret);
}
/**********************************************************
The following function is used to set the number of fields
in an old-style record. */
UNIV_INLINE
void
rec_set_n_fields_old(
/*=================*/
rec_t* rec, /* in: physical record */
ulint n_fields) /* in: the number of fields */
{
ut_ad(rec);
ut_ad(n_fields <= REC_MAX_N_FIELDS);
ut_ad(n_fields > 0);
rec_set_bit_field_2(rec, n_fields, REC_OLD_N_FIELDS,
REC_OLD_N_FIELDS_MASK, REC_OLD_N_FIELDS_SHIFT);
}
/**********************************************************
The following function retrieves the status bits of a new-style record. */
UNIV_INLINE
ulint
rec_get_status(
/*===========*/
/* out: status bits */
rec_t* rec) /* in: physical record */
{
ulint ret;
ut_ad(rec);
ret = rec_get_bit_field_1(rec, REC_NEW_STATUS,
REC_NEW_STATUS_MASK, REC_NEW_STATUS_SHIFT);
ut_ad((ret & ~REC_NEW_STATUS_MASK) == 0);
return(ret);
}
/**********************************************************
The following function is used to get the number of fields
in a record. */
UNIV_INLINE
ulint
rec_get_n_fields(
/*=============*/
/* out: number of data fields */
rec_t* rec, /* in: physical record */
dict_index_t* index) /* in: record descriptor */
{
ut_ad(rec);
ut_ad(index);
if (!dict_table_is_comp(index->table)) {
return(rec_get_n_fields_old(rec));
}
switch (rec_get_status(rec)) {
case REC_STATUS_ORDINARY:
return(dict_index_get_n_fields(index));
case REC_STATUS_NODE_PTR:
return(dict_index_get_n_unique_in_tree(index) + 1);
case REC_STATUS_INFIMUM:
case REC_STATUS_SUPREMUM:
return(1);
default:
ut_error;
return(ULINT_UNDEFINED);
}
}
/**********************************************************
The following function is used to get the number of records owned by the
previous directory record. */
UNIV_INLINE
ulint
rec_get_n_owned(
/*============*/
/* out: number of owned records */
rec_t* rec, /* in: physical record */
ulint comp) /* in: nonzero=compact page format */
{
ulint ret;
ut_ad(rec);
ret = rec_get_bit_field_1(rec,
comp ? REC_NEW_N_OWNED : REC_OLD_N_OWNED,
REC_N_OWNED_MASK, REC_N_OWNED_SHIFT);
ut_ad(ret <= REC_MAX_N_OWNED);
return(ret);
}
/**********************************************************
The following function is used to set the number of owned records. */
UNIV_INLINE
void
rec_set_n_owned(
/*============*/
rec_t* rec, /* in: physical record */
ulint comp, /* in: nonzero=compact page format */
ulint n_owned) /* in: the number of owned */
{
ut_ad(rec);
ut_ad(n_owned <= REC_MAX_N_OWNED);
rec_set_bit_field_1(rec, n_owned,
comp ? REC_NEW_N_OWNED : REC_OLD_N_OWNED,
REC_N_OWNED_MASK, REC_N_OWNED_SHIFT);
}
/**********************************************************
The following function is used to retrieve the info bits of a record. */
UNIV_INLINE
ulint
rec_get_info_bits(
/*==============*/
/* out: info bits */
rec_t* rec, /* in: physical record */
ulint comp) /* in: nonzero=compact page format */
{
ulint ret;
ut_ad(rec);
ret = rec_get_bit_field_1(rec,
comp ? REC_NEW_INFO_BITS : REC_OLD_INFO_BITS,
REC_INFO_BITS_MASK, REC_INFO_BITS_SHIFT);
ut_ad((ret & ~REC_INFO_BITS_MASK) == 0);
return(ret);
}
/**********************************************************
The following function is used to set the info bits of a record. */
UNIV_INLINE
void
rec_set_info_bits(
/*==============*/
rec_t* rec, /* in: physical record */
ulint comp, /* in: nonzero=compact page format */
ulint bits) /* in: info bits */
{
ut_ad(rec);
ut_ad((bits & ~REC_INFO_BITS_MASK) == 0);
rec_set_bit_field_1(rec, bits,
comp ? REC_NEW_INFO_BITS : REC_OLD_INFO_BITS,
REC_INFO_BITS_MASK, REC_INFO_BITS_SHIFT);
}
/**********************************************************
The following function is used to set the status bits of a new-style record. */
UNIV_INLINE
void
rec_set_status(
/*===========*/
rec_t* rec, /* in: physical record */
ulint bits) /* in: info bits */
{
ut_ad(rec);
ut_ad((bits & ~REC_NEW_STATUS_MASK) == 0);
rec_set_bit_field_1(rec, bits, REC_NEW_STATUS,
REC_NEW_STATUS_MASK, REC_NEW_STATUS_SHIFT);
}
/**********************************************************
The following function is used to retrieve the info and status
bits of a record. (Only compact records have status bits.) */
UNIV_INLINE
ulint
rec_get_info_and_status_bits(
/*=========================*/
/* out: info bits */
rec_t* rec, /* in: physical record */
ulint comp) /* in: nonzero=compact page format */
{
ulint bits;
#if (REC_NEW_STATUS_MASK >> REC_NEW_STATUS_SHIFT) \
& (REC_INFO_BITS_MASK >> REC_INFO_BITS_SHIFT)
# error "REC_NEW_STATUS_MASK and REC_INFO_BITS_MASK overlap"
#endif
if (UNIV_EXPECT(comp, REC_OFFS_COMPACT)) {
bits = rec_get_info_bits(rec, TRUE) | rec_get_status(rec);
} else {
bits = rec_get_info_bits(rec, FALSE);
ut_ad(!(bits & ~(REC_INFO_BITS_MASK >> REC_INFO_BITS_SHIFT)));
}
return(bits);
}
/**********************************************************
The following function is used to set the info and status
bits of a record. (Only compact records have status bits.) */
UNIV_INLINE
void
rec_set_info_and_status_bits(
/*=========================*/
rec_t* rec, /* in: physical record */
ulint comp, /* in: nonzero=compact page format */
ulint bits) /* in: info bits */
{
#if (REC_NEW_STATUS_MASK >> REC_NEW_STATUS_SHIFT) \
& (REC_INFO_BITS_MASK >> REC_INFO_BITS_SHIFT)
# error "REC_NEW_STATUS_MASK and REC_INFO_BITS_MASK overlap"
#endif
if (comp) {
rec_set_status(rec, bits & REC_NEW_STATUS_MASK);
} else {
ut_ad(!(bits & ~(REC_INFO_BITS_MASK >> REC_INFO_BITS_SHIFT)));
}
rec_set_info_bits(rec, comp, bits & ~REC_NEW_STATUS_MASK);
}
/**********************************************************
The following function tells if record is delete marked. */
UNIV_INLINE
ulint
rec_get_deleted_flag(
/*=================*/
/* out: nonzero if delete marked */
rec_t* rec, /* in: physical record */
ulint comp) /* in: nonzero=compact page format */
{
if (UNIV_EXPECT(comp, REC_OFFS_COMPACT)) {
return(UNIV_UNLIKELY(
rec_get_bit_field_1(rec, REC_NEW_INFO_BITS,
REC_INFO_DELETED_FLAG,
REC_INFO_BITS_SHIFT)));
} else {
return(UNIV_UNLIKELY(
rec_get_bit_field_1(rec, REC_OLD_INFO_BITS,
REC_INFO_DELETED_FLAG,
REC_INFO_BITS_SHIFT)));
}
}
/**********************************************************
The following function is used to set the deleted bit. */
UNIV_INLINE
void
rec_set_deleted_flag(
/*=================*/
rec_t* rec, /* in: physical record */
ulint comp, /* in: nonzero=compact page format */
ulint flag) /* in: nonzero if delete marked */
{
ulint val;
val = rec_get_info_bits(rec, comp);
if (flag) {
val |= REC_INFO_DELETED_FLAG;
} else {
val &= ~REC_INFO_DELETED_FLAG;
}
rec_set_info_bits(rec, comp, val);
}
/**********************************************************
The following function tells if a new-style record is a node pointer. */
UNIV_INLINE
ibool
rec_get_node_ptr_flag(
/*==================*/
/* out: TRUE if node pointer */
rec_t* rec) /* in: physical record */
{
return(REC_STATUS_NODE_PTR == rec_get_status(rec));
}
/**********************************************************
The following function is used to get the order number of the record in the
heap of the index page. */
UNIV_INLINE
ulint
rec_get_heap_no(
/*============*/
/* out: heap order number */
rec_t* rec, /* in: physical record */
ulint comp) /* in: nonzero=compact page format */
{
ulint ret;
ut_ad(rec);
ret = rec_get_bit_field_2(rec,
comp ? REC_NEW_HEAP_NO : REC_OLD_HEAP_NO,
REC_HEAP_NO_MASK, REC_HEAP_NO_SHIFT);
ut_ad(ret <= REC_MAX_HEAP_NO);
return(ret);
}
/**********************************************************
The following function is used to set the heap number field in the record. */
UNIV_INLINE
void
rec_set_heap_no(
/*============*/
rec_t* rec, /* in: physical record */
ulint comp, /* in: nonzero=compact page format */
ulint heap_no)/* in: the heap number */
{
ut_ad(heap_no <= REC_MAX_HEAP_NO);
rec_set_bit_field_2(rec, heap_no,
comp ? REC_NEW_HEAP_NO : REC_OLD_HEAP_NO,
REC_HEAP_NO_MASK, REC_HEAP_NO_SHIFT);
}
/**********************************************************
The following function is used to test whether the data offsets in the record
are stored in one-byte or two-byte format. */
UNIV_INLINE
ibool
rec_get_1byte_offs_flag(
/*====================*/
/* out: TRUE if 1-byte form */
rec_t* rec) /* in: physical record */
{
#if TRUE != 1
#error "TRUE != 1"
#endif
return(rec_get_bit_field_1(rec, REC_OLD_SHORT, REC_OLD_SHORT_MASK,
REC_OLD_SHORT_SHIFT));
}
/**********************************************************
The following function is used to set the 1-byte offsets flag. */
UNIV_INLINE
void
rec_set_1byte_offs_flag(
/*====================*/
rec_t* rec, /* in: physical record */
ibool flag) /* in: TRUE if 1byte form */
{
#if TRUE != 1
#error "TRUE != 1"
#endif
ut_ad(flag <= TRUE);
rec_set_bit_field_1(rec, flag, REC_OLD_SHORT, REC_OLD_SHORT_MASK,
REC_OLD_SHORT_SHIFT);
}
/**********************************************************
Returns the offset of nth field end if the record is stored in the 1-byte
offsets form. If the field is SQL null, the flag is ORed in the returned
value. */
UNIV_INLINE
ulint
rec_1_get_field_end_info(
/*=====================*/
/* out: offset of the start of the field, SQL null
flag ORed */
rec_t* rec, /* in: record */
ulint n) /* in: field index */
{
ut_ad(rec_get_1byte_offs_flag(rec));
ut_ad(n < rec_get_n_fields_old(rec));
return(mach_read_from_1(rec - (REC_N_OLD_EXTRA_BYTES + n + 1)));
}
/**********************************************************
Returns the offset of nth field end if the record is stored in the 2-byte
offsets form. If the field is SQL null, the flag is ORed in the returned
value. */
UNIV_INLINE
ulint
rec_2_get_field_end_info(
/*=====================*/
/* out: offset of the start of the field, SQL null
flag and extern storage flag ORed */
rec_t* rec, /* in: record */
ulint n) /* in: field index */
{
ut_ad(!rec_get_1byte_offs_flag(rec));
ut_ad(n < rec_get_n_fields_old(rec));
return(mach_read_from_2(rec - (REC_N_OLD_EXTRA_BYTES + 2 * n + 2)));
}
#ifdef UNIV_DEBUG
/* Length of the rec_get_offsets() header */
# define REC_OFFS_HEADER_SIZE 4
#else /* UNIV_DEBUG */
/* Length of the rec_get_offsets() header */
# define REC_OFFS_HEADER_SIZE 2
#endif /* UNIV_DEBUG */
/* Get the base address of offsets. The extra_size is stored at
this position, and following positions hold the end offsets of
the fields. */
#define rec_offs_base(offsets) (offsets + REC_OFFS_HEADER_SIZE)
/**************************************************************
The following function returns the number of allocated elements
for an array of offsets. */
UNIV_INLINE
ulint
rec_offs_get_n_alloc(
/*=================*/
/* out: number of elements */
const ulint* offsets)/* in: array for rec_get_offsets() */
{
ulint n_alloc;
ut_ad(offsets);
n_alloc = offsets[0];
ut_ad(n_alloc > 0);
return(n_alloc);
}
/**************************************************************
The following function sets the number of allocated elements
for an array of offsets. */
UNIV_INLINE
void
rec_offs_set_n_alloc(
/*=================*/
ulint* offsets, /* out: array for rec_get_offsets(),
must be allocated */
ulint n_alloc) /* in: number of elements */
{
ut_ad(offsets);
ut_ad(n_alloc > 0);
UNIV_MEM_ASSERT_AND_ALLOC(offsets, n_alloc * sizeof *offsets);
offsets[0] = n_alloc;
}
/**************************************************************
The following function returns the number of fields in a record. */
UNIV_INLINE
ulint
rec_offs_n_fields(
/*==============*/
/* out: number of fields */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
ulint n_fields;
ut_ad(offsets);
n_fields = offsets[1];
ut_ad(n_fields > 0);
ut_ad(n_fields <= REC_MAX_N_FIELDS);
ut_ad(n_fields + REC_OFFS_HEADER_SIZE
<= rec_offs_get_n_alloc(offsets));
return(n_fields);
}
/****************************************************************
Validates offsets returned by rec_get_offsets(). */
UNIV_INLINE
ibool
rec_offs_validate(
/*==============*/
/* out: TRUE if valid */
rec_t* rec, /* in: record or NULL */
dict_index_t* index, /* in: record descriptor or NULL */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
ulint i = rec_offs_n_fields(offsets);
ulint last = ULINT_MAX;
ulint comp = *rec_offs_base(offsets) & REC_OFFS_COMPACT;
if (rec) {
ut_ad((ulint) rec == offsets[2]);
if (!comp) {
ut_a(rec_get_n_fields_old(rec) >= i);
}
}
if (index) {
ulint max_n_fields;
ut_ad((ulint) index == offsets[3]);
max_n_fields = ut_max(
dict_index_get_n_fields(index),
dict_index_get_n_unique_in_tree(index) + 1);
if (comp && rec) {
switch (rec_get_status(rec)) {
case REC_STATUS_ORDINARY:
break;
case REC_STATUS_NODE_PTR:
max_n_fields = dict_index_get_n_unique_in_tree(
index) + 1;
break;
case REC_STATUS_INFIMUM:
case REC_STATUS_SUPREMUM:
max_n_fields = 1;
break;
default:
ut_error;
}
}
/* index->n_def == 0 for dummy indexes if !comp */
ut_a(!comp || index->n_def);
ut_a(!index->n_def || i <= max_n_fields);
}
while (i--) {
ulint curr = rec_offs_base(offsets)[1 + i] & REC_OFFS_MASK;
ut_a(curr <= last);
last = curr;
}
return(TRUE);
}
/****************************************************************
Updates debug data in offsets, in order to avoid bogus
rec_offs_validate() failures. */
UNIV_INLINE
void
rec_offs_make_valid(
/*================*/
rec_t* rec __attribute__((unused)),
/* in: record */
dict_index_t* index __attribute__((unused)),
/* in: record descriptor */
ulint* offsets __attribute__((unused)))
/* in: array returned by rec_get_offsets() */
{
#ifdef UNIV_DEBUG
ut_ad(rec_get_n_fields(rec, index) >= rec_offs_n_fields(offsets));
offsets[2] = (ulint) rec;
offsets[3] = (ulint) index;
#endif /* UNIV_DEBUG */
}
/****************************************************************
The following function is used to get a pointer to the nth
data field in a record. */
UNIV_INLINE
byte*
rec_get_nth_field(
/*==============*/
/* out: pointer to the field */
rec_t* rec, /* in: record */
const ulint* offsets,/* in: array returned by rec_get_offsets() */
ulint n, /* in: index of the field */
ulint* len) /* out: length of the field; UNIV_SQL_NULL
if SQL null */
{
byte* field;
ulint length;
ut_ad(rec);
ut_ad(rec_offs_validate(rec, NULL, offsets));
ut_ad(n < rec_offs_n_fields(offsets));
ut_ad(len);
if (UNIV_UNLIKELY(n == 0)) {
field = rec;
} else {
field = rec + (rec_offs_base(offsets)[n] & REC_OFFS_MASK);
}
length = rec_offs_base(offsets)[1 + n];
if (length & REC_OFFS_SQL_NULL) {
length = UNIV_SQL_NULL;
} else {
length &= REC_OFFS_MASK;
length -= field - rec;
}
*len = length;
return(field);
}
/**********************************************************
Determine if the offsets are for a record in the new
compact format. */
UNIV_INLINE
ulint
rec_offs_comp(
/*==========*/
/* out: nonzero if compact format */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
ut_ad(rec_offs_validate(NULL, NULL, offsets));
return(*rec_offs_base(offsets) & REC_OFFS_COMPACT);
}
/**********************************************************
Returns nonzero if the extern bit is set in nth field of rec. */
UNIV_INLINE
ulint
rec_offs_nth_extern(
/*================*/
/* out: nonzero if externally stored */
const ulint* offsets,/* in: array returned by rec_get_offsets() */
ulint n) /* in: nth field */
{
ut_ad(rec_offs_validate(NULL, NULL, offsets));
ut_ad(n < rec_offs_n_fields(offsets));
return(UNIV_UNLIKELY(rec_offs_base(offsets)[1 + n]
& REC_OFFS_EXTERNAL));
}
/**********************************************************
Returns nonzero if the SQL NULL bit is set in nth field of rec. */
UNIV_INLINE
ulint
rec_offs_nth_sql_null(
/*==================*/
/* out: nonzero if SQL NULL */
const ulint* offsets,/* in: array returned by rec_get_offsets() */
ulint n) /* in: nth field */
{
ut_ad(rec_offs_validate(NULL, NULL, offsets));
ut_ad(n < rec_offs_n_fields(offsets));
return(UNIV_UNLIKELY(rec_offs_base(offsets)[1 + n]
& REC_OFFS_SQL_NULL));
}
/**********************************************************
Gets the physical size of a field. */
UNIV_INLINE
ulint
rec_offs_nth_size(
/*==============*/
/* out: length of field */
const ulint* offsets,/* in: array returned by rec_get_offsets() */
ulint n) /* in: nth field */
{
ut_ad(rec_offs_validate(NULL, NULL, offsets));
ut_ad(n < rec_offs_n_fields(offsets));
if (!n) {
return(rec_offs_base(offsets)[1 + n] & REC_OFFS_MASK);
}
return((rec_offs_base(offsets)[1 + n] - rec_offs_base(offsets)[n])
& REC_OFFS_MASK);
}
/**********************************************************
Returns TRUE if the extern bit is set in any of the fields
of an old-style record. */
UNIV_INLINE
ibool
rec_offs_any_extern(
/*================*/
/* out: TRUE if a field is stored externally */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
ulint i;
for (i = rec_offs_n_fields(offsets); i--; ) {
if (rec_offs_nth_extern(offsets, i)) {
return(TRUE);
}
}
return(FALSE);
}
/***************************************************************
Sets the value of the ith field extern storage bit. */
UNIV_INLINE
void
rec_set_nth_field_extern_bit(
/*=========================*/
rec_t* rec, /* in: record */
dict_index_t* index, /* in: record descriptor */
ulint i, /* in: ith field */
ibool val, /* in: value to set */
mtr_t* mtr) /* in: mtr holding an X-latch to the page
where rec is, or NULL; in the NULL case
we do not write to log about the change */
{
if (dict_table_is_comp(index->table)) {
rec_set_nth_field_extern_bit_new(rec, index, i, val, mtr);
} else {
rec_set_nth_field_extern_bit_old(rec, i, val, mtr);
}
}
/**********************************************************
Returns the offset of n - 1th field end if the record is stored in the 1-byte
offsets form. If the field is SQL null, the flag is ORed in the returned
value. This function and the 2-byte counterpart are defined here because the
C-compiler was not able to sum negative and positive constant offsets, and
warned of constant arithmetic overflow within the compiler. */
UNIV_INLINE
ulint
rec_1_get_prev_field_end_info(
/*==========================*/
/* out: offset of the start of the PREVIOUS field, SQL
null flag ORed */
rec_t* rec, /* in: record */
ulint n) /* in: field index */
{
ut_ad(rec_get_1byte_offs_flag(rec));
ut_ad(n <= rec_get_n_fields_old(rec));
return(mach_read_from_1(rec - (REC_N_OLD_EXTRA_BYTES + n)));
}
/**********************************************************
Returns the offset of n - 1th field end if the record is stored in the 2-byte
offsets form. If the field is SQL null, the flag is ORed in the returned
value. */
UNIV_INLINE
ulint
rec_2_get_prev_field_end_info(
/*==========================*/
/* out: offset of the start of the PREVIOUS field, SQL
null flag ORed */
rec_t* rec, /* in: record */
ulint n) /* in: field index */
{
ut_ad(!rec_get_1byte_offs_flag(rec));
ut_ad(n <= rec_get_n_fields_old(rec));
return(mach_read_from_2(rec - (REC_N_OLD_EXTRA_BYTES + 2 * n)));
}
/**********************************************************
Sets the field end info for the nth field if the record is stored in the
1-byte format. */
UNIV_INLINE
void
rec_1_set_field_end_info(
/*=====================*/
rec_t* rec, /* in: record */
ulint n, /* in: field index */
ulint info) /* in: value to set */
{
ut_ad(rec_get_1byte_offs_flag(rec));
ut_ad(n < rec_get_n_fields_old(rec));
mach_write_to_1(rec - (REC_N_OLD_EXTRA_BYTES + n + 1), info);
}
/**********************************************************
Sets the field end info for the nth field if the record is stored in the
2-byte format. */
UNIV_INLINE
void
rec_2_set_field_end_info(
/*=====================*/
rec_t* rec, /* in: record */
ulint n, /* in: field index */
ulint info) /* in: value to set */
{
ut_ad(!rec_get_1byte_offs_flag(rec));
ut_ad(n < rec_get_n_fields_old(rec));
mach_write_to_2(rec - (REC_N_OLD_EXTRA_BYTES + 2 * n + 2), info);
}
/**********************************************************
Returns the offset of nth field start if the record is stored in the 1-byte
offsets form. */
UNIV_INLINE
ulint
rec_1_get_field_start_offs(
/*=======================*/
/* out: offset of the start of the field */
rec_t* rec, /* in: record */
ulint n) /* in: field index */
{
ut_ad(rec_get_1byte_offs_flag(rec));
ut_ad(n <= rec_get_n_fields_old(rec));
if (n == 0) {
return(0);
}
return(rec_1_get_prev_field_end_info(rec, n)
& ~REC_1BYTE_SQL_NULL_MASK);
}
/**********************************************************
Returns the offset of nth field start if the record is stored in the 2-byte
offsets form. */
UNIV_INLINE
ulint
rec_2_get_field_start_offs(
/*=======================*/
/* out: offset of the start of the field */
rec_t* rec, /* in: record */
ulint n) /* in: field index */
{
ut_ad(!rec_get_1byte_offs_flag(rec));
ut_ad(n <= rec_get_n_fields_old(rec));
if (n == 0) {
return(0);
}
return(rec_2_get_prev_field_end_info(rec, n)
& ~(REC_2BYTE_SQL_NULL_MASK | REC_2BYTE_EXTERN_MASK));
}
/**********************************************************
The following function is used to read the offset of the start of a data field
in the record. The start of an SQL null field is the end offset of the
previous non-null field, or 0, if none exists. If n is the number of the last
field + 1, then the end offset of the last field is returned. */
UNIV_INLINE
ulint
rec_get_field_start_offs(
/*=====================*/
/* out: offset of the start of the field */
rec_t* rec, /* in: record */
ulint n) /* in: field index */
{
ut_ad(rec);
ut_ad(n <= rec_get_n_fields_old(rec));
if (n == 0) {
return(0);
}
if (rec_get_1byte_offs_flag(rec)) {
return(rec_1_get_field_start_offs(rec, n));
}
return(rec_2_get_field_start_offs(rec, n));
}
/****************************************************************
Gets the physical size of an old-style field.
Also an SQL null may have a field of size > 0,
if the data type is of a fixed size. */
UNIV_INLINE
ulint
rec_get_nth_field_size(
/*===================*/
/* out: field size in bytes */
rec_t* rec, /* in: record */
ulint n) /* in: index of the field */
{
ulint os;
ulint next_os;
os = rec_get_field_start_offs(rec, n);
next_os = rec_get_field_start_offs(rec, n + 1);
ut_ad(next_os - os < UNIV_PAGE_SIZE);
return(next_os - os);
}
/***************************************************************
This is used to modify the value of an already existing field in a record.
The previous value must have exactly the same size as the new value. If len
is UNIV_SQL_NULL then the field is treated as an SQL null for old-style
records. For new-style records, len must not be UNIV_SQL_NULL. */
UNIV_INLINE
void
rec_set_nth_field(
/*==============*/
rec_t* rec, /* in: record */
const ulint* offsets,/* in: array returned by rec_get_offsets() */
ulint n, /* in: index number of the field */
const void* data, /* in: pointer to the data
if not SQL null */
ulint len) /* in: length of the data or UNIV_SQL_NULL.
If not SQL null, must have the same
length as the previous value.
If SQL null, previous value must be
SQL null. */
{
byte* data2;
ulint len2;
ut_ad(rec);
ut_ad(rec_offs_validate(rec, NULL, offsets));
if (len == UNIV_SQL_NULL) {
ut_ad(!rec_offs_comp(offsets));
rec_set_nth_field_sql_null(rec, n);
return;
}
data2 = rec_get_nth_field(rec, offsets, n, &len2);
if (len2 == UNIV_SQL_NULL) {
ut_ad(!rec_offs_comp(offsets));
rec_set_nth_field_null_bit(rec, n, FALSE);
ut_ad(len == rec_get_nth_field_size(rec, n));
} else {
ut_ad(len2 == len);
}
ut_memcpy(data2, data, len);
}
/**************************************************************
The following function returns the data size of an old-style physical
record, that is the sum of field lengths. SQL null fields
are counted as length 0 fields. The value returned by the function
is the distance from record origin to record end in bytes. */
UNIV_INLINE
ulint
rec_get_data_size_old(
/*==================*/
/* out: size */
rec_t* rec) /* in: physical record */
{
ut_ad(rec);
return(rec_get_field_start_offs(rec, rec_get_n_fields_old(rec)));
}
/**************************************************************
The following function sets the number of fields in offsets. */
UNIV_INLINE
void
rec_offs_set_n_fields(
/*==================*/
ulint* offsets, /* in/out: array returned by
rec_get_offsets() */
ulint n_fields) /* in: number of fields */
{
ut_ad(offsets);
ut_ad(n_fields > 0);
ut_ad(n_fields <= REC_MAX_N_FIELDS);
ut_ad(n_fields + REC_OFFS_HEADER_SIZE
<= rec_offs_get_n_alloc(offsets));
offsets[1] = n_fields;
}
/**************************************************************
The following function returns the data size of a physical
record, that is the sum of field lengths. SQL null fields
are counted as length 0 fields. The value returned by the function
is the distance from record origin to record end in bytes. */
UNIV_INLINE
ulint
rec_offs_data_size(
/*===============*/
/* out: size */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
ulint size;
ut_ad(rec_offs_validate(NULL, NULL, offsets));
size = rec_offs_base(offsets)[rec_offs_n_fields(offsets)]
& REC_OFFS_MASK;
ut_ad(size < UNIV_PAGE_SIZE);
return(size);
}
/**************************************************************
Returns the total size of record minus data size of record. The value
returned by the function is the distance from record start to record origin
in bytes. */
UNIV_INLINE
ulint
rec_offs_extra_size(
/*================*/
/* out: size */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
ulint size;
ut_ad(rec_offs_validate(NULL, NULL, offsets));
size = *rec_offs_base(offsets) & ~REC_OFFS_COMPACT;
ut_ad(size < UNIV_PAGE_SIZE);
return(size);
}
/**************************************************************
Returns the total size of a physical record. */
UNIV_INLINE
ulint
rec_offs_size(
/*==========*/
/* out: size */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
return(rec_offs_data_size(offsets) + rec_offs_extra_size(offsets));
}
/**************************************************************
Returns a pointer to the end of the record. */
UNIV_INLINE
byte*
rec_get_end(
/*========*/
/* out: pointer to end */
rec_t* rec, /* in: pointer to record */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
return(rec + rec_offs_data_size(offsets));
}
/**************************************************************
Returns a pointer to the start of the record. */
UNIV_INLINE
byte*
rec_get_start(
/*==========*/
/* out: pointer to start */
rec_t* rec, /* in: pointer to record */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
return(rec - rec_offs_extra_size(offsets));
}
/*******************************************************************
Copies a physical record to a buffer. */
UNIV_INLINE
rec_t*
rec_copy(
/*=====*/
/* out: pointer to the origin of the copy */
void* buf, /* in: buffer */
const rec_t* rec, /* in: physical record */
const ulint* offsets)/* in: array returned by rec_get_offsets() */
{
ulint extra_len;
ulint data_len;
ut_ad(rec && buf);
ut_ad(rec_offs_validate((rec_t*) rec, NULL, offsets));
ut_ad(rec_validate((rec_t*) rec, offsets));
extra_len = rec_offs_extra_size(offsets);
data_len = rec_offs_data_size(offsets);
ut_memcpy(buf, rec - extra_len, extra_len + data_len);
return((byte*)buf + extra_len);
}
/**************************************************************
Returns the extra size of an old-style physical record if we know its
data size and number of fields. */
UNIV_INLINE
ulint
rec_get_converted_extra_size(
/*=========================*/
/* out: extra size */
ulint data_size, /* in: data size */
ulint n_fields) /* in: number of fields */
{
if (data_size <= REC_1BYTE_OFFS_LIMIT) {
return(REC_N_OLD_EXTRA_BYTES + n_fields);
}
return(REC_N_OLD_EXTRA_BYTES + 2 * n_fields);
}
/**************************************************************
The following function returns the size of a data tuple when converted to
a new-style physical record. */
ulint
rec_get_converted_size_new(
/*=======================*/
/* out: size */
dict_index_t* index, /* in: record descriptor */
dtuple_t* dtuple);/* in: data tuple */
/**************************************************************
The following function returns the size of a data tuple when converted to
a physical record. */
UNIV_INLINE
ulint
rec_get_converted_size(
/*===================*/
/* out: size */
dict_index_t* index, /* in: record descriptor */
dtuple_t* dtuple) /* in: data tuple */
{
ulint data_size;
ulint extra_size;
ut_ad(index);
ut_ad(dtuple);
ut_ad(dtuple_check_typed(dtuple));
ut_ad(index->type & DICT_UNIVERSAL
|| dtuple_get_n_fields(dtuple)
== (((dtuple_get_info_bits(dtuple) & REC_NEW_STATUS_MASK)
== REC_STATUS_NODE_PTR)
? dict_index_get_n_unique_in_tree(index) + 1
: dict_index_get_n_fields(index)));
if (dict_table_is_comp(index->table)) {
return(rec_get_converted_size_new(index, dtuple));
}
data_size = dtuple_get_data_size(dtuple);
extra_size = rec_get_converted_extra_size(
data_size, dtuple_get_n_fields(dtuple));
return(data_size + extra_size);
}
/****************************************************************
Folds a prefix of a physical record to a ulint. Folds only existing fields,
that is, checks that we do not run out of the record. */
UNIV_INLINE
ulint
rec_fold(
/*=====*/
/* out: the folded value */
rec_t* rec, /* in: the physical record */
const ulint* offsets, /* in: array returned by
rec_get_offsets() */
ulint n_fields, /* in: number of complete
fields to fold */
ulint n_bytes, /* in: number of bytes to fold
in an incomplete last field */
dulint tree_id) /* in: index tree id */
{
ulint i;
byte* data;
ulint len;
ulint fold;
ulint n_fields_rec;
ut_ad(rec_offs_validate(rec, NULL, offsets));
ut_ad(rec_validate((rec_t*) rec, offsets));
ut_ad(n_fields + n_bytes > 0);
n_fields_rec = rec_offs_n_fields(offsets);
ut_ad(n_fields <= n_fields_rec);
ut_ad(n_fields < n_fields_rec || n_bytes == 0);
if (n_fields > n_fields_rec) {
n_fields = n_fields_rec;
}
if (n_fields == n_fields_rec) {
n_bytes = 0;
}
fold = ut_fold_dulint(tree_id);
for (i = 0; i < n_fields; i++) {
data = rec_get_nth_field(rec, offsets, i, &len);
if (len != UNIV_SQL_NULL) {
fold = ut_fold_ulint_pair(fold,
ut_fold_binary(data, len));
}
}
if (n_bytes > 0) {
data = rec_get_nth_field(rec, offsets, i, &len);
if (len != UNIV_SQL_NULL) {
if (len > n_bytes) {
len = n_bytes;
}
fold = ut_fold_ulint_pair(fold,
ut_fold_binary(data, len));
}
}
return(fold);
}