Added Innobase to source distribution

This commit is contained in:
monty@donna.mysql.com 2001-02-17 14:19:19 +02:00
parent 024e2f39c9
commit c533308a15
421 changed files with 170195 additions and 969 deletions

View file

@ -486,6 +486,7 @@ MySQL Table Types
* ISAM:: ISAM tables
* HEAP:: HEAP tables
* BDB:: BDB or Berkeley_db tables
* INNOBASE::
MyISAM Tables
@ -573,7 +574,7 @@ Replication in MySQL
* Replication Options:: Replication Options in my.cnf
* Replication SQL:: SQL Commands related to replication
* Replication FAQ:: Frequently Asked Questions about replication
* Troubleshooting Replication:: Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication.
* Troubleshooting Replication:: Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication.
Getting Maximum Performance from MySQL
@ -17875,7 +17876,7 @@ reference_option:
RESTRICT | CASCADE | SET NULL | NO ACTION | SET DEFAULT
table_options:
TYPE = @{ISAM | MYISAM | HEAP | MERGE@}
TYPE = @{BDB | HEAP | ISAM | INNOBASE | MERGE | MYISAM @}
or AUTO_INCREMENT = #
or AVG_ROW_LENGTH = #
or CHECKSUM = @{0 | 1@}
@ -18111,11 +18112,12 @@ implemented in @strong{MySQL} Version 3.23 and above.
The different table types are:
@multitable @columnfractions .20 .80
@item BDB or Berkeley_db @tab Transaction-safe tables @xref{BDB}.
@item BDB or Berkeley_db @tab Transaction-safe tables with page locking. @xref{BDB}.
@item HEAP @tab The data for this table is only stored in memory. @xref{HEAP}.
@item ISAM @tab The original table handler. @xref{ISAM}.
@item INNOBASE @tab Transaction-safe tables with row locking. @xref{INNOBASE}.
@item MERGE @tab A collection of MyISAM tables used as one table. @xref{MERGE}.
@item MyISAM @tab The new binary portable table handler. @xref{MyISAM}.
@item MyISAM @tab The new binary portable table handler that is replacing ISAM. @xref{MyISAM}.
@end multitable
@xref{Table types}.
@ -20370,7 +20372,7 @@ The following columns are returned:
@multitable @columnfractions .30 .70
@item @strong{Column} @tab @strong{Meaning}
@item @code{Name} @tab Name of the table.
@item @code{Type} @tab Type of table (BDB, ISAM, MERGE, MyISAM, or HEAP).
@item @code{Type} @tab Type of table. @xref{Table types}.
@item @code{Row_format} @tab The row storage format (Fixed, Dynamic, or Compressed).
@item @code{Rows} @tab Number of rows.
@item @code{Avg_row_length} @tab Average row length.
@ -20386,6 +20388,9 @@ The following columns are returned:
@item @code{Comment} @tab The comment used when creating the table (or some information why @strong{MySQL} couldn't access the table information).
@end multitable
@code{INNOBASE} tables will report the free space in the tablespace
in the table comment.
@node SHOW STATUS, SHOW VARIABLES, SHOW TABLE STATUS, SHOW
@subsection SHOW Status Information
@ -21506,7 +21511,8 @@ By default, @strong{MySQL} runs in @code{autocommit} mode. This means that
as soon as you execute an update, @strong{MySQL} will store the update on
disk.
If you are using @code{BDB} tables, you can put @strong{MySQL} into
If you are using transactions safe tables (like @code{BDB},
@code{INNOBASE} or @code{GEMINI}), you can put @strong{MySQL} into
non-@code{autocommit} mode with the following command:
@example
@ -22303,21 +22309,25 @@ used them.
@cindex table types, choosing
@cindex @code{BDB} table type
@cindex @code{Berkeley_db} table type
@cindex ISAM table type
@cindex @code{HEAP} table type
@cindex @code{ISAM} table type
@cindex @code{INNOBASE} table type
@cindex @code{MERGE} table type
@cindex MySQL table types
@cindex MyISAM table type
@cindex @code{MyISAM} table type
@cindex types, of tables
@node Table types, Tutorial, Reference, Top
@chapter MySQL Table Types
As of @strong{MySQL} Version 3.23.6, you can choose between three basic
table formats. When you create a new table, you can tell @strong{MySQL}
which table type it should use for the table. @strong{MySQL} will
always create a @code{.frm} file to hold the table and column
definitions. Depending on the table type, the index and data will be
stored in other files.
table formats (@code{ISAM}, @code{HEAP} and @code{MyISAM}. Newer
@strong{MySQL} may support additional table type, depending on how you
compile it.
When you create a new table, you can tell @strong{MySQL} which table
type it should use for the table. @strong{MySQL} will always create a
@code{.frm} file to hold the table and column definitions. Depending on
the table type, the index and data will be stored in other files.
The default table type in @strong{MySQL} is @code{MyISAM}. If you are
trying to use a table type that is not incompiled or activated,
@ -22327,8 +22337,9 @@ You can convert tables between different types with the @code{ALTER
TABLE} statement. @xref{ALTER TABLE, , @code{ALTER TABLE}}.
Note that @strong{MySQL} supports two different kinds of
tables. Transaction-safe tables (@code{BDB}) and not transaction-safe
tables (@code{ISAM}, @code{MERGE}, @code{MyISAM}, and @code{HEAP}).
tables. Transaction-safe tables (@code{BDB}, @code{INNOBASE} or
@code{GEMINI}) and not transaction-safe tables (@code{HEAP}, @code{ISAM},
@code{MERGE}, and @code{MyISAM}).
Advantages of transaction-safe tables (TST):
@ -22368,6 +22379,7 @@ of both worlds.
* ISAM:: ISAM tables
* HEAP:: HEAP tables
* BDB:: BDB or Berkeley_db tables
* INNOBASE::
@end menu
@node MyISAM, MERGE, Table types, Table types
@ -22978,7 +22990,7 @@ SUM_OVER_ALL_KEYS(max_length_of_key + sizeof(char*) * 2)
@cindex tables, @code{BDB}
@cindex tables, @code{Berkeley DB}
@node BDB, , HEAP, Table types
@node BDB, INNOBASE, HEAP, Table types
@section BDB or Berkeley_db Tables
@menu
@ -22993,6 +23005,9 @@ SUM_OVER_ALL_KEYS(max_length_of_key + sizeof(char*) * 2)
@node BDB overview, BDB install, BDB, BDB
@subsection Overview over BDB tables
Innobase is included in the @code{MySQL} source distribution starting
from 3.23.34 and will be activated in the @code{MySQL}-max binary.
Berkeley DB (@uref{http://www.sleepycat.com}) has provided
@strong{MySQL} with a transaction-safe table handler. This will survive
crashes and also provides @code{COMMIT} and @code{ROLLBACK} on
@ -23205,6 +23220,134 @@ This is not fatal but we don't recommend that you delete tables if you are
not in @code{auto_commit} mode, until this problem is fixed (the fix is
not trivial).
@node INNOBASE, , BDB, Table types
@section INNOBASE Tables
Innobase is included in the @code{MySQL} source distribution starting
from 3.23.34 and will be activated in the @code{MySQL}-max binary.
Innobase provides MySQL with a transaction safe table handler with
commit, rollback, and crash recovery capabilities. Innobase does
locking on row level, and also provides an Oracle-style consistent
non-locking read in @code{SELECTS}, which increases transaction
concurrency. There is neither need for lock escalation in Innobase,
because row level locks in Innobase fit in very small space.
Innobase is a table handler that is under the GNU GPL License Version 2
(of June 1991). In the source distribution of MySQL, Innobase appears as
a subdirectory.
Technically, Innobase is a database backend placed under MySQL. Innobase
has its own buffer pool for caching data and indexes in main
memory. Innobase stores its tables and indexes in a tablespace, which
may consist of several files. This is different from, for example,
@code{MyISAM} tables where each table is stored as a separate file.
To create a table in the Innobase format you must specify
@code{TYPE = INNOBASE} in the table creation SQL command:
@example
CREATE TABLE CUSTOMERS (A INT, B CHAR (20), INDEX (A)) TYPE = INNOBASE;
@end example
A consistent non-locking read is the default locking behavior when you
do a @code{SELECT} from an Innobase table. For a searched update and an
insert row level exclusive locking is performed.
To use Innobase tables you must specify configuration parameters
in the MySQL configuration file in the @code{[mysqld]} section of
the configuration file. Below is an example of possible configuration
parameters in my.cnf for Innobase:
@example
innobase_data_home_dir = c:\ibdata\
innobase_data_file_path = ibdata1:25M;ibdata2:37M;ibdata3:100M;ibdata4:300M
set-variable = innobase_mirrored_log_groups=1
innobase_log_group_home_dir = c:\iblogs\
set-variable = innobase_log_files_in_group=3
set-variable = innobase_log_file_size=5M
set-variable = innobase_log_buffer_size=8M
innobase_flush_log_at_trx_commit=1
innobase_log_arch_dir = c:\iblogs\
innobase_log_archive=0
set-variable = innobase_buffer_pool_size=16M
set-variable = innobase_additional_mem_pool_size=2M
set-variable = innobase_file_io_threads=4
set-variable = innobase_lock_wait_timeout=50
@end example
The meanings of the configuration parameters are the following:
@multitable @columnfractions .30 .70
@item @code{innobase_data_home_dir} @tab
The common part of the directory path for all innobase data files.
@item @code{innobase_data_file_path} @tab
Paths to individual data files and their sizes. The full directory path
to each data file is acquired by concatenating innobase_data_home_dir to
the paths specified here. The file sizes are specified in megabytes,
hence the 'M' after the size specification above. Do not set a file size
bigger than 4000M, and on most operating systems not bigger than 2000M.
innobase_mirrored_log_groups Number of identical copies of log groups we
keep for the database. Currently this should be set to 1.
@item @code{innobase_log_group_home_dir} @tab
Directory path to Innobase log files.
@item @code{innobase_log_files_in_group} @tab
Number of log files in the log group. Innobase writes to the files in a
circular fashion. Value 3 is recommended here.
@item @code{innobase_log_file_size} @tab
Size of each log file in a log group in megabytes. Sensible values range
from 1M to the size of the buffer pool specified below. The bigger the
value, the less checkpoint flush activity is needed in the buffer pool,
saving disk i/o. But bigger log files also mean that recovery will be
slower in case of a crash. File size restriction as for a data file.
@item @code{innobase_log_buffer_size} @tab
The size of the buffer which Innobase uses to write log to the log files
on disk. Sensible values range from 1M to half the combined size of log
files. A big log buffer allows large transactions to run without a need
to write the log to disk until the transaction commit. Thus, if you have
big transactions, making the log buffer big will save disk i/o.
@item @code{innobase_flush_log_at_trx_commit} @tab
Normally this is set to 1, meaning that at a transaction commit the log
is flushed to disk, and the modifications made by the transaction become
permanent, and survive a database crash. If you are willing to
compromise this safety, and you are running small transactions, you may
set this to 0 to reduce disk i/o to the logs.
@item @code{innobase_log_arch_dir} @tab
The directory where fully written log files would be archived if we used
log archiving. The value of this parameter should currently be set the
same as @code{innobase_log_group_home_dir}.
@item @code{innobase_log_archive} @tab
This value should currently be set to 0. As recovery from a backup is
done by MySQL using its own log files, there is currently no need to
archive Innobase log files.
@item @code{innobase_buffer_pool_size} @tab
The size of the memory buffer Innobase uses to cache data and indexes of
its tables. The bigger you set this the less disk i/o is needed to
access data in tables. On a dedicated database server you may set this
parameter up to 90 % of the machine physical memory size. Do not set it
too large, though, because competition of the physical memory may cause
paging in the operating system.
@item @code{innobase_additional_mem_pool_size} @tab
Size of a memory pool Innobase uses to store data dictionary information
and other internal data structures. A sensible value for this might be
2M, but the more tables you have in your application the more you will
need to allocate here. If Innobase runs out of memory in this pool, it
will start to allocate memory from the operating system, and write
warning messages to the MySQL error log.
@item @code{innobase_file_io_threads} @tab
Number of file i/o threads in Innobase. Normally, this should be 4, but
on Windows NT disk i/o may benefit from a larger number.
@item @code{innobase_lock_wait_timeout} @tab
Timeout in seconds an Innobase transaction may wait for a lock before
being rolled back. Innobase automatically detects transaction deadlocks
in its own lock table and rolls back the transaction. If you use
@code{LOCK TABLES} command, or other transaction safe table handlers
than Innobase in the same transaction, then a deadlock may arise which
Innobase cannot notice. In cases like this the timeout is useful to
resolve the situation.
@end multitable
@cindex tutorial
@cindex terminal monitor, defined
@cindex monitor, terminal
@ -25939,7 +26082,7 @@ tables}.
* Replication Options:: Replication Options in my.cnf
* Replication SQL:: SQL Commands related to replication
* Replication FAQ:: Frequently Asked Questions about replication
* Troubleshooting Replication:: Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication.
* Troubleshooting Replication:: Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication. Troubleshooting Replication.
@end menu
@node Replication Intro, Replication Implementation, Replication, Replication
@ -41284,6 +41427,9 @@ not yet 100 % confident in this code.
@appendixsubsec Changes in release 3.23.34
@itemize @bullet
@item
Added the @code{INNOBASE} table handler and the @code{BDB} table handler
to the @strong{MySQL} source distribution.
@item
Fixed bug in @code{BDB} tables when using index on multi-part key where a
key part may be @code{NULL}.
@item
@ -41292,8 +41438,8 @@ This ensures that on gets same values for date functions like @code{NOW()}
when using @code{mysqlbinlog} to pipe the queries to another server.
@item
Allow one to use @code{--skip-gemeni}, @code{--skip-bdb} and
@code{--skip-innobase} to mysqld even if these databases are not compiled
in @code{mysqld}.
@code{--skip-innobase} to @code{mysqld} even if these databases are not
compiled in @code{mysqld}.
@item
One can now do @code{GROUP BY ... DESC}.
@end itemize
@ -46368,8 +46514,6 @@ if they haven't been used in a while.
@item
Allow join on key parts (optimization issue).
@item
Entry for @code{DECRYPT()}.
@item
@code{INSERT SQL_CONCURRENT} and @code{mysqld --concurrent-insert} to do
a concurrent insert at the end of the file if the file is read-locked.
@item
@ -46452,8 +46596,6 @@ Currently, you can only use this syntax with @code{LEFT JOIN}.
@item
Add full support for @code{unsigned long long} type.
@item
Function @code{CASE}.
@item
Many more variables for @code{show status}. Counts for:
@code{INSERT}/@code{DELETE}/@code{UPDATE} statements. Records reads and
updated. Selects on 1 table and selects with joins. Mean number of

View file

@ -4,7 +4,7 @@ dnl Process this file with autoconf to produce a configure script.
AC_INIT(sql/mysqld.cc)
AC_CANONICAL_SYSTEM
# The Docs Makefile.am parses this line!
AM_INIT_AUTOMAKE(mysql, 3.23.33)
AM_INIT_AUTOMAKE(mysql, 3.23.34)
AM_CONFIG_HEADER(config.h)
PROTOCOL_VERSION=10

View file

@ -90,7 +90,8 @@ enum ha_extra_function {
HA_EXTRA_NO_ROWS, /* Don't write rows */
HA_EXTRA_RESET_STATE, /* Reset positions */
HA_EXTRA_IGNORE_DUP_KEY, /* Dup keys don't rollback everything*/
HA_EXTRA_NO_IGNORE_DUP_KEY
HA_EXTRA_NO_IGNORE_DUP_KEY,
HA_EXTRA_DONT_USE_CURSOR_TO_UPDATE /* Cursor will not be used for update */
};
/* The following is parameter to ha_panic() */

26
innobase/Makefile.am Normal file
View file

@ -0,0 +1,26 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
# Process this file with automake to create Makefile.in
AUTOMAKE_OPTIONS = foreign
TAR = gtar
SUBDIRS = os ut btr buf com data dict dyn eval fil fsp fut \
ha ibuf lock log mach mem mtr odbc page pars que \
read rem row srv sync thr trx usr

25
innobase/btr/Makefile.am Normal file
View file

@ -0,0 +1,25 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libbtr.a
libbtr_a_SOURCES = btr0btr.c btr0cur.c btr0pcur.c btr0sea.c
EXTRA_PROGRAMS =

2404
innobase/btr/btr0btr.c Normal file

File diff suppressed because it is too large Load diff

2288
innobase/btr/btr0cur.c Normal file

File diff suppressed because it is too large Load diff

474
innobase/btr/btr0pcur.c Normal file
View file

@ -0,0 +1,474 @@
/******************************************************
The index tree persistent cursor
(c) 1996 Innobase Oy
Created 2/23/1996 Heikki Tuuri
*******************************************************/
#include "btr0pcur.h"
#ifdef UNIV_NONINL
#include "btr0pcur.ic"
#endif
#include "ut0byte.h"
#include "rem0cmp.h"
/******************************************************************
Allocates memory for a persistent cursor object and initializes the cursor. */
btr_pcur_t*
btr_pcur_create_for_mysql(void)
/*============================*/
/* out, own: persistent cursor */
{
btr_pcur_t* pcur;
pcur = mem_alloc(sizeof(btr_pcur_t));
pcur->btr_cur.index = NULL;
btr_pcur_init(pcur);
return(pcur);
}
/******************************************************************
Frees the memory for a persistent cursor object. */
void
btr_pcur_free_for_mysql(
/*====================*/
btr_pcur_t* cursor) /* in, own: persistent cursor */
{
if (cursor->old_rec_buf != NULL) {
mem_free(cursor->old_rec_buf);
cursor->old_rec = NULL;
cursor->old_rec_buf = NULL;
}
cursor->btr_cur.page_cur.rec = NULL;
cursor->old_rec = NULL;
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
cursor->latch_mode = BTR_NO_LATCHES;
cursor->pos_state = BTR_PCUR_NOT_POSITIONED;
mem_free(cursor);
}
/******************************************************************
The position of the cursor is stored by taking an initial segment of the
record the cursor is positioned on, before, or after, and copying it to the
cursor data structure. NOTE that the page where the cursor is positioned
must not be empty! */
void
btr_pcur_store_position(
/*====================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
page_cur_t* page_cursor;
rec_t* rec;
dict_tree_t* tree;
page_t* page;
ut_a(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
tree = btr_cur_get_tree(btr_pcur_get_btr_cur(cursor));
page_cursor = btr_pcur_get_page_cur(cursor);
rec = page_cur_get_rec(page_cursor);
page = buf_frame_align(rec);
ut_ad(mtr_memo_contains(mtr, buf_block_align(page),
MTR_MEMO_PAGE_S_FIX)
|| mtr_memo_contains(mtr, buf_block_align(page),
MTR_MEMO_PAGE_X_FIX));
ut_a(cursor->latch_mode != BTR_NO_LATCHES);
if (page_get_n_recs(page) == 0) {
/* Cannot store position! */
btr_pcur_close(cursor);
return;
}
if (rec == page_get_supremum_rec(page)) {
rec = page_rec_get_prev(rec);
cursor->rel_pos = BTR_PCUR_AFTER;
} else if (rec == page_get_infimum_rec(page)) {
rec = page_rec_get_next(rec);
cursor->rel_pos = BTR_PCUR_BEFORE;
} else {
cursor->rel_pos = BTR_PCUR_ON;
}
cursor->old_stored = BTR_PCUR_OLD_STORED;
cursor->old_rec = dict_tree_copy_rec_order_prefix(tree, rec,
&(cursor->old_rec_buf),
&(cursor->buf_size));
cursor->modify_clock = buf_frame_get_modify_clock(page);
}
/******************************************************************
Copies the stored position of a pcur to another pcur. */
void
btr_pcur_copy_stored_position(
/*==========================*/
btr_pcur_t* pcur_receive, /* in: pcur which will receive the
position info */
btr_pcur_t* pcur_donate) /* in: pcur from which the info is
copied */
{
if (pcur_receive->old_rec_buf) {
mem_free(pcur_receive->old_rec_buf);
}
ut_memcpy((byte*)pcur_receive, (byte*)pcur_donate, sizeof(btr_pcur_t));
pcur_receive->old_rec_buf = mem_alloc(pcur_donate->buf_size);
ut_memcpy(pcur_receive->old_rec_buf, pcur_donate->old_rec_buf,
pcur_donate->buf_size);
pcur_receive->old_rec = pcur_receive->old_rec_buf
+ (pcur_donate->old_rec - pcur_donate->old_rec_buf);
}
/******************************************************************
Restores the stored position of a persistent cursor bufferfixing the page and
obtaining the specified latches. If the cursor position was saved when the
(1) cursor was positioned on a user record: this function restores the position
to the last record LESS OR EQUAL to the stored record;
(2) cursor was positioned on a page infimum record: restores the position to
the last record LESS than the user record which was the successor of the page
infimum;
(3) cursor was positioned on the page supremum: restores to the first record
GREATER than the user record which was the predecessor of the supremum. */
ibool
btr_pcur_restore_position(
/*======================*/
/* out: TRUE if the cursor position
was stored when it was on a user record
and it can be restored on a user record
whose ordering fields are identical to
the ones of the original user record */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in: detached persistent cursor */
mtr_t* mtr) /* in: mtr */
{
dict_tree_t* tree;
page_t* page;
dtuple_t* tuple;
ulint mode;
ulint old_mode;
mem_heap_t* heap;
ut_a((cursor->pos_state == BTR_PCUR_WAS_POSITIONED)
|| (cursor->pos_state == BTR_PCUR_IS_POSITIONED));
ut_a(cursor->old_stored == BTR_PCUR_OLD_STORED);
ut_a(cursor->old_rec);
page = btr_cur_get_page(btr_pcur_get_btr_cur(cursor));
if ((latch_mode == BTR_SEARCH_LEAF)
|| (latch_mode == BTR_MODIFY_LEAF)) {
/* Try optimistic restoration */
if (buf_page_optimistic_get(latch_mode, page,
cursor->modify_clock, mtr)) {
cursor->pos_state = BTR_PCUR_IS_POSITIONED;
buf_page_dbg_add_level(page, SYNC_TREE_NODE);
if (cursor->rel_pos == BTR_PCUR_ON) {
cursor->latch_mode = latch_mode;
ut_ad(cmp_rec_rec(cursor->old_rec,
btr_pcur_get_rec(cursor),
dict_tree_find_index(
btr_cur_get_tree(
btr_pcur_get_btr_cur(cursor)),
btr_pcur_get_rec(cursor)))
== 0);
return(TRUE);
}
return(FALSE);
}
}
/* If optimistic restoration did not succeed, open the cursor anew */
heap = mem_heap_create(256);
tree = btr_cur_get_tree(btr_pcur_get_btr_cur(cursor));
tuple = dict_tree_build_data_tuple(tree, cursor->old_rec, heap);
/* Save the old search mode of the cursor */
old_mode = cursor->search_mode;
if (cursor->rel_pos == BTR_PCUR_ON) {
mode = PAGE_CUR_LE;
} else if (cursor->rel_pos == BTR_PCUR_AFTER) {
mode = PAGE_CUR_G;
} else {
ut_ad(cursor->rel_pos == BTR_PCUR_BEFORE);
mode = PAGE_CUR_L;
}
btr_pcur_open_with_no_init(btr_pcur_get_btr_cur(cursor)->index, tuple,
mode, latch_mode, cursor, 0, mtr);
cursor->old_stored = BTR_PCUR_OLD_STORED;
/* Restore the old search mode */
cursor->search_mode = old_mode;
if ((cursor->rel_pos == BTR_PCUR_ON)
&& btr_pcur_is_on_user_rec(cursor, mtr)
&& (0 == cmp_dtuple_rec(tuple, btr_pcur_get_rec(cursor)))) {
mem_heap_free(heap);
return(TRUE);
}
mem_heap_free(heap);
return(FALSE);
}
/******************************************************************
If the latch mode of the cursor is BTR_LEAF_SEARCH or BTR_LEAF_MODIFY,
releases the page latch and bufferfix reserved by the cursor.
NOTE! In the case of BTR_LEAF_MODIFY, there should not exist changes
made by the current mini-transaction to the data protected by the
cursor latch, as then the latch must not be released until mtr_commit. */
void
btr_pcur_release_leaf(
/*==================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
page_t* page;
ut_a(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
page = btr_cur_get_page(btr_pcur_get_btr_cur(cursor));
btr_leaf_page_release(page, cursor->latch_mode, mtr);
cursor->latch_mode = BTR_NO_LATCHES;
cursor->pos_state = BTR_PCUR_WAS_POSITIONED;
}
/*************************************************************
Moves the persistent cursor to the first record on the next page. Releases the
latch on the current page, and bufferunfixes it. Note that there must not be
modifications on the current page, as then the x-latch can be released only in
mtr_commit. */
void
btr_pcur_move_to_next_page(
/*=======================*/
btr_pcur_t* cursor, /* in: persistent cursor; must be on the
last record of the current page */
mtr_t* mtr) /* in: mtr */
{
ulint next_page_no;
ulint space;
page_t* page;
page_t* next_page;
ut_a(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
ut_ad(btr_pcur_is_after_last_on_page(cursor, mtr));
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
page = btr_pcur_get_page(cursor);
next_page_no = btr_page_get_next(page, mtr);
space = buf_frame_get_space_id(page);
ut_ad(next_page_no != FIL_NULL);
next_page = btr_page_get(space, next_page_no, cursor->latch_mode, mtr);
btr_leaf_page_release(page, cursor->latch_mode, mtr);
page_cur_set_before_first(next_page, btr_pcur_get_page_cur(cursor));
}
/*************************************************************
Moves the persistent cursor backward if it is on the first record of the page.
Commits mtr. Note that to prevent a possible deadlock, the operation
first stores the position of the cursor, commits mtr, acquires the necessary
latches and restores the cursor position again before returning. The
alphabetical position of the cursor is guaranteed to be sensible on
return, but it may happen that the cursor is not positioned on the last
record of any page, because the structure of the tree may have changed
during the time when the cursor had no latches. */
void
btr_pcur_move_backward_from_page(
/*=============================*/
btr_pcur_t* cursor, /* in: persistent cursor, must be on the first
record of the current page */
mtr_t* mtr) /* in: mtr */
{
ulint prev_page_no;
ulint space;
page_t* page;
page_t* prev_page;
ulint latch_mode;
ulint latch_mode2;
ut_a(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
ut_ad(btr_pcur_is_before_first_on_page(cursor, mtr));
ut_ad(!btr_pcur_is_before_first_in_tree(cursor, mtr));
latch_mode = cursor->latch_mode;
if (latch_mode == BTR_SEARCH_LEAF) {
latch_mode2 = BTR_SEARCH_PREV;
} else if (latch_mode == BTR_MODIFY_LEAF) {
latch_mode2 = BTR_MODIFY_PREV;
} else {
ut_error;
}
btr_pcur_store_position(cursor, mtr);
mtr_commit(mtr);
mtr_start(mtr);
btr_pcur_restore_position(latch_mode2, cursor, mtr);
page = btr_pcur_get_page(cursor);
prev_page_no = btr_page_get_prev(page, mtr);
space = buf_frame_get_space_id(page);
if (btr_pcur_is_before_first_on_page(cursor, mtr)
&& (prev_page_no != FIL_NULL)) {
prev_page = btr_pcur_get_btr_cur(cursor)->left_page;
btr_leaf_page_release(page, latch_mode, mtr);
page_cur_set_after_last(prev_page,
btr_pcur_get_page_cur(cursor));
} else if (prev_page_no != FIL_NULL) {
/* The repositioned cursor did not end on an infimum record on
a page. Cursor repositioning acquired a latch also on the
previous page, but we do not need the latch: release it. */
prev_page = btr_pcur_get_btr_cur(cursor)->left_page;
btr_leaf_page_release(prev_page, latch_mode, mtr);
}
cursor->latch_mode = latch_mode;
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
}
/*************************************************************
Moves the persistent cursor to the previous record in the tree. If no records
are left, the cursor stays 'before first in tree'. */
ibool
btr_pcur_move_to_prev(
/*==================*/
/* out: TRUE if the cursor was not before first
in tree */
btr_pcur_t* cursor, /* in: persistent cursor; NOTE that the
function may release the page latch */
mtr_t* mtr) /* in: mtr */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
if (btr_pcur_is_before_first_on_page(cursor, mtr)) {
if (btr_pcur_is_before_first_in_tree(cursor, mtr)) {
return(FALSE);
}
btr_pcur_move_backward_from_page(cursor, mtr);
return(TRUE);
}
btr_pcur_move_to_prev_on_page(cursor, mtr);
return(TRUE);
}
/******************************************************************
If mode is PAGE_CUR_G or PAGE_CUR_GE, opens a persistent cursor on the first
user record satisfying the search condition, in the case PAGE_CUR_L or
PAGE_CUR_LE, on the last user record. If no such user record exists, then
in the first case sets the cursor after last in tree, and in the latter case
before first in tree. The latching mode must be BTR_SEARCH_LEAF or
BTR_MODIFY_LEAF. */
void
btr_pcur_open_on_user_rec(
/*======================*/
dict_index_t* index, /* in: index */
dtuple_t* tuple, /* in: tuple on which search done */
ulint mode, /* in: PAGE_CUR_L, ... */
ulint latch_mode, /* in: BTR_SEARCH_LEAF or
BTR_MODIFY_LEAF */
btr_pcur_t* cursor, /* in: memory buffer for persistent
cursor */
mtr_t* mtr) /* in: mtr */
{
btr_pcur_open(index, tuple, mode, latch_mode, cursor, mtr);
if ((mode == PAGE_CUR_GE) || (mode == PAGE_CUR_G)) {
if (btr_pcur_is_after_last_on_page(cursor, mtr)) {
btr_pcur_move_to_next_user_rec(cursor, mtr);
}
} else {
ut_ad((mode == PAGE_CUR_LE) || (mode == PAGE_CUR_L));
/* Not implemented yet */
ut_error;
}
}

1436
innobase/btr/btr0sea.c Normal file

File diff suppressed because it is too large Load diff

16
innobase/btr/makefilewin Normal file
View file

@ -0,0 +1,16 @@
include ..\include\makefile.i
btr.lib: btr0cur.obj btr0btr.obj btr0pcur.obj btr0sea.obj
lib -out:..\libs\btr.lib btr0cur.obj btr0btr.obj btr0pcur.obj btr0sea.obj
btr0cur.obj: btr0cur.c
$(CCOM) $(CFL) -c btr0cur.c
btr0btr.obj: btr0btr.c
$(CCOM) $(CFL) -c btr0btr.c
btr0sea.obj: btr0sea.c
$(CCOM) $(CFL) -c btr0sea.c
btr0pcur.obj: btr0pcur.c
$(CCOM) $(CFL) -c btr0pcur.c

312
innobase/btr/ts/isql.c Normal file
View file

@ -0,0 +1,312 @@
/************************************************************************
Test for the client: interactive SQL
(c) 1996-1997 Innobase Oy
Created 2/16/1996 Heikki Tuuri
*************************************************************************/
#include "univ.i"
#include "ib_odbc.h"
#include "mem0mem.h"
#include "sync0sync.h"
#include "os0thread.h"
#include "os0proc.h"
#include "os0sync.h"
#include "srv0srv.h"
ulint n_exited = 0;
char cli_srv_endpoint_name[100];
char cli_user_name[100];
ulint n_warehouses = ULINT_MAX;
ulint n_customers_d = ULINT_MAX;
bool is_tpc_d = FALSE;
ulint n_rounds = ULINT_MAX;
ulint n_users = ULINT_MAX;
ulint startdate = 0;
ulint enddate = 0;
bool own_warehouse = FALSE;
ulint mem_pool_size = ULINT_MAX;
/*************************************************************************
Reads a keywords and a values from an initfile. In case of an error, exits
from the process. */
static
void
cli_read_initfile(
/*==============*/
FILE* initfile) /* in: file pointer */
{
char str_buf[10000];
ulint ulint_val;
srv_read_init_val(initfile, FALSE, "SRV_ENDPOINT_NAME", str_buf,
&ulint_val);
ut_a(ut_strlen(str_buf) < COM_MAX_ADDR_LEN);
ut_memcpy(cli_srv_endpoint_name, str_buf, COM_MAX_ADDR_LEN);
srv_read_init_val(initfile, FALSE, "USER_NAME", str_buf,
&ulint_val);
ut_a(ut_strlen(str_buf) < COM_MAX_ADDR_LEN);
ut_memcpy(cli_user_name, str_buf, COM_MAX_ADDR_LEN);
srv_read_init_val(initfile, TRUE, "MEM_POOL_SIZE", str_buf,
&mem_pool_size);
srv_read_init_val(initfile, TRUE, "N_WAREHOUSES", str_buf,
&n_warehouses);
srv_read_init_val(initfile, TRUE, "N_CUSTOMERS_D", str_buf,
&n_customers_d);
srv_read_init_val(initfile, TRUE, "IS_TPC_D", str_buf,
&is_tpc_d);
srv_read_init_val(initfile, TRUE, "N_ROUNDS", str_buf,
&n_rounds);
srv_read_init_val(initfile, TRUE, "N_USERS", str_buf,
&n_users);
srv_read_init_val(initfile, TRUE, "STARTDATE", str_buf,
&startdate);
srv_read_init_val(initfile, TRUE, "ENDDATE", str_buf,
&enddate);
srv_read_init_val(initfile, TRUE, "OWN_WAREHOUSE", str_buf,
&own_warehouse);
}
/*************************************************************************
Reads configuration info for the client. */
static
void
cli_boot(
/*=====*/
char* name) /* in: the initialization file name */
{
FILE* initfile;
initfile = fopen(name, "r");
if (initfile == NULL) {
printf(
"Error in client booting: could not open initfile whose name is %s!\n",
name);
os_process_exit(1);
}
cli_read_initfile(initfile);
fclose(initfile);
}
/*********************************************************************
Interactive SQL loop. */
static
void
isql(
/*=*/
FILE* inputfile) /* in: input file containing SQL strings,
or stdin */
{
HENV env;
HDBC conn;
RETCODE ret;
HSTMT sql_query;
ulint tm, oldtm;
char buf[1000];
char* str;
ulint count;
ulint n_begins;
ulint len;
ulint n;
ulint i;
ulint n_lines;
ret = SQLAllocEnv(&env);
ut_a(ret == SQL_SUCCESS);
ret = SQLAllocConnect(env, &conn);
ut_a(ret == SQL_SUCCESS);
ret = SQLConnect(conn, (UCHAR*)cli_srv_endpoint_name,
(SWORD)ut_strlen(cli_srv_endpoint_name),
cli_user_name,
(SWORD)ut_strlen(cli_user_name),
(UCHAR*)"password", 8);
ut_a(ret == SQL_SUCCESS);
printf("Connection established\n");
printf("Interactive SQL performs queries by first making a stored\n");
printf("procedure from them, and then calling the procedure.\n");
printf("Put a semicolon after each statement and\n");
printf("end your query with two <enter>s.\n\n");
printf("You can also give a single input file\n");
printf("as a command line argument to isql.\n\n");
printf("In the file separate SQL queries and procedure bodies\n");
printf("by a single empty line. Do not write the final END; into\n");
printf("a procedure body.\n\n");
count = 0;
loop:
count++;
n = 0;
n_lines = 0;
sprintf(buf, "PROCEDURE P%s%lu () IS\nBEGIN ", cli_user_name,
count);
for (;;) {
len = ut_strlen(buf + n) - 1;
n += len;
if (len == 0) {
break;
} else {
sprintf(buf + n, "\n");
n++;
n_lines++;
}
str = fgets(buf + n, 1000, inputfile);
if ((str == NULL) && (inputfile != stdin)) {
/* Reached end-of-file: switch to input from
keyboard */
inputfile = stdin;
break;
}
ut_a(str);
}
if (n_lines == 1) {
/* Empty procedure */
goto loop;
}
/* If the statement is actually the body of a procedure,
erase the first BEGIN from the string: */
n_begins = 0;
for (i = 0; i < n - 5; i++) {
if (ut_memcmp(buf + i, "BEGIN", 5) == 0) {
n_begins++;
}
}
if (n_begins > 1) {
for (i = 0; i < n - 5; i++) {
if (ut_memcmp(buf + i, "BEGIN", 5) == 0) {
/* Erase the first BEGIN: */
ut_memcpy(buf + i, " ", 5);
break;
}
}
}
sprintf(buf + n, "END;\n");
printf("SQL procedure to execute:\n%s\n", buf);
ret = SQLAllocStmt(conn, &sql_query);
ut_a(ret == SQL_SUCCESS);
ret = SQLPrepare(sql_query, (UCHAR*)buf, ut_strlen(buf));
ut_a(ret == SQL_SUCCESS);
ret = SQLExecute(sql_query);
ut_a(ret == SQL_SUCCESS);
sprintf(buf, "{P%s%lu ()}", cli_user_name, count);
ret = SQLAllocStmt(conn, &sql_query);
ut_a(ret == SQL_SUCCESS);
ret = SQLPrepare(sql_query, (UCHAR*)buf, ut_strlen(buf));
ut_a(ret == SQL_SUCCESS);
printf("Starting to execute the query\n");
oldtm = ut_clock();
ret = SQLExecute(sql_query);
tm = ut_clock();
printf("Wall time for query %lu milliseconds\n\n", tm - oldtm);
ut_a(ret == SQL_SUCCESS);
goto loop;
}
/********************************************************************
Main test function. */
void
main(int argc, char* argv[])
/*========================*/
{
ulint tm, oldtm;
FILE* inputfile;
if (argc > 2) {
printf("Only one input file allowed\n");
os_process_exit(1);
} else if (argc == 2) {
inputfile = fopen(argv[1], "r");
if (inputfile == NULL) {
printf(
"Error: could not open the inputfile whose name is %s!\n",
argv[1]);
os_process_exit(1);
}
} else {
inputfile = stdin;
}
cli_boot("cli_init");
sync_init();
mem_init(mem_pool_size);
oldtm = ut_clock();
isql(inputfile);
tm = ut_clock();
printf("Wall time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

16
innobase/btr/ts/makefile Normal file
View file

@ -0,0 +1,16 @@
include ..\..\makefile.i
doall: tssrv tscli isql
tssrv: ..\btr.lib tssrv.c
$(CCOM) $(CFL) -I.. -I..\.. ..\btr.lib ..\..\eval.lib ..\..\ibuf.lib ..\..\trx.lib ..\..\pars.lib ..\..\que.lib ..\..\lock.lib ..\..\row.lib ..\..\read.lib ..\..\srv.lib ..\..\com.lib ..\..\usr.lib ..\..\thr.lib ..\..\fut.lib ..\..\fsp.lib ..\..\page.lib ..\..\dyn.lib ..\..\mtr.lib ..\..\log.lib ..\..\rem.lib ..\..\fil.lib ..\..\buf.lib ..\..\dict.lib ..\..\data.lib ..\..\mach.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tssrv.c $(LFL)
tscli: ..\btr.lib tscli.c
$(CCOM) $(CFL) -I.. -I..\.. ..\btr.lib ..\..\ib_odbc.lib ..\..\eval.lib ..\..\ibuf.lib ..\..\trx.lib ..\..\pars.lib ..\..\que.lib ..\..\lock.lib ..\..\row.lib ..\..\read.lib ..\..\srv.lib ..\..\com.lib ..\..\usr.lib ..\..\thr.lib ..\..\fut.lib ..\..\fsp.lib ..\..\page.lib ..\..\dyn.lib ..\..\mtr.lib ..\..\log.lib ..\..\rem.lib ..\..\fil.lib ..\..\buf.lib ..\..\dict.lib ..\..\data.lib ..\..\mach.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tscli.c $(LFL)
isql: ..\btr.lib isql.c
$(CCOM) $(CFL) -I.. -I..\.. ..\btr.lib ..\..\ib_odbc.lib ..\..\eval.lib ..\..\ibuf.lib ..\..\trx.lib ..\..\pars.lib ..\..\que.lib ..\..\lock.lib ..\..\row.lib ..\..\read.lib ..\..\srv.lib ..\..\com.lib ..\..\usr.lib ..\..\thr.lib ..\..\fut.lib ..\..\fsp.lib ..\..\page.lib ..\..\dyn.lib ..\..\mtr.lib ..\..\log.lib ..\..\rem.lib ..\..\fil.lib ..\..\buf.lib ..\..\dict.lib ..\..\data.lib ..\..\mach.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib isql.c $(LFL)
tsrecv: ..\btr.lib tsrecv.c
$(CCOM) $(CFL) -I.. -I..\.. ..\btr.lib ..\..\ibuf.lib ..\..\trx.lib ..\..\pars.lib ..\..\que.lib ..\..\lock.lib ..\..\row.lib ..\..\read.lib ..\..\srv.lib ..\..\com.lib ..\..\usr.lib ..\..\thr.lib ..\..\fut.lib ..\..\fsp.lib ..\..\page.lib ..\..\dyn.lib ..\..\mtr.lib ..\..\log.lib ..\..\rem.lib ..\..\fil.lib ..\..\buf.lib ..\..\dict.lib ..\..\data.lib ..\..\mach.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tsrecv.c $(LFL)

View file

@ -0,0 +1,483 @@
/************************************************************************
The test module for the record manager of MVB.
(c) 1994 Heikki Tuuri
Created 1/25/1994 Heikki Tuuri
*************************************************************************/
#include "rm0phr.h"
#include "rm0lgr.h"
#include "ut0ut.h"
#include "buf0mem.h"
#include "rm0ipg.h"
#include "../it0it.h"
#include "../it0hi.h"
#include "../it0ads.h"
byte buf[100];
byte buf2[100];
lint lintbuf[2048];
byte numbuf[6000];
byte numlogrecbuf[100];
phr_record_t* qs_table[100000];
lint qs_comp = 0;
extern
void
test1(void);
#ifdef NOT_DEFINED
void
q_sort(lint low, lint up)
{
phr_record_t* temp, *pivot;
lint i, j;
pivot = qs_table[(low + up) / 2];
i = low;
j = up;
while (i < j) {
qs_comp++;
if (cmp_phr_compare(qs_table[i], pivot)<= 0) {
i++;
} else {
j--;
temp = qs_table[i];
qs_table[i] = qs_table[j];
qs_table[j] = temp;
}
}
if (j == up) {
temp = qs_table[(low + up) / 2];
qs_table[(low + up) / 2] = qs_table[up - 1];
qs_table[up - 1] = temp;
j--;
}
if (j - low <= 1) {
/* do nothing */
} else if (j - low == 2) {
qs_comp++;
if (cmp_phr_compare(qs_table[low],
qs_table[low + 1])
<= 0) {
/* do nothing */
} else {
temp = qs_table[low];
qs_table[low] = qs_table[low + 1];
qs_table[low + 1] = temp;
}
} else {
q_sort(low, j);
}
if (up - j <= 1) {
/* do nothing */
} else if (up - j == 2) {
qs_comp++;
if (cmp_phr_compare(qs_table[j],
qs_table[j + 1])
<= 0) {
/* do nothing */
} else {
temp = qs_table[j];
qs_table[j] = qs_table[j + 1];
qs_table[j + 1] = temp;
}
} else {
q_sort(j, up);
}
}
#endif
extern
void
test1(void)
{
phr_record_t* physrec;
phr_record_t* rec1;
phr_record_t* rec2;
lgr_record_t* logrec;
lgrf_field_t* logfield;
lint len;
byte* str;
lint len2;
lint tm;
lint oldtm;
lint i, j, k, l, m;
bool b;
it_cur_cursor_t cursor;
ipg_cur_cursor_t* page_cursor;
ipg_page_t* page;
byte c4, c3, c2, c1, c0;
lint rand, rnd1, rnd2;
byte* nb;
lgr_record_t* numlogrec;
byte* pgbuf;
mem_stream_t* stream;
lint tree1, tree2, tree3;
lint dummy1, dummy2;
pgbuf = (byte*)lintbuf;
stream = mem_stream_create(0);
printf("-------------------------------------------\n");
printf("TEST 1. Speed and basic tests.\n");
logrec = lgr_create_logical_record(stream, 2);
nb = numbuf;
c4 = '0';
c3 = '0';
for (c2 = '0'; c2 <= '9'; c2++) {
for (c1 = '0'; c1 <= '9'; c1++) {
for (c0 = '0'; c0 <= '9'; c0++) {
*nb = c4; nb++;
*nb = c3; nb++;
*nb = c2; nb++;
*nb = c1; nb++;
*nb = c0; nb++;
*nb = '\0'; nb++;
}
}
}
numlogrec = lgr_create_logical_record(stream, 2);
tree1 = it_create_index_tree();
oldtm = ut_clock();
rand = 99900;
rnd1 = 67;
for (j = 0; j < 1; j++) {
for (i = 0 ; i < 100000; i++) {
rand = (rand + 1) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
/*
it_insert(tree1, numlogrec);
*/
it_cur_search_tree_to_nth_level(tree1, 1, numlogrec,
IPG_SE_L_GE, &cursor, &dummy1, &dummy2);
/*
it_cur_set_to_first(tree1, &cursor);
*/
it_cur_insert_record(&cursor, numlogrec);
}
}
tm = ut_clock();
printf("Time for inserting %ld recs = %ld \n", i* j, tm - oldtm);
/* it_print_tree(tree1, 10);*/
hi_print_info();
ads_print_info();
/*
oldtm = ut_clock();
rand = 11113;
for (i = 0; i < 5000; i++) {
rand = (rand + 57123) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
it_cur_search_tree_to_nth_level(tree1, 1, numlogrec,
IPG_SE_L_GE, &cursor, &dummy1, &dummy2);
}
tm = ut_clock();
printf("Time for searching %ld recs = %ld \n", i, tm - oldtm);
*/
it_cur_set_to_first(tree1, &cursor);
rec1 = ipg_cur_get_record(it_cur_get_page_cursor(&cursor));
for (i = 0;; i++) {
it_cur_move_to_next(&cursor);
if (it_cur_end_of_level(&cursor)) {
break;
}
rec2 = ipg_cur_get_record(it_cur_get_page_cursor(&cursor));
ut_a(cmp_phr_compare(rec1, rec2) == -1);
rec1 = rec2;
}
printf("tree1 checked for right sorted order!\n");
#ifdef not_defined
oldtm = ut_clock();
for (j = 0; j < 1; j++) {
rand = 11113;
for (i = 0; i < 3000; i++) {
rand = (rand + 57123) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
physrec = hi_search(numlogrec);
ut_a(physrec);
}
}
ut_a(physrec);
tm = ut_clock();
printf("Time for hi_search %ld recs = %ld \n", i * j,
tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 100000; i++) {
/* j += lgr_fold(numlogrec, -1, -1);*/
/* b += phr_lgr_equal(physrec, numlogrec, -1);*/
k += ut_hash_lint(j, HI_TABLE_SIZE);
}
/* ut_a(b);*/
tm = ut_clock();
printf("Time for fold + equal %ld recs %s = %ld \n", i, physrec,
tm - oldtm);
printf("%ld %ld %ld\n", j, b, k);
hi_print_info();
tree2 = it_create_index_tree();
rand = 90000;
for (i = 0; i < 300; i++) {
rand = (rand + 1) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
it_cur_search_tree_to_nth_level(tree2, 1, numlogrec,
IPG_SE_L_GE, &cursor);
it_cur_insert_record(&cursor, numlogrec);
}
oldtm = ut_clock();
rand = 10000;
for (i = 0; i < 3000; i++) {
rand = (rand + 1) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
it_cur_search_tree_to_nth_level(tree2, 1, numlogrec,
IPG_SE_L_GE, &cursor);
it_cur_insert_record(&cursor, numlogrec);
}
tm = ut_clock();
printf("Time for inserting sequentially %ld recs = %ld \n",
i, tm - oldtm);
/* it_print_tree(tree2, 10); */
tree3 = it_create_index_tree();
rand = 0;
for (i = 0; i < 300; i++) {
rand = (rand + 1) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
it_cur_search_tree_to_nth_level(tree3, 1, numlogrec,
IPG_SE_L_GE, &cursor);
it_cur_insert_record(&cursor, numlogrec);
}
oldtm = ut_clock();
rand = 100000;
for (i = 0; i < 3000; i++) {
rand = (rand - 1) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
it_cur_search_tree_to_nth_level(tree3, 1, numlogrec,
IPG_SE_L_GE, &cursor);
it_cur_insert_record(&cursor, numlogrec);
}
tm = ut_clock();
printf("Time for inserting sequentially downw. %ld recs = %ld \n",
i, tm - oldtm);
/* it_print_tree(tree3, 10); */
#endif
}
#ifdef NOT_DEFINED
/* Test of quicksort */
void
test2(void)
{
mem_stream_t* stream;
byte* stbuf;
lgrf_field_t* logfield;
lint tm;
lint oldtm;
lint i, j, k, l, m;
lint rand;
lgr_record_t* numlogrec;
phr_record_t* ph_rec;
stream = mem_stream_create(1000);
numlogrec = lgr_create_logical_record(stream, 2);
oldtm = ut_clock();
rand = 11113;
for (i = 0; i < 50000; i++) {
stbuf = mem_stream_alloc(stream, 30);
rand = (rand + 57123) % 100000;
logfield = lgr_get_nth_field(numlogrec, 0);
lgrf_set_data(logfield, numbuf + 6 * (rand / 300));
lgrf_set_len(logfield, 6);
logfield = lgr_get_nth_field(numlogrec, 1);
lgrf_set_data(logfield, numbuf + 6 * (rand % 300));
lgrf_set_len(logfield, 6);
ph_rec = phr_create_physical_record(stbuf, 30, numlogrec);
qs_table[i] = ph_rec;
}
tm = ut_clock();
printf("Time for inserting %ld recs to mem stream = %ld \n",
i, tm - oldtm);
oldtm = ut_clock();
q_sort(0, 50000);
tm = ut_clock();
printf("Time for quicksort of %ld recs = %ld, comps: %ld \n",
i, tm - oldtm, qs_comp);
for (i = 1; i < 49999; i++) {
ut_a(-1 ==
cmp_phr_compare(qs_table[i], qs_table[i+1]
));
}
tm = ut_clock();
oldtm = ut_clock();
for (i = 1; i < 50000; i++) {
k += cmp_phr_compare(qs_table[i & 0xF],
qs_table[5]);
}
tm = ut_clock();
printf("%ld\n", k);
printf("Time for cmp of %ld ph_recs = %ld \n",
i, tm - oldtm);
mem_stream_free(stream);
}
#endif
void
main(void)
{
test1();
/* test2(); */
}

View file

@ -0,0 +1,798 @@
/************************************************************************
The test for the index tree
(c) 1994-1996 Innobase Oy
Created 2/16/1996 Heikki Tuuri
*************************************************************************/
#include "sync0sync.h"
#include "ut0mem.h"
#include "mem0mem.h"
#include "data0data.h"
#include "data0type.h"
#include "dict0dict.h"
#include "buf0buf.h"
#include "os0file.h"
#include "fil0fil.h"
#include "fsp0fsp.h"
#include "rem0rec.h"
#include "rem0cmp.h"
#include "mtr0mtr.h"
#include "log0log.h"
#include "page0page.h"
#include "page0cur.h"
#include "..\btr0btr.h"
#include "..\btr0cur.h"
#include "..\btr0pcur.h"
os_file_t files[1000];
mutex_t ios_mutex;
ulint ios;
ulint n[10];
mutex_t incs_mutex;
ulint incs;
byte bigbuf[1000000];
#define N_SPACES 1
#define N_FILES 2
#define FILE_SIZE 1000 /* must be > 512 */
#define POOL_SIZE 1000
#define COUNTER_OFFSET 1500
#define LOOP_SIZE 150
#define N_THREADS 5
ulint zero = 0;
buf_block_t* bl_arr[POOL_SIZE];
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
void* mess;
ulint i;
bool ret;
segment = *((ulint*)arg);
printf("Io handler thread %lu starts\n", segment);
for (i = 0;; i++) {
ret = fil_aio_wait(segment, &mess);
ut_a(ret);
buf_page_io_complete((buf_block_t*)mess);
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
}
return(0);
}
/*************************************************************************
Creates the files for the file system test and inserts them to
the file system. */
void
create_files(void)
/*==============*/
{
bool ret;
ulint i, k;
char name[20];
os_thread_t thr[5];
os_thread_id_t id[5];
printf("--------------------------------------------------------\n");
printf("Create or open database files\n");
strcpy(name, "j:\\tsfile00");
for (k = 0; k < N_SPACES; k++) {
for (i = 0; i < N_FILES; i++) {
name[9] = (char)((ulint)'0' + k);
name[10] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE,
OS_FILE_TABLESPACE, &ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(
name, OS_FILE_OPEN,
OS_FILE_TABLESPACE, &ret);
ut_a(ret);
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create(name, k, OS_FILE_TABLESPACE);
}
ut_a(fil_validate());
fil_node_create(name, FILE_SIZE, k);
}
}
ios = 0;
mutex_create(&ios_mutex);
for (i = 0; i < 5; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
}
/************************************************************************
Inits space header of space 0. */
void
init_space(void)
/*============*/
{
mtr_t mtr;
printf("Init space header\n");
mtr_start(&mtr);
fsp_header_init(0, FILE_SIZE * N_FILES, &mtr);
mtr_commit(&mtr);
}
/*********************************************************************
Test for index page. */
void
test1(void)
/*=======*/
{
dtuple_t* tuple;
mem_heap_t* heap;
ulint rnd = 0;
dict_index_t* index;
dict_table_t* table;
dict_tree_t* tree;
mtr_t mtr;
byte buf[8];
ulint i;
ulint tm, oldtm;
btr_pcur_t cursor;
printf("-------------------------------------------------\n");
printf("TEST 1. Basic test\n");
heap = mem_heap_create(0);
table = dict_mem_table_create("TS_TABLE1", 2);
dict_mem_table_add_col(table, "COL1", DATA_VARCHAR, DATA_ENGLISH, 10, 0);
dict_mem_table_add_col(table, "COL2", DATA_VARCHAR, DATA_ENGLISH, 10, 0);
dict_table_add_to_cache(table);
index = dict_mem_index_create("TS_TABLE1", "IND1", 0, 2, 0);
dict_mem_index_add_field(index, "COL1", 0);
dict_mem_index_add_field(index, "COL2", 0);
dict_index_add_to_cache(index);
index = dict_index_get("TS_TABLE1", "IND1");
ut_a(index);
tree = dict_index_get_tree(index);
tuple = dtuple_create(heap, 3);
mtr_start(&mtr);
btr_root_create(tree, 0, &mtr);
mtr_commit(&mtr);
mtr_start(&mtr);
dtuple_gen_test_tuple3(tuple, 0, buf);
btr_insert(tree, tuple, &mtr);
mtr_commit(&mtr);
rnd = 90000;
oldtm = ut_clock();
for (i = 0; i < 1000 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
if (i == 77000) {
rnd = rnd % 200000;
}
rnd = (rnd + 15675751) % 200000;
dtuple_gen_test_tuple3(tuple, rnd, buf);
btr_insert(tree, tuple, &mtr);
mtr_commit(&mtr);
}
tm = ut_clock();
printf("Wall time for test %lu milliseconds\n", tm - oldtm);
rnd = 90000;
oldtm = ut_clock();
for (i = 0; i < 1000 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
if (i == 50000) {
rnd = rnd % 200000;
}
rnd = (rnd + 595659561) % 200000;
dtuple_gen_test_tuple3(tuple, rnd, buf);
btr_pcur_open(tree, tuple, PAGE_CUR_GE,
BTR_SEARCH_LEAF, &cursor, &mtr);
mtr_commit(&mtr);
}
tm = ut_clock();
printf("Wall time for test %lu milliseconds\n", tm - oldtm);
rnd = 0;
oldtm = ut_clock();
for (i = 0; i < 1000 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
rnd = (rnd + 35608971) % 200000 + 1;
dtuple_gen_test_tuple3(tuple, rnd, buf);
mtr_commit(&mtr);
}
tm = ut_clock();
printf("Wall time for test %lu milliseconds\n", tm - oldtm);
/* btr_print_tree(tree, 3); */
mem_heap_free(heap);
}
#ifdef notdefined
mtr_start(&mtr);
block = buf_page_create(0, 5, &mtr);
buf_page_x_lock(block, &mtr);
frame = buf_block_get_frame(block);
page = page_create(frame, &mtr);
for (i = 0; i < 512; i++) {
rnd = (rnd + 534671) % 512;
if (i % 27 == 0) {
ut_a(page_validate(page, index));
}
dtuple_gen_test_tuple(tuple, rnd);
/* dtuple_print(tuple);*/
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
rec_validate(rec);
/* page_print_list(page, 151); */
}
/* page_print_list(page, 151); */
ut_a(page_validate(page, index));
ut_a(page_get_n_recs(page) == 512);
for (i = 0; i < 512; i++) {
rnd = (rnd + 7771) % 512;
if (i % 27 == 0) {
ut_a(page_validate(page, index));
}
dtuple_gen_test_tuple(tuple, rnd);
/* dtuple_print(tuple);*/
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
page_cur_delete_rec(&cursor, &mtr);
ut_a(rec);
rec_validate(rec);
/* page_print_list(page, 151); */
}
ut_a(page_get_n_recs(page) == 0);
ut_a(page_validate(page, index));
page = page_create(frame, &mtr);
rnd = 311;
for (i = 0; i < 512; i++) {
rnd = (rnd + 1) % 512;
if (i % 27 == 0) {
ut_a(page_validate(page, index));
}
dtuple_gen_test_tuple(tuple, rnd);
/* dtuple_print(tuple);*/
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
rec_validate(rec);
/* page_print_list(page, 151); */
}
ut_a(page_validate(page, index));
ut_a(page_get_n_recs(page) == 512);
rnd = 217;
for (i = 0; i < 512; i++) {
rnd = (rnd + 1) % 512;
if (i % 27 == 0) {
ut_a(page_validate(page, index));
}
dtuple_gen_test_tuple(tuple, rnd);
/* dtuple_print(tuple);*/
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
page_cur_delete_rec(&cursor, &mtr);
ut_a(rec);
rec_validate(rec);
/* page_print_list(page, 151); */
}
ut_a(page_validate(page, index));
ut_a(page_get_n_recs(page) == 0);
page = page_create(frame, &mtr);
rnd = 291;
for (i = 0; i < 512; i++) {
rnd = (rnd - 1) % 512;
if (i % 27 == 0) {
ut_a(page_validate(page, index));
}
dtuple_gen_test_tuple(tuple, rnd);
/* dtuple_print(tuple);*/
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
rec_validate(rec);
/* page_print_list(page, 151); */
}
ut_a(page_validate(page, index));
ut_a(page_get_n_recs(page) == 512);
rnd = 277;
for (i = 0; i < 512; i++) {
rnd = (rnd - 1) % 512;
if (i % 27 == 0) {
ut_a(page_validate(page, index));
}
dtuple_gen_test_tuple(tuple, rnd);
/* dtuple_print(tuple);*/
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
page_cur_delete_rec(&cursor, &mtr);
ut_a(rec);
rec_validate(rec);
/* page_print_list(page, 151); */
}
ut_a(page_validate(page, index));
ut_a(page_get_n_recs(page) == 0);
mtr_commit(&mtr);
mem_heap_free(heap);
}
/*********************************************************************
Test for index page. */
void
test2(void)
/*=======*/
{
page_t* page;
dtuple_t* tuple;
mem_heap_t* heap;
ulint i, j;
ulint rnd = 0;
rec_t* rec;
page_cur_t cursor;
dict_index_t* index;
dict_table_t* table;
buf_block_t* block;
buf_frame_t* frame;
ulint tm, oldtm;
byte buf[8];
mtr_t mtr;
printf("-------------------------------------------------\n");
printf("TEST 2. Speed test\n");
oldtm = ut_clock();
for (i = 0; i < 1000 * UNIV_DBC * UNIV_DBC; i++) {
ut_memcpy(bigbuf, bigbuf + 800, 800);
}
tm = ut_clock();
printf("Wall time for %lu mem copys of 800 bytes %lu millisecs\n",
i, tm - oldtm);
oldtm = ut_clock();
rnd = 0;
for (i = 0; i < 1000 * UNIV_DBC * UNIV_DBC; i++) {
ut_memcpy(bigbuf + rnd, bigbuf + rnd + 800, 800);
rnd += 1600;
if (rnd > 995000) {
rnd = 0;
}
}
tm = ut_clock();
printf("Wall time for %lu mem copys of 800 bytes %lu millisecs\n",
i, tm - oldtm);
heap = mem_heap_create(0);
table = dict_table_create("TS_TABLE2", 2);
dict_table_add_col(table, "COL1", DATA_VARCHAR, DATA_ENGLISH, 10, 0);
dict_table_add_col(table, "COL2", DATA_VARCHAR, DATA_ENGLISH, 10, 0);
ut_a(0 == dict_table_publish(table));
index = dict_index_create("TS_TABLE2", "IND2", 0, 2, 0);
dict_index_add_field(index, "COL1", 0);
dict_index_add_field(index, "COL2", 0);
ut_a(0 == dict_index_publish(index));
index = dict_index_get("TS_TABLE2", "IND2");
ut_a(index);
tuple = dtuple_create(heap, 2);
oldtm = ut_clock();
rnd = 677;
for (i = 0; i < 4 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
block = buf_page_create(0, 5, &mtr);
buf_page_x_lock(block, &mtr);
frame = buf_block_get_frame(block);
page = page_create(frame, &mtr);
for (j = 0; j < 250; j++) {
rnd = (rnd + 54841) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
}
mtr_commit(&mtr);
}
tm = ut_clock();
printf("Wall time for insertion of %lu recs %lu milliseconds\n",
i * j, tm - oldtm);
mtr_start(&mtr);
block = buf_page_get(0, 5, &mtr);
buf_page_s_lock(block, &mtr);
page = buf_block_get_frame(block);
ut_a(page_validate(page, index));
mtr_commit(&mtr);
oldtm = ut_clock();
rnd = 677;
for (i = 0; i < 4 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
block = buf_page_create(0, 5, &mtr);
buf_page_x_lock(block, &mtr);
frame = buf_block_get_frame(block);
page = page_create(frame, &mtr);
for (j = 0; j < 250; j++) {
rnd = (rnd + 54841) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
}
mtr_commit(&mtr);
}
tm = ut_clock();
printf(
"Wall time for %lu empty loops with page create %lu milliseconds\n",
i * j, tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 4 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
block = buf_page_create(0, 5, &mtr);
buf_page_x_lock(block, &mtr);
frame = buf_block_get_frame(block);
page = page_create(frame, &mtr);
rnd = 100;
for (j = 0; j < 250; j++) {
rnd = (rnd + 1) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
}
mtr_commit(&mtr);
}
tm = ut_clock();
printf(
"Wall time for sequential insertion of %lu recs %lu milliseconds\n",
i * j, tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 4 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
block = buf_page_create(0, 5, &mtr);
buf_page_x_lock(block, &mtr);
frame = buf_block_get_frame(block);
page = page_create(frame, &mtr);
rnd = 500;
for (j = 0; j < 250; j++) {
rnd = (rnd - 1) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
}
mtr_commit(&mtr);
}
tm = ut_clock();
printf(
"Wall time for descend. seq. insertion of %lu recs %lu milliseconds\n",
i * j, tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 4 * UNIV_DBC * UNIV_DBC; i++) {
mtr_start(&mtr);
block = buf_page_create(0, 5, &mtr);
buf_page_x_lock(block, &mtr);
frame = buf_block_get_frame(block);
page = page_create(frame, &mtr);
rnd = 677;
for (j = 0; j < 250; j++) {
rnd = (rnd + 54841) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
}
rnd = 677;
for (j = 0; j < 250; j++) {
rnd = (rnd + 54841) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
page_cur_delete_rec(&cursor, &mtr);
}
ut_a(page_get_n_recs(page) == 0);
mtr_commit(&mtr);
}
tm = ut_clock();
printf("Wall time for insert and delete of %lu recs %lu milliseconds\n",
i * j, tm - oldtm);
mtr_start(&mtr);
block = buf_page_create(0, 5, &mtr);
buf_page_x_lock(block, &mtr);
frame = buf_block_get_frame(block);
page = page_create(frame, &mtr);
rnd = 677;
for (j = 0; j < 250; j++) {
rnd = (rnd + 54841) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
rec = page_cur_insert_rec(&cursor, tuple, NULL, &mtr);
ut_a(rec);
}
ut_a(page_validate(page, index));
mtr_print(&mtr);
oldtm = ut_clock();
for (i = 0; i < 4 * UNIV_DBC * UNIV_DBC; i++) {
rnd = 677;
for (j = 0; j < 250; j++) {
rnd = (rnd + 54841) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
page_cur_search(page, tuple, PAGE_CUR_G, &cursor);
}
}
tm = ut_clock();
printf("Wall time for search of %lu recs %lu milliseconds\n",
i * j, tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 4 * UNIV_DBC * UNIV_DBC; i++) {
rnd = 677;
for (j = 0; j < 250; j++) {
rnd = (rnd + 54841) % 1000;
dtuple_gen_test_tuple2(tuple, rnd, buf);
}
}
tm = ut_clock();
printf("Wall time for %lu empty loops %lu milliseconds\n",
i * j, tm - oldtm);
mtr_commit(&mtr);
}
#endif
/********************************************************************
Main test function. */
void
main(void)
/*======*/
{
ulint tm, oldtm;
sync_init();
mem_init();
os_aio_init(160, 5);
fil_init(25);
buf_pool_init(POOL_SIZE, POOL_SIZE);
dict_init();
fsp_init();
log_init();
create_files();
init_space();
oldtm = ut_clock();
ut_rnd_set_seed(19);
test1();
/* mem_print_info(); */
tm = ut_clock();
printf("Wall time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

File diff suppressed because it is too large Load diff

5080
innobase/btr/ts/tsbtr97.c Normal file

File diff suppressed because it is too large Load diff

4925
innobase/btr/ts/tsbtrfull.c Normal file

File diff suppressed because it is too large Load diff

802
innobase/btr/ts/tsbtrins.c Normal file
View file

@ -0,0 +1,802 @@
/************************************************************************
Test for the B-tree
(c) 1994-1997 Innobase Oy
Created 2/16/1996 Heikki Tuuri
*************************************************************************/
#include "os0proc.h"
#include "sync0sync.h"
#include "ut0mem.h"
#include "mem0mem.h"
#include "mem0pool.h"
#include "data0data.h"
#include "data0type.h"
#include "dict0dict.h"
#include "buf0buf.h"
#include "os0file.h"
#include "os0thread.h"
#include "fil0fil.h"
#include "fsp0fsp.h"
#include "rem0rec.h"
#include "rem0cmp.h"
#include "mtr0mtr.h"
#include "log0log.h"
#include "page0page.h"
#include "page0cur.h"
#include "trx0trx.h"
#include "dict0boot.h"
#include "trx0sys.h"
#include "dict0crea.h"
#include "btr0btr.h"
#include "btr0pcur.h"
#include "btr0cur.h"
#include "btr0sea.h"
#include "rem0rec.h"
#include "srv0srv.h"
#include "que0que.h"
#include "com0com.h"
#include "usr0sess.h"
#include "lock0lock.h"
#include "trx0roll.h"
#include "trx0purge.h"
#include "row0ins.h"
#include "row0upd.h"
#include "row0row.h"
#include "row0del.h"
#include "lock0lock.h"
#include "ibuf0ibuf.h"
os_file_t files[1000];
mutex_t ios_mutex;
ulint ios;
ulint n[10];
mutex_t incs_mutex;
ulint incs;
#define N_SPACES 2 /* must be >= 2 */
#define N_FILES 1
#define FILE_SIZE 8096 /* must be > 512 */
#define POOL_SIZE 1024
#define IBUF_SIZE 200
#define COUNTER_OFFSET 1500
#define LOOP_SIZE 150
#define N_THREADS 5
#define COUNT 1
ulint zero = 0;
buf_block_t* bl_arr[POOL_SIZE];
ulint dummy = 0;
byte test_buf[8000];
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
void* mess;
ulint i;
bool ret;
segment = *((ulint*)arg);
printf("Io handler thread %lu starts\n", segment);
for (i = 0;; i++) {
ret = fil_aio_wait(segment, &mess);
ut_a(ret);
buf_page_io_complete((buf_block_t*)mess);
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
}
return(0);
}
/*************************************************************************
Creates the files for the file system test and inserts them to the file
system. */
void
create_files(void)
/*==============*/
{
bool ret;
ulint i, k;
char name[20];
os_thread_t thr[10];
os_thread_id_t id[10];
printf("--------------------------------------------------------\n");
printf("Create or open database files\n");
strcpy(name, "tsfile00");
for (k = 0; k < N_SPACES; k++) {
for (i = 0; i < N_FILES; i++) {
name[6] = (char)((ulint)'0' + k);
name[7] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE,
OS_FILE_TABLESPACE, &ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(
name, OS_FILE_OPEN,
OS_FILE_TABLESPACE, &ret);
ut_a(ret);
} else {
if (k == 1) {
ut_a(os_file_set_size(files[i],
8192 * IBUF_SIZE, 0));
} else {
ut_a(os_file_set_size(files[i],
8192 * FILE_SIZE, 0));
}
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create(name, k, OS_FILE_TABLESPACE);
}
ut_a(fil_validate());
fil_node_create(name, FILE_SIZE, k);
}
}
ios = 0;
mutex_create(&ios_mutex);
mutex_set_level(&ios_mutex, SYNC_NO_ORDER_CHECK);
for (i = 0; i < 9; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
}
/************************************************************************
Inits space headers of spaces 0 and 1. */
void
init_spaces(void)
/*=============*/
{
mtr_t mtr;
mtr_start(&mtr);
fsp_header_init(0, FILE_SIZE * N_FILES, &mtr);
fsp_header_init(1, IBUF_SIZE, &mtr);
mtr_commit(&mtr);
}
/*********************************************************************
Test for table creation. */
ulint
test1(
/*==*/
void* arg)
{
sess_t* sess;
com_endpoint_t* com_endpoint;
mem_heap_t* heap;
dict_index_t* index;
dict_table_t* table;
que_fork_t* fork;
que_thr_t* thr;
trx_t* trx;
UT_NOT_USED(arg);
printf("-------------------------------------------------\n");
printf("TEST 1. CREATE TABLE WITH 3 COLUMNS AND WITH 3 INDEXES\n");
heap = mem_heap_create(512);
com_endpoint = (com_endpoint_t*)heap; /* This is a dummy non-NULL
value */
mutex_enter(&kernel_mutex);
sess = sess_open(ut_dulint_zero, com_endpoint, (byte*)"user1", 6);
trx = sess->trx;
mutex_exit(&kernel_mutex);
ut_a(trx_start(trx, ULINT_UNDEFINED));
table = dict_mem_table_create("TS_TABLE1", 0, 3);
dict_mem_table_add_col(table, "COL1", DATA_VARCHAR,
DATA_ENGLISH, 10, 0);
dict_mem_table_add_col(table, "COL2", DATA_VARCHAR,
DATA_ENGLISH, 10, 0);
dict_mem_table_add_col(table, "COL3", DATA_VARCHAR,
DATA_ENGLISH, 100, 0);
/*------------------------------------*/
/* CREATE TABLE */
fork = que_fork_create(NULL, NULL, QUE_FORK_EXECUTE, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
thr->child = tab_create_graph_create(fork, thr, table, heap);
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
ut_a(thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
que_run_threads(thr);
/* dict_table_print_by_name("SYS_TABLES");
dict_table_print_by_name("SYS_COLUMNS"); */
/*-------------------------------------*/
/* CREATE CLUSTERED INDEX */
index = dict_mem_index_create("TS_TABLE1", "IND1", 0,
DICT_UNIQUE | DICT_CLUSTERED, 1);
dict_mem_index_add_field(index, "COL1", 0);
ut_a(mem_heap_validate(index->heap));
fork = que_fork_create(NULL, NULL, QUE_FORK_EXECUTE, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
thr->child = ind_create_graph_create(fork, thr, index, heap);
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
ut_a(thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
que_run_threads(thr);
/* dict_table_print_by_name("SYS_INDEXES");
dict_table_print_by_name("SYS_FIELDS"); */
/*-------------------------------------*/
/* CREATE SECONDARY INDEX */
index = dict_mem_index_create("TS_TABLE1", "IND2", 0, 0, 1);
dict_mem_index_add_field(index, "COL2", 0);
ut_a(mem_heap_validate(index->heap));
fork = que_fork_create(NULL, NULL, QUE_FORK_EXECUTE, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
thr->child = ind_create_graph_create(fork, thr, index, heap);
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
ut_a(thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
que_run_threads(thr);
/* dict_table_print_by_name("SYS_INDEXES");
dict_table_print_by_name("SYS_FIELDS"); */
/*-------------------------------------*/
/* CREATE ANOTHER SECONDARY INDEX */
index = dict_mem_index_create("TS_TABLE1", "IND3", 0, 0, 1);
dict_mem_index_add_field(index, "COL2", 0);
ut_a(mem_heap_validate(index->heap));
fork = que_fork_create(NULL, NULL, QUE_FORK_EXECUTE, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
thr->child = ind_create_graph_create(fork, thr, index, heap);
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
ut_a(thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
que_run_threads(thr);
#ifdef notdefined
/*-------------------------------------*/
/* CREATE YET ANOTHER SECONDARY INDEX */
index = dict_mem_index_create("TS_TABLE1", "IND4", 0, 0, 1);
dict_mem_index_add_field(index, "COL2", 0);
ut_a(mem_heap_validate(index->heap));
fork = que_fork_create(NULL, NULL, QUE_FORK_EXECUTE, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
thr->child = ind_create_graph_create(fork, thr, index, heap);
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
ut_a(thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
que_run_threads(thr);
#endif
/* dict_table_print_by_name("SYS_INDEXES");
dict_table_print_by_name("SYS_FIELDS"); */
return(0);
}
/*********************************************************************
Another test for inserts. */
ulint
test2_1(
/*====*/
void* arg)
{
ulint tm, oldtm;
sess_t* sess;
com_endpoint_t* com_endpoint;
mem_heap_t* heap;
que_fork_t* fork;
dict_table_t* table;
que_thr_t* thr;
trx_t* trx;
ulint i;
byte buf[100];
ins_node_t* node;
ulint count = 0;
ulint rnd;
dtuple_t* row;
dict_index_t* index;
/* dict_tree_t* tree;
dtuple_t* entry;
btr_pcur_t pcur;
mtr_t mtr; */
printf("-------------------------------------------------\n");
printf("TEST 2.1. MASSIVE ASCENDING INSERT\n");
heap = mem_heap_create(512);
com_endpoint = (com_endpoint_t*)heap; /* This is a dummy non-NULL
value */
mutex_enter(&kernel_mutex);
sess = sess_open(ut_dulint_zero, com_endpoint, (byte*)"user1", 6);
trx = sess->trx;
mutex_exit(&kernel_mutex);
loop:
ut_a(trx_start(trx, ULINT_UNDEFINED));
/*-------------------------------------*/
/* MASSIVE INSERT */
fork = que_fork_create(NULL, NULL, QUE_FORK_INSERT, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
table = dict_table_get("TS_TABLE1", trx);
row = dtuple_create(heap, 3 + DATA_N_SYS_COLS);
dict_table_copy_types(row, table);
node = ins_node_create(fork, thr, row, table, heap);
thr->child = node;
row_ins_init_sys_fields_at_sql_compile(node->row, node->table, heap);
row_ins_init_sys_fields_at_sql_prepare(node->row, node->table, trx);
node->init_all_sys_fields = FALSE;
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
mutex_exit(&kernel_mutex);
rnd = 0;
log_print();
oldtm = ut_clock();
for (i = 0; i < *((ulint*)arg); i++) {
dtuple_gen_test_tuple3(row, rnd, DTUPLE_TEST_FIXED30, buf);
mutex_enter(&kernel_mutex);
ut_a(
thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
que_run_threads(thr);
if (i % 5000 == 0) {
/* ibuf_print(); */
/* buf_print(); */
/* buf_print_io(); */
/*
tm = ut_clock();
printf("Wall time for %lu inserts %lu milliseconds\n",
i, tm - oldtm); */
}
rnd = rnd + 1;
}
tm = ut_clock();
printf("Wall time for %lu inserts %lu milliseconds\n", i, tm - oldtm);
log_print();
/* dict_table_print_by_name("TS_TABLE1"); */
/* ibuf_print(); */
index = index;
index = dict_table_get_first_index(table);
if (zero) {
btr_search_index_print_info(index);
}
btr_validate_tree(dict_index_get_tree(index));
#ifdef notdefined
index = dict_table_get_next_index(index);
if (zero) {
btr_search_index_print_info(index);
}
btr_validate_tree(dict_index_get_tree(index));
index = dict_table_get_next_index(index);
/* btr_search_index_print_info(index); */
btr_validate_tree(dict_index_get_tree(index));
/* dict_table_print_by_name("TS_TABLE1"); */
/* Check inserted entries */
btr_search_print_info();
entry = dtuple_create(heap, 1);
dtuple_gen_search_tuple3(entry, 0, buf);
mtr_start(&mtr);
index = dict_table_get_first_index(table);
tree = dict_index_get_tree(index);
btr_pcur_open(index, entry, PAGE_CUR_L, BTR_SEARCH_LEAF, &pcur, &mtr);
ut_a(btr_pcur_is_before_first_in_tree(&pcur, &mtr));
for (i = 0; i < *((ulint*)arg); i++) {
ut_a(btr_pcur_move_to_next(&pcur, &mtr));
dtuple_gen_search_tuple3(entry, i, buf);
ut_a(0 == cmp_dtuple_rec(entry, btr_pcur_get_rec(&pcur)));
}
ut_a(!btr_pcur_move_to_next(&pcur, &mtr));
ut_a(btr_pcur_is_after_last_in_tree(&pcur, &mtr));
btr_pcur_close(&pcur);
mtr_commit(&mtr);
printf("Validating tree\n");
btr_validate_tree(tree);
printf("Validated\n");
#endif
/*-------------------------------------*/
/* ROLLBACK */
#ifdef notdefined
/* btr_validate_tree(tree); */
fork = que_fork_create(NULL, NULL, QUE_FORK_EXECUTE, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
thr->child = roll_node_create(fork, thr, heap);
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
ut_a(thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
oldtm = ut_clock();
que_run_threads(thr);
tm = ut_clock();
printf("Wall time for rollback of %lu inserts %lu milliseconds\n",
i, tm - oldtm);
os_thread_sleep(1000000);
/* dict_table_print_by_name("TS_TABLE1"); */
dtuple_gen_search_tuple3(entry, 0, buf);
mtr_start(&mtr);
btr_pcur_open(index, entry, PAGE_CUR_L, BTR_SEARCH_LEAF, &pcur, &mtr);
ut_a(btr_pcur_is_before_first_in_tree(&pcur, &mtr));
ut_a(!btr_pcur_move_to_next(&pcur, &mtr));
ut_a(btr_pcur_is_after_last_in_tree(&pcur, &mtr));
btr_pcur_close(&pcur);
mtr_commit(&mtr);
btr_search_print_info();
#endif
/*-------------------------------------*/
/* COMMIT */
fork = que_fork_create(NULL, NULL, QUE_FORK_EXECUTE, heap);
fork->trx = trx;
thr = que_thr_create(fork, fork, heap);
thr->child = commit_node_create(fork, thr, heap);
mutex_enter(&kernel_mutex);
que_graph_publish(fork, trx->sess);
trx->graph = fork;
ut_a(thr == que_fork_start_command(fork, SESS_COMM_EXECUTE, 0));
mutex_exit(&kernel_mutex);
oldtm = ut_clock();
que_run_threads(thr);
tm = ut_clock();
printf("Wall time for commit %lu milliseconds\n", tm - oldtm);
/*-------------------------------------*/
count++;
/* btr_validate_tree(tree); */
if (count < 1) {
goto loop;
}
mem_heap_free(heap);
return(0);
}
/********************************************************************
Main test function. */
void
main(void)
/*======*/
{
ulint tm, oldtm;
os_thread_id_t id[10];
ulint n1000[10];
ulint i;
ulint n5000 = 500;
ulint n2;
/* buf_debug_prints = TRUE; */
log_do_write = TRUE;
srv_boot("initfile");
os_aio_init(576, 9, 100);
fil_init(25);
buf_pool_init(POOL_SIZE, POOL_SIZE);
fsp_init();
log_init();
lock_sys_create(1024);
create_files();
init_spaces();
sess_sys_init_at_db_start();
trx_sys_create();
dict_create();
/* os_thread_sleep(500000); */
oldtm = ut_clock();
ut_rnd_set_seed(19);
test1(NULL);
/* for (i = 0; i < 2; i++) {
n1000[i] = i;
id[i] = id[i];
os_thread_create(test10mt, n1000 + i, id + i);
}
*/
i = 4;
n1000[i] = i;
id[i] = id[i];
/* os_thread_create(test10_4, n1000 + i, id + i); */
i = 5;
/* test10mt(&i);
i = 6;
test10mt(&i);
trx_purge();
printf("%lu pages purged\n", purge_sys->n_pages_handled);
dict_table_print_by_name("TS_TABLE1"); */
/* os_thread_create(test_measure_cont, &n3, id + 0); */
/* mem_print_info(); */
/* dict_table_print_by_name("TS_TABLE1"); */
log_flush_up_to(ut_dulint_zero);
os_thread_sleep(500000);
n2 = 10000;
test2_1(&n2);
/* test9A(&n2);
test9(&n2); */
/* test6(&n2); */
/* test2(&n2); */
/* test2_2(&n2); */
/* mem_print_info(); */
for (i = 0; i < 2; i++) {
n1000[i] = 1000 + 10 * i;
id[i] = id[i];
/* os_thread_create(test2mt, n1000 + i, id + i);
os_thread_create(test2_1mt, n1000 + i, id + i);
os_thread_create(test2_2mt, n1000 + i, id + i); */
}
n2 = 2000;
/* test2mt(&n2); */
/* buf_print();
ibuf_print();
rw_lock_list_print_info();
mutex_list_print_info();
dict_table_print_by_name("TS_TABLE1"); */
/* mem_print_info(); */
n2 = 1000;
/* test4_1();
test4_2();
for (i = 0; i < 2; i++) {
n1000[i] = i;
id[i] = id[i];
os_thread_create(test4mt, n1000 + i, id + i);
}
n2 = 4;
test4mt(&n2);
test4mt(&n2);
test4_2();
lock_print_info(); */
/* test7(&n2); */
/* os_thread_sleep(25000000); */
tm = ut_clock();
printf("Wall time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

3380
innobase/btr/ts/tscli.c Normal file

File diff suppressed because it is too large Load diff

4909
innobase/btr/ts/tsrecv.c Normal file

File diff suppressed because it is too large Load diff

4909
innobase/btr/ts/tsrecv97.c Normal file

File diff suppressed because it is too large Load diff

397
innobase/btr/ts/tss.c Normal file
View file

@ -0,0 +1,397 @@
/************************************************************************
Test for the server
(c) 1996-1997 Innobase Oy
Created 2/16/1996 Heikki Tuuri
*************************************************************************/
#include "os0proc.h"
#include "sync0sync.h"
#include "ut0mem.h"
#include "mem0mem.h"
#include "mem0pool.h"
#include "data0data.h"
#include "data0type.h"
#include "dict0dict.h"
#include "buf0buf.h"
#include "buf0flu.h"
#include "os0file.h"
#include "os0thread.h"
#include "fil0fil.h"
#include "fsp0fsp.h"
#include "rem0rec.h"
#include "rem0cmp.h"
#include "mtr0mtr.h"
#include "log0log.h"
#include "log0recv.h"
#include "page0page.h"
#include "page0cur.h"
#include "trx0trx.h"
#include "dict0boot.h"
#include "trx0sys.h"
#include "dict0crea.h"
#include "btr0btr.h"
#include "btr0pcur.h"
#include "btr0cur.h"
#include "btr0sea.h"
#include "rem0rec.h"
#include "srv0srv.h"
#include "que0que.h"
#include "com0com.h"
#include "usr0sess.h"
#include "lock0lock.h"
#include "trx0roll.h"
#include "trx0purge.h"
#include "row0ins.h"
#include "row0sel.h"
#include "row0upd.h"
#include "row0row.h"
#include "lock0lock.h"
#include "ibuf0ibuf.h"
#include "pars0pars.h"
#include "btr0sea.h"
bool measure_cont = FALSE;
os_file_t files[1000];
mutex_t ios_mutex;
ulint ios;
ulint n[10];
mutex_t incs_mutex;
ulint incs;
byte rnd_buf[67000];
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
ulint i;
segment = *((ulint*)arg);
printf("Io handler thread %lu starts\n", segment);
for (i = 0;; i++) {
fil_aio_wait(segment);
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
}
return(0);
}
/*************************************************************************
Creates or opens the log files. */
void
create_log_files(void)
/*==================*/
{
bool ret;
ulint i, k;
char name[20];
printf("--------------------------------------------------------\n");
printf("Create or open log files\n");
strcpy(name, "logfile00");
for (k = 0; k < srv_n_log_groups; k++) {
for (i = 0; i < srv_n_log_files; i++) {
name[6] = (char)((ulint)'0' + k);
name[7] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE, OS_FILE_AIO,
&ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(
name, OS_FILE_OPEN, OS_FILE_AIO, &ret);
ut_a(ret);
} else {
ut_a(os_file_set_size(files[i],
8192 * srv_log_file_size, 0));
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create(name, k + 100, FIL_LOG);
}
ut_a(fil_validate());
fil_node_create(name, srv_log_file_size, k + 100);
}
fil_space_create(name, k + 200, FIL_LOG);
log_group_init(k, srv_n_log_files,
srv_log_file_size * UNIV_PAGE_SIZE,
k + 100, k + 200);
}
}
/*************************************************************************
Creates the files for the file system test and inserts them to the file
system. */
void
create_files(void)
/*==============*/
{
bool ret;
ulint i, k;
char name[20];
os_thread_t thr[10];
os_thread_id_t id[10];
printf("--------------------------------------------------------\n");
printf("Create or open database files\n");
strcpy(name, "tsfile00");
for (k = 0; k < 2 * srv_n_spaces; k += 2) {
for (i = 0; i < srv_n_files; i++) {
name[6] = (char)((ulint)'0' + k);
name[7] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE,
OS_FILE_NORMAL, &ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(
name, OS_FILE_OPEN, OS_FILE_NORMAL, &ret);
ut_a(ret);
} else {
ut_a(os_file_set_size(files[i],
UNIV_PAGE_SIZE * srv_file_size, 0));
/* Initialize the file contents to a random value */
/*
for (j = 0; j < srv_file_size; j++) {
for (c = 0; c < UNIV_PAGE_SIZE; c++) {
rnd_buf[c] = 0xFF;
}
os_file_write(files[i], rnd_buf,
UNIV_PAGE_SIZE * j, 0,
UNIV_PAGE_SIZE);
}
*/
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create(name, k, FIL_TABLESPACE);
}
ut_a(fil_validate());
fil_node_create(name, srv_file_size, k);
}
}
ios = 0;
mutex_create(&ios_mutex);
mutex_set_level(&ios_mutex, SYNC_NO_ORDER_CHECK);
/* Create i/o-handler threads: */
for (i = 0; i < 9; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
}
/************************************************************************
Inits space header of space. */
void
init_spaces(void)
/*=============*/
{
mtr_t mtr;
mtr_start(&mtr);
fsp_header_init(0, srv_file_size * srv_n_files, &mtr);
mtr_commit(&mtr);
}
/*********************************************************************
This thread is used to measure contention of latches. */
ulint
test_measure_cont(
/*==============*/
void* arg)
{
ulint i, j, k;
ulint count[8];
ulint lcount[8];
ulint lscount;
ulint lkcount;
ulint pcount, kcount, scount;
UT_NOT_USED(arg);
printf("Starting contention measurement\n");
for (i = 0; i < 1000; i++) {
for (k = 0; k < 8; k++) {
count[k] = 0;
lcount[k] = 0;
}
pcount = 0;
kcount = 0;
scount = 0;
lscount = 0;
lkcount = 0;
for (j = 0; j < 100; j++) {
if (srv_measure_by_spin) {
ut_delay(ut_rnd_interval(0, 20000));
} else {
os_thread_sleep(20000);
}
if (kernel_mutex.lock_word) {
kcount++;
}
if (lock_kernel_reserved) {
lkcount++;
}
if (buf_pool->mutex.lock_word) {
pcount++;
}
if (btr_search_mutex.lock_word) {
scount++;
}
for (k = 0; k < 8; k++) {
if (btr_search_sys->
hash_index->mutexes[k].lock_word) {
count[k]++;
}
}
for (k = 0; k < 2; k++) {
if (lock_sys->rec_hash->mutexes[k].lock_word) {
lcount[k]++;
}
}
if (kernel_mutex.lock_word
|| lock_sys->rec_hash->mutexes[0].lock_word
|| lock_sys->rec_hash->mutexes[1].lock_word) {
lscount++;
}
}
printf(
"Mutex res. p %lu, k %lu %lu, %lu %lu %lu s %lu, %lu %lu %lu %lu %lu %lu %lu %lu of %lu\n",
pcount, kcount, lkcount, lcount[0], lcount[1], lscount, scount,
count[0], count[1], count[2], count[3],
count[4], count[5], count[6], count[7], j);
sync_print_wait_info();
printf("N log i/os %lu, n non sea %lu, n sea succ %lu\n",
log_sys->n_log_ios, btr_cur_n_non_sea,
btr_search_n_succ);
}
return(0);
}
/********************************************************************
Main test function. */
void
main(void)
/*======*/
{
os_thread_id_t thread_id;
log_do_write = TRUE;
/* yydebug = TRUE; */
srv_boot("srv_init");
os_aio_init(576, 9, 100);
fil_init(25);
buf_pool_init(srv_pool_size, srv_pool_size);
fsp_init();
log_init();
lock_sys_create(srv_lock_table_size);
create_files();
create_log_files();
init_spaces();
sess_sys_init_at_db_start();
trx_sys_create();
dict_create();
log_make_checkpoint_at(ut_dulint_max);
if (srv_measure_contention) {
os_thread_create(&test_measure_cont, NULL, &thread_id);
}
if (!srv_log_archive_on) {
ut_a(DB_SUCCESS == log_archive_noarchivelog());
}
srv_master_thread();
printf("TESTS COMPLETED SUCCESSFULLY!\n");
os_process_exit(0);
}

535
innobase/btr/ts/tssrv.c Normal file
View file

@ -0,0 +1,535 @@
/************************************************************************
Test for the server
(c) 1996-1997 Innobase Oy
Created 2/16/1996 Heikki Tuuri
*************************************************************************/
#include "os0proc.h"
#include "sync0sync.h"
#include "ut0mem.h"
#include "mem0mem.h"
#include "mem0pool.h"
#include "data0data.h"
#include "data0type.h"
#include "dict0dict.h"
#include "buf0buf.h"
#include "buf0flu.h"
#include "os0file.h"
#include "os0thread.h"
#include "fil0fil.h"
#include "fsp0fsp.h"
#include "rem0rec.h"
#include "rem0cmp.h"
#include "mtr0mtr.h"
#include "log0log.h"
#include "log0recv.h"
#include "page0page.h"
#include "page0cur.h"
#include "trx0trx.h"
#include "dict0boot.h"
#include "trx0sys.h"
#include "dict0crea.h"
#include "btr0btr.h"
#include "btr0pcur.h"
#include "btr0cur.h"
#include "btr0sea.h"
#include "rem0rec.h"
#include "srv0srv.h"
#include "que0que.h"
#include "com0com.h"
#include "usr0sess.h"
#include "lock0lock.h"
#include "trx0roll.h"
#include "trx0purge.h"
#include "row0ins.h"
#include "row0sel.h"
#include "row0upd.h"
#include "row0row.h"
#include "lock0lock.h"
#include "ibuf0ibuf.h"
#include "pars0pars.h"
#include "btr0sea.h"
bool measure_cont = FALSE;
os_file_t files[1000];
mutex_t ios_mutex;
ulint ios;
ulint n[10];
mutex_t incs_mutex;
ulint incs;
byte rnd_buf[67000];
ulint glob_var1 = 0;
ulint glob_var2 = 0;
mutex_t mutex2;
mutex_t test_mutex1;
mutex_t test_mutex2;
mutex_t* volatile mutexes;
bool always_false = FALSE;
ulint* test_array;
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
ulint i;
segment = *((ulint*)arg);
printf("Io handler thread %lu starts\n", segment);
for (i = 0;; i++) {
fil_aio_wait(segment);
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
}
return(0);
}
/*************************************************************************
Creates or opens the log files. */
void
create_log_files(void)
/*==================*/
{
bool ret;
ulint i, k;
char name[20];
printf("--------------------------------------------------------\n");
printf("Create or open log files\n");
strcpy(name, "logfile00");
for (k = 0; k < srv_n_log_groups; k++) {
for (i = 0; i < srv_n_log_files; i++) {
name[6] = (char)((ulint)'0' + k);
name[7] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE, OS_FILE_AIO,
&ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(
name, OS_FILE_OPEN, OS_FILE_AIO, &ret);
ut_a(ret);
} else {
ut_a(os_file_set_size(files[i],
8192 * srv_log_file_size, 0));
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create(name, k + 100, FIL_LOG);
}
ut_a(fil_validate());
fil_node_create(name, srv_log_file_size, k + 100);
}
fil_space_create(name, k + 200, FIL_LOG);
log_group_init(k, srv_n_log_files,
srv_log_file_size * UNIV_PAGE_SIZE,
k + 100, k + 200);
}
}
/*************************************************************************
Creates the files for the file system test and inserts them to the file
system. */
void
create_files(void)
/*==============*/
{
bool ret;
ulint i, k;
char name[20];
os_thread_t thr[10];
os_thread_id_t id[10];
printf("--------------------------------------------------------\n");
printf("Create or open database files\n");
strcpy(name, "tsfile00");
for (k = 0; k < 2 * srv_n_spaces; k += 2) {
for (i = 0; i < srv_n_files; i++) {
name[6] = (char)((ulint)'0' + k);
name[7] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE,
OS_FILE_NORMAL, &ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(
name, OS_FILE_OPEN, OS_FILE_NORMAL, &ret);
ut_a(ret);
} else {
ut_a(os_file_set_size(files[i],
UNIV_PAGE_SIZE * srv_file_size, 0));
/* Initialize the file contents to a random value */
/*
for (j = 0; j < srv_file_size; j++) {
for (c = 0; c < UNIV_PAGE_SIZE; c++) {
rnd_buf[c] = 0xFF;
}
os_file_write(files[i], rnd_buf,
UNIV_PAGE_SIZE * j, 0,
UNIV_PAGE_SIZE);
}
*/
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create(name, k, FIL_TABLESPACE);
}
ut_a(fil_validate());
fil_node_create(name, srv_file_size, k);
}
}
ios = 0;
mutex_create(&ios_mutex);
mutex_set_level(&ios_mutex, SYNC_NO_ORDER_CHECK);
/* Create i/o-handler threads: */
for (i = 0; i < 9; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
}
/************************************************************************
Inits space header of space. */
void
init_spaces(void)
/*=============*/
{
mtr_t mtr;
mtr_start(&mtr);
fsp_header_init(0, srv_file_size * srv_n_files, &mtr);
mtr_commit(&mtr);
}
/*********************************************************************
This thread is used to measure contention of latches. */
ulint
test_measure_cont(
/*==============*/
void* arg)
{
ulint i, j;
ulint pcount, kcount, s_scount, s_xcount, s_mcount, lcount;
ulint t1count;
ulint t2count;
UT_NOT_USED(arg);
printf("Starting contention measurement\n");
for (i = 0; i < 1000; i++) {
pcount = 0;
kcount = 0;
s_scount = 0;
s_xcount = 0;
s_mcount = 0;
lcount = 0;
t1count = 0;
t2count = 0;
for (j = 0; j < 100; j++) {
if (srv_measure_by_spin) {
ut_delay(ut_rnd_interval(0, 20000));
} else {
os_thread_sleep(20000);
}
if (kernel_mutex.lock_word) {
kcount++;
}
if (buf_pool->mutex.lock_word) {
pcount++;
}
if (log_sys->mutex.lock_word) {
lcount++;
}
if (btr_search_latch.reader_count) {
s_scount++;
}
if (btr_search_latch.writer != RW_LOCK_NOT_LOCKED) {
s_xcount++;
}
if (btr_search_latch.mutex.lock_word) {
s_mcount++;
}
if (test_mutex1.lock_word) {
t1count++;
}
if (test_mutex2.lock_word) {
t2count++;
}
}
printf(
"Mutex res. l %lu, p %lu, k %lu s x %lu s s %lu s mut %lu of %lu\n",
lcount, pcount, kcount, s_xcount, s_scount, s_mcount, j);
sync_print_wait_info();
printf(
"log i/o %lu n non sea %lu n succ %lu n h fail %lu\n",
log_sys->n_log_ios, btr_cur_n_non_sea,
btr_search_n_succ, btr_search_n_hash_fail);
}
return(0);
}
/*********************************************************************
This thread is used to test contention of latches. */
ulint
test_sync(
/*======*/
void* arg)
{
ulint tm, oldtm;
ulint i, j;
ulint sum;
ulint rnd = ut_rnd_gen_ulint();
ulint mut_ind;
byte* ptr;
UT_NOT_USED(arg);
printf("Starting mutex reservation test\n");
oldtm = ut_clock();
sum = 0;
rnd = 87354941;
for (i = 0; i < srv_test_n_loops; i++) {
for (j = 0; j < srv_test_n_free_rnds; j++) {
rnd += 423087123;
sum += test_array[rnd % (256 * srv_test_array_size)];
}
rnd += 43605677;
mut_ind = rnd % srv_test_n_mutexes;
mutex_enter(mutexes + mut_ind);
for (j = 0; j < srv_test_n_reserved_rnds; j++) {
rnd += 423087121;
sum += test_array[rnd % (256 * srv_test_array_size)];
}
mutex_exit(mutexes + mut_ind);
if (srv_test_cache_evict) {
ptr = (byte*)(mutexes + mut_ind);
for (j = 0; j < 4; j++) {
ptr += 256 * 1024;
sum += *((ulint*)ptr);
}
}
}
if (always_false) {
printf("%lu", sum);
}
tm = ut_clock();
printf("Wall time for res. test %lu milliseconds\n", tm - oldtm);
return(0);
}
/********************************************************************
Main test function. */
void
main(void)
/*======*/
{
os_thread_id_t thread_ids[1000];
ulint tm, oldtm;
ulint rnd;
ulint i, sum;
byte* ptr;
/* mutex_t mutex; */
log_do_write = TRUE;
/* yydebug = TRUE; */
srv_boot("srv_init");
os_aio_init(576, 9, 100);
fil_init(25);
buf_pool_init(srv_pool_size, srv_pool_size);
fsp_init();
log_init();
lock_sys_create(srv_lock_table_size);
create_files();
create_log_files();
init_spaces();
sess_sys_init_at_db_start();
trx_sys_create();
dict_create();
log_make_checkpoint_at(ut_dulint_max);
printf("Hotspot semaphore addresses k %lx, p %lx, l %lx, s %lx\n",
&kernel_mutex, &(buf_pool->mutex),
&(log_sys->mutex), &btr_search_latch);
if (srv_measure_contention) {
os_thread_create(&test_measure_cont, NULL, thread_ids + 999);
}
if (!srv_log_archive_on) {
ut_a(DB_SUCCESS == log_archive_noarchivelog());
}
/*
mutex_create(&mutex);
oldtm = ut_clock();
for (i = 0; i < 2000000; i++) {
mutex_enter(&mutex);
mutex_exit(&mutex);
}
tm = ut_clock();
printf("Wall clock time for %lu mutex enter %lu milliseconds\n",
i, tm - oldtm);
*/
if (srv_test_sync) {
if (srv_test_nocache) {
mutexes = os_mem_alloc_nocache(srv_test_n_mutexes
* sizeof(mutex_t));
} else {
mutexes = mem_alloc(srv_test_n_mutexes
* sizeof(mutex_t));
}
sum = 0;
rnd = 492314896;
oldtm = ut_clock();
for (i = 0; i < 4000000; i++) {
rnd += 85967944;
ptr = ((byte*)(mutexes)) + (rnd % (srv_test_n_mutexes
* sizeof(mutex_t)));
sum += *((ulint*)ptr);
}
tm = ut_clock();
printf(
"Wall clock time for %lu random access %lu milliseconds\n",
i, tm - oldtm);
if (always_false) {
printf("%lu", sum);
}
test_array = mem_alloc(4 * 256 * srv_test_array_size);
for (i = 0; i < srv_test_n_mutexes; i++) {
mutex_create(mutexes + i);
}
for (i = 0; i < srv_test_n_threads; i++) {
os_thread_create(&test_sync, NULL, thread_ids + i);
}
}
srv_master_thread(NULL);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
os_process_exit(0);
}

24
innobase/buf/Makefile.am Normal file
View file

@ -0,0 +1,24 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libbuf.a
libbuf_a_SOURCES = buf0buf.c buf0flu.c buf0lru.c buf0rea.c
EXTRA_PROGRAMS =

1568
innobase/buf/buf0buf.c Normal file

File diff suppressed because it is too large Load diff

702
innobase/buf/buf0flu.c Normal file
View file

@ -0,0 +1,702 @@
/******************************************************
The database buffer buf_pool flush algorithm
(c) 1995 Innobase Oy
Created 11/11/1995 Heikki Tuuri
*******************************************************/
#include "buf0flu.h"
#ifdef UNIV_NONINL
#include "buf0flu.ic"
#endif
#include "ut0byte.h"
#include "ut0lst.h"
#include "fil0fil.h"
#include "buf0buf.h"
#include "buf0lru.h"
#include "buf0rea.h"
#include "ibuf0ibuf.h"
#include "log0log.h"
#include "os0file.h"
/* When flushed, dirty blocks are searched in neigborhoods of this size, and
flushed along with the original page. */
#define BUF_FLUSH_AREA ut_min(BUF_READ_AHEAD_AREA,\
buf_pool->curr_size / 16)
/**********************************************************************
Validates the flush list. */
static
ibool
buf_flush_validate_low(void);
/*========================*/
/* out: TRUE if ok */
/************************************************************************
Inserts a modified block into the flush list. */
void
buf_flush_insert_into_flush_list(
/*=============================*/
buf_block_t* block) /* in: block which is modified */
{
ut_ad(mutex_own(&(buf_pool->mutex)));
ut_ad((UT_LIST_GET_FIRST(buf_pool->flush_list) == NULL)
|| (ut_dulint_cmp(
(UT_LIST_GET_FIRST(buf_pool->flush_list))
->oldest_modification,
block->oldest_modification) <= 0));
UT_LIST_ADD_FIRST(flush_list, buf_pool->flush_list, block);
ut_ad(buf_flush_validate_low());
}
/************************************************************************
Inserts a modified block into the flush list in the right sorted position.
This function is used by recovery, because there the modifications do not
necessarily come in the order of lsn's. */
void
buf_flush_insert_sorted_into_flush_list(
/*====================================*/
buf_block_t* block) /* in: block which is modified */
{
buf_block_t* prev_b;
buf_block_t* b;
ut_ad(mutex_own(&(buf_pool->mutex)));
prev_b = NULL;
b = UT_LIST_GET_FIRST(buf_pool->flush_list);
while (b && (ut_dulint_cmp(b->oldest_modification,
block->oldest_modification) > 0)) {
prev_b = b;
b = UT_LIST_GET_NEXT(flush_list, b);
}
if (prev_b == NULL) {
UT_LIST_ADD_FIRST(flush_list, buf_pool->flush_list, block);
} else {
UT_LIST_INSERT_AFTER(flush_list, buf_pool->flush_list, prev_b,
block);
}
ut_ad(buf_flush_validate_low());
}
/************************************************************************
Returns TRUE if the file page block is immediately suitable for replacement,
i.e., the transition FILE_PAGE => NOT_USED allowed. */
ibool
buf_flush_ready_for_replace(
/*========================*/
/* out: TRUE if can replace immediately */
buf_block_t* block) /* in: buffer control block, must be in state
BUF_BLOCK_FILE_PAGE and in the LRU list*/
{
ut_ad(mutex_own(&(buf_pool->mutex)));
ut_ad(block->state == BUF_BLOCK_FILE_PAGE);
if ((ut_dulint_cmp(block->oldest_modification, ut_dulint_zero) > 0)
|| (block->buf_fix_count != 0)
|| (block->io_fix != 0)) {
return(FALSE);
}
return(TRUE);
}
/************************************************************************
Returns TRUE if the block is modified and ready for flushing. */
UNIV_INLINE
ibool
buf_flush_ready_for_flush(
/*======================*/
/* out: TRUE if can flush immediately */
buf_block_t* block, /* in: buffer control block, must be in state
BUF_BLOCK_FILE_PAGE */
ulint flush_type)/* in: BUF_FLUSH_LRU or BUF_FLUSH_LIST */
{
ut_ad(mutex_own(&(buf_pool->mutex)));
ut_ad(block->state == BUF_BLOCK_FILE_PAGE);
if ((ut_dulint_cmp(block->oldest_modification, ut_dulint_zero) > 0)
&& (block->io_fix == 0)) {
if (flush_type != BUF_FLUSH_LRU) {
return(TRUE);
} else if ((block->old || (UT_LIST_GET_LEN(buf_pool->LRU)
< BUF_LRU_OLD_MIN_LEN))
&& (block->buf_fix_count == 0)) {
/* If we are flushing the LRU list, to avoid deadlocks
we require the block not to be bufferfixed, and hence
not latched. Since LRU flushed blocks are soon moved
to the free list, it is good to flush only old blocks
from the end of the LRU list. */
return(TRUE);
}
}
return(FALSE);
}
/************************************************************************
Updates the flush system data structures when a write is completed. */
void
buf_flush_write_complete(
/*=====================*/
buf_block_t* block) /* in: pointer to the block in question */
{
ut_ad(block);
ut_ad(mutex_own(&(buf_pool->mutex)));
block->oldest_modification = ut_dulint_zero;
UT_LIST_REMOVE(flush_list, buf_pool->flush_list, block);
ut_d(UT_LIST_VALIDATE(flush_list, buf_block_t, buf_pool->flush_list));
(buf_pool->n_flush[block->flush_type])--;
if (block->flush_type == BUF_FLUSH_LRU) {
/* Put the block to the end of the LRU list to wait to be
moved to the free list */
buf_LRU_make_block_old(block);
buf_pool->LRU_flush_ended++;
}
/* printf("n pending flush %lu\n",
buf_pool->n_flush[block->flush_type]); */
if ((buf_pool->n_flush[block->flush_type] == 0)
&& (buf_pool->init_flush[block->flush_type] == FALSE)) {
/* The running flush batch has ended */
os_event_set(buf_pool->no_flush[block->flush_type]);
}
}
/************************************************************************
Does an asynchronous write of a buffer page. NOTE: in simulated aio we must
call os_aio_simulated_wake_handler_threads after we have posted a batch
of writes! */
static
void
buf_flush_write_block_low(
/*======================*/
buf_block_t* block) /* in: buffer block to write */
{
#ifdef UNIV_IBUF_DEBUG
ut_a(ibuf_count_get(block->space, block->offset) == 0);
#endif
ut_ad(!ut_dulint_is_zero(block->newest_modification));
#ifdef UNIV_LOG_DEBUG
printf(
"Warning: cannot force log to disk in the log debug version!\n");
#else
/* Force the log to the disk before writing the modified block */
log_flush_up_to(block->newest_modification, LOG_WAIT_ALL_GROUPS);
#endif
/* Write the newest modification lsn to the page */
mach_write_to_8(block->frame + FIL_PAGE_LSN,
block->newest_modification);
mach_write_to_8(block->frame + UNIV_PAGE_SIZE - FIL_PAGE_END_LSN,
block->newest_modification);
fil_io(OS_FILE_WRITE | OS_AIO_SIMULATED_WAKE_LATER,
FALSE, block->space, block->offset, 0, UNIV_PAGE_SIZE,
(void*)block->frame, (void*)block);
}
/************************************************************************
Writes a page asynchronously from the buffer buf_pool to a file, if it can be
found in the buf_pool and it is in a flushable state. NOTE: in simulated aio
we must call os_aio_simulated_wake_handler_threads after we have posted a batch
of writes! */
static
ulint
buf_flush_try_page(
/*===============*/
/* out: 1 if a page was flushed, 0 otherwise */
ulint space, /* in: space id */
ulint offset, /* in: page offset */
ulint flush_type) /* in: BUF_FLUSH_LRU, BUF_FLUSH_LIST, or
BUF_FLUSH_SINGLE_PAGE */
{
buf_block_t* block;
ibool locked;
ut_ad((flush_type == BUF_FLUSH_LRU) || (flush_type == BUF_FLUSH_LIST)
|| (flush_type == BUF_FLUSH_SINGLE_PAGE));
mutex_enter(&(buf_pool->mutex));
block = buf_page_hash_get(space, offset);
if ((flush_type == BUF_FLUSH_LIST)
&& block && buf_flush_ready_for_flush(block, flush_type)) {
block->io_fix = BUF_IO_WRITE;
block->flush_type = flush_type;
if (buf_pool->n_flush[block->flush_type] == 0) {
os_event_reset(buf_pool->no_flush[block->flush_type]);
}
(buf_pool->n_flush[flush_type])++;
locked = FALSE;
/* If the simulated aio thread is not running, we must
not wait for any latch, as we may end up in a deadlock:
if buf_fix_count == 0, then we know we need not wait */
if (block->buf_fix_count == 0) {
rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE);
locked = TRUE;
}
mutex_exit(&(buf_pool->mutex));
if (!locked) {
os_aio_simulated_wake_handler_threads();
rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE);
}
if (buf_debug_prints) {
printf("Flushing page space %lu, page no %lu \n",
block->space, block->offset);
}
buf_flush_write_block_low(block);
return(1);
} else if ((flush_type == BUF_FLUSH_LRU) && block
&& buf_flush_ready_for_flush(block, flush_type)) {
/* VERY IMPORTANT:
Because any thread may call the LRU flush, even when owning
locks on pages, to avoid deadlocks, we must make sure that the
s-lock is acquired on the page without waiting: this is
accomplished because in the if-condition above we require
the page not to be bufferfixed (in function
..._ready_for_flush). */
block->io_fix = BUF_IO_WRITE;
block->flush_type = flush_type;
(buf_pool->n_flush[flush_type])++;
rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE);
/* Note that the s-latch is acquired before releasing the
buf_pool mutex: this ensures that the latch is acquired
immediately. */
mutex_exit(&(buf_pool->mutex));
buf_flush_write_block_low(block);
return(1);
} else if ((flush_type == BUF_FLUSH_SINGLE_PAGE) && block
&& buf_flush_ready_for_flush(block, flush_type)) {
block->io_fix = BUF_IO_WRITE;
block->flush_type = flush_type;
if (buf_pool->n_flush[block->flush_type] == 0) {
os_event_reset(buf_pool->no_flush[block->flush_type]);
}
(buf_pool->n_flush[flush_type])++;
mutex_exit(&(buf_pool->mutex));
rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE);
if (buf_debug_prints) {
printf("Flushing single page space %lu, page no %lu \n",
block->space, block->offset);
}
buf_flush_write_block_low(block);
return(1);
} else {
mutex_exit(&(buf_pool->mutex));
return(0);
}
}
/***************************************************************
Flushes to disk all flushable pages within the flush area. */
static
ulint
buf_flush_try_neighbors(
/*====================*/
/* out: number of pages flushed */
ulint space, /* in: space id */
ulint offset, /* in: page offset */
ulint flush_type) /* in: BUF_FLUSH_LRU or BUF_FLUSH_LIST */
{
buf_block_t* block;
ulint low, high;
ulint count = 0;
ulint i;
ut_ad(flush_type == BUF_FLUSH_LRU || flush_type == BUF_FLUSH_LIST);
low = (offset / BUF_FLUSH_AREA) * BUF_FLUSH_AREA;
high = (offset / BUF_FLUSH_AREA + 1) * BUF_FLUSH_AREA;
if (UT_LIST_GET_LEN(buf_pool->LRU) < BUF_LRU_OLD_MIN_LEN) {
/* If there is little space, it is better not to flush any
block except from the end of the LRU list */
low = offset;
high = offset + 1;
}
/* printf("Flush area: low %lu high %lu\n", low, high); */
if (high > fil_space_get_size(space)) {
high = fil_space_get_size(space);
}
mutex_enter(&(buf_pool->mutex));
for (i = low; i < high; i++) {
block = buf_page_hash_get(space, i);
if (block && buf_flush_ready_for_flush(block, flush_type)) {
mutex_exit(&(buf_pool->mutex));
/* Note: as we release the buf_pool mutex above, in
buf_flush_try_page we cannot be sure the page is still
in a flushable state: therefore we check it again
inside that function. */
count += buf_flush_try_page(space, i, flush_type);
mutex_enter(&(buf_pool->mutex));
}
}
mutex_exit(&(buf_pool->mutex));
/* In simulated aio we wake up the i/o-handler threads now that
we have posted a batch of writes: */
os_aio_simulated_wake_handler_threads();
return(count);
}
/***********************************************************************
This utility flushes dirty blocks from the end of the LRU list or flush_list.
NOTE 1: in the case of an LRU flush the calling thread may own latches to
pages: to avoid deadlocks, this function must be written so that it cannot
end up waiting for these latches! NOTE 2: in the case of a flush list flush,
the calling thread is not allowed to own any latches on pages! */
ulint
buf_flush_batch(
/*============*/
/* out: number of blocks for which the write
request was queued; ULINT_UNDEFINED if there
was a flush of the same type already running */
ulint flush_type, /* in: BUF_FLUSH_LRU or BUF_FLUSH_LIST; if
BUF_FLUSH_LIST, then the caller must not own
any latches on pages */
ulint min_n, /* in: wished minimum mumber of blocks flushed
(it is not guaranteed that the actual number
is that big, though) */
dulint lsn_limit) /* in the case BUF_FLUSH_LIST all blocks whose
oldest_modification is smaller than this
should be flushed (if their number does not
exceed min_n), otherwise ignored */
{
buf_block_t* block;
ulint page_count = 0;
ulint old_page_count;
ulint space;
ulint offset;
ibool found;
ut_ad((flush_type == BUF_FLUSH_LRU) || (flush_type == BUF_FLUSH_LIST));
ut_ad((flush_type != BUF_FLUSH_LIST) ||
sync_thread_levels_empty_gen(TRUE));
mutex_enter(&(buf_pool->mutex));
if ((buf_pool->n_flush[flush_type] > 0)
|| (buf_pool->init_flush[flush_type] == TRUE)) {
/* There is already a flush batch of the same type running */
mutex_exit(&(buf_pool->mutex));
return(ULINT_UNDEFINED);
}
(buf_pool->init_flush)[flush_type] = TRUE;
for (;;) {
/* If we have flushed enough, leave the loop */
if (page_count >= min_n) {
break;
}
/* Start from the end of the list looking for a suitable
block to be flushed. */
if (flush_type == BUF_FLUSH_LRU) {
block = UT_LIST_GET_LAST(buf_pool->LRU);
} else {
ut_ad(flush_type == BUF_FLUSH_LIST);
block = UT_LIST_GET_LAST(buf_pool->flush_list);
if (!block
|| (ut_dulint_cmp(block->oldest_modification,
lsn_limit) >= 0)) {
/* We have flushed enough */
break;
}
}
found = FALSE;
/* Note that after finding a single flushable page, we try to
flush also all its neighbors, and after that start from the
END of the LRU list or flush list again: the list may change
during the flushing and we cannot safely preserve within this
function a pointer to a block in the list! */
while ((block != NULL) && !found) {
if (buf_flush_ready_for_flush(block, flush_type)) {
found = TRUE;
space = block->space;
offset = block->offset;
mutex_exit(&(buf_pool->mutex));
old_page_count = page_count;
/* Try to flush also all the neighbors */
page_count +=
buf_flush_try_neighbors(space, offset,
flush_type);
/* printf(
"Flush type %lu, page no %lu, neighb %lu\n",
flush_type, offset,
page_count - old_page_count); */
mutex_enter(&(buf_pool->mutex));
} else if (flush_type == BUF_FLUSH_LRU) {
block = UT_LIST_GET_PREV(LRU, block);
} else {
ut_ad(flush_type == BUF_FLUSH_LIST);
block = UT_LIST_GET_PREV(flush_list, block);
}
}
/* If we could not find anything to flush, leave the loop */
if (!found) {
break;
}
}
(buf_pool->init_flush)[flush_type] = FALSE;
if ((buf_pool->n_flush[flush_type] == 0)
&& (buf_pool->init_flush[flush_type] == FALSE)) {
/* The running flush batch has ended */
os_event_set(buf_pool->no_flush[flush_type]);
}
mutex_exit(&(buf_pool->mutex));
if (buf_debug_prints && (page_count > 0)) {
if (flush_type == BUF_FLUSH_LRU) {
printf("To flush %lu pages in LRU flush\n",
page_count, flush_type);
} else if (flush_type == BUF_FLUSH_LIST) {
printf("To flush %lu pages in flush list flush\n",
page_count, flush_type);
} else {
ut_error;
}
}
return(page_count);
}
/**********************************************************************
Waits until a flush batch of the given type ends */
void
buf_flush_wait_batch_end(
/*=====================*/
ulint type) /* in: BUF_FLUSH_LRU or BUF_FLUSH_LIST */
{
ut_ad((type == BUF_FLUSH_LRU) || (type == BUF_FLUSH_LIST));
os_event_wait(buf_pool->no_flush[type]);
}
/**********************************************************************
Gives a recommendation of how many blocks should be flushed to establish
a big enough margin of replaceable blocks near the end of the LRU list
and in the free list. */
static
ulint
buf_flush_LRU_recommendation(void)
/*==============================*/
/* out: number of blocks which should be flushed
from the end of the LRU list */
{
buf_block_t* block;
ulint n_replaceable;
ulint distance = 0;
mutex_enter(&(buf_pool->mutex));
n_replaceable = UT_LIST_GET_LEN(buf_pool->free);
block = UT_LIST_GET_LAST(buf_pool->LRU);
while ((block != NULL)
&& (n_replaceable < BUF_FLUSH_FREE_BLOCK_MARGIN
+ BUF_FLUSH_EXTRA_MARGIN)
&& (distance < BUF_LRU_FREE_SEARCH_LEN)) {
if (buf_flush_ready_for_replace(block)) {
n_replaceable++;
}
distance++;
block = UT_LIST_GET_PREV(LRU, block);
}
mutex_exit(&(buf_pool->mutex));
if (n_replaceable >= BUF_FLUSH_FREE_BLOCK_MARGIN) {
return(0);
}
return(BUF_FLUSH_FREE_BLOCK_MARGIN + BUF_FLUSH_EXTRA_MARGIN
- n_replaceable);
}
/*************************************************************************
Flushes pages from the end of the LRU list if there is too small a margin
of replaceable pages there or in the free list. VERY IMPORTANT: this function
is called also by threads which have locks on pages. To avoid deadlocks, we
flush only pages such that the s-lock required for flushing can be acquired
immediately, without waiting. */
void
buf_flush_free_margin(void)
/*=======================*/
{
ulint n_to_flush;
n_to_flush = buf_flush_LRU_recommendation();
if (n_to_flush > 0) {
buf_flush_batch(BUF_FLUSH_LRU, n_to_flush, ut_dulint_zero);
}
}
/**********************************************************************
Validates the flush list. */
static
ibool
buf_flush_validate_low(void)
/*========================*/
/* out: TRUE if ok */
{
buf_block_t* block;
dulint om;
UT_LIST_VALIDATE(flush_list, buf_block_t, buf_pool->flush_list);
block = UT_LIST_GET_FIRST(buf_pool->flush_list);
while (block != NULL) {
om = block->oldest_modification;
ut_a(block->state == BUF_BLOCK_FILE_PAGE);
ut_a(ut_dulint_cmp(om, ut_dulint_zero) > 0);
block = UT_LIST_GET_NEXT(flush_list, block);
if (block) {
ut_a(ut_dulint_cmp(om, block->oldest_modification)
>= 0);
}
}
return(TRUE);
}
/**********************************************************************
Validates the flush list. */
ibool
buf_flush_validate(void)
/*====================*/
/* out: TRUE if ok */
{
ibool ret;
mutex_enter(&(buf_pool->mutex));
ret = buf_flush_validate_low();
mutex_exit(&(buf_pool->mutex));
return(ret);
}

734
innobase/buf/buf0lru.c Normal file
View file

@ -0,0 +1,734 @@
/******************************************************
The database buffer replacement algorithm
(c) 1995 Innobase Oy
Created 11/5/1995 Heikki Tuuri
*******************************************************/
#include "buf0lru.h"
#ifdef UNIV_NONINL
#include "buf0lru.ic"
#endif
#include "ut0byte.h"
#include "ut0lst.h"
#include "ut0rnd.h"
#include "sync0sync.h"
#include "sync0rw.h"
#include "hash0hash.h"
#include "os0sync.h"
#include "fil0fil.h"
#include "btr0btr.h"
#include "buf0buf.h"
#include "buf0flu.h"
#include "buf0rea.h"
#include "btr0sea.h"
#include "os0file.h"
/* The number of blocks from the LRU_old pointer onward, including the block
pointed to, must be 3/8 of the whole LRU list length, except that the
tolerance defined below is allowed. Note that the tolerance must be small
enough such that for even the BUF_LRU_OLD_MIN_LEN long LRU list, the
LRU_old pointer is not allowed to point to either end of the LRU list. */
#define BUF_LRU_OLD_TOLERANCE 20
/* The whole LRU list length is divided by this number to determine an
initial segment in buf_LRU_get_recent_limit */
#define BUF_LRU_INITIAL_RATIO 8
/**********************************************************************
Takes a block out of the LRU list and page hash table and sets the block
state to BUF_BLOCK_REMOVE_HASH. */
static
void
buf_LRU_block_remove_hashed_page(
/*=============================*/
buf_block_t* block); /* in: block, must contain a file page and
be in a state where it can be freed; there
may or may not be a hash index to the page */
/**********************************************************************
Puts a file page whose has no hash index to the free list. */
static
void
buf_LRU_block_free_hashed_page(
/*===========================*/
buf_block_t* block); /* in: block, must contain a file page and
be in a state where it can be freed */
/**********************************************************************
Gets the minimum LRU_position field for the blocks in an initial segment
(determined by BUF_LRU_INITIAL_RATIO) of the LRU list. The limit is not
guaranteed to be precise, because the ulint_clock may wrap around. */
ulint
buf_LRU_get_recent_limit(void)
/*==========================*/
/* out: the limit; zero if could not determine it */
{
buf_block_t* block;
ulint len;
ulint limit;
mutex_enter(&(buf_pool->mutex));
len = UT_LIST_GET_LEN(buf_pool->LRU);
if (len < BUF_LRU_OLD_MIN_LEN) {
/* The LRU list is too short to do read-ahead */
mutex_exit(&(buf_pool->mutex));
return(0);
}
block = UT_LIST_GET_FIRST(buf_pool->LRU);
limit = block->LRU_position - len / BUF_LRU_INITIAL_RATIO;
mutex_exit(&(buf_pool->mutex));
return(limit);
}
/**********************************************************************
Look for a replaceable block from the end of the LRU list and put it to
the free list if found. */
ibool
buf_LRU_search_and_free_block(
/*==========================*/
/* out: TRUE if freed */
ulint n_iterations) /* in: how many times this has been called
repeatedly without result: a high value
means that we should search farther */
{
buf_block_t* block;
ulint distance;
ibool freed;
ulint i;
mutex_enter(&(buf_pool->mutex));
freed = FALSE;
distance = BUF_LRU_FREE_SEARCH_LEN * (1 + n_iterations / 5);
i = 0;
block = UT_LIST_GET_LAST(buf_pool->LRU);
while (i < distance && block != NULL) {
if (buf_flush_ready_for_replace(block)) {
if (buf_debug_prints) {
printf(
"Putting space %lu page %lu to free list\n",
block->space, block->offset);
}
buf_LRU_block_remove_hashed_page(block);
mutex_exit(&(buf_pool->mutex));
btr_search_drop_page_hash_index(block->frame);
mutex_enter(&(buf_pool->mutex));
buf_LRU_block_free_hashed_page(block);
freed = TRUE;
break;
}
block = UT_LIST_GET_PREV(LRU, block);
}
if (buf_pool->LRU_flush_ended > 0) {
buf_pool->LRU_flush_ended--;
}
if (!freed) {
buf_pool->LRU_flush_ended = 0;
}
mutex_exit(&(buf_pool->mutex));
return(freed);
}
/**********************************************************************
Tries to remove LRU flushed blocks from the end of the LRU list and put them
to the free list. This is beneficial for the efficiency of the insert buffer
operation, as flushed pages from non-unique non-clustered indexes are here
taken out of the buffer pool, and their inserts redirected to the insert
buffer. Otherwise, the flushed blocks could get modified again before read
operations need new buffer blocks, and the i/o work done in flushing would be
wasted. */
void
buf_LRU_try_free_flushed_blocks(void)
/*=================================*/
{
mutex_enter(&(buf_pool->mutex));
while (buf_pool->LRU_flush_ended > 0) {
mutex_exit(&(buf_pool->mutex));
buf_LRU_search_and_free_block(0);
mutex_enter(&(buf_pool->mutex));
}
mutex_exit(&(buf_pool->mutex));
}
/**********************************************************************
Returns a free block from buf_pool. The block is taken off the free list.
If it is empty, blocks are moved from the end of the LRU list to the free
list. */
buf_block_t*
buf_LRU_get_free_block(void)
/*========================*/
/* out: the free control block */
{
buf_block_t* block = NULL;
ibool freed;
ulint n_iterations = 0;
loop:
mutex_enter(&(buf_pool->mutex));
if (buf_pool->LRU_flush_ended > 0) {
mutex_exit(&(buf_pool->mutex));
buf_LRU_try_free_flushed_blocks();
mutex_enter(&(buf_pool->mutex));
}
/* If there is a block in the free list, take it */
if (UT_LIST_GET_LEN(buf_pool->free) > 0) {
block = UT_LIST_GET_FIRST(buf_pool->free);
UT_LIST_REMOVE(free, buf_pool->free, block);
block->state = BUF_BLOCK_READY_FOR_USE;
mutex_exit(&(buf_pool->mutex));
return(block);
}
/* If no block was in the free list, search from the end of the LRU
list and try to free a block there */
mutex_exit(&(buf_pool->mutex));
freed = buf_LRU_search_and_free_block(n_iterations);
if (freed > 0) {
goto loop;
}
/* No free block was found near the end of the list: try to flush
the LRU list */
buf_flush_free_margin();
os_event_wait(buf_pool->no_flush[BUF_FLUSH_LRU]);
n_iterations++;
os_aio_simulated_wake_handler_threads();
if (n_iterations > 10) {
os_thread_sleep(500000);
}
if (n_iterations > 20) {
/* buf_print();
os_aio_print();
rw_lock_list_print_info();
*/
if (n_iterations > 30) {
fprintf(stderr,
"Innobase: Warning: difficult to find free blocks from\n"
"Innobase: the buffer pool! Consider increasing the\n"
"Innobase: buffer pool size.\n");
}
}
goto loop;
}
/***********************************************************************
Moves the LRU_old pointer so that the length of the old blocks list
is inside the allowed limits. */
UNIV_INLINE
void
buf_LRU_old_adjust_len(void)
/*========================*/
{
ulint old_len;
ulint new_len;
ut_ad(buf_pool->LRU_old);
ut_ad(mutex_own(&(buf_pool->mutex)));
ut_ad(3 * (BUF_LRU_OLD_MIN_LEN / 8) > BUF_LRU_OLD_TOLERANCE + 5);
for (;;) {
old_len = buf_pool->LRU_old_len;
new_len = 3 * (UT_LIST_GET_LEN(buf_pool->LRU) / 8);
/* Update the LRU_old pointer if necessary */
if (old_len < new_len - BUF_LRU_OLD_TOLERANCE) {
buf_pool->LRU_old = UT_LIST_GET_PREV(LRU,
buf_pool->LRU_old);
(buf_pool->LRU_old)->old = TRUE;
buf_pool->LRU_old_len++;
} else if (old_len > new_len + BUF_LRU_OLD_TOLERANCE) {
(buf_pool->LRU_old)->old = FALSE;
buf_pool->LRU_old = UT_LIST_GET_NEXT(LRU,
buf_pool->LRU_old);
buf_pool->LRU_old_len--;
} else {
ut_ad(buf_pool->LRU_old); /* Check that we did not
fall out of the LRU list */
return;
}
}
}
/***********************************************************************
Initializes the old blocks pointer in the LRU list.
This function should be called when the LRU list grows to
BUF_LRU_OLD_MIN_LEN length. */
static
void
buf_LRU_old_init(void)
/*==================*/
{
buf_block_t* block;
ut_ad(UT_LIST_GET_LEN(buf_pool->LRU) == BUF_LRU_OLD_MIN_LEN);
/* We first initialize all blocks in the LRU list as old and then use
the adjust function to move the LRU_old pointer to the right
position */
block = UT_LIST_GET_FIRST(buf_pool->LRU);
while (block != NULL) {
block->old = TRUE;
block = UT_LIST_GET_NEXT(LRU, block);
}
buf_pool->LRU_old = UT_LIST_GET_FIRST(buf_pool->LRU);
buf_pool->LRU_old_len = UT_LIST_GET_LEN(buf_pool->LRU);
buf_LRU_old_adjust_len();
}
/**********************************************************************
Removes a block from the LRU list. */
UNIV_INLINE
void
buf_LRU_remove_block(
/*=================*/
buf_block_t* block) /* in: control block */
{
ut_ad(buf_pool);
ut_ad(block);
ut_ad(mutex_own(&(buf_pool->mutex)));
/* If the LRU_old pointer is defined and points to just this block,
move it backward one step */
if (block == buf_pool->LRU_old) {
/* Below: the previous block is guaranteed to exist, because
the LRU_old pointer is only allowed to differ by the
tolerance value from strict 3/8 of the LRU list length. */
buf_pool->LRU_old = UT_LIST_GET_PREV(LRU, block);
(buf_pool->LRU_old)->old = TRUE;
buf_pool->LRU_old_len++;
ut_ad(buf_pool->LRU_old);
}
/* Remove the block from the LRU list */
UT_LIST_REMOVE(LRU, buf_pool->LRU, block);
/* If the LRU list is so short that LRU_old not defined, return */
if (UT_LIST_GET_LEN(buf_pool->LRU) < BUF_LRU_OLD_MIN_LEN) {
buf_pool->LRU_old = NULL;
return;
}
ut_ad(buf_pool->LRU_old);
/* Update the LRU_old_len field if necessary */
if (block->old) {
buf_pool->LRU_old_len--;
}
/* Adjust the length of the old block list if necessary */
buf_LRU_old_adjust_len();
}
/**********************************************************************
Adds a block to the LRU list end. */
UNIV_INLINE
void
buf_LRU_add_block_to_end_low(
/*=========================*/
buf_block_t* block) /* in: control block */
{
buf_block_t* last_block;
ut_ad(buf_pool);
ut_ad(block);
ut_ad(mutex_own(&(buf_pool->mutex)));
block->old = TRUE;
last_block = UT_LIST_GET_LAST(buf_pool->LRU);
if (last_block) {
block->LRU_position = last_block->LRU_position;
} else {
block->LRU_position = buf_pool_clock_tic();
}
UT_LIST_ADD_LAST(LRU, buf_pool->LRU, block);
if (UT_LIST_GET_LEN(buf_pool->LRU) >= BUF_LRU_OLD_MIN_LEN) {
buf_pool->LRU_old_len++;
}
if (UT_LIST_GET_LEN(buf_pool->LRU) > BUF_LRU_OLD_MIN_LEN) {
ut_ad(buf_pool->LRU_old);
/* Adjust the length of the old block list if necessary */
buf_LRU_old_adjust_len();
} else if (UT_LIST_GET_LEN(buf_pool->LRU) == BUF_LRU_OLD_MIN_LEN) {
/* The LRU list is now long enough for LRU_old to become
defined: init it */
buf_LRU_old_init();
}
}
/**********************************************************************
Adds a block to the LRU list. */
UNIV_INLINE
void
buf_LRU_add_block_low(
/*==================*/
buf_block_t* block, /* in: control block */
ibool old) /* in: TRUE if should be put to the old blocks
in the LRU list, else put to the start; if the
LRU list is very short, the block is added to
the start, regardless of this parameter */
{
ulint cl;
ut_ad(buf_pool);
ut_ad(block);
ut_ad(mutex_own(&(buf_pool->mutex)));
block->old = old;
cl = buf_pool_clock_tic();
if (!old || (UT_LIST_GET_LEN(buf_pool->LRU) < BUF_LRU_OLD_MIN_LEN)) {
UT_LIST_ADD_FIRST(LRU, buf_pool->LRU, block);
block->LRU_position = cl;
block->freed_page_clock = buf_pool->freed_page_clock;
} else {
UT_LIST_INSERT_AFTER(LRU, buf_pool->LRU, buf_pool->LRU_old,
block);
buf_pool->LRU_old_len++;
/* We copy the LRU position field of the previous block
to the new block */
block->LRU_position = (buf_pool->LRU_old)->LRU_position;
}
if (UT_LIST_GET_LEN(buf_pool->LRU) > BUF_LRU_OLD_MIN_LEN) {
ut_ad(buf_pool->LRU_old);
/* Adjust the length of the old block list if necessary */
buf_LRU_old_adjust_len();
} else if (UT_LIST_GET_LEN(buf_pool->LRU) == BUF_LRU_OLD_MIN_LEN) {
/* The LRU list is now long enough for LRU_old to become
defined: init it */
buf_LRU_old_init();
}
}
/**********************************************************************
Adds a block to the LRU list. */
void
buf_LRU_add_block(
/*==============*/
buf_block_t* block, /* in: control block */
ibool old) /* in: TRUE if should be put to the old
blocks in the LRU list, else put to the start;
if the LRU list is very short, the block is
added to the start, regardless of this
parameter */
{
buf_LRU_add_block_low(block, old);
}
/**********************************************************************
Moves a block to the start of the LRU list. */
void
buf_LRU_make_block_young(
/*=====================*/
buf_block_t* block) /* in: control block */
{
buf_LRU_remove_block(block);
buf_LRU_add_block_low(block, FALSE);
}
/**********************************************************************
Moves a block to the end of the LRU list. */
void
buf_LRU_make_block_old(
/*===================*/
buf_block_t* block) /* in: control block */
{
buf_LRU_remove_block(block);
buf_LRU_add_block_to_end_low(block);
}
/**********************************************************************
Puts a block back to the free list. */
void
buf_LRU_block_free_non_file_page(
/*=============================*/
buf_block_t* block) /* in: block, must not contain a file page */
{
ut_ad(mutex_own(&(buf_pool->mutex)));
ut_ad(block);
ut_ad((block->state == BUF_BLOCK_MEMORY)
|| (block->state == BUF_BLOCK_READY_FOR_USE));
block->state = BUF_BLOCK_NOT_USED;
UT_LIST_ADD_FIRST(free, buf_pool->free, block);
}
/**********************************************************************
Takes a block out of the LRU list and page hash table and sets the block
state to BUF_BLOCK_REMOVE_HASH. */
static
void
buf_LRU_block_remove_hashed_page(
/*=============================*/
buf_block_t* block) /* in: block, must contain a file page and
be in a state where it can be freed; there
may or may not be a hash index to the page */
{
ut_ad(mutex_own(&(buf_pool->mutex)));
ut_ad(block);
ut_ad(block->state == BUF_BLOCK_FILE_PAGE);
ut_a(block->io_fix == 0);
ut_a(block->buf_fix_count == 0);
ut_a(ut_dulint_cmp(block->oldest_modification, ut_dulint_zero) == 0);
buf_LRU_remove_block(block);
buf_pool->freed_page_clock += 1;
buf_frame_modify_clock_inc(block->frame);
HASH_DELETE(buf_block_t, hash, buf_pool->page_hash,
buf_page_address_fold(block->space, block->offset),
block);
block->state = BUF_BLOCK_REMOVE_HASH;
}
/**********************************************************************
Puts a file page whose has no hash index to the free list. */
static
void
buf_LRU_block_free_hashed_page(
/*===========================*/
buf_block_t* block) /* in: block, must contain a file page and
be in a state where it can be freed */
{
ut_ad(mutex_own(&(buf_pool->mutex)));
ut_ad(block->state == BUF_BLOCK_REMOVE_HASH);
block->state = BUF_BLOCK_MEMORY;
buf_LRU_block_free_non_file_page(block);
}
/**************************************************************************
Validates the LRU list. */
ibool
buf_LRU_validate(void)
/*==================*/
{
buf_block_t* block;
ulint old_len;
ulint new_len;
ulint LRU_pos;
ut_ad(buf_pool);
mutex_enter(&(buf_pool->mutex));
if (UT_LIST_GET_LEN(buf_pool->LRU) >= BUF_LRU_OLD_MIN_LEN) {
ut_a(buf_pool->LRU_old);
old_len = buf_pool->LRU_old_len;
new_len = 3 * (UT_LIST_GET_LEN(buf_pool->LRU) / 8);
ut_a(old_len >= new_len - BUF_LRU_OLD_TOLERANCE);
ut_a(old_len <= new_len + BUF_LRU_OLD_TOLERANCE);
}
UT_LIST_VALIDATE(LRU, buf_block_t, buf_pool->LRU);
block = UT_LIST_GET_FIRST(buf_pool->LRU);
old_len = 0;
while (block != NULL) {
ut_a(block->state == BUF_BLOCK_FILE_PAGE);
if (block->old) {
old_len++;
}
if (buf_pool->LRU_old && (old_len == 1)) {
ut_a(buf_pool->LRU_old == block);
}
LRU_pos = block->LRU_position;
block = UT_LIST_GET_NEXT(LRU, block);
if (block) {
/* If the following assert fails, it may
not be an error: just the buf_pool clock
has wrapped around */
ut_a(LRU_pos >= block->LRU_position);
}
}
if (buf_pool->LRU_old) {
ut_a(buf_pool->LRU_old_len == old_len);
}
UT_LIST_VALIDATE(free, buf_block_t, buf_pool->free);
block = UT_LIST_GET_FIRST(buf_pool->free);
while (block != NULL) {
ut_a(block->state == BUF_BLOCK_NOT_USED);
block = UT_LIST_GET_NEXT(free, block);
}
mutex_exit(&(buf_pool->mutex));
return(TRUE);
}
/**************************************************************************
Prints the LRU list. */
void
buf_LRU_print(void)
/*===============*/
{
buf_block_t* block;
buf_frame_t* frame;
ulint len;
ut_ad(buf_pool);
mutex_enter(&(buf_pool->mutex));
printf("Pool ulint clock %lu\n", buf_pool->ulint_clock);
block = UT_LIST_GET_FIRST(buf_pool->LRU);
len = 0;
while (block != NULL) {
printf("BLOCK %lu ", block->offset);
if (block->old) {
printf("old ");
}
if (block->buf_fix_count) {
printf("buffix count %lu ", block->buf_fix_count);
}
if (block->io_fix) {
printf("io_fix %lu ", block->io_fix);
}
if (ut_dulint_cmp(block->oldest_modification,
ut_dulint_zero) > 0) {
printf("modif. ");
}
printf("LRU pos %lu ", block->LRU_position);
frame = buf_block_get_frame(block);
printf("type %lu ", fil_page_get_type(frame));
printf("index id %lu ", ut_dulint_get_low(
btr_page_get_index_id(frame)));
block = UT_LIST_GET_NEXT(LRU, block);
len++;
if (len % 10 == 0) {
printf("\n");
}
}
mutex_exit(&(buf_pool->mutex));
}

559
innobase/buf/buf0rea.c Normal file
View file

@ -0,0 +1,559 @@
/******************************************************
The database buffer read
(c) 1995 Innobase Oy
Created 11/5/1995 Heikki Tuuri
*******************************************************/
#include "buf0rea.h"
#include "fil0fil.h"
#include "mtr0mtr.h"
#include "buf0buf.h"
#include "buf0flu.h"
#include "buf0lru.h"
#include "ibuf0ibuf.h"
#include "log0recv.h"
#include "trx0sys.h"
#include "os0file.h"
/* The size in blocks of the area where the random read-ahead algorithm counts
the accessed pages when deciding whether to read-ahead */
#define BUF_READ_AHEAD_RANDOM_AREA BUF_READ_AHEAD_AREA
/* There must be at least this many pages in buf_pool in the area to start
a random read-ahead */
#define BUF_READ_AHEAD_RANDOM_THRESHOLD (5 + BUF_READ_AHEAD_RANDOM_AREA / 8)
/* The linear read-ahead area size */
#define BUF_READ_AHEAD_LINEAR_AREA BUF_READ_AHEAD_AREA
/* The linear read-ahead threshold */
#define BUF_READ_AHEAD_LINEAR_THRESHOLD (3 * BUF_READ_AHEAD_LINEAR_AREA / 8)
/* If there are buf_pool->curr_size per the number below pending reads, then
read-ahead is not done: this is to prevent flooding the buffer pool with
i/o-fixed buffer blocks */
#define BUF_READ_AHEAD_PEND_LIMIT 2
/************************************************************************
Low-level function which reads a page asynchronously from a file to the
buffer buf_pool if it is not already there, in which case does nothing.
Sets the io_fix flag and sets an exclusive lock on the buffer frame. The
flag is cleared and the x-lock released by an i/o-handler thread. */
static
ulint
buf_read_page_low(
/*==============*/
/* out: 1 if a read request was queued, 0 if the page
already resided in buf_pool */
ibool sync, /* in: TRUE if synchronous aio is desired */
ulint mode, /* in: BUF_READ_IBUF_PAGES_ONLY, ...,
ORed to OS_AIO_SIMULATED_WAKE_LATER (see below
at read-ahead functions) */
ulint space, /* in: space id */
ulint offset) /* in: page number */
{
buf_block_t* block;
ulint wake_later;
wake_later = mode & OS_AIO_SIMULATED_WAKE_LATER;
mode = mode & ~OS_AIO_SIMULATED_WAKE_LATER;
#ifdef UNIV_LOG_DEBUG
if (space % 2 == 1) {
/* We are updating a replicate space while holding the
log mutex: the read must be handled before other reads
which might incur ibuf operations and thus write to the log */
printf("Log debug: reading replicate page in sync mode\n");
sync = TRUE;
}
#endif
if (trx_sys_hdr_page(space, offset)) {
/* Trx sys header is so low in the latching order that we play
safe and do not leave the i/o-completion to an asynchronous
i/o-thread: */
sync = TRUE;
}
block = buf_page_init_for_read(mode, space, offset);
if (block != NULL) {
fil_io(OS_FILE_READ | wake_later,
sync, space, offset, 0, UNIV_PAGE_SIZE,
(void*)block->frame, (void*)block);
if (sync) {
/* The i/o is already completed when we arrive from
fil_read */
buf_page_io_complete(block);
}
return(1);
}
return(0);
}
/************************************************************************
Applies a random read-ahead in buf_pool if there are at least a threshold
value of accessed pages from the random read-ahead area. Does not read any
page, not even the one at the position (space, offset), if the read-ahead
mechanism is not activated. NOTE 1: the calling thread may own latches on
pages: to avoid deadlocks this function must be written such that it cannot
end up waiting for these latches! NOTE 2: the calling thread must want
access to the page given: this rule is set to prevent unintended read-aheads
performed by ibuf routines, a situation which could result in a deadlock if
the OS does not support asynchronous i/o. */
static
ulint
buf_read_ahead_random(
/*==================*/
/* out: number of page read requests issued; NOTE
that if we read ibuf pages, it may happen that
the page at the given page number does not get
read even if we return a value > 0! */
ulint space, /* in: space id */
ulint offset) /* in: page number of a page which the current thread
wants to access */
{
buf_block_t* block;
ulint recent_blocks = 0;
ulint count;
ulint LRU_recent_limit;
ulint ibuf_mode;
ulint low, high;
ulint i;
if (ibuf_bitmap_page(offset)) {
/* If it is an ibuf bitmap page, we do no read-ahead, as
that could break the ibuf page access order */
return(0);
}
low = (offset / BUF_READ_AHEAD_RANDOM_AREA)
* BUF_READ_AHEAD_RANDOM_AREA;
high = (offset / BUF_READ_AHEAD_RANDOM_AREA + 1)
* BUF_READ_AHEAD_RANDOM_AREA;
if (high > fil_space_get_size(space)) {
high = fil_space_get_size(space);
}
/* Get the minimum LRU_position field value for an initial segment
of the LRU list, to determine which blocks have recently been added
to the start of the list. */
LRU_recent_limit = buf_LRU_get_recent_limit();
mutex_enter(&(buf_pool->mutex));
if (buf_pool->n_pend_reads >
buf_pool->curr_size / BUF_READ_AHEAD_PEND_LIMIT) {
mutex_exit(&(buf_pool->mutex));
return(0);
}
/* Count how many blocks in the area have been recently accessed,
that is, reside near the start of the LRU list. */
for (i = low; i < high; i++) {
block = buf_page_hash_get(space, i);
if ((block)
&& (block->LRU_position > LRU_recent_limit)
&& block->accessed) {
recent_blocks++;
}
}
mutex_exit(&(buf_pool->mutex));
if (recent_blocks < BUF_READ_AHEAD_RANDOM_THRESHOLD) {
/* Do nothing */
return(0);
}
/* Read all the suitable blocks within the area */
if (ibuf_inside()) {
ibuf_mode = BUF_READ_IBUF_PAGES_ONLY;
} else {
ibuf_mode = BUF_READ_ANY_PAGE;
}
count = 0;
for (i = low; i < high; i++) {
/* It is only sensible to do read-ahead in the non-sync aio
mode: hence FALSE as the first parameter */
if (!ibuf_bitmap_page(i)) {
count += buf_read_page_low(FALSE, ibuf_mode
| OS_AIO_SIMULATED_WAKE_LATER,
space, i);
}
}
/* In simulated aio we wake the aio handler threads only after
queuing all aio requests, in native aio the following call does
nothing: */
os_aio_simulated_wake_handler_threads();
if (buf_debug_prints && (count > 0)) {
printf("Random read-ahead space %lu offset %lu pages %lu\n",
space, offset, count);
}
return(count);
}
/************************************************************************
High-level function which reads a page asynchronously from a file to the
buffer buf_pool if it is not already there. Sets the io_fix flag and sets
an exclusive lock on the buffer frame. The flag is cleared and the x-lock
released by the i/o-handler thread. Does a random read-ahead if it seems
sensible. */
ulint
buf_read_page(
/*==========*/
/* out: number of page read requests issued: this can
be > 1 if read-ahead occurred */
ulint space, /* in: space id */
ulint offset) /* in: page number */
{
ulint count;
ulint count2;
count = buf_read_ahead_random(space, offset);
/* We do the i/o in the synchronous aio mode to save thread
switches: hence TRUE */
count2 = buf_read_page_low(TRUE, BUF_READ_ANY_PAGE, space, offset);
/* Flush pages from the end of the LRU list if necessary */
buf_flush_free_margin();
return(count + count2);
}
/************************************************************************
Applies linear read-ahead if in the buf_pool the page is a border page of
a linear read-ahead area and all the pages in the area have been accessed.
Does not read any page if the read-ahead mechanism is not activated. Note
that the the algorithm looks at the 'natural' adjacent successor and
predecessor of the page, which on the leaf level of a B-tree are the next
and previous page in the chain of leaves. To know these, the page specified
in (space, offset) must already be present in the buf_pool. Thus, the
natural way to use this function is to call it when a page in the buf_pool
is accessed the first time, calling this function just after it has been
bufferfixed.
NOTE 1: as this function looks at the natural predecessor and successor
fields on the page, what happens, if these are not initialized to any
sensible value? No problem, before applying read-ahead we check that the
area to read is within the span of the space, if not, read-ahead is not
applied. An uninitialized value may result in a useless read operation, but
only very improbably.
NOTE 2: the calling thread may own latches on pages: to avoid deadlocks this
function must be written such that it cannot end up waiting for these
latches!
NOTE 3: the calling thread must want access to the page given: this rule is
set to prevent unintended read-aheads performed by ibuf routines, a situation
which could result in a deadlock if the OS does not support asynchronous io. */
ulint
buf_read_ahead_linear(
/*==================*/
/* out: number of page read requests issued */
ulint space, /* in: space id */
ulint offset) /* in: page number of a page; NOTE: the current thread
must want access to this page (see NOTE 3 above) */
{
buf_block_t* block;
buf_frame_t* frame;
buf_block_t* pred_block = NULL;
ulint pred_offset;
ulint succ_offset;
ulint count;
int asc_or_desc;
ulint new_offset;
ulint fail_count;
ulint ibuf_mode;
ulint low, high;
ulint i;
if (ibuf_bitmap_page(offset)) {
/* If it is an ibuf bitmap page, we do no read-ahead, as
that could break the ibuf page access order */
return(0);
}
low = (offset / BUF_READ_AHEAD_LINEAR_AREA)
* BUF_READ_AHEAD_LINEAR_AREA;
high = (offset / BUF_READ_AHEAD_LINEAR_AREA + 1)
* BUF_READ_AHEAD_LINEAR_AREA;
if ((offset != low) && (offset != high - 1)) {
/* This is not a border page of the area: return */
return(0);
}
if (high > fil_space_get_size(space)) {
/* The area is not whole, return */
return(0);
}
mutex_enter(&(buf_pool->mutex));
if (buf_pool->n_pend_reads >
buf_pool->curr_size / BUF_READ_AHEAD_PEND_LIMIT) {
mutex_exit(&(buf_pool->mutex));
return(0);
}
/* Check that almost all pages in the area have been accessed; if
offset == low, the accesses must be in a descending order, otherwise,
in an ascending order. */
asc_or_desc = 1;
if (offset == low) {
asc_or_desc = -1;
}
fail_count = 0;
for (i = low; i < high; i++) {
block = buf_page_hash_get(space, i);
if ((block == NULL) || !block->accessed) {
/* Not accessed */
fail_count++;
} else if (pred_block && (ut_ulint_cmp(block->LRU_position,
pred_block->LRU_position)
!= asc_or_desc)) {
/* Accesses not in the right order */
fail_count++;
pred_block = block;
}
}
if (fail_count > BUF_READ_AHEAD_LINEAR_AREA -
BUF_READ_AHEAD_LINEAR_THRESHOLD) {
/* Too many failures: return */
mutex_exit(&(buf_pool->mutex));
return(0);
}
/* If we got this far, we know that enough pages in the area have
been accessed in the right order: linear read-ahead can be sensible */
block = buf_page_hash_get(space, offset);
if (block == NULL) {
mutex_exit(&(buf_pool->mutex));
return(0);
}
frame = block->frame;
/* Read the natural predecessor and successor page addresses from
the page; NOTE that because the calling thread may have an x-latch
on the page, we do not acquire an s-latch on the page, this is to
prevent deadlocks. Even if we read values which are nonsense, the
algorithm will work. */
pred_offset = fil_page_get_prev(frame);
succ_offset = fil_page_get_next(frame);
mutex_exit(&(buf_pool->mutex));
if ((offset == low) && (succ_offset == offset + 1)) {
/* This is ok, we can continue */
new_offset = pred_offset;
} else if ((offset == high - 1) && (pred_offset == offset - 1)) {
/* This is ok, we can continue */
new_offset = succ_offset;
} else {
/* Successor or predecessor not in the right order */
return(0);
}
low = (new_offset / BUF_READ_AHEAD_LINEAR_AREA)
* BUF_READ_AHEAD_LINEAR_AREA;
high = (new_offset / BUF_READ_AHEAD_LINEAR_AREA + 1)
* BUF_READ_AHEAD_LINEAR_AREA;
if ((new_offset != low) && (new_offset != high - 1)) {
/* This is not a border page of the area: return */
return(0);
}
if (high > fil_space_get_size(space)) {
/* The area is not whole, return */
return(0);
}
/* If we got this far, read-ahead can be sensible: do it */
if (ibuf_inside()) {
ibuf_mode = BUF_READ_IBUF_PAGES_ONLY;
} else {
ibuf_mode = BUF_READ_ANY_PAGE;
}
count = 0;
for (i = low; i < high; i++) {
/* It is only sensible to do read-ahead in the non-sync
aio mode: hence FALSE as the first parameter */
if (!ibuf_bitmap_page(i)) {
count += buf_read_page_low(FALSE, ibuf_mode
| OS_AIO_SIMULATED_WAKE_LATER,
space, i);
}
}
/* In simulated aio we wake the aio handler threads only after
queuing all aio requests, in native aio the following call does
nothing: */
os_aio_simulated_wake_handler_threads();
/* Flush pages from the end of the LRU list if necessary */
buf_flush_free_margin();
if (buf_debug_prints && (count > 0)) {
printf(
"LINEAR read-ahead space %lu offset %lu pages %lu\n",
space, offset, count);
}
return(count);
}
/************************************************************************
Issues read requests for pages which the ibuf module wants to read in, in
order to contract insert buffer trees. Technically, this function is like
a read-ahead function. */
void
buf_read_ibuf_merge_pages(
/*======================*/
ibool sync, /* in: TRUE if the caller wants this function
to wait for the highest address page to get
read in, before this function returns */
ulint space, /* in: space id */
ulint* page_nos, /* in: array of page numbers to read, with the
highest page number the last in the array */
ulint n_stored) /* in: number of page numbers in the array */
{
ulint i;
ut_ad(!ibuf_inside());
#ifdef UNIV_IBUF_DEBUG
ut_a(n_stored < UNIV_PAGE_SIZE);
#endif
while (buf_pool->n_pend_reads >
buf_pool->curr_size / BUF_READ_AHEAD_PEND_LIMIT) {
os_thread_sleep(500000);
}
for (i = 0; i < n_stored; i++) {
if ((i + 1 == n_stored) && sync) {
buf_read_page_low(TRUE, BUF_READ_ANY_PAGE, space,
page_nos[i]);
} else {
buf_read_page_low(FALSE, BUF_READ_ANY_PAGE, space,
page_nos[i]);
}
}
/* Flush pages from the end of the LRU list if necessary */
buf_flush_free_margin();
if (buf_debug_prints) {
printf("Ibuf merge read-ahead space %lu pages %lu\n",
space, n_stored);
}
}
/************************************************************************
Issues read requests for pages which recovery wants to read in. */
void
buf_read_recv_pages(
/*================*/
ibool sync, /* in: TRUE if the caller wants this function
to wait for the highest address page to get
read in, before this function returns */
ulint space, /* in: space id */
ulint* page_nos, /* in: array of page numbers to read, with the
highest page number the last in the array */
ulint n_stored) /* in: number of page numbers in the array */
{
ulint i;
for (i = 0; i < n_stored; i++) {
while (buf_pool->n_pend_reads >= RECV_POOL_N_FREE_BLOCKS / 2) {
os_aio_simulated_wake_handler_threads();
os_thread_sleep(500000);
}
if ((i + 1 == n_stored) && sync) {
buf_read_page_low(TRUE, BUF_READ_ANY_PAGE, space,
page_nos[i]);
} else {
buf_read_page_low(FALSE, BUF_READ_ANY_PAGE
| OS_AIO_SIMULATED_WAKE_LATER,
space, page_nos[i]);
}
}
os_aio_simulated_wake_handler_threads();
/* Flush pages from the end of the LRU list if necessary */
buf_flush_free_margin();
if (buf_debug_prints) {
printf("Recovery applies read-ahead pages %lu\n", n_stored);
}
}

20
innobase/buf/makefilewin Normal file
View file

@ -0,0 +1,20 @@
include ..\include\makefile.i
buf.lib: buf0buf.obj buf0lru.obj buf0flu.obj buf0rea.obj
lib -out:..\libs\buf.lib buf0buf.obj buf0lru.obj buf0flu.obj buf0rea.obj
buf0buf.obj: buf0buf.c
$(CCOM) $(CFL) -c buf0buf.c
buf0lru.obj: buf0lru.c
$(CCOM) $(CFL) -c buf0lru.c
buf0flu.obj: buf0flu.c
$(CCOM) $(CFL) -c buf0flu.c
buf0rea.obj: buf0rea.c
$(CCOM) $(CFL) -c buf0rea.c

20
innobase/buf/ts/makefile Normal file
View file

@ -0,0 +1,20 @@
include ..\..\makefile.i
doall: tsbuf
tsbuf: ..\buf.lib tsbuf.c
$(CCOM) $(CFL) -I.. -I..\.. ..\buf.lib ..\..\btr.lib ..\..\trx.lib ..\..\pars.lib ..\..\que.lib ..\..\lock.lib ..\..\row.lib ..\..\read.lib ..\..\srv.lib ..\..\com.lib ..\..\usr.lib ..\..\thr.lib ..\..\fut.lib ..\..\fsp.lib ..\..\page.lib ..\..\dyn.lib ..\..\mtr.lib ..\..\log.lib ..\..\rem.lib ..\..\fil.lib ..\..\dict.lib ..\..\data.lib ..\..\mach.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tsbuf.c $(LFL)
tsos: ..\buf.lib tsos.c
$(CCOM) $(CFL) -I.. -I..\.. ..\buf.lib ..\..\mach.lib ..\..\fil.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tsos.c $(LFL)

885
innobase/buf/ts/tsbuf.c Normal file
View file

@ -0,0 +1,885 @@
/************************************************************************
The test module for the file system and buffer manager
(c) 1995 Innobase Oy
Created 11/16/1995 Heikki Tuuri
*************************************************************************/
#include "string.h"
#include "os0thread.h"
#include "os0file.h"
#include "ut0ut.h"
#include "ut0byte.h"
#include "sync0sync.h"
#include "mem0mem.h"
#include "fil0fil.h"
#include "mtr0mtr.h"
#include "mtr0log.h"
#include "log0log.h"
#include "mach0data.h"
#include "..\buf0buf.h"
#include "..\buf0flu.h"
#include "..\buf0lru.h"
os_file_t files[1000];
mutex_t ios_mutex;
ulint ios;
ulint n[10];
mutex_t incs_mutex;
ulint incs;
#define N_SPACES 1
#define N_FILES 1
#define FILE_SIZE 4000
#define POOL_SIZE 1000
#define COUNTER_OFFSET 1500
#define LOOP_SIZE 150
#define N_THREADS 5
ulint zero = 0;
buf_frame_t* bl_arr[POOL_SIZE];
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
void* mess;
ulint i;
bool ret;
segment = *((ulint*)arg);
printf("Io handler thread %lu starts\n", segment);
for (i = 0;; i++) {
ret = fil_aio_wait(segment, &mess);
ut_a(ret);
buf_page_io_complete((buf_block_t*)mess);
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
}
return(0);
}
/*************************************************************************
This thread reports the status of sync system. */
ulint
info_thread(
/*========*/
void* arg)
{
ulint segment;
segment = *((ulint*)arg);
for (;;) {
sync_print();
os_aio_print();
printf("Debug stop threads == %lu\n", ut_dbg_stop_threads);
os_thread_sleep(30000000);
}
return(0);
}
/*************************************************************************
Creates the files for the file system test and inserts them to
the file system. */
void
create_files(void)
/*==============*/
{
bool ret;
ulint i, k;
char name[20];
os_thread_t thr[5];
os_thread_id_t id[5];
ulint err;
printf("--------------------------------------------------------\n");
printf("Create or open database files\n");
strcpy(name, "tsfile00");
for (k = 0; k < N_SPACES; k++) {
for (i = 0; i < N_FILES; i++) {
name[9] = (char)((ulint)'0' + k);
name[10] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE,
OS_FILE_TABLESPACE, &ret);
if (ret == FALSE) {
err = os_file_get_last_error();
if (err != OS_FILE_ALREADY_EXISTS) {
printf("OS error %lu in file creation\n", err);
ut_error;
}
files[i] = os_file_create(
name, OS_FILE_OPEN,
OS_FILE_TABLESPACE, &ret);
ut_a(ret);
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create(name, k, OS_FILE_TABLESPACE);
}
ut_a(fil_validate());
fil_node_create(name, FILE_SIZE, k);
}
}
ios = 0;
mutex_create(&ios_mutex);
for (i = 0; i < 5; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
/*
n[9] = 9;
os_thread_create(info_thread, n + 9, id);
*/
}
/************************************************************************
Creates the test database files. */
void
create_db(void)
/*===========*/
{
ulint i;
byte* frame;
ulint j;
ulint tm, oldtm;
mtr_t mtr;
printf("--------------------------------------------------------\n");
printf("Write database pages\n");
oldtm = ut_clock();
for (i = 0; i < N_SPACES; i++) {
for (j = 0; j < FILE_SIZE * N_FILES; j++) {
mtr_start(&mtr);
mtr_set_log_mode(&mtr, MTR_LOG_NONE);
frame = buf_page_create(i, j, &mtr);
buf_page_get(i, j, RW_X_LATCH, &mtr);
if (j > FILE_SIZE * N_FILES - 64 * 2 - 1) {
mlog_write_ulint(frame + FIL_PAGE_PREV, j - 5,
MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_NEXT, j - 7,
MLOG_4BYTES, &mtr);
} else {
mlog_write_ulint(frame + FIL_PAGE_PREV, j - 1,
MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_NEXT, j + 1,
MLOG_4BYTES, &mtr);
}
mlog_write_ulint(frame + FIL_PAGE_OFFSET, j,
MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_SPACE, i,
MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + COUNTER_OFFSET, 0,
MLOG_4BYTES, &mtr);
mtr_commit(&mtr);
}
}
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
printf("--------------------------------------------------------\n");
printf("TEST 1 A. Test of page creation when page resides in buffer\n");
for (i = 0; i < N_SPACES; i++) {
for (j = FILE_SIZE * N_FILES - 200;
j < FILE_SIZE * N_FILES; j++) {
mtr_start(&mtr);
mtr_set_log_mode(&mtr, MTR_LOG_NONE);
frame = buf_page_create(i, j, &mtr);
buf_page_get(i, j, RW_X_LATCH, &mtr);
mlog_write_ulint(frame + FIL_PAGE_PREV,
j - 1, MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_NEXT,
j + 1, MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_OFFSET, j,
MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_SPACE, i,
MLOG_4BYTES, &mtr);
mtr_commit(&mtr);
}
}
printf("--------------------------------------------------------\n");
printf("TEST 1 B. Flush pages\n");
buf_flush_batch(BUF_FLUSH_LIST, POOL_SIZE / 2);
buf_validate();
printf("--------------------------------------------------------\n");
printf("TEST 1 C. Allocate POOL_SIZE blocks to flush pages\n");
buf_validate();
/* Flush the pool of dirty pages */
for (i = 0; i < POOL_SIZE; i++) {
bl_arr[i] = buf_frame_alloc();
}
buf_validate();
buf_LRU_print();
for (i = 0; i < POOL_SIZE; i++) {
buf_frame_free(bl_arr[i]);
}
buf_validate();
ut_a(buf_all_freed());
mtr_start(&mtr);
frame = buf_page_get(0, 313, RW_S_LATCH, &mtr);
#ifdef UNIV_ASYNC_IO
ut_a(buf_page_io_query(buf_block_align(frame)) == TRUE);
#endif
mtr_commit(&mtr);
}
/************************************************************************
Reads the test database files. */
void
test1(void)
/*=======*/
{
ulint i, j, k, c;
byte* frame;
ulint tm, oldtm;
mtr_t mtr;
printf("--------------------------------------------------------\n");
printf("TEST 1 D. Read linearly database files\n");
oldtm = ut_clock();
for (k = 0; k < 1; k++) {
for (i = 0; i < N_SPACES; i++) {
for (j = 0; j < N_FILES * FILE_SIZE; j++) {
mtr_start(&mtr);
frame = buf_page_get(i, j, RW_S_LATCH, &mtr);
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET,
MLOG_4BYTES, &mtr)
== j);
ut_a(mtr_read_ulint(frame + FIL_PAGE_SPACE,
MLOG_4BYTES, &mtr)
== i);
mtr_commit(&mtr);
}
}
}
tm = ut_clock();
printf("Wall clock time for %lu pages %lu milliseconds\n",
k * i * j, tm - oldtm);
buf_validate();
printf("--------------------------------------------------------\n");
printf("TEST 1 E. Read linearly downward database files\n");
oldtm = ut_clock();
c = 0;
for (k = 0; k < 1; k++) {
for (i = 0; i < N_SPACES; i++) {
for (j = ut_min(1000, FILE_SIZE - 1); j > 0; j--) {
mtr_start(&mtr);
frame = buf_page_get(i, j, RW_S_LATCH, &mtr);
c++;
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET,
MLOG_4BYTES, &mtr)
== j);
ut_a(mtr_read_ulint(frame + FIL_PAGE_SPACE,
MLOG_4BYTES, &mtr)
== i);
ut_a(buf_page_io_query(buf_block_align(frame))
== FALSE);
mtr_commit(&mtr);
}
}
}
tm = ut_clock();
printf("Wall clock time for %lu pages %lu milliseconds\n",
c, tm - oldtm);
buf_validate();
}
/************************************************************************
Reads the test database files. */
void
test2(void)
/*=======*/
{
ulint i, j, k;
byte* frame;
ulint tm, oldtm;
mtr_t mtr;
printf("--------------------------------------------------------\n");
printf("TEST 2. Read randomly database files\n");
oldtm = ut_clock();
for (k = 0; k < 100; k++) {
i = ut_rnd_gen_ulint() % N_SPACES;
j = ut_rnd_gen_ulint() % (N_FILES * FILE_SIZE);
mtr_start(&mtr);
frame = buf_page_get(i, j, RW_S_LATCH, &mtr);
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET,
MLOG_4BYTES, &mtr)
== j);
ut_a(mtr_read_ulint(frame + FIL_PAGE_SPACE,
MLOG_4BYTES, &mtr)
== i);
mtr_commit(&mtr);
}
tm = ut_clock();
printf("Wall clock time for random %lu read %lu milliseconds\n",
k, tm - oldtm);
}
/************************************************************************
Reads the test database files. */
void
test3(void)
/*=======*/
{
ulint i, j, k;
byte* frame;
ulint tm, oldtm;
ulint rnd;
mtr_t mtr;
if (FILE_SIZE < POOL_SIZE + 3050 + ut_dbg_zero) {
return;
}
printf("Flush the pool of high-offset pages\n");
/* Flush the pool of high-offset pages */
for (i = 0; i < POOL_SIZE; i++) {
mtr_start(&mtr);
frame = buf_page_get(0, i, RW_S_LATCH, &mtr);
mtr_commit(&mtr);
}
buf_validate();
printf("--------------------------------------------------------\n");
printf("TEST 3. Read randomly database pages, no read-ahead\n");
oldtm = ut_clock();
rnd = 123;
for (k = 0; k < 400; k++) {
rnd += 23477;
i = 0;
j = POOL_SIZE + 10 + rnd % 3000;
mtr_start(&mtr);
frame = buf_page_get(i, j, RW_S_LATCH, &mtr);
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET,
MLOG_4BYTES, &mtr)
== j);
ut_a(mtr_read_ulint(frame + FIL_PAGE_SPACE,
MLOG_4BYTES, &mtr)
== i);
mtr_commit(&mtr);
}
tm = ut_clock();
printf(
"Wall clock time for %lu random no read-ahead %lu milliseconds\n",
k, tm - oldtm);
buf_validate();
printf("Flush the pool of high-offset pages\n");
/* Flush the pool of high-offset pages */
for (i = 0; i < POOL_SIZE; i++) {
mtr_start(&mtr);
frame = buf_page_get(0, i, RW_S_LATCH, &mtr);
mtr_commit(&mtr);
}
buf_validate();
printf("--------------------------------------------------------\n");
printf("TEST 3 B. Read randomly database pages, random read-ahead\n");
oldtm = ut_clock();
rnd = 123;
for (k = 0; k < 400; k++) {
rnd += 23477;
i = 0;
j = POOL_SIZE + 10 + rnd % 400;
mtr_start(&mtr);
frame = buf_page_get(i, j, RW_S_LATCH, &mtr);
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET,
MLOG_4BYTES, &mtr)
== j);
ut_a(mtr_read_ulint(frame + FIL_PAGE_SPACE,
MLOG_4BYTES, &mtr)
== i);
mtr_commit(&mtr);
}
tm = ut_clock();
printf(
"Wall clock time for %lu random read-ahead %lu milliseconds\n",
k, tm - oldtm);
}
/************************************************************************
Tests speed of CPU algorithms. */
void
test4(void)
/*=======*/
{
ulint i, j;
ulint tm, oldtm;
mtr_t mtr;
buf_frame_t* frame;
os_thread_sleep(2000000);
printf("--------------------------------------------------------\n");
printf("TEST 4. Speed of CPU algorithms\n");
oldtm = ut_clock();
for (j = 0; j < 1000; j++) {
mtr_start(&mtr);
for (i = 0; i < 20; i++) {
frame = buf_page_get(0, i, RW_S_LATCH, &mtr);
}
mtr_commit(&mtr);
}
tm = ut_clock();
printf("Wall clock time for %lu page get-release %lu milliseconds\n",
i * j, tm - oldtm);
buf_validate();
oldtm = ut_clock();
for (i = 0; i < 10000; i++) {
frame = buf_frame_alloc();
buf_frame_free(frame);
}
tm = ut_clock();
printf("Wall clock time for %lu block alloc-free %lu milliseconds\n",
i, tm - oldtm);
ha_print_info(buf_pool->page_hash);
buf_print();
}
/************************************************************************
Tests various points of code. */
void
test5(void)
/*=======*/
{
buf_frame_t* frame;
fil_addr_t addr;
ulint space;
mtr_t mtr;
printf("--------------------------------------------------------\n");
printf("TEST 5. Various tests \n");
mtr_start(&mtr);
frame = buf_page_get(0, 313, RW_S_LATCH, &mtr);
ut_a(buf_frame_get_space_id(frame) == 0);
ut_a(buf_frame_get_page_no(frame) == 313);
ut_a(buf_frame_align(frame + UNIV_PAGE_SIZE - 1) == frame);
ut_a(buf_frame_align(frame) == frame);
ut_a(buf_block_align(frame + UNIV_PAGE_SIZE - 1) ==
buf_block_align(frame));
buf_ptr_get_fsp_addr(frame + UNIV_PAGE_SIZE - 1, &space, &addr);
ut_a(addr.page == 313)
ut_a(addr.boffset == UNIV_PAGE_SIZE - 1);
ut_a(space == 0);
mtr_commit(&mtr);
}
/************************************************************************
Random test thread function. */
ulint
random_thread(
/*===========*/
void* arg)
{
ulint n;
ulint i, j, r, t, p, sp, count;
ulint s;
buf_frame_t* arr[POOL_SIZE / N_THREADS];
buf_frame_t* frame;
mtr_t mtr;
mtr_t mtr2;
n = *((ulint*)arg);
printf("Random test thread %lu starts\n", os_thread_get_curr_id());
for (i = 0; i < 30; i++) {
t = ut_rnd_gen_ulint() % 10;
r = ut_rnd_gen_ulint() % 100;
s = ut_rnd_gen_ulint() % (POOL_SIZE / N_THREADS);
p = ut_rnd_gen_ulint();
sp = ut_rnd_gen_ulint() % N_SPACES;
if (i % 100 == 0) {
printf("Thr %lu tst %lu starts\n", os_thread_get_curr_id(), t);
}
ut_a(buf_validate());
mtr_start(&mtr);
if (t == 6) {
/* Allocate free blocks */
for (j = 0; j < s; j++) {
arr[j] = buf_frame_alloc();
ut_a(arr[j]);
}
for (j = 0; j < s; j++) {
buf_frame_free(arr[j]);
}
} else if (t == 9) {
/* buf_flush_batch(BUF_FLUSH_LIST, 30); */
} else if (t == 7) {
/* x-lock many blocks */
for (j = 0; j < s; j++) {
arr[j] = buf_page_get(sp, (p + j)
% (N_FILES * FILE_SIZE),
RW_X_LATCH,
&mtr);
ut_a(arr[j]);
if (j > 0) {
ut_a(arr[j] != arr[j - 1]);
}
}
ut_a(buf_validate());
} else if (t == 8) {
/* s-lock many blocks */
for (j = 0; j < s; j++) {
arr[j] = buf_page_get(sp, (p + j)
% (N_FILES * FILE_SIZE),
RW_S_LATCH,
&mtr);
ut_a(arr[j]);
if (j > 0) {
ut_a(arr[j] != arr[j - 1]);
}
}
} else if (t <= 2) {
for (j = 0; j < r; j++) {
/* Read pages */
mtr_start(&mtr2);
frame = buf_page_get(sp,
p % (N_FILES * FILE_SIZE),
RW_S_LATCH, &mtr2);
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET,
MLOG_4BYTES, &mtr2)
== p % (N_FILES * FILE_SIZE));
ut_a(mtr_read_ulint(frame + FIL_PAGE_SPACE,
MLOG_4BYTES, &mtr2)
== sp);
mtr_commit(&mtr2);
if (t == 0) {
p++; /* upward */
} else if (t == 1) {
p--; /* downward */
} else if (t == 2) {
p = ut_rnd_gen_ulint(); /* randomly */
}
}
} else if (t <= 5) {
for (j = 0; j < r; j++) {
/* Write pages */
mtr_start(&mtr2);
frame = buf_page_get(sp, p % (N_FILES * FILE_SIZE),
RW_X_LATCH, &mtr2);
count = 1 + mtr_read_ulint(frame + COUNTER_OFFSET,
MLOG_4BYTES, &mtr2);
mutex_enter(&incs_mutex);
incs++;
mutex_exit(&incs_mutex);
mlog_write_ulint(frame + COUNTER_OFFSET, count,
MLOG_4BYTES, &mtr2);
mtr_commit(&mtr2);
if (t == 3) {
p++; /* upward */
} else if (t == 4) {
p--; /* downward */
} else if (t == 5) {
p = ut_rnd_gen_ulint(); /* randomly */
}
}
} /* if t = */
mtr_commit(&mtr);
/* printf("Thr %lu tst %lu ends ", os_thread_get_curr_id(), t); */
ut_a(buf_validate());
} /* for i */
printf("\nRandom test thread %lu exits\n", os_thread_get_curr_id());
return(0);
}
/************************************************************************
Random test thread function which reports the rw-lock list. */
ulint
rw_list_thread(
/*===========*/
void* arg)
{
ulint n;
ulint i;
n = *((ulint*)arg);
printf("\nRw list test thread %lu starts\n", os_thread_get_curr_id());
for (i = 0; i < 10; i++) {
os_thread_sleep(3000000);
rw_lock_list_print_info();
buf_validate();
}
return(0);
}
/*************************************************************************
Performs random operations on the buffer with several threads. */
void
test6(void)
/*=======*/
{
ulint i, j;
os_thread_t thr[N_THREADS + 1];
os_thread_id_t id[N_THREADS + 1];
ulint n[N_THREADS + 1];
ulint count = 0;
buf_frame_t* frame;
mtr_t mtr;
printf("--------------------------------------------------------\n");
printf("TEST 6. Random multi-thread test on the buffer \n");
incs = 0;
mutex_create(&incs_mutex);
for (i = 0; i < N_THREADS; i++) {
n[i] = i;
thr[i] = os_thread_create(random_thread, n + i, id + i);
}
/*
n[N_THREADS] = N_THREADS;
thr[N_THREADS] = os_thread_create(rw_list_thread, n + N_THREADS,
id + N_THREADS);
*/
for (i = 0; i < N_THREADS; i++) {
os_thread_wait(thr[i]);
}
/* os_thread_wait(thr[N_THREADS]); */
for (i = 0; i < N_SPACES; i++) {
for (j = 0; j < N_FILES * FILE_SIZE; j++) {
mtr_start(&mtr);
frame = buf_page_get(i, j, RW_S_LATCH, &mtr);
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET,
MLOG_4BYTES, &mtr)
== j);
ut_a(mtr_read_ulint(frame + FIL_PAGE_SPACE,
MLOG_4BYTES, &mtr)
== i);
count += mtr_read_ulint(frame + COUNTER_OFFSET,
MLOG_4BYTES, &mtr);
mtr_commit(&mtr);
}
}
printf("Count %lu incs %lu\n", count, incs);
ut_a(count == incs);
}
/************************************************************************
Frees the spaces in the file system. */
void
free_system(void)
/*=============*/
{
ulint i;
for (i = 0; i < N_SPACES; i++) {
fil_space_free(i);
}
}
/************************************************************************
Main test function. */
void
main(void)
/*======*/
{
ulint tm, oldtm;
/* buf_debug_prints = TRUE; */
oldtm = ut_clock();
os_aio_init(160, 5);
sync_init();
mem_init(1500000);
fil_init(26); /* Allow 25 open files at a time */
buf_pool_init(POOL_SIZE, POOL_SIZE);
log_init();
buf_validate();
ut_a(fil_validate());
create_files();
create_db();
buf_validate();
test1();
buf_validate();
test2();
buf_validate();
test3();
buf_validate();
test4();
test5();
buf_validate();
test6();
buf_validate();
buf_print();
buf_flush_batch(BUF_FLUSH_LIST, POOL_SIZE + 1);
buf_print();
buf_validate();
os_thread_sleep(1000000);
buf_print();
buf_all_freed();
free_system();
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

185
innobase/buf/ts/tsos.c Normal file
View file

@ -0,0 +1,185 @@
/************************************************************************
The test module for the operating system interface
(c) 1995 Innobase Oy
Created 9/27/1995 Heikki Tuuri
*************************************************************************/
#include "../os0thread.h"
#include "../os0shm.h"
#include "../os0proc.h"
#include "../os0sync.h"
#include "../os0file.h"
#include "ut0ut.h"
#include "sync0sync.h"
#include "mem0mem.h"
ulint last_thr = 1;
byte global_buf[1000000];
os_file_t file;
os_file_t file2;
os_event_t gl_ready;
mutex_t ios_mutex;
ulint ios;
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
void* mess;
ulint i;
bool ret;
segment = *((ulint*)arg);
printf("Thread %lu starts\n", segment);
for (i = 0;; i++) {
ret = os_aio_wait(segment, &mess);
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
ut_a(ret);
/* printf("Message for thread %lu %lu\n", segment,
(ulint)mess); */
if ((ulint)mess == 3333) {
os_event_set(gl_ready);
}
}
return(0);
}
/************************************************************************
Test of io-handler threads */
void
test4(void)
/*=======*/
{
ulint i;
bool ret;
void* buf;
ulint rnd;
ulint tm, oldtm;
os_thread_t thr[5];
os_thread_id_t id[5];
ulint n[5];
printf("-------------------------------------------\n");
printf("OS-TEST 4. Test of asynchronous file io\n");
/* Align the buffer for file io */
buf = (void*)(((ulint)global_buf + 6300) & (~0xFFF));
gl_ready = os_event_create(NULL);
ios = 0;
sync_init();
mem_init();
mutex_create(&ios_mutex);
for (i = 0; i < 5; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
rnd = 0;
oldtm = ut_clock();
for (i = 0; i < 4096; i++) {
ret = os_aio_read(file, (byte*)buf + 8192 * (rnd % 100),
8192 * (rnd % 4096), 0,
8192, (void*)i);
ut_a(ret);
rnd += 1;
}
ret = os_aio_read(file, buf, 8192 * (rnd % 1024), 0, 8192,
(void*)3333);
ut_a(ret);
ut_a(!os_aio_all_slots_free());
tm = ut_clock();
printf("All ios queued! N ios: %lu\n", ios);
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
os_event_wait(gl_ready);
tm = ut_clock();
printf("N ios: %lu\n", ios);
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
os_thread_sleep(2000000);
printf("N ios: %lu\n", ios);
ut_a(os_aio_all_slots_free());
}
/*************************************************************************
Initializes the asyncronous io system for tests. */
void
init_aio(void)
/*==========*/
{
bool ret;
void* buf;
buf = (void*)(((ulint)global_buf + 6300) & (~0xFFF));
os_aio_init(160, 5);
file = os_file_create("j:\\tsfile4", OS_FILE_CREATE, OS_FILE_TABLESPACE,
&ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() == OS_FILE_ALREADY_EXISTS);
file = os_file_create("j:\\tsfile4", OS_FILE_OPEN,
OS_FILE_TABLESPACE, &ret);
ut_a(ret);
}
}
/************************************************************************
Main test function. */
void
main(void)
/*======*/
{
ulint tm, oldtm;
oldtm = ut_clock();
init_aio();
test4();
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

24
innobase/com/Makefile.am Normal file
View file

@ -0,0 +1,24 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libcom.a
libcom_a_SOURCES = com0com.c com0shm.c
EXTRA_PROGRAMS =

345
innobase/com/com0com.c Normal file
View file

@ -0,0 +1,345 @@
/******************************************************
The communication primitives
(c) 1995 Innobase Oy
Created 9/23/1995 Heikki Tuuri
*******************************************************/
#include "com0com.h"
#ifdef UNIV_NONINL
#include "com0com.ic"
#endif
#include "mem0mem.h"
#include "com0shm.h"
/*
IMPLEMENTATION OF COMMUNICATION PRIMITIVES
==========================================
The primitives provide a uniform function interface for
use in communication. The primitives have been modeled
after the Windows Sockets interface. Below this uniform
API, the precise methods of communication, for example,
shared memory or Berkeley sockets, can be implemented
as subroutines.
*/
struct com_endpoint_struct{
ulint type; /* endpoint type */
void* par; /* type-specific data structures */
ibool bound; /* TRUE if the endpoint has been
bound to an address */
};
/*************************************************************************
Accessor functions for an endpoint */
UNIV_INLINE
ulint
com_endpoint_get_type(
/*==================*/
com_endpoint_t* ep)
{
ut_ad(ep);
return(ep->type);
}
UNIV_INLINE
void
com_endpoint_set_type(
/*==================*/
com_endpoint_t* ep,
ulint type)
{
ut_ad(ep);
ut_ad(type == COM_SHM);
ep->type = type;
}
UNIV_INLINE
void*
com_endpoint_get_par(
/*=================*/
com_endpoint_t* ep)
{
ut_ad(ep);
return(ep->par);
}
UNIV_INLINE
void
com_endpoint_set_par(
/*=================*/
com_endpoint_t* ep,
void* par)
{
ut_ad(ep);
ut_ad(par);
ep->par = par;
}
UNIV_INLINE
ibool
com_endpoint_get_bound(
/*===================*/
com_endpoint_t* ep)
{
ut_ad(ep);
return(ep->bound);
}
UNIV_INLINE
void
com_endpoint_set_bound(
/*===================*/
com_endpoint_t* ep,
ibool bound)
{
ut_ad(ep);
ep->bound = bound;
}
/*************************************************************************
Creates a communications endpoint. */
com_endpoint_t*
com_endpoint_create(
/*================*/
/* out, own: communications endpoint, NULL if
did not succeed */
ulint type) /* in: communication type of endpoint:
only COM_SHM supported */
{
com_endpoint_t* ep;
void* par;
ep = mem_alloc(sizeof(com_endpoint_t));
com_endpoint_set_type(ep, type);
com_endpoint_set_bound(ep, FALSE);
if (type == COM_SHM) {
par = com_shm_endpoint_create();
com_endpoint_set_par(ep, par);
} else {
par = NULL;
ut_error;
}
if (par != NULL) {
return(ep);
} else {
mem_free(ep);
return(NULL);
}
}
/*************************************************************************
Frees a communications endpoint. */
ulint
com_endpoint_free(
/*==============*/
/* out: O if succeed, else error number */
com_endpoint_t* ep) /* in, own: communications endpoint */
{
ulint type;
ulint ret;
void* par;
type = com_endpoint_get_type(ep);
par = com_endpoint_get_par(ep);
if (type == COM_SHM) {
ret = com_shm_endpoint_free((com_shm_endpoint_t*)par);
} else {
ret = 0;
ut_error;
}
if (ret) {
return(ret);
} else {
mem_free(ep);
return(0);
}
}
/*************************************************************************
Sets an option, like the maximum datagram size for an endpoint.
The options may vary depending on the endpoint type. */
ulint
com_endpoint_set_option(
/*====================*/
/* out: 0 if succeed, else error number */
com_endpoint_t* ep, /* in: endpoint */
ulint optno, /* in: option number, only
COM_OPT_MAX_DGRAM_SIZE currently supported */
byte* optval, /* in: pointer to a buffer containing the
option value to set */
ulint optlen) /* in: option value buffer length */
{
ulint type;
ulint ret;
void* par;
type = com_endpoint_get_type(ep);
par = com_endpoint_get_par(ep);
if (type == COM_SHM) {
ret = com_shm_endpoint_set_option((com_shm_endpoint_t*)par,
optno, optval, optlen);
} else {
ret = 0;
ut_error;
}
return(ret);
}
/*************************************************************************
Binds a communications endpoint to the specified address. */
ulint
com_bind(
/*=====*/
/* out: 0 if succeed, else error number */
com_endpoint_t* ep, /* in: communications endpoint */
char* name, /* in: address name */
ulint len) /* in: name length */
{
ulint type;
ulint ret;
void* par;
ut_ad(len <= COM_MAX_ADDR_LEN);
if (com_endpoint_get_bound(ep)) {
return(COM_ERR_ALREADY_BOUND);
}
type = com_endpoint_get_type(ep);
par = com_endpoint_get_par(ep);
if (type == COM_SHM) {
ret = com_shm_bind((com_shm_endpoint_t*)par, name, len);
} else {
ret = 0;
ut_error;
}
if (ret == 0) {
com_endpoint_set_bound(ep, TRUE);
}
return(ret);
}
/*************************************************************************
Waits for a datagram to arrive at an endpoint. */
ulint
com_recvfrom(
/*=========*/
/* out: 0 if succeed, else error number */
com_endpoint_t* ep, /* in: communications endpoint */
byte* buf, /* out: datagram buffer; the buffer is
supplied by the caller */
ulint buf_len,/* in: datagram buffer length */
ulint* len, /* out: datagram length */
char* from, /* out: address name buffer; the buffer is
supplied by the caller */
ulint from_len,/* in: address name buffer length */
ulint* addr_len)/* out: address name length */
{
ulint type;
ulint ret;
void* par;
if (!com_endpoint_get_bound(ep)) {
return(COM_ERR_NOT_BOUND);
}
type = com_endpoint_get_type(ep);
par = com_endpoint_get_par(ep);
if (type == COM_SHM) {
ret = com_shm_recvfrom((com_shm_endpoint_t*)par,
buf, buf_len, len, from, from_len,
addr_len);
} else {
ret = 0;
ut_error;
}
return(ret);
}
/*************************************************************************
Sends a datagram to the specified destination. */
ulint
com_sendto(
/*=======*/
/* out: 0 if succeed, else error number */
com_endpoint_t* ep, /* in: communications endpoint */
byte* buf, /* in: datagram buffer */
ulint len, /* in: datagram length */
char* to, /* in: address name buffer */
ulint tolen) /* in: address name length */
{
ulint type;
ulint ret;
void* par;
if (!com_endpoint_get_bound(ep)) {
return(COM_ERR_NOT_BOUND);
}
type = com_endpoint_get_type(ep);
par = com_endpoint_get_par(ep);
if (type == COM_SHM) {
ret = com_shm_sendto((com_shm_endpoint_t*)par, buf, len,
to, tolen);
} else {
ret = 0;
ut_error;
}
return(ret);
}
/*************************************************************************
Gets the maximum datagram size for an endpoint. */
ulint
com_endpoint_get_max_size(
/*======================*/
/* out: maximum size */
com_endpoint_t* ep) /* in: endpoint */
{
ulint type;
ulint ret;
void* par;
type = com_endpoint_get_type(ep);
par = com_endpoint_get_par(ep);
if (type == COM_SHM) {
ret = com_shm_endpoint_get_size((com_shm_endpoint_t*)par);
} else {
ret = 0;
ut_error;
}
return(ret);
}

1159
innobase/com/com0shm.c Normal file

File diff suppressed because it is too large Load diff

12
innobase/com/makefilewin Normal file
View file

@ -0,0 +1,12 @@
include ..\include\makefile.i
com.lib: com0com.obj com0shm.obj
lib -out:..\libs\com.lib com0com.obj com0shm.obj
com0com.obj: com0com.c
$(CCOM) $(CFL) -c com0com.c
com0shm.obj: com0shm.c
$(CCOM) $(CFL) -c com0shm.c

19
innobase/com/ts/makefile Normal file
View file

@ -0,0 +1,19 @@
include ..\..\makefile.i
doall: tscom tscli
tscom: ..\com.lib tscom.c makefile
$(CCOM) $(CFL) -I.. -I..\.. ..\com.lib ..\..\ut.lib ..\..\mem.lib ..\..\sync.lib ..\..\os.lib tscom.c $(LFL)
tscli: ..\com.lib tscli.c makefile
$(CCOM) $(CFL) -I.. -I..\.. ..\com.lib ..\..\ut.lib ..\..\mem.lib ..\..\sync.lib ..\..\os.lib tscli.c $(LFL)

96
innobase/com/ts/tscli.c Normal file
View file

@ -0,0 +1,96 @@
/************************************************************************
The test module for communication
(c) 1995 Innobase Oy
Created 9/26/1995 Heikki Tuuri
*************************************************************************/
#include "../com0com.h"
#include "../com0shm.h"
#include "ut0ut.h"
#include "mem0mem.h"
#include "os0thread.h"
#include "sync0ipm.h"
#include "sync0sync.h"
byte buf[10000];
char addr[150];
void
test1(void)
/*=======*/
{
com_endpoint_t* ep;
ulint ret;
ulint size;
ulint len;
ulint addr_len;
ulint i;
ep = com_endpoint_create(COM_SHM);
ut_a(ep);
size = 8192;
ret = com_endpoint_set_option(ep, COM_OPT_MAX_DGRAM_SIZE,
(byte*)&size, 0);
ut_a(ret == 0);
ret = com_bind(ep, "CLI", 3);
ut_a(ret == 0);
printf("Client endpoint created!\n");
for (i = 0; i < 10000; i++) {
ret = com_sendto(ep, (byte*)"Hello from client!\n", 18, "SRV", 3);
ut_a(ret == 0);
ret = com_recvfrom(ep, buf, 10000, &len, addr, 150, &addr_len);
ut_a(ret == 0);
buf[len] = '\0';
addr[addr_len] = '\0';
/*
printf(
"Message of len %lu\n%s \nreceived from address %s of len %lu\n",
len, buf, addr, addr_len);
*/
}
ret = com_endpoint_free(ep);
ut_ad(ret == 0);
printf("Count of extra system calls in com_shm %lu\n",
com_shm_system_call_count);
printf("Count of extra system calls in ip_mutex %lu\n",
ip_mutex_system_call_count);
}
void
main(void)
/*======*/
{
ulint tm, oldtm;
sync_init();
mem_init();
oldtm = ut_clock();
test1();
ut_ad(mem_all_freed());
tm = ut_clock();
printf("Wall clock time for test %ld milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

94
innobase/com/ts/tscom.c Normal file
View file

@ -0,0 +1,94 @@
/************************************************************************
The test module for communication
(c) 1995 Innobase Oy
Created 9/26/1995 Heikki Tuuri
*************************************************************************/
#include "../com0com.h"
#include "../com0shm.h"
#include "ut0ut.h"
#include "mem0mem.h"
#include "os0thread.h"
#include "sync0ipm.h"
#include "sync0sync.h"
byte buf[10000];
char addr[150];
void
test1(void)
/*=======*/
{
com_endpoint_t* ep;
ulint ret;
ulint size;
ulint len;
ulint addr_len;
ulint i;
ep = com_endpoint_create(COM_SHM);
ut_a(ep);
size = 8192;
ret = com_endpoint_set_option(ep, COM_OPT_MAX_DGRAM_SIZE,
(byte*)&size, 0);
ut_a(ret == 0);
ret = com_bind(ep, "SRV", 3);
ut_a(ret == 0);
printf("Server endpoint created!\n");
for (i = 0; i < 50000; i++) {
ret = com_recvfrom(ep, buf, 10000, &len, addr, 150, &addr_len);
ut_a(ret == 0);
buf[len] = '\0';
addr[addr_len] = '\0';
/*
printf(
"Message of len %lu\n%s \nreceived from address %s of len %lu\n",
len, buf, addr, addr_len);
*/
ret = com_sendto(ep, (byte*)"Hello from server!\n", 18, "CLI", 3);
ut_a(ret == 0);
}
ret = com_endpoint_free(ep);
ut_ad(ret == 0);
printf("Count of extra system calls in com_shm %lu\n",
com_shm_system_call_count);
printf("Count of extra system calls in ip_mutex %lu\n",
ip_mutex_system_call_count);
}
void
main(void)
/*======*/
{
ulint tm, oldtm;
sync_init();
mem_init();
oldtm = ut_clock();
test1();
ut_ad(mem_all_freed());
tm = ut_clock();
printf("Wall clock time for test %ld milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

22
innobase/configure.in Normal file
View file

@ -0,0 +1,22 @@
# Process this file with autoconf to produce a configure script
AC_INIT(./os/os0file.c)
AM_CONFIG_HEADER(ib_config.h)
AM_INIT_AUTOMAKE(ib, 0.90)
AC_PROG_CC
AC_PROG_RANLIB
AC_CHECK_HEADERS(aio.h)
AC_CHECK_SIZEOF(int, 4)
AC_C_INLINE
AC_C_BIGENDIAN
AC_OUTPUT(Makefile os/Makefile ut/Makefile btr/Makefile
buf/Makefile com/Makefile data/Makefile
dict/Makefile dyn/Makefile
eval/Makefile fil/Makefile fsp/Makefile fut/Makefile
ha/Makefile ibuf/Makefile lock/Makefile log/Makefile
mach/Makefile mem/Makefile mtr/Makefile odbc/Makefile
page/Makefile pars/Makefile que/Makefile
read/Makefile rem/Makefile row/Makefile
srv/Makefile sync/Makefile thr/Makefile trx/Makefile
usr/Makefile)

12
innobase/cry/makefilewin Normal file
View file

@ -0,0 +1,12 @@
include ..\include\makefile.i
doall: cr.exe dcr.exe wro.exe
cr.exe: cry0cry.c
$(CCOM) $(CFLW) -o cr.exe -I.. cry0cry.c ..\ut.lib ..\os.lib
dcr.exe: cry0dcr.c
$(CCOM) $(CFLW) -o dcr.exe -I.. cry0dcr.c ..\ut.lib ..\os.lib
wro.exe: cry0wro.c
$(CCOM) $(CFLW) -o wro.exe -I.. cry0wro.c ..\ut.lib ..\os.lib

25
innobase/data/Makefile.am Normal file
View file

@ -0,0 +1,25 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libdata.a
libdata_a_SOURCES = data0data.c data0type.c
EXTRA_PROGRAMS =

790
innobase/data/data0data.c Normal file
View file

@ -0,0 +1,790 @@
/************************************************************************
SQL data field and tuple
(c) 1994-1996 Innobase Oy
Created 5/30/1994 Heikki Tuuri
*************************************************************************/
#include "data0data.h"
#ifdef UNIV_NONINL
#include "data0data.ic"
#endif
#include "ut0rnd.h"
byte data_error; /* data pointers of tuple fields are initialized
to point here for error checking */
ulint data_dummy; /* this is used to fool the compiler in
dtuple_validate */
byte data_buf[8192]; /* used in generating test tuples */
ulint data_rnd = 756511;
/* Some non-inlined functions used in the MySQL interface: */
void
dfield_set_data_noninline(
dfield_t* field, /* in: field */
void* data, /* in: data */
ulint len) /* in: length or UNIV_SQL_NULL */
{
dfield_set_data(field, data, len);
}
void*
dfield_get_data_noninline(
dfield_t* field) /* in: field */
{
return(dfield_get_data(field));
}
ulint
dfield_get_len_noninline(
dfield_t* field) /* in: field */
{
return(dfield_get_len(field));
}
ulint
dtuple_get_n_fields_noninline(
dtuple_t* tuple) /* in: tuple */
{
return(dtuple_get_n_fields(tuple));
}
dfield_t*
dtuple_get_nth_field_noninline(
dtuple_t* tuple, /* in: tuple */
ulint n) /* in: index of field */
{
return(dtuple_get_nth_field(tuple, n));
}
/*************************************************************************
Creates a dtuple for use in MySQL. */
dtuple_t*
dtuple_create_for_mysql(
/*====================*/
/* out, own created dtuple */
void** heap, /* out: created memory heap */
ulint n_fields) /* in: number of fields */
{
*heap = (void*)mem_heap_create(500);
return(dtuple_create(*((mem_heap_t**)heap), n_fields));
}
/*************************************************************************
Frees a dtuple used in MySQL. */
void
dtuple_free_for_mysql(
/*==================*/
void* heap) /* in: memory heap where tuple was created */
{
mem_heap_free((mem_heap_t*)heap);
}
/*************************************************************************
Sets number of fields used in a tuple. Normally this is set in
dtuple_create, but if you want later to set it smaller, you can use this. */
void
dtuple_set_n_fields(
/*================*/
dtuple_t* tuple, /* in: tuple */
ulint n_fields) /* in: number of fields */
{
ut_ad(tuple);
tuple->n_fields = n_fields;
tuple->n_fields_cmp = n_fields;
}
/**************************************************************
Checks that a data field is typed. Asserts an error if not. */
ibool
dfield_check_typed(
/*===============*/
/* out: TRUE if ok */
dfield_t* field) /* in: data field */
{
ut_a(dfield_get_type(field)->mtype <= DATA_SYS);
ut_a(dfield_get_type(field)->mtype >= DATA_VARCHAR);
return(TRUE);
}
/**************************************************************
Checks that a data tuple is typed. Asserts an error if not. */
ibool
dtuple_check_typed(
/*===============*/
/* out: TRUE if ok */
dtuple_t* tuple) /* in: tuple */
{
dfield_t* field;
ulint i;
for (i = 0; i < dtuple_get_n_fields(tuple); i++) {
field = dtuple_get_nth_field(tuple, i);
ut_a(dfield_check_typed(field));
}
return(TRUE);
}
/**************************************************************
Validates the consistency of a tuple which must be complete, i.e,
all fields must have been set. */
ibool
dtuple_validate(
/*============*/
/* out: TRUE if ok */
dtuple_t* tuple) /* in: tuple */
{
dfield_t* field;
byte* data;
ulint n_fields;
ulint len;
ulint i;
ulint j;
ulint sum = 0; /* A dummy variable used
to prevent the compiler
from erasing the loop below */
ut_a(tuple->magic_n = DATA_TUPLE_MAGIC_N);
n_fields = dtuple_get_n_fields(tuple);
/* We dereference all the data of each field to test
for memory traps */
for (i = 0; i < n_fields; i++) {
field = dtuple_get_nth_field(tuple, i);
len = dfield_get_len(field);
if (len != UNIV_SQL_NULL) {
data = field->data;
for (j = 0; j < len; j++) {
data_dummy += *data; /* fool the compiler not
to optimize out this
code */
data++;
}
}
}
ut_a(dtuple_check_typed(tuple));
return(TRUE);
}
/*****************************************************************
Pretty prints a dfield value according to its data type. */
void
dfield_print(
/*=========*/
dfield_t* dfield) /* in: dfield */
{
byte* data;
ulint len;
ulint mtype;
ulint i;
len = dfield_get_len(dfield);
data = dfield_get_data(dfield);
if (len == UNIV_SQL_NULL) {
printf("NULL");
return;
}
mtype = dtype_get_mtype(dfield_get_type(dfield));
if ((mtype == DATA_CHAR) || (mtype == DATA_VARCHAR)) {
for (i = 0; i < len; i++) {
if (isprint((char)(*data))) {
printf("%c", (char)*data);
} else {
printf(" ");
}
data++;
}
} else if (mtype == DATA_INT) {
ut_a(len == 4); /* only works for 32-bit integers */
printf("%li", (int)mach_read_from_4(data));
} else {
ut_error;
}
}
/*****************************************************************
Pretty prints a dfield value according to its data type. Also the hex string
is printed if a string contains non-printable characters. */
void
dfield_print_also_hex(
/*==================*/
dfield_t* dfield) /* in: dfield */
{
byte* data;
ulint len;
ulint mtype;
ulint i;
ibool print_also_hex;
len = dfield_get_len(dfield);
data = dfield_get_data(dfield);
if (len == UNIV_SQL_NULL) {
printf("NULL");
return;
}
mtype = dtype_get_mtype(dfield_get_type(dfield));
if ((mtype == DATA_CHAR) || (mtype == DATA_VARCHAR)) {
print_also_hex = FALSE;
for (i = 0; i < len; i++) {
if (isprint((char)(*data))) {
printf("%c", (char)*data);
} else {
print_also_hex = TRUE;
printf(" ");
}
data++;
}
if (!print_also_hex) {
return;
}
printf(" Hex: ");
data = dfield_get_data(dfield);
for (i = 0; i < len; i++) {
printf("%02x", (ulint)*data);
data++;
}
} else if (mtype == DATA_INT) {
ut_a(len == 4); /* inly works for 32-bit integers */
printf("%li", (int)mach_read_from_4(data));
} else {
ut_error;
}
}
/**************************************************************
The following function prints the contents of a tuple. */
void
dtuple_print(
/*=========*/
dtuple_t* tuple) /* in: tuple */
{
dfield_t* field;
ulint n_fields;
ulint i;
n_fields = dtuple_get_n_fields(tuple);
printf("DATA TUPLE: %lu fields;\n", n_fields);
for (i = 0; i < n_fields; i++) {
printf(" %lu:", i);
field = dtuple_get_nth_field(tuple, i);
if (field->len != UNIV_SQL_NULL) {
ut_print_buf(field->data, field->len);
} else {
printf(" SQL NULL");
}
printf(";");
}
printf("\n");
dtuple_validate(tuple);
}
/**************************************************************
The following function prints the contents of a tuple to a buffer. */
ulint
dtuple_sprintf(
/*===========*/
/* out: printed length in bytes */
char* buf, /* in: print buffer */
ulint buf_len,/* in: buf length in bytes */
dtuple_t* tuple) /* in: tuple */
{
dfield_t* field;
ulint n_fields;
ulint len;
ulint i;
len = 0;
n_fields = dtuple_get_n_fields(tuple);
for (i = 0; i < n_fields; i++) {
if (len + 30 > buf_len) {
return(len);
}
len += sprintf(buf + len, " %lu:", i);
field = dtuple_get_nth_field(tuple, i);
if (field->len != UNIV_SQL_NULL) {
if (5 * field->len + len + 30 > buf_len) {
return(len);
}
len += ut_sprintf_buf(buf + len, field->data,
field->len);
} else {
len += sprintf(buf + len, " SQL NULL");
}
len += sprintf(buf + len, ";");
}
return(len);
}
/******************************************************************
Generates random numbers, where 10/16 is uniformly
distributed between 0 and n1, 5/16 between 0 and n2,
and 1/16 between 0 and n3. */
static
ulint
dtuple_gen_rnd_ulint(
/*=================*/
/* out: random ulint */
ulint n1,
ulint n2,
ulint n3)
{
ulint m;
ulint n;
m = ut_rnd_gen_ulint() % 16;
if (m < 10) {
n = n1;
} else if (m < 15) {
n = n2;
} else {
n = n3;
}
m = ut_rnd_gen_ulint();
return(m % n);
}
/***************************************************************
Generates a random tuple. */
dtuple_t*
dtuple_gen_rnd_tuple(
/*=================*/
/* out: pointer to the tuple */
mem_heap_t* heap) /* in: memory heap where generated */
{
ulint n_fields;
dfield_t* field;
ulint len;
dtuple_t* tuple;
ulint i;
ulint j;
byte* ptr;
n_fields = dtuple_gen_rnd_ulint(5, 30, 300) + 1;
tuple = dtuple_create(heap, n_fields);
for (i = 0; i < n_fields; i++) {
if (n_fields < 7) {
len = dtuple_gen_rnd_ulint(5, 30, 400);
} else {
len = dtuple_gen_rnd_ulint(7, 5, 17);
}
field = dtuple_get_nth_field(tuple, i);
if (len == 0) {
dfield_set_data(field, NULL, UNIV_SQL_NULL);
} else {
ptr = mem_heap_alloc(heap, len);
dfield_set_data(field, ptr, len - 1);
for (j = 0; j < len; j++) {
*ptr = (byte)(65 +
dtuple_gen_rnd_ulint(22, 22, 22));
ptr++;
}
}
dtype_set(dfield_get_type(field), DATA_VARCHAR,
DATA_ENGLISH, 500, 0);
}
ut_a(dtuple_validate(tuple));
return(tuple);
}
/*******************************************************************
Generates a test tuple for sort and comparison tests. */
void
dtuple_gen_test_tuple(
/*==================*/
dtuple_t* tuple, /* in/out: a tuple with 3 fields */
ulint i) /* in: a number < 512 */
{
ulint j;
dfield_t* field;
void* data = NULL;
ulint len = 0;
for (j = 0; j < 3; j++) {
switch (i % 8) {
case 0:
data = ""; len = 0; break;
case 1:
data = "A"; len = 1; break;
case 2:
data = "AA"; len = 2; break;
case 3:
data = "AB"; len = 2; break;
case 4:
data = "B"; len = 1; break;
case 5:
data = "BA"; len = 2; break;
case 6:
data = "BB"; len = 2; break;
case 7:
len = UNIV_SQL_NULL; break;
}
field = dtuple_get_nth_field(tuple, 2 - j);
dfield_set_data(field, data, len);
dtype_set(dfield_get_type(field), DATA_VARCHAR,
DATA_ENGLISH, 100, 0);
i = i / 8;
}
ut_ad(dtuple_validate(tuple));
}
/*******************************************************************
Generates a test tuple for B-tree speed tests. */
void
dtuple_gen_test_tuple3(
/*===================*/
dtuple_t* tuple, /* in/out: a tuple with >= 3 fields */
ulint i, /* in: a number < 1000000 */
ulint type, /* in: DTUPLE_TEST_FIXED30, ... */
byte* buf) /* in: a buffer of size >= 16 bytes */
{
dfield_t* field;
ulint third_size;
ut_ad(tuple && buf);
ut_ad(i < 1000000);
field = dtuple_get_nth_field(tuple, 0);
ut_strcpy((char*)buf, "0000000");
buf[1] = (byte)('0' + (i / 100000) % 10);
buf[2] = (byte)('0' + (i / 10000) % 10);
buf[3] = (byte)('0' + (i / 1000) % 10);
buf[4] = (byte)('0' + (i / 100) % 10);
buf[5] = (byte)('0' + (i / 10) % 10);
buf[6] = (byte)('0' + (i % 10));
dfield_set_data(field, buf, 8);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
field = dtuple_get_nth_field(tuple, 1);
i = i % 1000; /* ut_rnd_gen_ulint() % 1000000; */
ut_strcpy((char*)buf + 8, "0000000");
buf[9] = (byte)('0' + (i / 100000) % 10);
buf[10] = (byte)('0' + (i / 10000) % 10);
buf[11] = (byte)('0' + (i / 1000) % 10);
buf[12] = (byte)('0' + (i / 100) % 10);
buf[13] = (byte)('0' + (i / 10) % 10);
buf[14] = (byte)('0' + (i % 10));
dfield_set_data(field, buf + 8, 8);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
field = dtuple_get_nth_field(tuple, 2);
data_rnd += 8757651;
if (type == DTUPLE_TEST_FIXED30) {
third_size = 30;
} else if (type == DTUPLE_TEST_RND30) {
third_size = data_rnd % 30;
} else if (type == DTUPLE_TEST_RND3500) {
third_size = data_rnd % 3500;
} else if (type == DTUPLE_TEST_FIXED2000) {
third_size = 2000;
} else if (type == DTUPLE_TEST_FIXED3) {
third_size = 3;
} else {
ut_error;
}
if (type == DTUPLE_TEST_FIXED30) {
dfield_set_data(field,
"12345678901234567890123456789", third_size);
} else {
dfield_set_data(field, data_buf, third_size);
}
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
ut_ad(dtuple_validate(tuple));
}
/*******************************************************************
Generates a test tuple for B-tree speed tests. */
void
dtuple_gen_search_tuple3(
/*=====================*/
dtuple_t* tuple, /* in/out: a tuple with 1 or 2 fields */
ulint i, /* in: a number < 1000000 */
byte* buf) /* in: a buffer of size >= 16 bytes */
{
dfield_t* field;
ut_ad(tuple && buf);
ut_ad(i < 1000000);
field = dtuple_get_nth_field(tuple, 0);
ut_strcpy((char*)buf, "0000000");
buf[1] = (byte)('0' + (i / 100000) % 10);
buf[2] = (byte)('0' + (i / 10000) % 10);
buf[3] = (byte)('0' + (i / 1000) % 10);
buf[4] = (byte)('0' + (i / 100) % 10);
buf[5] = (byte)('0' + (i / 10) % 10);
buf[6] = (byte)('0' + (i % 10));
dfield_set_data(field, buf, 8);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
if (dtuple_get_n_fields(tuple) == 1) {
return;
}
field = dtuple_get_nth_field(tuple, 1);
i = (i * 1000) % 1000000;
ut_strcpy((char*)buf + 8, "0000000");
buf[9] = (byte)('0' + (i / 100000) % 10);
buf[10] = (byte)('0' + (i / 10000) % 10);
buf[11] = (byte)('0' + (i / 1000) % 10);
buf[12] = (byte)('0' + (i / 100) % 10);
buf[13] = (byte)('0' + (i / 10) % 10);
buf[14] = (byte)('0' + (i % 10));
dfield_set_data(field, buf + 8, 8);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
ut_ad(dtuple_validate(tuple));
}
/*******************************************************************
Generates a test tuple for TPC-A speed test. */
void
dtuple_gen_test_tuple_TPC_A(
/*========================*/
dtuple_t* tuple, /* in/out: a tuple with >= 3 fields */
ulint i, /* in: a number < 10000 */
byte* buf) /* in: a buffer of size >= 16 bytes */
{
dfield_t* field;
ulint third_size;
ut_ad(tuple && buf);
ut_ad(i < 10000);
field = dtuple_get_nth_field(tuple, 0);
ut_strcpy((char*)buf, "0000");
buf[0] = (byte)('0' + (i / 1000) % 10);
buf[1] = (byte)('0' + (i / 100) % 10);
buf[2] = (byte)('0' + (i / 10) % 10);
buf[3] = (byte)('0' + (i % 10));
dfield_set_data(field, buf, 5);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
field = dtuple_get_nth_field(tuple, 1);
dfield_set_data(field, buf + 8, 5);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
field = dtuple_get_nth_field(tuple, 2);
third_size = 90;
dfield_set_data(field, data_buf, third_size);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
ut_ad(dtuple_validate(tuple));
}
/*******************************************************************
Generates a test tuple for B-tree speed tests. */
void
dtuple_gen_search_tuple_TPC_A(
/*==========================*/
dtuple_t* tuple, /* in/out: a tuple with 1 field */
ulint i, /* in: a number < 10000 */
byte* buf) /* in: a buffer of size >= 16 bytes */
{
dfield_t* field;
ut_ad(tuple && buf);
ut_ad(i < 10000);
field = dtuple_get_nth_field(tuple, 0);
ut_strcpy((char*)buf, "0000");
buf[0] = (byte)('0' + (i / 1000) % 10);
buf[1] = (byte)('0' + (i / 100) % 10);
buf[2] = (byte)('0' + (i / 10) % 10);
buf[3] = (byte)('0' + (i % 10));
dfield_set_data(field, buf, 5);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
ut_ad(dtuple_validate(tuple));
}
/*******************************************************************
Generates a test tuple for TPC-C speed test. */
void
dtuple_gen_test_tuple_TPC_C(
/*========================*/
dtuple_t* tuple, /* in/out: a tuple with >= 12 fields */
ulint i, /* in: a number < 100000 */
byte* buf) /* in: a buffer of size >= 16 bytes */
{
dfield_t* field;
ulint size;
ulint j;
ut_ad(tuple && buf);
ut_ad(i < 100000);
field = dtuple_get_nth_field(tuple, 0);
buf[0] = (byte)('0' + (i / 10000) % 10);
buf[1] = (byte)('0' + (i / 1000) % 10);
buf[2] = (byte)('0' + (i / 100) % 10);
buf[3] = (byte)('0' + (i / 10) % 10);
buf[4] = (byte)('0' + (i % 10));
dfield_set_data(field, buf, 5);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
field = dtuple_get_nth_field(tuple, 1);
dfield_set_data(field, buf, 5);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
for (j = 0; j < 10; j++) {
field = dtuple_get_nth_field(tuple, 2 + j);
size = 24;
dfield_set_data(field, data_buf, size);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH,
100, 0);
}
ut_ad(dtuple_validate(tuple));
}
/*******************************************************************
Generates a test tuple for B-tree speed tests. */
void
dtuple_gen_search_tuple_TPC_C(
/*==========================*/
dtuple_t* tuple, /* in/out: a tuple with 1 field */
ulint i, /* in: a number < 100000 */
byte* buf) /* in: a buffer of size >= 16 bytes */
{
dfield_t* field;
ut_ad(tuple && buf);
ut_ad(i < 100000);
field = dtuple_get_nth_field(tuple, 0);
buf[0] = (byte)('0' + (i / 10000) % 10);
buf[1] = (byte)('0' + (i / 1000) % 10);
buf[2] = (byte)('0' + (i / 100) % 10);
buf[3] = (byte)('0' + (i / 10) % 10);
buf[4] = (byte)('0' + (i % 10));
dfield_set_data(field, buf, 5);
dtype_set(dfield_get_type(field), DATA_VARCHAR, DATA_ENGLISH, 100, 0);
ut_ad(dtuple_validate(tuple));
}

93
innobase/data/data0type.c Normal file
View file

@ -0,0 +1,93 @@
/******************************************************
Data types
(c) 1996 Innobase Oy
Created 1/16/1996 Heikki Tuuri
*******************************************************/
#include "data0type.h"
#ifdef UNIV_NONINL
#include "data0type.ic"
#endif
dtype_t dtype_binary_val = {DATA_BINARY, 0, 0, 0};
dtype_t* dtype_binary = &dtype_binary_val;
/*************************************************************************
Validates a data type structure. */
ibool
dtype_validate(
/*===========*/
/* out: TRUE if ok */
dtype_t* type) /* in: type struct to validate */
{
ut_a(type);
ut_a((type->mtype >= DATA_VARCHAR) && (type->mtype <= DATA_SYS));
if (type->mtype == DATA_SYS) {
ut_a(type->prtype >= DATA_ROW_ID);
ut_a(type->prtype <= DATA_MIX_ID);
}
return(TRUE);
}
/*************************************************************************
Prints a data type structure. */
void
dtype_print(
/*========*/
dtype_t* type) /* in: type */
{
ulint mtype;
ulint prtype;
ut_a(type);
printf("DATA TYPE: ");
mtype = type->mtype;
prtype = type->prtype;
if (mtype == DATA_VARCHAR) {
printf("DATA_VARCHAR");
} else if (mtype == DATA_CHAR) {
printf("DATA_CHAR");
} else if (mtype == DATA_BINARY) {
printf("DATA_BINARY");
} else if (mtype == DATA_INT) {
printf("DATA_INT");
} else if (mtype == DATA_MYSQL) {
printf("DATA_MYSQL");
} else if (mtype == DATA_SYS) {
printf("DATA_SYS");
} else {
printf("unknown type %lu", mtype);
}
if ((type->mtype == DATA_SYS)
|| (type->mtype == DATA_VARCHAR)
|| (type->mtype == DATA_CHAR)) {
printf(" ");
if (prtype == DATA_ROW_ID) {
printf("DATA_ROW_ID");
} else if (prtype == DATA_ROLL_PTR) {
printf("DATA_ROLL_PTR");
} else if (prtype == DATA_MIX_ID) {
printf("DATA_MIX_ID");
} else if (prtype == DATA_ENGLISH) {
printf("DATA_ENGLISH");
} else if (prtype == DATA_FINNISH) {
printf("DATA_FINNISH");
} else {
printf("unknown prtype %lu", mtype);
}
}
printf("; len %lu prec %lu\n", type->len, type->prec);
}

11
innobase/data/makefilewin Normal file
View file

@ -0,0 +1,11 @@
include ..\include\makefile.i
data.lib: data0type.obj data0data.obj
lib -out:..\libs\data.lib data0type.obj data0data.obj
data0type.obj: data0type.c
$(CCOM) $(CFL) -c data0type.c
data0data.obj: data0data.c
$(CCOM) $(CFL) -c data0data.c

44
innobase/db/db0err.h Normal file
View file

@ -0,0 +1,44 @@
/******************************************************
Global error codes for the database
(c) 1996 Innobase Oy
Created 5/24/1996 Heikki Tuuri
*******************************************************/
#ifndef db0err_h
#define db0err_h
#define DB_SUCCESS 10
/* The following are error codes */
#define DB_ERROR 11
#define DB_OUT_OF_MEMORY 12
#define DB_OUT_OF_FILE_SPACE 13
#define DB_LOCK_WAIT 14
#define DB_DEADLOCK 15
#define DB_ROLLBACK 16
#define DB_DUPLICATE_KEY 17
#define DB_QUE_THR_SUSPENDED 18
#define DB_MISSING_HISTORY 19 /* required history data has been
deleted due to lack of space in
rollback segment */
#define DB_CLUSTER_NOT_FOUND 30
#define DB_TABLE_NOT_FOUND 31
#define DB_MUST_GET_MORE_FILE_SPACE 32 /* the database has to be stopped
and restrated with more file space */
#define DB_TABLE_IS_BEING_USED 33
#define DB_TOO_BIG_RECORD 34 /* a record in an index would become
bigger than 1/2 free space in a page
frame */
/* The following are partial failure codes */
#define DB_FAIL 1000
#define DB_OVERFLOW 1001
#define DB_UNDERFLOW 1002
#define DB_STRONG_FAIL 1003
#define DB_RECORD_NOT_FOUND 1500
#define DB_END_OF_INDEX 1501
#endif

25
innobase/dict/Makefile.am Normal file
View file

@ -0,0 +1,25 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libdict.a
libdict_a_SOURCES = dict0boot.c dict0crea.c dict0dict.c dict0load.c\
dict0mem.c
EXTRA_PROGRAMS =

361
innobase/dict/dict0boot.c Normal file
View file

@ -0,0 +1,361 @@
/******************************************************
Data dictionary creation and booting
(c) 1996 Innobase Oy
Created 4/18/1996 Heikki Tuuri
*******************************************************/
#include "dict0boot.h"
#ifdef UNIV_NONINL
#include "dict0boot.ic"
#endif
#include "dict0crea.h"
#include "btr0btr.h"
#include "dict0load.h"
#include "dict0load.h"
#include "trx0trx.h"
#include "srv0srv.h"
#include "ibuf0ibuf.h"
#include "buf0flu.h"
#include "log0recv.h"
#include "os0file.h"
/**************************************************************************
Writes the current value of the row id counter to the dictionary header file
page. */
void
dict_hdr_flush_row_id(void)
/*=======================*/
{
dict_hdr_t* dict_hdr;
dulint id;
mtr_t mtr;
ut_ad(mutex_own(&(dict_sys->mutex)));
id = dict_sys->row_id;
mtr_start(&mtr);
dict_hdr = dict_hdr_get(&mtr);
mlog_write_dulint(dict_hdr + DICT_HDR_ROW_ID, id, MLOG_8BYTES, &mtr);
mtr_commit(&mtr);
}
/*********************************************************************
Creates the file page for the dictionary header. This function is
called only at the database creation. */
static
ibool
dict_hdr_create(
/*============*/
/* out: TRUE if succeed */
mtr_t* mtr) /* in: mtr */
{
dict_hdr_t* dict_header;
ulint hdr_page_no;
ulint root_page_no;
page_t* page;
ut_ad(mtr);
/* Create the dictionary header file block in a new, allocated file
segment in the system tablespace */
page = fseg_create(DICT_HDR_SPACE, 0,
DICT_HDR + DICT_HDR_FSEG_HEADER, mtr);
hdr_page_no = buf_frame_get_page_no(page);
ut_a(DICT_HDR_PAGE_NO == hdr_page_no);
dict_header = dict_hdr_get(mtr);
/* Start counting row, table, index, and tree ids from
DICT_HDR_FIRST_ID */
mlog_write_dulint(dict_header + DICT_HDR_ROW_ID,
ut_dulint_create(0, DICT_HDR_FIRST_ID),
MLOG_8BYTES, mtr);
mlog_write_dulint(dict_header + DICT_HDR_TABLE_ID,
ut_dulint_create(0, DICT_HDR_FIRST_ID),
MLOG_8BYTES, mtr);
mlog_write_dulint(dict_header + DICT_HDR_INDEX_ID,
ut_dulint_create(0, DICT_HDR_FIRST_ID),
MLOG_8BYTES, mtr);
mlog_write_dulint(dict_header + DICT_HDR_MIX_ID,
ut_dulint_create(0, DICT_HDR_FIRST_ID),
MLOG_8BYTES, mtr);
/* Create the B-tree roots for the clustered indexes of the basic
system tables */
/*--------------------------*/
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
DICT_HDR_SPACE, DICT_TABLES_ID, mtr);
if (root_page_no == FIL_NULL) {
return(FALSE);
}
mlog_write_ulint(dict_header + DICT_HDR_TABLES, root_page_no,
MLOG_4BYTES, mtr);
/*--------------------------*/
root_page_no = btr_create(DICT_UNIQUE, DICT_HDR_SPACE,
DICT_TABLE_IDS_ID, mtr);
if (root_page_no == FIL_NULL) {
return(FALSE);
}
mlog_write_ulint(dict_header + DICT_HDR_TABLE_IDS, root_page_no,
MLOG_4BYTES, mtr);
/*--------------------------*/
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
DICT_HDR_SPACE, DICT_COLUMNS_ID, mtr);
if (root_page_no == FIL_NULL) {
return(FALSE);
}
mlog_write_ulint(dict_header + DICT_HDR_COLUMNS, root_page_no,
MLOG_4BYTES, mtr);
/*--------------------------*/
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
DICT_HDR_SPACE, DICT_INDEXES_ID, mtr);
if (root_page_no == FIL_NULL) {
return(FALSE);
}
mlog_write_ulint(dict_header + DICT_HDR_INDEXES, root_page_no,
MLOG_4BYTES, mtr);
/*--------------------------*/
root_page_no = btr_create(DICT_CLUSTERED | DICT_UNIQUE,
DICT_HDR_SPACE, DICT_FIELDS_ID, mtr);
if (root_page_no == FIL_NULL) {
return(FALSE);
}
mlog_write_ulint(dict_header + DICT_HDR_FIELDS, root_page_no,
MLOG_4BYTES, mtr);
/*--------------------------*/
return(TRUE);
}
/*********************************************************************
Initializes the data dictionary memory structures when the database is
started. This function is also called when the data dictionary is created. */
void
dict_boot(void)
/*===========*/
{
dict_table_t* table;
dict_index_t* index;
dict_hdr_t* dict_hdr;
mtr_t mtr;
mtr_start(&mtr);
/* Create the hash tables etc. */
dict_init();
mutex_enter(&(dict_sys->mutex));
/* Get the dictionary header */
dict_hdr = dict_hdr_get(&mtr);
/* Because we only write new row ids to disk-based data structure
(dictionary header) when it is divisible by
DICT_HDR_ROW_ID_WRITE_MARGIN, in recovery we will not recover
the latest value of the row id counter. Therefore we advance
the counter at the database startup to avoid overlapping values.
Note that when a user after database startup first time asks for
a new row id, then because the counter is now divisible by
..._MARGIN, it will immediately be updated to the disk-based
header. */
dict_sys->row_id = ut_dulint_add(
ut_dulint_align_up(
mtr_read_dulint(dict_hdr + DICT_HDR_ROW_ID,
MLOG_8BYTES, &mtr),
DICT_HDR_ROW_ID_WRITE_MARGIN),
DICT_HDR_ROW_ID_WRITE_MARGIN);
/* Insert into the dictionary cache the descriptions of the basic
system tables */
/*-------------------------*/
table = dict_mem_table_create("SYS_TABLES", DICT_HDR_SPACE, 8);
dict_mem_table_add_col(table, "NAME", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "ID", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "N_COLS", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "TYPE", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "MIX_ID", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "MIX_LEN", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "CLUSTER_NAME", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "SPACE", DATA_INT, 0, 4, 0);
table->id = DICT_TABLES_ID;
dict_table_add_to_cache(table);
dict_sys->sys_tables = table;
index = dict_mem_index_create("SYS_TABLES", "CLUST_IND",
DICT_HDR_SPACE,
DICT_UNIQUE | DICT_CLUSTERED, 1);
dict_mem_index_add_field(index, "NAME", 0);
index->page_no = mtr_read_ulint(dict_hdr + DICT_HDR_TABLES,
MLOG_4BYTES, &mtr);
index->id = DICT_TABLES_ID;
ut_a(dict_index_add_to_cache(table, index));
/*-------------------------*/
index = dict_mem_index_create("SYS_TABLES", "ID_IND", DICT_HDR_SPACE,
DICT_UNIQUE, 1);
dict_mem_index_add_field(index, "ID", 0);
index->page_no = mtr_read_ulint(dict_hdr + DICT_HDR_TABLE_IDS,
MLOG_4BYTES, &mtr);
index->id = DICT_TABLE_IDS_ID;
ut_a(dict_index_add_to_cache(table, index));
/*-------------------------*/
table = dict_mem_table_create("SYS_COLUMNS", DICT_HDR_SPACE, 7);
dict_mem_table_add_col(table, "TABLE_ID", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "POS", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "NAME", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "MTYPE", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "PRTYPE", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "LEN", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "PREC", DATA_INT, 0, 4, 0);
table->id = DICT_COLUMNS_ID;
dict_table_add_to_cache(table);
dict_sys->sys_columns = table;
index = dict_mem_index_create("SYS_COLUMNS", "CLUST_IND",
DICT_HDR_SPACE,
DICT_UNIQUE | DICT_CLUSTERED, 2);
dict_mem_index_add_field(index, "TABLE_ID", 0);
dict_mem_index_add_field(index, "POS", 0);
index->page_no = mtr_read_ulint(dict_hdr + DICT_HDR_COLUMNS,
MLOG_4BYTES, &mtr);
index->id = DICT_COLUMNS_ID;
ut_a(dict_index_add_to_cache(table, index));
/*-------------------------*/
table = dict_mem_table_create("SYS_INDEXES", DICT_HDR_SPACE, 7);
dict_mem_table_add_col(table, "TABLE_ID", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "ID", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "NAME", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "N_FIELDS", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "TYPE", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "SPACE", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "PAGE_NO", DATA_INT, 0, 4, 0);
/* The '+ 2' below comes from the 2 system fields */
ut_ad(DICT_SYS_INDEXES_PAGE_NO_FIELD == 6 + 2);
ut_ad(DICT_SYS_INDEXES_SPACE_NO_FIELD == 5 + 2);
table->id = DICT_INDEXES_ID;
dict_table_add_to_cache(table);
dict_sys->sys_indexes = table;
index = dict_mem_index_create("SYS_INDEXES", "CLUST_IND",
DICT_HDR_SPACE,
DICT_UNIQUE | DICT_CLUSTERED, 2);
dict_mem_index_add_field(index, "TABLE_ID", 0);
dict_mem_index_add_field(index, "ID", 0);
index->page_no = mtr_read_ulint(dict_hdr + DICT_HDR_INDEXES,
MLOG_4BYTES, &mtr);
index->id = DICT_INDEXES_ID;
ut_a(dict_index_add_to_cache(table, index));
/*-------------------------*/
table = dict_mem_table_create("SYS_FIELDS", DICT_HDR_SPACE, 3);
dict_mem_table_add_col(table, "INDEX_ID", DATA_BINARY, 0, 0, 0);
dict_mem_table_add_col(table, "POS", DATA_INT, 0, 4, 0);
dict_mem_table_add_col(table, "COL_NAME", DATA_BINARY, 0, 0, 0);
table->id = DICT_FIELDS_ID;
dict_table_add_to_cache(table);
dict_sys->sys_fields = table;
index = dict_mem_index_create("SYS_FIELDS", "CLUST_IND",
DICT_HDR_SPACE,
DICT_UNIQUE | DICT_CLUSTERED, 2);
dict_mem_index_add_field(index, "INDEX_ID", 0);
dict_mem_index_add_field(index, "POS", 0);
index->page_no = mtr_read_ulint(dict_hdr + DICT_HDR_FIELDS,
MLOG_4BYTES, &mtr);
index->id = DICT_FIELDS_ID;
ut_a(dict_index_add_to_cache(table, index));
mtr_commit(&mtr);
/*-------------------------*/
/* Load definitions of other indexes on system tables */
dict_load_sys_table(dict_sys->sys_tables);
dict_load_sys_table(dict_sys->sys_columns);
dict_load_sys_table(dict_sys->sys_indexes);
dict_load_sys_table(dict_sys->sys_fields);
/* Initialize the insert buffer table and index for each tablespace */
ibuf_init_at_db_start();
mutex_exit(&(dict_sys->mutex));
}
/*********************************************************************
Inserts the basic system table data into themselves in the database
creation. */
static
void
dict_insert_initial_data(void)
/*==========================*/
{
/* Does nothing yet */
}
/*********************************************************************
Creates and initializes the data dictionary at the database creation. */
void
dict_create(void)
/*=============*/
{
mtr_t mtr;
mtr_start(&mtr);
dict_hdr_create(&mtr);
mtr_commit(&mtr);
dict_boot();
dict_insert_initial_data();
sync_order_checks_on = TRUE;
}

1031
innobase/dict/dict0crea.c Normal file

File diff suppressed because it is too large Load diff

1870
innobase/dict/dict0dict.c Normal file

File diff suppressed because it is too large Load diff

611
innobase/dict/dict0load.c Normal file
View file

@ -0,0 +1,611 @@
/******************************************************
Loads to the memory cache database object definitions
from dictionary tables
(c) 1996 Innobase Oy
Created 4/24/1996 Heikki Tuuri
*******************************************************/
#include "dict0load.h"
#ifdef UNIV_NONINL
#include "dict0load.ic"
#endif
#include "btr0pcur.h"
#include "btr0btr.h"
#include "page0page.h"
#include "mach0data.h"
#include "dict0dict.h"
#include "dict0boot.h"
/************************************************************************
Loads definitions for table columns. */
static
void
dict_load_columns(
/*==============*/
dict_table_t* table, /* in: table */
mem_heap_t* heap); /* in: memory heap for temporary storage */
/************************************************************************
Loads definitions for table indexes. */
static
void
dict_load_indexes(
/*==============*/
dict_table_t* table, /* in: table */
mem_heap_t* heap); /* in: memory heap for temporary storage */
/************************************************************************
Loads definitions for index fields. */
static
void
dict_load_fields(
/*=============*/
dict_table_t* table, /* in: table */
dict_index_t* index, /* in: index whose fields to load */
mem_heap_t* heap); /* in: memory heap for temporary storage */
/************************************************************************
Loads a table definition and also all its index definitions, and also
the cluster definition if the table is a member in a cluster. */
dict_table_t*
dict_load_table(
/*============*/
/* out: table, NULL if does not exist */
char* name) /* in: table name */
{
dict_table_t* table;
dict_table_t* sys_tables;
mtr_t mtr;
btr_pcur_t pcur;
dict_index_t* sys_index;
dtuple_t* tuple;
mem_heap_t* heap;
dfield_t* dfield;
rec_t* rec;
byte* field;
ulint len;
char* buf;
ulint space;
ulint n_cols;
ut_ad(mutex_own(&(dict_sys->mutex)));
heap = mem_heap_create(1000);
mtr_start(&mtr);
sys_tables = dict_table_get_low("SYS_TABLES");
sys_index = UT_LIST_GET_FIRST(sys_tables->indexes);
tuple = dtuple_create(heap, 1);
dfield = dtuple_get_nth_field(tuple, 0);
dfield_set_data(dfield, name, ut_strlen(name));
dict_index_copy_types(tuple, sys_index, 1);
btr_pcur_open_on_user_rec(sys_index, tuple, PAGE_CUR_GE,
BTR_SEARCH_LEAF, &pcur, &mtr);
rec = btr_pcur_get_rec(&pcur);
if (!btr_pcur_is_on_user_rec(&pcur, &mtr)
|| rec_get_deleted_flag(rec)) {
/* Not found */
btr_pcur_close(&pcur);
mtr_commit(&mtr);
mem_heap_free(heap);
return(NULL);
}
field = rec_get_nth_field(rec, 0, &len);
/* Check if the table name in record is the searched one */
if (len != ut_strlen(name) || ut_memcmp(name, field, len) != 0) {
btr_pcur_close(&pcur);
mtr_commit(&mtr);
mem_heap_free(heap);
return(NULL);
}
ut_a(0 == ut_strcmp("SPACE",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_tables), 9))->name));
field = rec_get_nth_field(rec, 9, &len);
space = mach_read_from_4(field);
ut_a(0 == ut_strcmp("N_COLS",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_tables), 4))->name));
field = rec_get_nth_field(rec, 4, &len);
n_cols = mach_read_from_4(field);
table = dict_mem_table_create(name, space, n_cols);
ut_a(0 == ut_strcmp("ID",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_tables), 3))->name));
field = rec_get_nth_field(rec, 3, &len);
table->id = mach_read_from_8(field);
field = rec_get_nth_field(rec, 5, &len);
table->type = mach_read_from_4(field);
if (table->type == DICT_TABLE_CLUSTER_MEMBER) {
ut_a(0);
field = rec_get_nth_field(rec, 6, &len);
table->mix_id = mach_read_from_8(field);
field = rec_get_nth_field(rec, 8, &len);
buf = mem_heap_alloc(heap, len);
ut_memcpy(buf, field, len);
table->cluster_name = buf;
}
if ((table->type == DICT_TABLE_CLUSTER)
|| (table->type == DICT_TABLE_CLUSTER_MEMBER)) {
field = rec_get_nth_field(rec, 7, &len);
table->mix_len = mach_read_from_4(field);
}
btr_pcur_close(&pcur);
mtr_commit(&mtr);
if (table->type == DICT_TABLE_CLUSTER_MEMBER) {
/* Load the cluster table definition if not yet in
memory cache */
dict_table_get_low(table->cluster_name);
}
dict_load_columns(table, heap);
dict_table_add_to_cache(table);
dict_load_indexes(table, heap);
mem_heap_free(heap);
return(table);
}
/************************************************************************
This function is called when the database is booted. Loads system table
index definitions except for the clustered index which is added to the
dictionary cache at booting before calling this function. */
void
dict_load_sys_table(
/*================*/
dict_table_t* table) /* in: system table */
{
mem_heap_t* heap;
ut_ad(mutex_own(&(dict_sys->mutex)));
heap = mem_heap_create(1000);
dict_load_indexes(table, heap);
mem_heap_free(heap);
}
/************************************************************************
Loads definitions for table columns. */
static
void
dict_load_columns(
/*==============*/
dict_table_t* table, /* in: table */
mem_heap_t* heap) /* in: memory heap for temporary storage */
{
dict_table_t* sys_columns;
dict_index_t* sys_index;
btr_pcur_t pcur;
dtuple_t* tuple;
dfield_t* dfield;
rec_t* rec;
byte* field;
ulint len;
byte* buf;
char* name_buf;
char* name;
ulint mtype;
ulint prtype;
ulint col_len;
ulint prec;
ulint i;
mtr_t mtr;
ut_ad(mutex_own(&(dict_sys->mutex)));
mtr_start(&mtr);
sys_columns = dict_table_get_low("SYS_COLUMNS");
sys_index = UT_LIST_GET_FIRST(sys_columns->indexes);
tuple = dtuple_create(heap, 1);
dfield = dtuple_get_nth_field(tuple, 0);
buf = mem_heap_alloc(heap, 8);
mach_write_to_8(buf, table->id);
dfield_set_data(dfield, buf, 8);
dict_index_copy_types(tuple, sys_index, 1);
btr_pcur_open_on_user_rec(sys_index, tuple, PAGE_CUR_GE,
BTR_SEARCH_LEAF, &pcur, &mtr);
for (i = 0; i < table->n_cols - DATA_N_SYS_COLS; i++) {
rec = btr_pcur_get_rec(&pcur);
ut_a(btr_pcur_is_on_user_rec(&pcur, &mtr));
ut_a(!rec_get_deleted_flag(rec));
field = rec_get_nth_field(rec, 0, &len);
ut_ad(len == 8);
ut_a(ut_dulint_cmp(table->id, mach_read_from_8(field)) == 0);
field = rec_get_nth_field(rec, 1, &len);
ut_ad(len == 4);
ut_a(i == mach_read_from_4(field));
ut_a(0 == ut_strcmp("NAME",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_columns), 4))->name));
field = rec_get_nth_field(rec, 4, &len);
name_buf = mem_heap_alloc(heap, len + 1);
ut_memcpy(name_buf, field, len);
name_buf[len] = '\0';
name = name_buf;
field = rec_get_nth_field(rec, 5, &len);
mtype = mach_read_from_4(field);
field = rec_get_nth_field(rec, 6, &len);
prtype = mach_read_from_4(field);
field = rec_get_nth_field(rec, 7, &len);
col_len = mach_read_from_4(field);
ut_a(0 == ut_strcmp("PREC",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_columns), 8))->name));
field = rec_get_nth_field(rec, 8, &len);
prec = mach_read_from_4(field);
dict_mem_table_add_col(table, name, mtype, prtype, col_len,
prec);
btr_pcur_move_to_next_user_rec(&pcur, &mtr);
}
btr_pcur_close(&pcur);
mtr_commit(&mtr);
}
/************************************************************************
Loads definitions for table indexes. */
static
void
dict_load_indexes(
/*==============*/
dict_table_t* table, /* in: table */
mem_heap_t* heap) /* in: memory heap for temporary storage */
{
dict_table_t* sys_indexes;
dict_index_t* sys_index;
dict_index_t* index;
btr_pcur_t pcur;
dtuple_t* tuple;
dfield_t* dfield;
rec_t* rec;
byte* field;
ulint len;
ulint name_len;
char* name_buf;
ulint type;
ulint space;
ulint page_no;
ulint n_fields;
byte* buf;
ibool is_sys_table;
dulint id;
mtr_t mtr;
ut_ad(mutex_own(&(dict_sys->mutex)));
if ((ut_dulint_get_high(table->id) == 0)
&& (ut_dulint_get_low(table->id) < DICT_HDR_FIRST_ID)) {
is_sys_table = TRUE;
} else {
is_sys_table = FALSE;
}
mtr_start(&mtr);
sys_indexes = dict_table_get_low("SYS_INDEXES");
sys_index = UT_LIST_GET_FIRST(sys_indexes->indexes);
tuple = dtuple_create(heap, 1);
dfield = dtuple_get_nth_field(tuple, 0);
buf = mem_heap_alloc(heap, 8);
mach_write_to_8(buf, table->id);
dfield_set_data(dfield, buf, 8);
dict_index_copy_types(tuple, sys_index, 1);
btr_pcur_open_on_user_rec(sys_index, tuple, PAGE_CUR_GE,
BTR_SEARCH_LEAF, &pcur, &mtr);
for (;;) {
if (!btr_pcur_is_on_user_rec(&pcur, &mtr)) {
break;
}
rec = btr_pcur_get_rec(&pcur);
field = rec_get_nth_field(rec, 0, &len);
ut_ad(len == 8);
if (ut_memcmp(buf, field, len) != 0) {
break;
}
ut_a(!rec_get_deleted_flag(rec));
field = rec_get_nth_field(rec, 1, &len);
ut_ad(len == 8);
id = mach_read_from_8(field);
ut_a(0 == ut_strcmp("NAME",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_indexes), 4))->name));
field = rec_get_nth_field(rec, 4, &name_len);
name_buf = mem_heap_alloc(heap, name_len + 1);
ut_memcpy(name_buf, field, name_len);
name_buf[name_len] = '\0';
field = rec_get_nth_field(rec, 5, &len);
n_fields = mach_read_from_4(field);
field = rec_get_nth_field(rec, 6, &len);
type = mach_read_from_4(field);
field = rec_get_nth_field(rec, 7, &len);
space = mach_read_from_4(field);
ut_a(0 == ut_strcmp("PAGE_NO",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_indexes), 8))->name));
field = rec_get_nth_field(rec, 8, &len);
page_no = mach_read_from_4(field);
if (is_sys_table
&& ((type & DICT_CLUSTERED)
|| ((table == dict_sys->sys_tables)
&& (name_len == ut_strlen("ID_IND"))
&& (0 == ut_memcmp(name_buf, "ID_IND",
name_len))))) {
/* The index was created in memory already in
booting */
} else {
index = dict_mem_index_create(table->name, name_buf,
space, type, n_fields);
index->page_no = page_no;
index->id = id;
dict_load_fields(table, index, heap);
dict_index_add_to_cache(table, index);
}
btr_pcur_move_to_next_user_rec(&pcur, &mtr);
}
btr_pcur_close(&pcur);
mtr_commit(&mtr);
}
/************************************************************************
Loads definitions for index fields. */
static
void
dict_load_fields(
/*=============*/
dict_table_t* table, /* in: table */
dict_index_t* index, /* in: index whose fields to load */
mem_heap_t* heap) /* in: memory heap for temporary storage */
{
dict_table_t* sys_fields;
dict_index_t* sys_index;
mtr_t mtr;
btr_pcur_t pcur;
dtuple_t* tuple;
dfield_t* dfield;
char* col_name;
rec_t* rec;
byte* field;
ulint len;
byte* buf;
ulint i;
ut_ad(mutex_own(&(dict_sys->mutex)));
UT_NOT_USED(table);
mtr_start(&mtr);
sys_fields = dict_table_get_low("SYS_FIELDS");
sys_index = UT_LIST_GET_FIRST(sys_fields->indexes);
tuple = dtuple_create(heap, 1);
dfield = dtuple_get_nth_field(tuple, 0);
buf = mem_heap_alloc(heap, 8);
mach_write_to_8(buf, index->id);
dfield_set_data(dfield, buf, 8);
dict_index_copy_types(tuple, sys_index, 1);
btr_pcur_open_on_user_rec(sys_index, tuple, PAGE_CUR_GE,
BTR_SEARCH_LEAF, &pcur, &mtr);
for (i = 0; i < index->n_fields; i++) {
rec = btr_pcur_get_rec(&pcur);
ut_a(btr_pcur_is_on_user_rec(&pcur, &mtr));
ut_a(!rec_get_deleted_flag(rec));
field = rec_get_nth_field(rec, 0, &len);
ut_ad(len == 8);
ut_a(ut_memcmp(buf, field, len) == 0);
field = rec_get_nth_field(rec, 1, &len);
ut_ad(len == 4);
ut_a(i == mach_read_from_4(field));
ut_a(0 == ut_strcmp("COL_NAME",
dict_field_get_col(
dict_index_get_nth_field(
dict_table_get_first_index(sys_fields), 4))->name));
field = rec_get_nth_field(rec, 4, &len);
col_name = mem_heap_alloc(heap, len + 1);
ut_memcpy(col_name, field, len);
col_name[len] = '\0';
dict_mem_index_add_field(index, col_name, 0);
btr_pcur_move_to_next_user_rec(&pcur, &mtr);
}
btr_pcur_close(&pcur);
mtr_commit(&mtr);
}
/***************************************************************************
Loads a table object based on the table id. */
dict_table_t*
dict_load_table_on_id(
/*==================*/
/* out: table; NULL if table does not exist */
dulint table_id) /* in: table id */
{
mtr_t mtr;
byte id_buf[8];
btr_pcur_t pcur;
mem_heap_t* heap;
dtuple_t* tuple;
dfield_t* dfield;
dict_index_t* sys_table_ids;
dict_table_t* sys_tables;
rec_t* rec;
byte* field;
ulint len;
dict_table_t* table;
char* name;
ut_ad(mutex_own(&(dict_sys->mutex)));
/* NOTE that the operation of this function is protected by
the dictionary mutex, and therefore no deadlocks can occur
with other dictionary operations. */
mtr_start(&mtr);
/*---------------------------------------------------*/
/* Get the secondary index based on ID for table SYS_TABLES */
sys_tables = dict_sys->sys_tables;
sys_table_ids = dict_table_get_next_index(
dict_table_get_first_index(sys_tables));
heap = mem_heap_create(256);
tuple = dtuple_create(heap, 1);
dfield = dtuple_get_nth_field(tuple, 0);
/* Write the table id in byte format to id_buf */
mach_write_to_8(id_buf, table_id);
dfield_set_data(dfield, id_buf, 8);
dict_index_copy_types(tuple, sys_table_ids, 1);
btr_pcur_open_on_user_rec(sys_table_ids, tuple, PAGE_CUR_GE,
BTR_SEARCH_LEAF, &pcur, &mtr);
rec = btr_pcur_get_rec(&pcur);
if (!btr_pcur_is_on_user_rec(&pcur, &mtr)
|| rec_get_deleted_flag(rec)) {
/* Not found */
btr_pcur_close(&pcur);
mtr_commit(&mtr);
mem_heap_free(heap);
return(NULL);
}
/*---------------------------------------------------*/
/* Now we have the record in the secondary index containing the
table ID and NAME */
rec = btr_pcur_get_rec(&pcur);
field = rec_get_nth_field(rec, 0, &len);
ut_ad(len == 8);
/* Check if the table id in record is the one searched for */
if (ut_dulint_cmp(table_id, mach_read_from_8(field)) != 0) {
btr_pcur_close(&pcur);
mtr_commit(&mtr);
mem_heap_free(heap);
return(NULL);
}
/* Now we get the table name from the record */
field = rec_get_nth_field(rec, 1, &len);
name = mem_heap_alloc(heap, len + 1);
ut_memcpy(name, field, len);
name[len] = '\0';
/* Load the table definition to memory */
table = dict_load_table(name);
ut_a(table);
btr_pcur_close(&pcur);
mtr_commit(&mtr);
mem_heap_free(heap);
return(table);
}

291
innobase/dict/dict0mem.c Normal file
View file

@ -0,0 +1,291 @@
/**********************************************************************
Data dictionary memory object creation
(c) 1996 Innobase Oy
Created 1/8/1996 Heikki Tuuri
***********************************************************************/
#include "dict0mem.h"
#ifdef UNIV_NONINL
#include "dict0mem.ic"
#endif
#include "rem0rec.h"
#include "data0type.h"
#include "mach0data.h"
#include "dict0dict.h"
#include "que0que.h"
#include "pars0pars.h"
#define DICT_HEAP_SIZE 100 /* initial memory heap size when
creating a table or index object */
/**************************************************************************
Creates a table memory object. */
dict_table_t*
dict_mem_table_create(
/*==================*/
/* out, own: table object */
char* name, /* in: table name */
ulint space, /* in: space where the clustered index of
the table is placed; this parameter is
ignored if the table is made a member of
a cluster */
ulint n_cols) /* in: number of columns */
{
dict_table_t* table;
char* str;
mem_heap_t* heap;
ut_ad(name);
heap = mem_heap_create(DICT_HEAP_SIZE);
table = mem_heap_alloc(heap, sizeof(dict_table_t));
table->heap = heap;
str = mem_heap_alloc(heap, 1 + ut_strlen(name));
ut_strcpy(str, name);
table->type = DICT_TABLE_ORDINARY;
table->name = str;
table->space = space;
table->n_def = 0;
table->n_cols = n_cols + DATA_N_SYS_COLS;
table->mem_fix = 0;
table->cached = FALSE;
table->cols = mem_heap_alloc(heap, (n_cols + DATA_N_SYS_COLS)
* sizeof(dict_col_t));
UT_LIST_INIT(table->indexes);
UT_LIST_INIT(table->locks);
table->does_not_fit_in_memory = FALSE;
table->stat_last_estimate_counter = (ulint)(-1);
table->stat_modif_counter = 0;
table->magic_n = DICT_TABLE_MAGIC_N;
return(table);
}
/**************************************************************************
Creates a cluster memory object. */
dict_table_t*
dict_mem_cluster_create(
/*====================*/
/* out, own: cluster object */
char* name, /* in: cluster name */
ulint space, /* in: space where the clustered indexes
of the member tables are placed */
ulint n_cols, /* in: number of columns */
ulint mix_len) /* in: length of the common key prefix in the
cluster */
{
dict_table_t* cluster;
cluster = dict_mem_table_create(name, space, n_cols);
cluster->type = DICT_TABLE_CLUSTER;
cluster->mix_len = mix_len;
return(cluster);
}
/**************************************************************************
Declares a non-published table as a member in a cluster. */
void
dict_mem_table_make_cluster_member(
/*===============================*/
dict_table_t* table, /* in: non-published table */
char* cluster_name) /* in: cluster name */
{
table->type = DICT_TABLE_CLUSTER_MEMBER;
table->cluster_name = cluster_name;
}
/**************************************************************************
Adds a column definition to a table. */
void
dict_mem_table_add_col(
/*===================*/
dict_table_t* table, /* in: table */
char* name, /* in: column name */
ulint mtype, /* in: main datatype */
ulint prtype, /* in: precise type */
ulint len, /* in: length */
ulint prec) /* in: precision */
{
char* str;
dict_col_t* col;
dtype_t* type;
ut_ad(table && name);
ut_ad(table->magic_n == DICT_TABLE_MAGIC_N);
table->n_def++;
col = dict_table_get_nth_col(table, table->n_def - 1);
str = mem_heap_alloc(table->heap, 1 + ut_strlen(name));
ut_strcpy(str, name);
col->ind = table->n_def - 1;
col->name = str;
col->table = table;
col->ord_part = 0;
col->clust_pos = ULINT_UNDEFINED;
type = dict_col_get_type(col);
dtype_set(type, mtype, prtype, len, prec);
}
/**************************************************************************
Creates an index memory object. */
dict_index_t*
dict_mem_index_create(
/*==================*/
/* out, own: index object */
char* table_name, /* in: table name */
char* index_name, /* in: index name */
ulint space, /* in: space where the index tree is placed,
ignored if the index is of the clustered
type */
ulint type, /* in: DICT_UNIQUE, DICT_CLUSTERED, ... ORed */
ulint n_fields) /* in: number of fields */
{
char* str;
dict_index_t* index;
mem_heap_t* heap;
ut_ad(table_name && index_name);
heap = mem_heap_create(DICT_HEAP_SIZE);
index = mem_heap_alloc(heap, sizeof(dict_index_t));
index->heap = heap;
str = mem_heap_alloc(heap, 1 + ut_strlen(index_name));
ut_strcpy(str, index_name);
index->type = type;
index->space = space;
index->name = str;
index->table_name = table_name;
index->table = NULL;
index->n_def = 0;
index->n_fields = n_fields;
index->fields = mem_heap_alloc(heap, 1 + n_fields
* sizeof(dict_field_t));
/* The '1 +' above prevents allocation
of an empty mem block */
index->cached = FALSE;
index->magic_n = DICT_INDEX_MAGIC_N;
return(index);
}
/**************************************************************************
Adds a field definition to an index. NOTE: does not take a copy
of the column name if the field is a column. The memory occupied
by the column name may be released only after publishing the index. */
void
dict_mem_index_add_field(
/*=====================*/
dict_index_t* index, /* in: index */
char* name, /* in: column name */
ulint order) /* in: order criterion; 0 means an ascending
order */
{
dict_field_t* field;
ut_ad(index && name);
ut_ad(index->magic_n == DICT_INDEX_MAGIC_N);
index->n_def++;
field = dict_index_get_nth_field(index, index->n_def - 1);
field->name = name;
field->order = order;
}
/**************************************************************************
Frees an index memory object. */
void
dict_mem_index_free(
/*================*/
dict_index_t* index) /* in: index */
{
mem_heap_free(index->heap);
}
/**************************************************************************
Creates a procedure memory object. */
dict_proc_t*
dict_mem_procedure_create(
/*======================*/
/* out, own: procedure object */
char* name, /* in: procedure name */
char* sql_string, /* in: procedure definition as an SQL
string */
que_fork_t* graph) /* in: parsed procedure graph */
{
dict_proc_t* proc;
proc_node_t* proc_node;
mem_heap_t* heap;
char* str;
ut_ad(name);
heap = mem_heap_create(128);
proc = mem_heap_alloc(heap, sizeof(dict_proc_t));
proc->heap = heap;
str = mem_heap_alloc(heap, 1 + ut_strlen(name));
ut_strcpy(str, name);
proc->name = str;
str = mem_heap_alloc(heap, 1 + ut_strlen(sql_string));
ut_strcpy(str, sql_string);
proc->sql_string = str;
UT_LIST_INIT(proc->graphs);
/* UT_LIST_ADD_LAST(graphs, proc->graphs, graph); */
#ifdef UNIV_DEBUG
UT_LIST_VALIDATE(graphs, que_t, proc->graphs);
#endif
proc->mem_fix = 0;
proc_node = que_fork_get_child(graph);
proc_node->dict_proc = proc;
return(proc);
}

21
innobase/dict/makefilewin Normal file
View file

@ -0,0 +1,21 @@
include ..\include\makefile.i
dict.lib: dict0dict.obj dict0boot.obj dict0load.obj dict0mem.obj dict0crea.obj
lib -out:..\libs\dict.lib dict0dict.obj dict0boot.obj dict0load.obj dict0mem.obj dict0crea.obj
dict0dict.obj: dict0dict.c
$(CCOM) $(CFL) -c dict0dict.c
dict0boot.obj: dict0boot.c
$(CCOM) $(CFL) -c dict0boot.c
dict0mem.obj: dict0mem.c
$(CCOM) $(CFL) -c dict0mem.c
dict0crea.obj: dict0crea.c
$(CCOM) $(CFL) -c dict0crea.c
dict0load.obj: dict0load.c
$(CCOM) $(CFL) -c dict0load.c

16
innobase/dict/ts/makefile Normal file
View file

@ -0,0 +1,16 @@
include ..\..\makefile.i
tsdict: ..\dict.lib tsdict.c
$(CCOM) $(CFL) -I.. -I..\.. ..\dict.lib ..\..\data.lib ..\..\buf.lib ..\..\mach.lib ..\..\fil.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tsdict.c $(LFL)

73
innobase/dict/ts/tsdict.c Normal file
View file

@ -0,0 +1,73 @@
/************************************************************************
The test module for the data dictionary
(c) 1996 Innobase Oy
Created 1/13/1996 Heikki Tuuri
*************************************************************************/
#include "sync0sync.h"
#include "mem0mem.h"
#include "buf0buf.h"
#include "data0type.h"
#include "..\dict0dict.h"
/************************************************************************
Basic test of data dictionary. */
void
test1(void)
/*=======*/
{
dict_table_t* table;
dict_index_t* index;
table = dict_table_create("TS_TABLE1", 3);
dict_table_add_col(table, "COL1", DATA_INT, 3, 4, 5);
dict_table_add_col(table, "COL2", DATA_INT, 3, 4, 5);
dict_table_add_col(table, "COL3", DATA_INT, 3, 4, 5);
ut_a(0 == dict_table_publish(table));
index = dict_index_create("TS_TABLE1", "IND1",
DICT_UNIQUE | DICT_CLUSTERED | DICT_MIX, 2, 1);
dict_index_add_field(index, "COL2", DICT_DESCEND);
dict_index_add_field(index, "COL1", 0);
ut_a(0 == dict_index_publish(index));
dict_table_print(table);
dict_table_free(table);
ut_a(dict_all_freed());
dict_free_all();
ut_a(dict_all_freed());
}
/************************************************************************
Main test function. */
void
main(void)
/*======*/
{
ulint tm, oldtm;
oldtm = ut_clock();
sync_init();
mem_init();
buf_pool_init(100, 100);
dict_init();
test1();
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

24
innobase/dyn/Makefile.am Normal file
View file

@ -0,0 +1,24 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libdyn.a
libdyn_a_SOURCES = dyn0dyn.c
EXTRA_PROGRAMS =

48
innobase/dyn/dyn0dyn.c Normal file
View file

@ -0,0 +1,48 @@
/******************************************************
The dynamically allocated array
(c) 1996 Innobase Oy
Created 2/5/1996 Heikki Tuuri
*******************************************************/
#include "dyn0dyn.h"
#ifdef UNIV_NONINL
#include "dyn0dyn.ic"
#endif
/****************************************************************
Adds a new block to a dyn array. */
dyn_block_t*
dyn_array_add_block(
/*================*/
/* out: created block */
dyn_array_t* arr) /* in: dyn array */
{
mem_heap_t* heap;
dyn_block_t* block;
ut_ad(arr);
ut_ad(arr->magic_n == DYN_BLOCK_MAGIC_N);
if (arr->heap == NULL) {
UT_LIST_INIT(arr->base);
UT_LIST_ADD_FIRST(list, arr->base, arr);
arr->heap = mem_heap_create(sizeof(dyn_block_t));
}
block = dyn_array_get_last_block(arr);
block->used = block->used | DYN_BLOCK_FULL_FLAG;
heap = arr->heap;
block = mem_heap_alloc(heap, sizeof(dyn_block_t));
block->used = 0;
UT_LIST_ADD_LAST(list, arr->base, block);
return(block);
}

9
innobase/dyn/makefilewin Normal file
View file

@ -0,0 +1,9 @@
include ..\include\makefile.i
dyn.lib: dyn0dyn.obj makefile
lib -out:..\libs\dyn.lib dyn0dyn.obj
dyn0dyn.obj: dyn0dyn.c
$(CCOM) $(CFL) -c dyn0dyn.c

12
innobase/dyn/ts/makefile Normal file
View file

@ -0,0 +1,12 @@
include ..\..\makefile.i
tsdyn: ..\dyn.lib tsdyn.c makefile
$(CCOM) $(CFL) -I.. -I..\.. ..\dyn.lib ..\..\mem.lib ..\..\ut.lib ..\..\mach.lib ..\..\sync.lib ..\..\os.lib tsdyn.c $(LFL)

57
innobase/dyn/ts/tsdyn.c Normal file
View file

@ -0,0 +1,57 @@
/************************************************************************
The test module for dynamic array
(c) 1996 Innobase Oy
Created 2/5/1996 Heikki Tuuri
*************************************************************************/
#include "../dyn0dyn.h"
#include "sync0sync.h"
#include "mem0mem.h"
/****************************************************************
Basic test. */
void
test1(void)
/*=======*/
{
dyn_array_t dyn;
ulint i;
ulint* ulint_ptr;
printf("-------------------------------------------\n");
printf("TEST 1. Basic test\n");
dyn_array_create(&dyn);
for (i = 0; i < 1000; i++) {
ulint_ptr = dyn_array_push(&dyn, sizeof(ulint));
*ulint_ptr = i;
}
ut_a(dyn_array_get_n_elements(&dyn) == 1000);
for (i = 0; i < 1000; i++) {
ulint_ptr = dyn_array_get_nth_element(&dyn, i, sizeof(ulint));
ut_a(*ulint_ptr == i);
}
dyn_array_free(&dyn);
}
void
main(void)
{
sync_init();
mem_init();
test1();
ut_ad(sync_all_freed());
ut_ad(mem_all_freed());
printf("TEST SUCCESSFULLY COMPLETED!\n");
}

25
innobase/eval/Makefile.am Normal file
View file

@ -0,0 +1,25 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libeval.a
libeval_a_SOURCES = eval0eval.c eval0proc.c
EXTRA_PROGRAMS =

777
innobase/eval/eval0eval.c Normal file
View file

@ -0,0 +1,777 @@
/******************************************************
SQL evaluator: evaluates simple data structures, like expressions, in
a query graph
(c) 1997 Innobase Oy
Created 12/29/1997 Heikki Tuuri
*******************************************************/
#include "eval0eval.h"
#ifdef UNIV_NONINL
#include "eval0eval.ic"
#endif
#include "data0data.h"
#include "row0sel.h"
/* The RND function seed */
ulint eval_rnd = 128367121;
/* Dummy adress used when we should allocate a buffer of size 0 in
the function below */
byte eval_dummy;
/*********************************************************************
Allocate a buffer from global dynamic memory for a value of a que_node.
NOTE that this memory must be explicitly freed when the query graph is
freed. If the node already has an allocated buffer, that buffer is freed
here. NOTE that this is the only function where dynamic memory should be
allocated for a query node val field. */
byte*
eval_node_alloc_val_buf(
/*====================*/
/* out: pointer to allocated buffer */
que_node_t* node, /* in: query graph node; sets the val field
data field to point to the new buffer, and
len field equal to size */
ulint size) /* in: buffer size */
{
dfield_t* dfield;
byte* data;
ut_ad(que_node_get_type(node) == QUE_NODE_SYMBOL
|| que_node_get_type(node) == QUE_NODE_FUNC);
dfield = que_node_get_val(node);
data = dfield_get_data(dfield);
if (data && data != &eval_dummy) {
mem_free(data);
}
if (size == 0) {
data = &eval_dummy;
} else {
data = mem_alloc(size);
}
que_node_set_val_buf_size(node, size);
dfield_set_data(dfield, data, size);
return(data);
}
/*********************************************************************
Free the buffer from global dynamic memory for a value of a que_node,
if it has been allocated in the above function. The freeing for pushed
column values is done in sel_col_prefetch_buf_free. */
void
eval_node_free_val_buf(
/*===================*/
que_node_t* node) /* in: query graph node */
{
dfield_t* dfield;
byte* data;
ut_ad(que_node_get_type(node) == QUE_NODE_SYMBOL
|| que_node_get_type(node) == QUE_NODE_FUNC);
dfield = que_node_get_val(node);
data = dfield_get_data(dfield);
if (que_node_get_val_buf_size(node) > 0) {
ut_a(data);
mem_free(data);
}
}
/*********************************************************************
Evaluates a comparison node. */
ibool
eval_cmp(
/*=====*/
/* out: the result of the comparison */
func_node_t* cmp_node) /* in: comparison node */
{
que_node_t* arg1;
que_node_t* arg2;
int res;
ibool val;
int func;
ut_ad(que_node_get_type(cmp_node) == QUE_NODE_FUNC);
arg1 = cmp_node->args;
arg2 = que_node_get_next(arg1);
res = cmp_dfield_dfield(que_node_get_val(arg1),
que_node_get_val(arg2));
val = TRUE;
func = cmp_node->func;
if (func == '=') {
if (res != 0) {
val = FALSE;
}
} else if (func == '<') {
if (res != -1) {
val = FALSE;
}
} else if (func == PARS_LE_TOKEN) {
if (res == 1) {
val = FALSE;
}
} else if (func == PARS_NE_TOKEN) {
if (res == 0) {
val = FALSE;
}
} else if (func == PARS_GE_TOKEN) {
if (res == -1) {
val = FALSE;
}
} else {
ut_ad(func == '>');
if (res != 1) {
val = FALSE;
}
}
eval_node_set_ibool_val(cmp_node, val);
return(val);
}
/*********************************************************************
Evaluates a logical operation node. */
UNIV_INLINE
void
eval_logical(
/*=========*/
func_node_t* logical_node) /* in: logical operation node */
{
que_node_t* arg1;
que_node_t* arg2;
ibool val1;
ibool val2;
ibool val;
int func;
ut_ad(que_node_get_type(logical_node) == QUE_NODE_FUNC);
arg1 = logical_node->args;
arg2 = que_node_get_next(arg1); /* arg2 is NULL if func is 'NOT' */
val1 = eval_node_get_ibool_val(arg1);
if (arg2) {
val2 = eval_node_get_ibool_val(arg2);
}
func = logical_node->func;
if (func == PARS_AND_TOKEN) {
val = val1 & val2;
} else if (func == PARS_OR_TOKEN) {
val = val1 | val2;
} else if (func == PARS_NOT_TOKEN) {
val = TRUE - val1;
} else {
ut_error;
}
eval_node_set_ibool_val(logical_node, val);
}
/*********************************************************************
Evaluates an arithmetic operation node. */
UNIV_INLINE
void
eval_arith(
/*=======*/
func_node_t* arith_node) /* in: arithmetic operation node */
{
que_node_t* arg1;
que_node_t* arg2;
lint val1;
lint val2;
lint val;
int func;
ut_ad(que_node_get_type(arith_node) == QUE_NODE_FUNC);
arg1 = arith_node->args;
arg2 = que_node_get_next(arg1); /* arg2 is NULL if func is unary '-' */
val1 = eval_node_get_int_val(arg1);
if (arg2) {
val2 = eval_node_get_int_val(arg2);
}
func = arith_node->func;
if (func == '+') {
val = val1 + val2;
} else if ((func == '-') && arg2) {
val = val1 - val2;
} else if (func == '-') {
val = -val1;
} else if (func == '*') {
val = val1 * val2;
} else {
ut_ad(func == '/');
val = val1 / val2;
}
eval_node_set_int_val(arith_node, val);
}
/*********************************************************************
Evaluates an aggregate operation node. */
UNIV_INLINE
void
eval_aggregate(
/*===========*/
func_node_t* node) /* in: aggregate operation node */
{
que_node_t* arg;
lint val;
lint arg_val;
int func;
ut_ad(que_node_get_type(node) == QUE_NODE_FUNC);
val = eval_node_get_int_val(node);
func = node->func;
if (func == PARS_COUNT_TOKEN) {
val = val + 1;
} else {
ut_ad(func == PARS_SUM_TOKEN);
arg = node->args;
arg_val = eval_node_get_int_val(arg);
val = val + arg_val;
}
eval_node_set_int_val(node, val);
}
/*********************************************************************
Evaluates a predefined function node where the function is not relevant
in benchmarks. */
static
void
eval_predefined_2(
/*==============*/
func_node_t* func_node) /* in: predefined function node */
{
que_node_t* arg;
que_node_t* arg1;
que_node_t* arg2;
lint int_val;
byte* data;
ulint len1;
ulint len2;
int func;
ulint i;
ut_ad(que_node_get_type(func_node) == QUE_NODE_FUNC);
arg1 = func_node->args;
if (arg1) {
arg2 = que_node_get_next(arg1);
}
func = func_node->func;
if (func == PARS_PRINTF_TOKEN) {
arg = arg1;
while (arg) {
dfield_print(que_node_get_val(arg));
arg = que_node_get_next(arg);
}
printf("\n");
} else if (func == PARS_ASSERT_TOKEN) {
if (!eval_node_get_ibool_val(arg1)) {
printf("SQL assertion fails in a stored procedure!\n");
}
ut_a(eval_node_get_ibool_val(arg1));
/* This function, or more precisely, a debug procedure,
returns no value */
} else if (func == PARS_RND_TOKEN) {
len1 = (ulint)eval_node_get_int_val(arg1);
len2 = (ulint)eval_node_get_int_val(arg2);
ut_ad(len2 >= len1);
if (len2 > len1) {
int_val = (lint)(len1 +
(eval_rnd % (len2 - len1 + 1)));
} else {
int_val = (lint)len1;
}
eval_rnd = ut_rnd_gen_next_ulint(eval_rnd);
eval_node_set_int_val(func_node, int_val);
} else if (func == PARS_RND_STR_TOKEN) {
len1 = (ulint)eval_node_get_int_val(arg1);
data = eval_node_ensure_val_buf(func_node, len1);
for (i = 0; i < len1; i++) {
data[i] = (byte)(97 + (eval_rnd % 3));
eval_rnd = ut_rnd_gen_next_ulint(eval_rnd);
}
} else {
ut_error;
}
}
/*********************************************************************
Evaluates a notfound-function node. */
UNIV_INLINE
void
eval_notfound(
/*==========*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg1;
que_node_t* arg2;
sym_node_t* cursor;
sel_node_t* sel_node;
ibool ibool_val;
arg1 = func_node->args;
arg2 = que_node_get_next(arg1);
ut_ad(func_node->func == PARS_NOTFOUND_TOKEN);
cursor = arg1;
ut_ad(que_node_get_type(cursor) == QUE_NODE_SYMBOL);
if (cursor->token_type == SYM_LIT) {
ut_ad(ut_memcmp(dfield_get_data(que_node_get_val(cursor)),
"SQL", 3) == 0);
sel_node = cursor->sym_table->query_graph->last_sel_node;
} else {
sel_node = cursor->alias->cursor_def;
}
if (sel_node->state == SEL_NODE_NO_MORE_ROWS) {
ibool_val = TRUE;
} else {
ibool_val = FALSE;
}
eval_node_set_ibool_val(func_node, ibool_val);
}
/*********************************************************************
Evaluates a substr-function node. */
UNIV_INLINE
void
eval_substr(
/*========*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg1;
que_node_t* arg2;
que_node_t* arg3;
dfield_t* dfield;
byte* str1;
ulint len1;
ulint len2;
arg1 = func_node->args;
arg2 = que_node_get_next(arg1);
ut_ad(func_node->func == PARS_SUBSTR_TOKEN);
arg3 = que_node_get_next(arg2);
str1 = dfield_get_data(que_node_get_val(arg1));
len1 = (ulint)eval_node_get_int_val(arg2);
len2 = (ulint)eval_node_get_int_val(arg3);
dfield = que_node_get_val(func_node);
dfield_set_data(dfield, str1 + len1, len2);
}
/*********************************************************************
Evaluates a replstr-procedure node. */
static
void
eval_replstr(
/*=========*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg1;
que_node_t* arg2;
que_node_t* arg3;
que_node_t* arg4;
byte* str1;
byte* str2;
ulint len1;
ulint len2;
arg1 = func_node->args;
arg2 = que_node_get_next(arg1);
ut_ad(que_node_get_type(arg1) == QUE_NODE_SYMBOL);
arg3 = que_node_get_next(arg2);
arg4 = que_node_get_next(arg3);
str1 = dfield_get_data(que_node_get_val(arg1));
str2 = dfield_get_data(que_node_get_val(arg2));
len1 = (ulint)eval_node_get_int_val(arg3);
len2 = (ulint)eval_node_get_int_val(arg4);
if ((dfield_get_len(que_node_get_val(arg1)) < len1 + len2)
|| (dfield_get_len(que_node_get_val(arg2)) < len2)) {
ut_error;
}
ut_memcpy(str1 + len1, str2, len2);
}
/*********************************************************************
Evaluates an instr-function node. */
static
void
eval_instr(
/*=======*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg1;
que_node_t* arg2;
dfield_t* dfield1;
dfield_t* dfield2;
lint int_val;
byte* str1;
byte* str2;
byte match_char;
ulint len1;
ulint len2;
ulint i;
ulint j;
arg1 = func_node->args;
arg2 = que_node_get_next(arg1);
dfield1 = que_node_get_val(arg1);
dfield2 = que_node_get_val(arg2);
str1 = dfield_get_data(dfield1);
str2 = dfield_get_data(dfield2);
len1 = dfield_get_len(dfield1);
len2 = dfield_get_len(dfield2);
if (len2 == 0) {
ut_error;
}
match_char = str2[0];
for (i = 0; i < len1; i++) {
/* In this outer loop, the number of matched characters is 0 */
if (str1[i] == match_char) {
if (i + len2 > len1) {
break;
}
for (j = 1;; j++) {
/* We have already matched j characters */
if (j == len2) {
int_val = i + 1;
goto match_found;
}
if (str1[i + j] != str2[j]) {
break;
}
}
}
}
int_val = 0;
match_found:
eval_node_set_int_val(func_node, int_val);
}
/*********************************************************************
Evaluates a predefined function node. */
UNIV_INLINE
void
eval_binary_to_number(
/*==================*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg1;
dfield_t* dfield;
byte* str1;
byte* str2;
ulint len1;
ulint int_val;
arg1 = func_node->args;
dfield = que_node_get_val(arg1);
str1 = dfield_get_data(dfield);
len1 = dfield_get_len(dfield);
if (len1 > 4) {
ut_error;
}
if (len1 == 4) {
str2 = str1;
} else {
int_val = 0;
str2 = (byte*)&int_val;
ut_memcpy(str2 + (4 - len1), str1, len1);
}
eval_node_copy_and_alloc_val(func_node, str2, 4);
}
/*********************************************************************
Evaluates a predefined function node. */
static
void
eval_concat(
/*========*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg;
dfield_t* dfield;
byte* data;
ulint len;
ulint len1;
arg = func_node->args;
len = 0;
while (arg) {
len1 = dfield_get_len(que_node_get_val(arg));
len += len1;
arg = que_node_get_next(arg);
}
data = eval_node_ensure_val_buf(func_node, len);
arg = func_node->args;
len = 0;
while (arg) {
dfield = que_node_get_val(arg);
len1 = dfield_get_len(dfield);
ut_memcpy(data + len, dfield_get_data(dfield), len1);
len += len1;
arg = que_node_get_next(arg);
}
}
/*********************************************************************
Evaluates a predefined function node. */
UNIV_INLINE
void
eval_to_binary(
/*===========*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg1;
que_node_t* arg2;
dfield_t* dfield;
byte* str1;
ulint len1;
arg1 = func_node->args;
str1 = dfield_get_data(que_node_get_val(arg1));
arg2 = que_node_get_next(arg1);
len1 = (ulint)eval_node_get_int_val(arg2);
if (len1 > 4) {
ut_error;
}
dfield = que_node_get_val(func_node);
dfield_set_data(dfield, str1 + (4 - len1), len1);
}
/*********************************************************************
Evaluates a predefined function node. */
UNIV_INLINE
void
eval_predefined(
/*============*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg1;
lint int_val;
byte* str1;
byte* data;
int func;
func = func_node->func;
arg1 = func_node->args;
if (func == PARS_LENGTH_TOKEN) {
int_val = (lint)dfield_get_len(que_node_get_val(arg1));
} else if (func == PARS_TO_CHAR_TOKEN) {
int_val = eval_node_get_int_val(arg1);
data = eval_node_ensure_val_buf(func_node, 11);
sprintf((char*)data, "%10li", int_val);
dfield_set_len(que_node_get_val(func_node), 10);
return;
} else if (func == PARS_TO_NUMBER_TOKEN) {
str1 = dfield_get_data(que_node_get_val(arg1));
int_val = atoi((char*)str1);
} else if (func == PARS_SYSDATE_TOKEN) {
int_val = (lint)ut_time();
} else {
eval_predefined_2(func_node);
return;
}
eval_node_set_int_val(func_node, int_val);
}
/*********************************************************************
Evaluates a function node. */
void
eval_func(
/*======*/
func_node_t* func_node) /* in: function node */
{
que_node_t* arg;
ulint class;
ulint func;
ut_ad(que_node_get_type(func_node) == QUE_NODE_FUNC);
class = func_node->class;
func = func_node->func;
arg = func_node->args;
/* Evaluate first the argument list */
while (arg) {
eval_exp(arg);
/* The functions are not defined for SQL null argument
values, except for eval_cmp and notfound */
if ((dfield_get_len(que_node_get_val(arg)) == UNIV_SQL_NULL)
&& (class != PARS_FUNC_CMP)
&& (func != PARS_NOTFOUND_TOKEN)
&& (func != PARS_PRINTF_TOKEN)) {
ut_error;
}
arg = que_node_get_next(arg);
}
if (class == PARS_FUNC_CMP) {
eval_cmp(func_node);
} else if (class == PARS_FUNC_ARITH) {
eval_arith(func_node);
} else if (class == PARS_FUNC_AGGREGATE) {
eval_aggregate(func_node);
} else if (class == PARS_FUNC_PREDEFINED) {
if (func == PARS_NOTFOUND_TOKEN) {
eval_notfound(func_node);
} else if (func == PARS_SUBSTR_TOKEN) {
eval_substr(func_node);
} else if (func == PARS_REPLSTR_TOKEN) {
eval_replstr(func_node);
} else if (func == PARS_INSTR_TOKEN) {
eval_instr(func_node);
} else if (func == PARS_BINARY_TO_NUMBER_TOKEN) {
eval_binary_to_number(func_node);
} else if (func == PARS_CONCAT_TOKEN) {
eval_concat(func_node);
} else if (func == PARS_TO_BINARY_TOKEN) {
eval_to_binary(func_node);
} else {
eval_predefined(func_node);
}
} else {
ut_ad(class == PARS_FUNC_LOGICAL);
eval_logical(func_node);
}
}

245
innobase/eval/eval0proc.c Normal file
View file

@ -0,0 +1,245 @@
/******************************************************
Executes SQL stored procedures and their control structures
(c) 1998 Innobase Oy
Created 1/20/1998 Heikki Tuuri
*******************************************************/
#include "eval0proc.h"
#ifdef UNIV_NONINL
#include "eval0proc.ic"
#endif
/**************************************************************************
Performs an execution step of an if-statement node. */
que_thr_t*
if_step(
/*====*/
/* out: query thread to run next or NULL */
que_thr_t* thr) /* in: query thread */
{
if_node_t* node;
elsif_node_t* elsif_node;
ut_ad(thr);
node = thr->run_node;
ut_ad(que_node_get_type(node) == QUE_NODE_IF);
if (thr->prev_node == que_node_get_parent(node)) {
/* Evaluate the condition */
eval_exp(node->cond);
if (eval_node_get_ibool_val(node->cond)) {
/* The condition evaluated to TRUE: start execution
from the first statement in the statement list */
thr->run_node = node->stat_list;
} else if (node->else_part) {
thr->run_node = node->else_part;
} else if (node->elsif_list) {
elsif_node = node->elsif_list;
for (;;) {
eval_exp(elsif_node->cond);
if (eval_node_get_ibool_val(elsif_node->cond)) {
/* The condition evaluated to TRUE:
start execution from the first
statement in the statement list */
thr->run_node = elsif_node->stat_list;
break;
}
elsif_node = que_node_get_next(elsif_node);
if (elsif_node == NULL) {
thr->run_node = NULL;
break;
}
}
} else {
thr->run_node = NULL;
}
} else {
/* Move to the next statement */
ut_ad(que_node_get_next(thr->prev_node) == NULL);
thr->run_node = NULL;
}
if (thr->run_node == NULL) {
thr->run_node = que_node_get_parent(node);
}
return(thr);
}
/**************************************************************************
Performs an execution step of a while-statement node. */
que_thr_t*
while_step(
/*=======*/
/* out: query thread to run next or NULL */
que_thr_t* thr) /* in: query thread */
{
while_node_t* node;
ut_ad(thr);
node = thr->run_node;
ut_ad(que_node_get_type(node) == QUE_NODE_WHILE);
ut_ad((thr->prev_node == que_node_get_parent(node))
|| (que_node_get_next(thr->prev_node) == NULL));
/* Evaluate the condition */
eval_exp(node->cond);
if (eval_node_get_ibool_val(node->cond)) {
/* The condition evaluated to TRUE: start execution
from the first statement in the statement list */
thr->run_node = node->stat_list;
} else {
thr->run_node = que_node_get_parent(node);
}
return(thr);
}
/**************************************************************************
Performs an execution step of an assignment statement node. */
que_thr_t*
assign_step(
/*========*/
/* out: query thread to run next or NULL */
que_thr_t* thr) /* in: query thread */
{
assign_node_t* node;
ut_ad(thr);
node = thr->run_node;
ut_ad(que_node_get_type(node) == QUE_NODE_ASSIGNMENT);
/* Evaluate the value to assign */
eval_exp(node->val);
eval_node_copy_val(node->var->alias, node->val);
thr->run_node = que_node_get_parent(node);
return(thr);
}
/**************************************************************************
Performs an execution step of a for-loop node. */
que_thr_t*
for_step(
/*=====*/
/* out: query thread to run next or NULL */
que_thr_t* thr) /* in: query thread */
{
for_node_t* node;
que_node_t* parent;
int loop_var_value;
ut_ad(thr);
node = thr->run_node;
ut_ad(que_node_get_type(node) == QUE_NODE_FOR);
parent = que_node_get_parent(node);
if (thr->prev_node != parent) {
/* Move to the next statement */
thr->run_node = que_node_get_next(thr->prev_node);
if (thr->run_node != NULL) {
return(thr);
}
/* Increment the value of loop_var */
loop_var_value = 1 + eval_node_get_int_val(node->loop_var);
} else {
/* Initialize the loop */
eval_exp(node->loop_start_limit);
eval_exp(node->loop_end_limit);
loop_var_value = eval_node_get_int_val(node->loop_start_limit);
node->loop_end_value = eval_node_get_int_val(
node->loop_end_limit);
}
/* Check if we should do another loop */
if (loop_var_value > node->loop_end_value) {
/* Enough loops done */
thr->run_node = parent;
} else {
eval_node_set_int_val(node->loop_var, loop_var_value);
thr->run_node = node->stat_list;
}
return(thr);
}
/**************************************************************************
Performs an execution step of a return-statement node. */
que_thr_t*
return_step(
/*========*/
/* out: query thread to run next or NULL */
que_thr_t* thr) /* in: query thread */
{
return_node_t* node;
que_node_t* parent;
ut_ad(thr);
node = thr->run_node;
ut_ad(que_node_get_type(node) == QUE_NODE_RETURN);
parent = node;
while (que_node_get_type(parent) != QUE_NODE_PROC) {
parent = que_node_get_parent(parent);
}
ut_a(parent);
thr->run_node = que_node_get_parent(parent);
return(thr);
}

10
innobase/eval/makefilewin Normal file
View file

@ -0,0 +1,10 @@
include ..\include\makefile.i
eval.lib: eval0eval.obj eval0proc.obj
lib -out:..\libs\eval.lib eval0eval.obj eval0proc.obj
eval0eval.obj: eval0eval.c
$(CCOM) $(CFL) -c eval0eval.c
eval0proc.obj: eval0proc.c
$(CCOM) $(CFL) -c eval0proc.c

24
innobase/fil/Makefile.am Normal file
View file

@ -0,0 +1,24 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libfil.a
libfil_a_SOURCES = fil0fil.c
EXTRA_PROGRAMS =

1326
innobase/fil/fil0fil.c Normal file

File diff suppressed because it is too large Load diff

10
innobase/fil/makefilewin Normal file
View file

@ -0,0 +1,10 @@
include ..\include\makefile.i
fil.lib: fil0fil.obj
lib -out:..\libs\fil.lib fil0fil.obj
fil0fil.obj: fil0fil.c
$(CCOM) $(CFL) -c fil0fil.c

15
innobase/fil/ts/makefile Normal file
View file

@ -0,0 +1,15 @@
include ..\..\makefile.i
tsfil: ..\fil.lib tsfil.c
$(CCOM) $(CFL) -I.. -I..\.. ..\fil.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tsfil.c $(LFL)

329
innobase/fil/ts/tsfil.c Normal file
View file

@ -0,0 +1,329 @@
/************************************************************************
The test module for the file system
(c) 1995 Innobase Oy
Created 10/29/1995 Heikki Tuuri
*************************************************************************/
#include "os0thread.h"
#include "os0file.h"
#include "ut0ut.h"
#include "sync0sync.h"
#include "mem0mem.h"
#include "..\fil0fil.h"
ulint last_thr = 1;
byte global_buf[10000000];
byte global_buf2[20000];
os_file_t files[1000];
os_event_t gl_ready;
mutex_t ios_mutex;
ulint ios;
/*********************************************************************
Test for synchronous file io. */
void
test1(void)
/*=======*/
{
ulint i, j;
void* mess;
bool ret;
void* buf;
ulint rnd, rnd3;
ulint tm, oldtm;
printf("-------------------------------------------\n");
printf("FIL-TEST 1. Test of synchronous file io\n");
/* Align the buffer for file io */
buf = (void*)(((ulint)global_buf + 6300) & (~0xFFF));
rnd = ut_time();
rnd3 = ut_time();
rnd = rnd * 3416133;
rnd3 = rnd3 * 6576681;
oldtm = ut_clock();
for (j = 0; j < 300; j++) {
for (i = 0; i < (rnd3 % 15); i++) {
fil_read((rnd % 1000) / 100, rnd % 100, 0, 8192, buf, NULL);
ut_a(fil_validate());
ret = fil_aio_wait(0, &mess);
ut_a(ret);
ut_a(fil_validate());
ut_a(*((ulint*)buf) == rnd % 1000);
rnd += 1;
}
rnd = rnd + 3416133;
rnd3 = rnd3 + 6576681;
}
tm = ut_clock();
printf("Wall clock time for synchr. io %lu milliseconds\n",
tm - oldtm);
}
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
void* mess;
void* buf;
ulint i;
bool ret;
segment = *((ulint*)arg);
buf = (void*)(((ulint)global_buf + 6300) & (~0xFFF));
printf("Thread %lu starts\n", segment);
for (i = 0;; i++) {
ret = fil_aio_wait(segment, &mess);
ut_a(ret);
if ((ulint)mess == 3333) {
os_event_set(gl_ready);
} else {
ut_a((ulint)mess ==
*((ulint*)((byte*)buf + 8192 * (ulint)mess)));
}
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
ut_a(ret);
/* printf("Message for thread %lu %lu\n", segment,
(ulint)mess); */
}
return(0);
}
/************************************************************************
Test of io-handler threads */
void
test2(void)
/*=======*/
{
ulint i;
ulint j;
void* buf;
ulint rnd, rnd3;
ulint tm, oldtm;
os_thread_t thr[5];
os_thread_id_t id[5];
ulint n[5];
/* Align the buffer for file io */
buf = (void*)(((ulint)global_buf + 6300) & (~0xFFF));
gl_ready = os_event_create(NULL);
ios = 0;
mutex_create(&ios_mutex);
for (i = 0; i < 5; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
printf("-------------------------------------------\n");
printf("FIL-TEST 2. Test of asynchronous file io\n");
rnd = ut_time();
rnd3 = ut_time();
rnd = rnd * 3416133;
rnd3 = rnd3 * 6576681;
oldtm = ut_clock();
for (j = 0; j < 300; j++) {
for (i = 0; i < (rnd3 % 15); i++) {
fil_read((rnd % 1000) / 100, rnd % 100, 0, 8192,
(void*)((byte*)buf + 8192 * (rnd % 1000)),
(void*)(rnd % 1000));
rnd += 1;
}
ut_a(fil_validate());
rnd = rnd + 3416133;
rnd3 = rnd3 + 6576681;
}
ut_a(!os_aio_all_slots_free());
tm = ut_clock();
printf("Wall clock time for asynchr. io %lu milliseconds\n",
tm - oldtm);
fil_read(5, 25, 0, 8192,
(void*)((byte*)buf + 8192 * 1000),
(void*)3333);
tm = ut_clock();
ut_a(fil_validate());
printf("All ios queued! N ios: %lu\n", ios);
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
os_event_wait(gl_ready);
tm = ut_clock();
printf("N ios: %lu\n", ios);
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
os_thread_sleep(2000000);
printf("N ios: %lu\n", ios);
ut_a(fil_validate());
ut_a(os_aio_all_slots_free());
}
/*************************************************************************
Creates the files for the file system test and inserts them to
the file system. */
void
create_files(void)
/*==============*/
{
bool ret;
ulint i, j, k, n;
void* buf;
void* mess;
char name[10];
buf = (void*)(((ulint)global_buf2 + 6300) & (~0xFFF));
name[0] = 't';
name[1] = 's';
name[2] = 'f';
name[3] = 'i';
name[4] = 'l';
name[5] = 'e';
name[8] = '\0';
for (k = 0; k < 10; k++) {
for (i = 0; i < 20; i++) {
name[6] = (char)(k + (ulint)'a');
name[7] = (char)(i + (ulint)'a');
files[i] = os_file_create(name, OS_FILE_CREATE,
OS_FILE_TABLESPACE, &ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(name, OS_FILE_OPEN,
OS_FILE_TABLESPACE, &ret);
ut_a(ret);
} else {
for (j = 0; j < 5; j++) {
for (n = 0; n < 8192 / sizeof(ulint); n++) {
*((ulint*)buf + n) =
k * 100 + i * 5 + j;
}
ret = os_aio_write(files[i], buf, 8192 * j,
0, 8192, NULL);
ut_a(ret);
ret = os_aio_wait(0, &mess);
ut_a(ret);
ut_a(mess == NULL);
}
}
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create("noname", k, OS_FILE_TABLESPACE);
}
ut_a(fil_validate());
fil_node_create(name, 5, k);
}
}
}
/************************************************************************
Frees the spaces in the file system. */
void
free_system(void)
/*=============*/
{
ulint i;
for (i = 0; i < 10; i++) {
fil_space_free(i);
}
}
/************************************************************************
Main test function. */
void
main(void)
/*======*/
{
ulint tm, oldtm;
oldtm = ut_clock();
os_aio_init(160, 5);
sync_init();
mem_init();
fil_init(2); /* Allow only 2 open files at a time */
ut_a(fil_validate());
create_files();
test1();
test2();
free_system();
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

25
innobase/fsp/Makefile.am Normal file
View file

@ -0,0 +1,25 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libfsp.a
libfsp_a_SOURCES = fsp0fsp.c
EXTRA_PROGRAMS =

3365
innobase/fsp/fsp0fsp.c Normal file

File diff suppressed because it is too large Load diff

9
innobase/fsp/makefilewin Normal file
View file

@ -0,0 +1,9 @@
include ..\include\makefile.i
fsp.lib: fsp0fsp.obj
lib -out:..\libs\fsp.lib fsp0fsp.obj
fsp0fsp.obj: fsp0fsp.c
$(CCOM) $(CFL) -c fsp0fsp.c

3100
innobase/fsp/trash/FSP0FSP.C Normal file

File diff suppressed because it is too large Load diff

891
innobase/fsp/ts/del.c Normal file
View file

@ -0,0 +1,891 @@
/************************************************************************
The test module for the file system and buffer manager
(c) 1995 Innobase Oy
Created 11/16/1995 Heikki Tuuri
*************************************************************************/
#include "string.h"
#include "os0thread.h"
#include "os0file.h"
#include "ut0ut.h"
#include "ut0byte.h"
#include "sync0sync.h"
#include "mem0mem.h"
#include "fil0fil.h"
#include "..\buf0buf.h"
#include "..\buf0buf.h1"
#include "..\buf0buf.h2"
#include "..\buf0flu.h"
#include "..\buf0lru.h"
#include "mtr0buf.h"
#include "mtr0log.h"
#include "fsp0fsp.h"
#include "log0log.h"
os_file_t files[1000];
mutex_t ios_mutex;
ulint ios;
ulint n[5];
/************************************************************************
Io-handler thread function. */
ulint
handler_thread(
/*===========*/
void* arg)
{
ulint segment;
void* mess;
ulint i;
bool ret;
segment = *((ulint*)arg);
printf("Thread %lu starts\n", segment);
for (i = 0;; i++) {
ret = fil_aio_wait(segment, &mess);
ut_a(ret);
buf_page_io_complete((buf_block_t*)mess);
mutex_enter(&ios_mutex);
ios++;
mutex_exit(&ios_mutex);
}
return(0);
}
/************************************************************************
Creates the test database files. */
void
create_db(void)
/*===========*/
{
ulint i;
buf_block_t* block;
byte* frame;
ulint j;
ulint tm, oldtm;
mtr_t mtr;
oldtm = ut_clock();
for (i = 0; i < 1; i++) {
for (j = 0; j < 4096; j++) {
mtr_start(&mtr);
if (j == 0) {
fsp_header_init(i, 4096, &mtr);
block = mtr_page_get(i, j, NULL, &mtr);
} else {
block = mtr_page_create(i, j, &mtr);
}
frame = buf_block_get_frame(block);
mtr_page_x_lock(block, &mtr);
mlog_write_ulint(frame + FIL_PAGE_PREV,
j - 1, MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_NEXT,
j + 1, MLOG_4BYTES, &mtr);
mlog_write_ulint(frame + FIL_PAGE_OFFSET,
j, MLOG_4BYTES, &mtr);
mtr_commit(&mtr);
}
}
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
/* Flush the pool of dirty pages by reading low-offset pages */
for (i = 0; i < 1000; i++) {
mtr_start(&mtr);
block = mtr_page_get(0, i, NULL, &mtr);
frame = buf_block_get_frame(block);
mtr_page_s_lock(block, &mtr);
ut_a(mtr_read_ulint(frame + FIL_PAGE_OFFSET, MLOG_4BYTES,
&mtr) == i);
mtr_commit(&mtr);
}
os_thread_sleep(1000000);
ut_a(buf_all_freed());
}
/*************************************************************************
Creates the files for the file system test and inserts them to
the file system. */
void
create_files(void)
/*==============*/
{
bool ret;
ulint i, k;
char name[20];
os_thread_t thr[5];
os_thread_id_t id[5];
strcpy(name, "j:\\tsfile1");
for (k = 0; k < 1; k++) {
for (i = 0; i < 4; i++) {
name[9] = (char)((ulint)'0' + i);
files[i] = os_file_create(name, OS_FILE_CREATE,
OS_FILE_TABLESPACE, &ret);
if (ret == FALSE) {
ut_a(os_file_get_last_error() ==
OS_FILE_ALREADY_EXISTS);
files[i] = os_file_create(
name, OS_FILE_OPEN,
OS_FILE_TABLESPACE, &ret);
ut_a(ret);
}
ret = os_file_set_size(files[i], 4096 * 8192, 0);
ut_a(ret);
ret = os_file_close(files[i]);
ut_a(ret);
if (i == 0) {
fil_space_create("noname", k, OS_FILE_TABLESPACE);
}
ut_a(fil_validate());
fil_node_create(name, 4096, k);
}
}
ios = 0;
mutex_create(&ios_mutex);
for (i = 0; i < 5; i++) {
n[i] = i;
thr[i] = os_thread_create(handler_thread, n + i, id + i);
}
}
/************************************************************************
Reads the test database files. */
void
test1(void)
/*=======*/
{
ulint i, j, k;
buf_block_t* block;
byte* frame;
ulint tm, oldtm;
buf_flush_batch(BUF_FLUSH_LIST, 1000);
os_thread_sleep(1000000);
buf_all_freed();
oldtm = ut_clock();
for (k = 0; k < 1; k++) {
for (i = 0; i < 1; i++) {
for (j = 0; j < 409; j++) {
block = buf_page_get(i, j, NULL);
frame = buf_block_get_frame(block);
buf_page_s_lock(block);
ut_a(*((ulint*)(frame + 16)) == j);
buf_page_s_unlock(block);
buf_page_release(block);
}
}
}
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
}
/************************************************************************
Reads the test database files. */
void
test2(void)
/*=======*/
{
ulint i, j, k, rnd;
buf_block_t* block;
byte* frame;
ulint tm, oldtm;
oldtm = ut_clock();
rnd = 123;
for (k = 0; k < 100; k++) {
rnd += 23651;
rnd = rnd % 4096;
i = rnd / 4096;
j = rnd % 2048;
block = buf_page_get(i, j, NULL);
frame = buf_block_get_frame(block);
buf_page_s_lock(block);
ut_a(*((ulint*)(frame + 16)) == j);
buf_page_s_unlock(block);
buf_page_release(block);
}
tm = ut_clock();
printf("Wall clock time for random read %lu milliseconds\n",
tm - oldtm);
}
/************************************************************************
Reads the test database files. */
void
test4(void)
/*=======*/
{
ulint i, j, k, rnd;
buf_block_t* block;
byte* frame;
ulint tm, oldtm;
/* Flush the pool of high-offset pages */
for (i = 0; i < 1000; i++) {
block = buf_page_get(0, i, NULL);
frame = buf_block_get_frame(block);
buf_page_s_lock(block);
ut_a(*((ulint*)(frame + 16)) == i);
buf_page_s_unlock(block);
buf_page_release(block);
}
printf("Test starts\n");
oldtm = ut_clock();
rnd = 123;
for (k = 0; k < 400; k++) {
rnd += 4357;
i = 0;
j = 1001 + rnd % 3000;
block = buf_page_get(i, j, NULL);
frame = buf_block_get_frame(block);
buf_page_s_lock(block);
ut_a(*((ulint*)(frame + 16)) == j);
buf_page_s_unlock(block);
buf_page_release(block);
}
tm = ut_clock();
printf(
"Wall clock time for %lu random no read-ahead %lu milliseconds\n",
k, tm - oldtm);
/* Flush the pool of high-offset pages */
for (i = 0; i < 1000; i++) {
block = buf_page_get(0, i, NULL);
frame = buf_block_get_frame(block);
buf_page_s_lock(block);
ut_a(*((ulint*)(frame + 16)) == i);
buf_page_s_unlock(block);
buf_page_release(block);
}
printf("Test starts\n");
oldtm = ut_clock();
rnd = 123;
for (k = 0; k < 400; k++) {
rnd += 4357;
i = 0;
j = 1001 + rnd % 400;
block = buf_page_get(i, j, NULL);
frame = buf_block_get_frame(block);
buf_page_s_lock(block);
ut_a(*((ulint*)(frame + 16)) == j);
buf_page_s_unlock(block);
buf_page_release(block);
}
tm = ut_clock();
printf(
"Wall clock time for %lu random read-ahead %lu milliseconds\n",
k, tm - oldtm);
}
/************************************************************************
Tests speed of CPU algorithms. */
void
test3(void)
/*=======*/
{
ulint i, j;
buf_block_t* block;
ulint tm, oldtm;
for (i = 0; i < 400; i++) {
block = buf_page_get(0, i, NULL);
buf_page_release(block);
}
os_thread_sleep(2000000);
oldtm = ut_clock();
for (j = 0; j < 500; j++) {
for (i = 0; i < 200; i++) {
block = buf_page_get(0, i, NULL);
/*
buf_page_s_lock(block);
buf_page_s_unlock(block);
*/
buf_page_release(block);
}
}
tm = ut_clock();
printf("Wall clock time for %lu page get-release %lu milliseconds\n",
i * j, tm - oldtm);
oldtm = ut_clock();
for (j = 0; j < 500; j++) {
for (i = 0; i < 200; i++) {
buf_page_get(0, i, NULL);
/*
buf_page_s_lock(block);
buf_page_s_unlock(block);
*/
buf_page_release(block);
}
}
tm = ut_clock();
printf("Wall clock time for %lu block get-release %lu milliseconds\n",
i * j, tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 100000; i++) {
block = buf_block_alloc();
buf_block_free(block);
}
tm = ut_clock();
printf("Wall clock time for %lu block alloc-free %lu milliseconds\n",
i, tm - oldtm);
ha_print_info(buf_pool->page_hash);
}
/************************************************************************
Frees the spaces in the file system. */
void
free_system(void)
/*=============*/
{
ulint i;
for (i = 0; i < 1; i++) {
fil_space_free(i);
}
}
/************************************************************************
Test for file space management. */
void
test5(void)
/*=======*/
{
mtr_t mtr;
ulint seg_page;
ulint new_page;
ulint seg_page2;
ulint new_page2;
buf_block_t* block;
bool finished;
ulint i;
ulint reserved;
ulint used;
ulint tm, oldtm;
os_thread_sleep(1000000);
buf_validate();
buf_print();
mtr_start(&mtr);
seg_page = fseg_create(0, 0, 1000, 555, &mtr);
mtr_commit(&mtr);
os_thread_sleep(1000000);
buf_validate();
printf("Segment created: header page %lu\n", seg_page);
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
new_page = fseg_alloc_free_page(buf_block_get_frame(block) + 1000,
2, FSP_UP, &mtr);
mtr_commit(&mtr);
buf_validate();
buf_print();
printf("Segment page allocated %lu\n", new_page);
finished = FALSE;
while (!finished) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
finished = fseg_free_step(
buf_block_get_frame(block) + 1000, &mtr);
mtr_commit(&mtr);
}
/***********************************************/
os_thread_sleep(1000000);
buf_validate();
buf_print();
mtr_start(&mtr);
seg_page = fseg_create(0, 0, 1000, 557, &mtr);
mtr_commit(&mtr);
ut_a(seg_page == 1);
printf("Segment created: header page %lu\n", seg_page);
new_page = seg_page;
for (i = 0; i < 1023; i++) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
new_page = fseg_alloc_free_page(
buf_block_get_frame(block) + 1000,
new_page + 1, FSP_UP, &mtr);
if (i < FSP_EXTENT_SIZE - 1) {
ut_a(new_page == 2 + i);
} else {
ut_a(new_page == i + FSP_EXTENT_SIZE + 1);
}
printf("%lu %lu; ", i, new_page);
if (i % 10 == 0) {
printf("\n");
}
mtr_commit(&mtr);
}
buf_print();
buf_validate();
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
mtr_page_s_lock(block, &mtr);
reserved = fseg_n_reserved_pages(buf_block_get_frame(block) + 1000,
&used, &mtr);
ut_a(used == 1024);
ut_a(reserved >= 1024);
printf("Pages used in segment %lu reserved by segment %lu \n",
used, reserved);
mtr_commit(&mtr);
finished = FALSE;
while (!finished) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
finished = fseg_free_step(
buf_block_get_frame(block) + 1000, &mtr);
mtr_commit(&mtr);
}
buf_print();
buf_validate();
/***********************************************/
mtr_start(&mtr);
seg_page = fseg_create(0, 0, 1000, 557, &mtr);
mtr_commit(&mtr);
ut_a(seg_page == 1);
mtr_start(&mtr);
seg_page2 = fseg_create(0, 0, 1000, 558, &mtr);
mtr_commit(&mtr);
ut_a(seg_page2 == 2);
new_page = seg_page;
new_page2 = seg_page2;
for (;;) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
new_page = fseg_alloc_free_page(
buf_block_get_frame(block) + 1000,
new_page + 1, FSP_UP, &mtr);
printf("1:%lu %lu; ", i, new_page);
if (i % 10 == 0) {
printf("\n");
}
new_page = fseg_alloc_free_page(
buf_block_get_frame(block) + 1000,
new_page + 1, FSP_UP, &mtr);
printf("1:%lu %lu; ", i, new_page);
if (i % 10 == 0) {
printf("\n");
}
mtr_commit(&mtr);
mtr_start(&mtr);
block = mtr_page_get(0, seg_page2, NULL, &mtr);
new_page2 = fseg_alloc_free_page(
buf_block_get_frame(block) + 1000,
new_page2 + 1, FSP_UP, &mtr);
printf("2:%lu %lu; ", i, new_page2);
if (i % 10 == 0) {
printf("\n");
}
mtr_commit(&mtr);
if (new_page2 == FIL_NULL) {
break;
}
}
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
mtr_page_s_lock(block, &mtr);
reserved = fseg_n_reserved_pages(buf_block_get_frame(block) + 1000,
&used, &mtr);
printf("Pages used in segment 1 %lu, reserved by segment %lu \n",
used, reserved);
mtr_commit(&mtr);
mtr_start(&mtr);
block = mtr_page_get(0, seg_page2, NULL, &mtr);
mtr_page_s_lock(block, &mtr);
reserved = fseg_n_reserved_pages(buf_block_get_frame(block) + 1000,
&used, &mtr);
printf("Pages used in segment 2 %lu, reserved by segment %lu \n",
used, reserved);
mtr_commit(&mtr);
for (;;) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
fseg_free_step(
buf_block_get_frame(block) + 1000, &mtr);
block = mtr_page_get(0, seg_page2, NULL, &mtr);
finished = fseg_free_step(
buf_block_get_frame(block) + 1000, &mtr);
mtr_commit(&mtr);
if (finished) {
break;
}
}
mtr_start(&mtr);
seg_page2 = fseg_create(0, 0, 1000, 558, &mtr);
mtr_commit(&mtr);
i = 0;
for (;;) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page2, NULL, &mtr);
new_page2 = fseg_alloc_free_page(
buf_block_get_frame(block) + 1000,
557, FSP_DOWN, &mtr);
printf("%lu %lu; ", i, new_page2);
mtr_commit(&mtr);
if (new_page2 == FIL_NULL) {
break;
}
i++;
}
for (;;) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
finished = fseg_free_step(
buf_block_get_frame(block) + 1000, &mtr);
mtr_commit(&mtr);
if (finished) {
break;
}
}
for (;;) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page2, NULL, &mtr);
finished = fseg_free_step(
buf_block_get_frame(block) + 1000, &mtr);
mtr_commit(&mtr);
if (finished) {
break;
}
}
/***************************************/
oldtm = ut_clock();
for (i = 0; i < 1000; i++) {
mtr_start(&mtr);
seg_page = fseg_create(0, 0, 1000, 555, &mtr);
mtr_commit(&mtr);
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
new_page = fseg_alloc_free_page(buf_block_get_frame(block) + 1000,
2, FSP_UP, &mtr);
mtr_commit(&mtr);
finished = FALSE;
while (!finished) {
mtr_start(&mtr);
block = mtr_page_get(0, seg_page, NULL, &mtr);
finished = fseg_free_step(
buf_block_get_frame(block) + 1000, &mtr);
mtr_commit(&mtr);
}
}
tm = ut_clock();
printf("Wall clock time for %lu seg crea+free %lu millisecs\n",
i, tm - oldtm);
buf_validate();
buf_flush_batch(BUF_FLUSH_LIST, 500);
os_thread_sleep(1000000);
buf_all_freed();
}
/************************************************************************
Main test function. */
void
main(void)
/*======*/
{
ulint tm, oldtm;
ulint n;
oldtm = ut_clock();
os_aio_init(160, 5);
sync_init();
mem_init();
fil_init(26); /* Allow 25 open files at a time */
buf_pool_init(1000, 1000);
log_init();
buf_validate();
ut_a(fil_validate());
create_files();
create_db();
buf_validate();
test5();
/*
test1();
test3();
test4();
test2();
*/
buf_validate();
n = buf_flush_batch(BUF_FLUSH_LIST, 500);
os_thread_sleep(1000000);
buf_all_freed();
free_system();
tm = ut_clock();
printf("Wall clock time for test %lu milliseconds\n", tm - oldtm);
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

16
innobase/fsp/ts/makefile Normal file
View file

@ -0,0 +1,16 @@
include ..\..\makefile.i
tsfsp: ..\fsp.lib tsfsp.c
$(CCOM) $(CFL) -I.. -I..\.. ..\..\btr.lib ..\..\trx.lib ..\..\pars.lib ..\..\que.lib ..\..\lock.lib ..\..\row.lib ..\..\read.lib ..\..\srv.lib ..\..\com.lib ..\..\usr.lib ..\..\thr.lib ..\..\fut.lib ..\fsp.lib ..\..\page.lib ..\..\dyn.lib ..\..\mtr.lib ..\..\log.lib ..\..\rem.lib ..\..\fil.lib ..\..\buf.lib ..\..\dict.lib ..\..\data.lib ..\..\mach.lib ..\..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tsfsp.c $(LFL)

1234
innobase/fsp/ts/tsfsp.c Normal file

File diff suppressed because it is too large Load diff

24
innobase/fut/Makefile.am Normal file
View file

@ -0,0 +1,24 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libfut.a
libfut_a_SOURCES = fut0fut.c fut0lst.c
EXTRA_PROGRAMS =

14
innobase/fut/fut0fut.c Normal file
View file

@ -0,0 +1,14 @@
/**********************************************************************
File-based utilities
(c) 1995 Innobase Oy
Created 12/13/1995 Heikki Tuuri
***********************************************************************/
#include "fut0fut.h"
#ifdef UNIV_NONINL
#include "fut0fut.ic"
#endif

516
innobase/fut/fut0lst.c Normal file
View file

@ -0,0 +1,516 @@
/**********************************************************************
File-based list utilities
(c) 1995 Innobase Oy
Created 11/28/1995 Heikki Tuuri
***********************************************************************/
#include "fut0lst.h"
#ifdef UNIV_NONINL
#include "fut0lst.ic"
#endif
#include "buf0buf.h"
/************************************************************************
Adds a node to an empty list. */
static
void
flst_add_to_empty(
/*==============*/
flst_base_node_t* base, /* in: pointer to base node of
empty list */
flst_node_t* node, /* in: node to add */
mtr_t* mtr) /* in: mini-transaction handle */
{
ulint space;
fil_addr_t node_addr;
ulint len;
ut_ad(mtr && base && node);
ut_ad(base != node);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node),
MTR_MEMO_PAGE_X_FIX));
len = flst_get_len(base, mtr);
ut_a(len == 0);
buf_ptr_get_fsp_addr(node, &space, &node_addr);
/* Update first and last fields of base node */
flst_write_addr(base + FLST_FIRST, node_addr, mtr);
flst_write_addr(base + FLST_LAST, node_addr, mtr);
/* Set prev and next fields of node to add */
flst_write_addr(node + FLST_PREV, fil_addr_null, mtr);
flst_write_addr(node + FLST_NEXT, fil_addr_null, mtr);
/* Update len of base node */
mlog_write_ulint(base + FLST_LEN, len + 1, MLOG_4BYTES, mtr);
}
/************************************************************************
Adds a node as the last node in a list. */
void
flst_add_last(
/*==========*/
flst_base_node_t* base, /* in: pointer to base node of list */
flst_node_t* node, /* in: node to add */
mtr_t* mtr) /* in: mini-transaction handle */
{
ulint space;
fil_addr_t node_addr;
ulint len;
fil_addr_t last_addr;
flst_node_t* last_node;
ut_ad(mtr && base && node);
ut_ad(base != node);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node),
MTR_MEMO_PAGE_X_FIX));
len = flst_get_len(base, mtr);
last_addr = flst_get_last(base, mtr);
buf_ptr_get_fsp_addr(node, &space, &node_addr);
/* If the list is not empty, call flst_insert_after */
if (len != 0) {
if (last_addr.page == node_addr.page) {
last_node = buf_frame_align(node) + last_addr.boffset;
} else {
last_node = fut_get_ptr(space, last_addr, RW_X_LATCH,
mtr);
}
flst_insert_after(base, last_node, node, mtr);
} else {
/* else call flst_add_to_empty */
flst_add_to_empty(base, node, mtr);
}
}
/************************************************************************
Adds a node as the first node in a list. */
void
flst_add_first(
/*===========*/
flst_base_node_t* base, /* in: pointer to base node of list */
flst_node_t* node, /* in: node to add */
mtr_t* mtr) /* in: mini-transaction handle */
{
ulint space;
fil_addr_t node_addr;
ulint len;
fil_addr_t first_addr;
flst_node_t* first_node;
ut_ad(mtr && base && node);
ut_ad(base != node);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node),
MTR_MEMO_PAGE_X_FIX));
len = flst_get_len(base, mtr);
first_addr = flst_get_first(base, mtr);
buf_ptr_get_fsp_addr(node, &space, &node_addr);
/* If the list is not empty, call flst_insert_before */
if (len != 0) {
if (first_addr.page == node_addr.page) {
first_node = buf_frame_align(node)
+ first_addr.boffset;
} else {
first_node = fut_get_ptr(space, first_addr,
RW_X_LATCH, mtr);
}
flst_insert_before(base, node, first_node, mtr);
} else {
/* else call flst_add_to_empty */
flst_add_to_empty(base, node, mtr);
}
}
/************************************************************************
Inserts a node after another in a list. */
void
flst_insert_after(
/*==============*/
flst_base_node_t* base, /* in: pointer to base node of list */
flst_node_t* node1, /* in: node to insert after */
flst_node_t* node2, /* in: node to add */
mtr_t* mtr) /* in: mini-transaction handle */
{
ulint space;
fil_addr_t node1_addr;
fil_addr_t node2_addr;
flst_node_t* node3;
fil_addr_t node3_addr;
ulint len;
ut_ad(mtr && node1 && node2 && base);
ut_ad(base != node1);
ut_ad(base != node2);
ut_ad(node2 != node1);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node1),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node2),
MTR_MEMO_PAGE_X_FIX));
buf_ptr_get_fsp_addr(node1, &space, &node1_addr);
buf_ptr_get_fsp_addr(node2, &space, &node2_addr);
node3_addr = flst_get_next_addr(node1, mtr);
/* Set prev and next fields of node2 */
flst_write_addr(node2 + FLST_PREV, node1_addr, mtr);
flst_write_addr(node2 + FLST_NEXT, node3_addr, mtr);
if (!fil_addr_is_null(node3_addr)) {
/* Update prev field of node3 */
node3 = fut_get_ptr(space, node3_addr, RW_X_LATCH, mtr);
flst_write_addr(node3 + FLST_PREV, node2_addr, mtr);
} else {
/* node1 was last in list: update last field in base */
flst_write_addr(base + FLST_LAST, node2_addr, mtr);
}
/* Set next field of node1 */
flst_write_addr(node1 + FLST_NEXT, node2_addr, mtr);
/* Update len of base node */
len = flst_get_len(base, mtr);
mlog_write_ulint(base + FLST_LEN, len + 1, MLOG_4BYTES, mtr);
}
/************************************************************************
Inserts a node before another in a list. */
void
flst_insert_before(
/*===============*/
flst_base_node_t* base, /* in: pointer to base node of list */
flst_node_t* node2, /* in: node to insert */
flst_node_t* node3, /* in: node to insert before */
mtr_t* mtr) /* in: mini-transaction handle */
{
ulint space;
flst_node_t* node1;
fil_addr_t node1_addr;
fil_addr_t node2_addr;
fil_addr_t node3_addr;
ulint len;
ut_ad(mtr && node2 && node3 && base);
ut_ad(base != node2);
ut_ad(base != node3);
ut_ad(node2 != node3);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node2),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node3),
MTR_MEMO_PAGE_X_FIX));
buf_ptr_get_fsp_addr(node2, &space, &node2_addr);
buf_ptr_get_fsp_addr(node3, &space, &node3_addr);
node1_addr = flst_get_prev_addr(node3, mtr);
/* Set prev and next fields of node2 */
flst_write_addr(node2 + FLST_PREV, node1_addr, mtr);
flst_write_addr(node2 + FLST_NEXT, node3_addr, mtr);
if (!fil_addr_is_null(node1_addr)) {
/* Update next field of node1 */
node1 = fut_get_ptr(space, node1_addr, RW_X_LATCH, mtr);
flst_write_addr(node1 + FLST_NEXT, node2_addr, mtr);
} else {
/* node3 was first in list: update first field in base */
flst_write_addr(base + FLST_FIRST, node2_addr, mtr);
}
/* Set prev field of node3 */
flst_write_addr(node3 + FLST_PREV, node2_addr, mtr);
/* Update len of base node */
len = flst_get_len(base, mtr);
mlog_write_ulint(base + FLST_LEN, len + 1, MLOG_4BYTES, mtr);
}
/************************************************************************
Removes a node. */
void
flst_remove(
/*========*/
flst_base_node_t* base, /* in: pointer to base node of list */
flst_node_t* node2, /* in: node to remove */
mtr_t* mtr) /* in: mini-transaction handle */
{
ulint space;
flst_node_t* node1;
fil_addr_t node1_addr;
fil_addr_t node2_addr;
flst_node_t* node3;
fil_addr_t node3_addr;
ulint len;
ut_ad(mtr && node2 && base);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node2),
MTR_MEMO_PAGE_X_FIX));
buf_ptr_get_fsp_addr(node2, &space, &node2_addr);
node1_addr = flst_get_prev_addr(node2, mtr);
node3_addr = flst_get_next_addr(node2, mtr);
if (!fil_addr_is_null(node1_addr)) {
/* Update next field of node1 */
if (node1_addr.page == node2_addr.page) {
node1 = buf_frame_align(node2) + node1_addr.boffset;
} else {
node1 = fut_get_ptr(space, node1_addr, RW_X_LATCH,
mtr);
}
ut_ad(node1 != node2);
flst_write_addr(node1 + FLST_NEXT, node3_addr, mtr);
} else {
/* node2 was first in list: update first field in base */
flst_write_addr(base + FLST_FIRST, node3_addr, mtr);
}
if (!fil_addr_is_null(node3_addr)) {
/* Update prev field of node3 */
if (node3_addr.page == node2_addr.page) {
node3 = buf_frame_align(node2) + node3_addr.boffset;
} else {
node3 = fut_get_ptr(space, node3_addr, RW_X_LATCH,
mtr);
}
ut_ad(node2 != node3);
flst_write_addr(node3 + FLST_PREV, node1_addr, mtr);
} else {
/* node2 was last in list: update last field in base */
flst_write_addr(base + FLST_LAST, node1_addr, mtr);
}
/* Update len of base node */
len = flst_get_len(base, mtr);
ut_ad(len > 0);
mlog_write_ulint(base + FLST_LEN, len - 1, MLOG_4BYTES, mtr);
}
/************************************************************************
Cuts off the tail of the list, including the node given. The number of
nodes which will be removed must be provided by the caller, as this function
does not measure the length of the tail. */
void
flst_cut_end(
/*=========*/
flst_base_node_t* base, /* in: pointer to base node of list */
flst_node_t* node2, /* in: first node to remove */
ulint n_nodes,/* in: number of nodes to remove,
must be >= 1 */
mtr_t* mtr) /* in: mini-transaction handle */
{
ulint space;
flst_node_t* node1;
fil_addr_t node1_addr;
fil_addr_t node2_addr;
ulint len;
ut_ad(mtr && node2 && base);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node2),
MTR_MEMO_PAGE_X_FIX));
ut_ad(n_nodes > 0);
buf_ptr_get_fsp_addr(node2, &space, &node2_addr);
node1_addr = flst_get_prev_addr(node2, mtr);
if (!fil_addr_is_null(node1_addr)) {
/* Update next field of node1 */
if (node1_addr.page == node2_addr.page) {
node1 = buf_frame_align(node2) + node1_addr.boffset;
} else {
node1 = fut_get_ptr(space, node1_addr, RW_X_LATCH,
mtr);
}
flst_write_addr(node1 + FLST_NEXT, fil_addr_null, mtr);
} else {
/* node2 was first in list: update the field in base */
flst_write_addr(base + FLST_FIRST, fil_addr_null, mtr);
}
flst_write_addr(base + FLST_LAST, node1_addr, mtr);
/* Update len of base node */
len = flst_get_len(base, mtr);
ut_ad(len >= n_nodes);
mlog_write_ulint(base + FLST_LEN, len - n_nodes, MLOG_4BYTES, mtr);
}
/************************************************************************
Cuts off the tail of the list, not including the given node. The number of
nodes which will be removed must be provided by the caller, as this function
does not measure the length of the tail. */
void
flst_truncate_end(
/*==============*/
flst_base_node_t* base, /* in: pointer to base node of list */
flst_node_t* node2, /* in: first node not to remove */
ulint n_nodes,/* in: number of nodes to remove */
mtr_t* mtr) /* in: mini-transaction handle */
{
fil_addr_t node2_addr;
ulint len;
ulint space;
ut_ad(mtr && node2 && base);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
ut_ad(mtr_memo_contains(mtr, buf_block_align(node2),
MTR_MEMO_PAGE_X_FIX));
if (n_nodes == 0) {
ut_ad(fil_addr_is_null(flst_get_next_addr(node2, mtr)));
return;
}
buf_ptr_get_fsp_addr(node2, &space, &node2_addr);
/* Update next field of node2 */
flst_write_addr(node2 + FLST_NEXT, fil_addr_null, mtr);
flst_write_addr(base + FLST_LAST, node2_addr, mtr);
/* Update len of base node */
len = flst_get_len(base, mtr);
ut_ad(len >= n_nodes);
mlog_write_ulint(base + FLST_LEN, len - n_nodes, MLOG_4BYTES, mtr);
}
/************************************************************************
Validates a file-based list. */
ibool
flst_validate(
/*==========*/
/* out: TRUE if ok */
flst_base_node_t* base, /* in: pointer to base node of list */
mtr_t* mtr1) /* in: mtr */
{
ulint space;
flst_node_t* node;
fil_addr_t node_addr;
fil_addr_t base_addr;
ulint len;
ulint i;
mtr_t mtr2;
ut_ad(base);
ut_ad(mtr_memo_contains(mtr1, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
/* We use two mini-transaction handles: the first is used to
lock the base node, and prevent other threads from modifying the
list. The second is used to traverse the list. We cannot run the
second mtr without committing it at times, because if the list
is long, then the x-locked pages could fill the buffer resulting
in a deadlock. */
/* Find out the space id */
buf_ptr_get_fsp_addr(base, &space, &base_addr);
len = flst_get_len(base, mtr1);
node_addr = flst_get_first(base, mtr1);
for (i = 0; i < len; i++) {
mtr_start(&mtr2);
node = fut_get_ptr(space, node_addr, RW_X_LATCH, &mtr2);
node_addr = flst_get_next_addr(node, &mtr2);
mtr_commit(&mtr2); /* Commit mtr2 each round to prevent buffer
becoming full */
}
ut_a(fil_addr_is_null(node_addr));
node_addr = flst_get_last(base, mtr1);
for (i = 0; i < len; i++) {
mtr_start(&mtr2);
node = fut_get_ptr(space, node_addr, RW_X_LATCH, &mtr2);
node_addr = flst_get_prev_addr(node, &mtr2);
mtr_commit(&mtr2); /* Commit mtr2 each round to prevent buffer
becoming full */
}
ut_a(fil_addr_is_null(node_addr));
return(TRUE);
}
/************************************************************************
Prints info of a file-based list. */
void
flst_print(
/*=======*/
flst_base_node_t* base, /* in: pointer to base node of list */
mtr_t* mtr) /* in: mtr */
{
buf_frame_t* frame;
ulint len;
ut_ad(base && mtr);
ut_ad(mtr_memo_contains(mtr, buf_block_align(base),
MTR_MEMO_PAGE_X_FIX));
frame = buf_frame_align(base);
len = flst_get_len(base, mtr);
printf("FILE-BASED LIST:\n");
printf("Base node in space %lu page %lu byte offset %lu; len %lu\n",
buf_frame_get_space_id(frame), buf_frame_get_page_no(frame),
base - frame, len);
}

12
innobase/fut/makefilewin Normal file
View file

@ -0,0 +1,12 @@
include ..\include\makefile.i
fut.lib: fut0lst.obj fut0fut.obj
lib -out:..\libs\fut.lib fut0lst.obj fut0fut.obj
fut0lst.obj: fut0lst.c
$(CCOM) $(CFL) -c fut0lst.c
fut0fut.obj: fut0fut.c
$(CCOM) $(CFL) -c fut0fut.c

24
innobase/ha/Makefile.am Normal file
View file

@ -0,0 +1,24 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libha.a
libha_a_SOURCES = ha0ha.c hash0hash.c
EXTRA_PROGRAMS =

317
innobase/ha/ha0ha.c Normal file
View file

@ -0,0 +1,317 @@
/************************************************************************
The hash table with external chains
(c) 1994-1997 Innobase Oy
Created 8/22/1994 Heikki Tuuri
*************************************************************************/
#include "ha0ha.h"
#ifdef UNIV_NONINL
#include "ha0ha.ic"
#endif
#include "buf0buf.h"
/*****************************************************************
Creates a hash table with >= n array cells. The actual number of cells is
chosen to be a prime number slightly bigger than n. */
hash_table_t*
ha_create(
/*======*/
/* out, own: created table */
ibool in_btr_search, /* in: TRUE if the hash table is used in
the btr_search module */
ulint n, /* in: number of array cells */
ulint n_mutexes, /* in: number of mutexes to protect the
hash table: must be a power of 2, or 0 */
ulint mutex_level) /* in: level of the mutexes in the latching
order: this is used in the debug version */
{
hash_table_t* table;
ulint i;
table = hash_create(n);
if (n_mutexes == 0) {
if (in_btr_search) {
table->heap = mem_heap_create_in_btr_search(4096);
} else {
table->heap = mem_heap_create_in_buffer(4096);
}
return(table);
}
hash_create_mutexes(table, n_mutexes, mutex_level);
table->heaps = mem_alloc(n_mutexes * sizeof(void*));
for (i = 0; i < n_mutexes; i++) {
if (in_btr_search) {
table->heaps[i] = mem_heap_create_in_btr_search(4096);
} else {
table->heaps[i] = mem_heap_create_in_buffer(4096);
}
}
return(table);
}
/*****************************************************************
Checks that a hash table node is in the chain. */
ibool
ha_node_in_chain(
/*=============*/
/* out: TRUE if in chain */
hash_cell_t* cell, /* in: hash table cell */
ha_node_t* node) /* in: external chain node */
{
ha_node_t* node2;
node2 = cell->node;
while (node2 != NULL) {
if (node2 == node) {
return(TRUE);
}
node2 = node2->next;
}
return(FALSE);
}
/*****************************************************************
Inserts an entry into a hash table. If an entry with the same fold number
is found, its node is updated to point to the new data, and no new node
is inserted. */
ibool
ha_insert_for_fold(
/*===============*/
/* out: TRUE if succeed, FALSE if no more
memory could be allocated */
hash_table_t* table, /* in: hash table */
ulint fold, /* in: folded value of data; if a node with
the same fold value already exists, it is
updated to point to the same data, and no new
node is created! */
void* data) /* in: data, must not be NULL */
{
hash_cell_t* cell;
ha_node_t* node;
ha_node_t* prev_node;
ulint hash;
ut_ad(table && data);
ut_ad(!table->mutexes || mutex_own(hash_get_mutex(table, fold)));
hash = hash_calc_hash(fold, table);
cell = hash_get_nth_cell(table, hash);
prev_node = cell->node;
while (prev_node != NULL) {
if (prev_node->fold == fold) {
prev_node->data = data;
return(TRUE);
}
prev_node = prev_node->next;
}
/* We have to allocate a new chain node */
node = mem_heap_alloc(hash_get_heap(table, fold), sizeof(ha_node_t));
if (node == NULL) {
/* It was a btr search type memory heap and at the moment
no more memory could be allocated: return */
ut_ad(hash_get_heap(table, fold)->type & MEM_HEAP_BTR_SEARCH);
return(FALSE);
}
ha_node_set_data(node, data);
node->fold = fold;
node->next = NULL;
prev_node = cell->node;
if (prev_node == NULL) {
cell->node = node;
return(TRUE);
}
while (prev_node->next != NULL) {
prev_node = prev_node->next;
}
prev_node->next = node;
return(TRUE);
}
/***************************************************************
Deletes a hash node. */
void
ha_delete_hash_node(
/*================*/
hash_table_t* table, /* in: hash table */
ha_node_t* del_node) /* in: node to be deleted */
{
HASH_DELETE_AND_COMPACT(ha_node_t, next, table, del_node);
}
/*****************************************************************
Deletes an entry from a hash table. */
void
ha_delete(
/*======*/
hash_table_t* table, /* in: hash table */
ulint fold, /* in: folded value of data */
void* data) /* in: data, must not be NULL and must exist
in the hash table */
{
ha_node_t* node;
ut_ad(!table->mutexes || mutex_own(hash_get_mutex(table, fold)));
node = ha_search_with_data(table, fold, data);
ut_ad(node);
ha_delete_hash_node(table, node);
}
/*********************************************************************
Removes from the chain determined by fold all nodes whose data pointer
points to the page given. */
void
ha_remove_all_nodes_to_page(
/*========================*/
hash_table_t* table, /* in: hash table */
ulint fold, /* in: fold value */
page_t* page) /* in: buffer page */
{
ha_node_t* node;
ut_ad(!table->mutexes || mutex_own(hash_get_mutex(table, fold)));
node = ha_chain_get_first(table, fold);
while (node) {
if (buf_frame_align(ha_node_get_data(node)) == page) {
/* Remove the hash node */
ha_delete_hash_node(table, node);
/* Start again from the first node in the chain
because the deletion may compact the heap of
nodes and move other nodes! */
node = ha_chain_get_first(table, fold);
} else {
node = ha_chain_get_next(table, node);
}
}
}
/*****************************************************************
Validates a hash table. */
ibool
ha_validate(
/*========*/
/* out: TRUE if ok */
hash_table_t* table) /* in: hash table */
{
hash_cell_t* cell;
ha_node_t* node;
ulint i;
for (i = 0; i < hash_get_n_cells(table); i++) {
cell = hash_get_nth_cell(table, i);
node = cell->node;
while (node) {
ut_a(hash_calc_hash(node->fold, table) == i);
node = node->next;
}
}
return(TRUE);
}
/*****************************************************************
Prints info of a hash table. */
void
ha_print_info(
/*==========*/
hash_table_t* table) /* in: hash table */
{
hash_cell_t* cell;
ha_node_t* node;
ulint nodes = 0;
ulint cells = 0;
ulint len = 0;
ulint max_len = 0;
ulint i;
for (i = 0; i < hash_get_n_cells(table); i++) {
cell = hash_get_nth_cell(table, i);
if (cell->node) {
cells++;
len = 0;
node = cell->node;
for (;;) {
len++;
nodes++;
if (ha_chain_get_next(table, node) == NULL) {
break;
}
node = node->next;
}
if (len > max_len) {
max_len = len;
}
}
}
printf("Hash table size %lu, used cells %lu, nodes %lu\n",
hash_get_n_cells(table), cells, nodes);
printf("max chain length %lu\n", max_len);
ut_a(ha_validate(table));
}

152
innobase/ha/hash0hash.c Normal file
View file

@ -0,0 +1,152 @@
/******************************************************
The simple hash table utility
(c) 1997 Innobase Oy
Created 5/20/1997 Heikki Tuuri
*******************************************************/
#include "hash0hash.h"
#ifdef UNIV_NONINL
#include "hash0hash.ic"
#endif
#include "mem0mem.h"
/****************************************************************
Reserves the mutex for a fold value in a hash table. */
void
hash_mutex_enter(
/*=============*/
hash_table_t* table, /* in: hash table */
ulint fold) /* in: fold */
{
mutex_enter(hash_get_mutex(table, fold));
}
/****************************************************************
Releases the mutex for a fold value in a hash table. */
void
hash_mutex_exit(
/*============*/
hash_table_t* table, /* in: hash table */
ulint fold) /* in: fold */
{
mutex_exit(hash_get_mutex(table, fold));
}
/****************************************************************
Reserves all the mutexes of a hash table, in an ascending order. */
void
hash_mutex_enter_all(
/*=================*/
hash_table_t* table) /* in: hash table */
{
ulint i;
for (i = 0; i < table->n_mutexes; i++) {
mutex_enter(table->mutexes + i);
}
}
/****************************************************************
Releases all the mutexes of a hash table. */
void
hash_mutex_exit_all(
/*================*/
hash_table_t* table) /* in: hash table */
{
ulint i;
for (i = 0; i < table->n_mutexes; i++) {
mutex_exit(table->mutexes + i);
}
}
/*****************************************************************
Creates a hash table with >= n array cells. The actual number of cells is
chosen to be a prime number slightly bigger than n. */
hash_table_t*
hash_create(
/*========*/
/* out, own: created table */
ulint n) /* in: number of array cells */
{
hash_cell_t* array;
ulint prime;
hash_table_t* table;
ulint i;
hash_cell_t* cell;
prime = ut_find_prime(n);
table = mem_alloc(sizeof(hash_table_t));
array = ut_malloc(sizeof(hash_cell_t) * prime);
table->array = array;
table->n_cells = prime;
table->n_mutexes = 0;
table->mutexes = NULL;
table->heaps = NULL;
table->heap = NULL;
table->magic_n = HASH_TABLE_MAGIC_N;
/* Initialize the cell array */
for (i = 0; i < prime; i++) {
cell = hash_get_nth_cell(table, i);
cell->node = NULL;
}
return(table);
}
/*****************************************************************
Frees a hash table. */
void
hash_table_free(
/*============*/
hash_table_t* table) /* in, own: hash table */
{
ut_a(table->mutexes == NULL);
ut_free(table->array);
mem_free(table);
}
/*****************************************************************
Creates a mutex array to protect a hash table. */
void
hash_create_mutexes(
/*================*/
hash_table_t* table, /* in: hash table */
ulint n_mutexes, /* in: number of mutexes, must be a
power of 2 */
ulint sync_level) /* in: latching order level of the
mutexes: used in the debug version */
{
ulint i;
ut_a(n_mutexes == ut_2_power_up(n_mutexes));
table->mutexes = mem_alloc(n_mutexes * sizeof(mutex_t));
for (i = 0; i < n_mutexes; i++) {
mutex_create(table->mutexes + i);
mutex_set_level(table->mutexes + i, sync_level);
}
table->n_mutexes = n_mutexes;
}

10
innobase/ha/makefilewin Normal file
View file

@ -0,0 +1,10 @@
include ..\include\makefile.i
ha.lib: ha0ha.obj hash0hash.obj
lib -out:..\libs\ha.lib ha0ha.obj hash0hash.obj
ha0ha.obj: ha0ha.c
$(CCOM) $(CFL) -c ha0ha.c
hash0hash.obj: hash0hash.c
$(CCOM) $(CFL) -c hash0hash.c

12
innobase/ha/ts/makefile Normal file
View file

@ -0,0 +1,12 @@
include ..\..\makefile.i
tsha: ..\ha.lib tsha.c makefile
$(CCOM) $(CFL) -I.. -I..\.. ..\..\btr.lib ..\..\trx.lib ..\..\pars.lib ..\..\que.lib ..\..\lock.lib ..\..\row.lib ..\..\read.lib ..\..\srv.lib ..\..\com.lib ..\..\usr.lib ..\..\thr.lib ..\..\fut.lib ..\..\fsp.lib ..\..\page.lib ..\..\dyn.lib ..\..\mtr.lib ..\..\log.lib ..\..\rem.lib ..\..\fil.lib ..\..\buf.lib ..\..\dict.lib ..\..\data.lib ..\..\mach.lib ..\ha.lib ..\..\ut.lib ..\..\sync.lib ..\..\mem.lib ..\..\os.lib tsha.c $(LFL)

120
innobase/ha/ts/tsha.c Normal file
View file

@ -0,0 +1,120 @@
/************************************************************************
The test module for hash table
(c) 1994, 1995 Innobase Oy
Created 1/25/1994 Heikki Tuuri
*************************************************************************/
#include "ut0ut.h"
#include "ha0ha.h"
#include "mem0mem.h"
#include "sync0sync.h"
ulint ulint_array[200000];
void
test1(void)
{
hash_table_t* table1;
ulint i;
ulint n313 = 313;
ulint n414 = 414;
printf("------------------------------------------------\n");
printf("TEST 1. BASIC TEST\n");
table1 = ha_create(50000);
ha_insert_for_fold(table1, 313, &n313);
ha_insert_for_fold(table1, 313, &n414);
ut_a(ha_validate(table1));
ha_delete(table1, 313, &n313);
ha_delete(table1, 313, &n414);
ut_a(ha_validate(table1));
printf("------------------------------------------------\n");
printf("TEST 2. TEST OF MASSIVE INSERTS AND DELETES\n");
table1 = ha_create(10000);
for (i = 0; i < 200000; i++) {
ulint_array[i] = i;
}
for (i = 0; i < 50000; i++) {
ha_insert_for_fold(table1, i * 7, ulint_array + i);
}
ut_a(ha_validate(table1));
for (i = 0; i < 50000; i++) {
ha_delete(table1, i * 7, ulint_array + i);
}
ut_a(ha_validate(table1));
}
void
test2(void)
{
hash_table_t* table1;
ulint i;
ulint oldtm, tm;
ha_node_t* node;
printf("------------------------------------------------\n");
printf("TEST 3. SPEED TEST\n");
table1 = ha_create(300000);
oldtm = ut_clock();
for (i = 0; i < 200000; i++) {
ha_insert_for_fold(table1, i * 27877, ulint_array + i);
}
tm = ut_clock();
printf("Wall clock time for %lu inserts %lu millisecs\n",
i, tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 200000; i++) {
node = ha_search(table1, i * 27877);
}
tm = ut_clock();
printf("Wall clock time for %lu searches %lu millisecs\n",
i, tm - oldtm);
oldtm = ut_clock();
for (i = 0; i < 200000; i++) {
ha_delete(table1, i * 27877, ulint_array + i);
}
tm = ut_clock();
printf("Wall clock time for %lu deletes %lu millisecs\n",
i, tm - oldtm);
}
void
main(void)
{
sync_init();
mem_init(1000000);
test1();
test2();
printf("TESTS COMPLETED SUCCESSFULLY!\n");
}

22
innobase/ib_config.h Normal file
View file

@ -0,0 +1,22 @@
/* ib_config.h. Generated automatically by configure. */
/* ib_config.h.in. Generated automatically from configure.in by autoheader. */
/* Define as __inline if that's what the C compiler calls it. */
/* #undef inline */
/* Define if your processor stores words with the most significant
byte first (like Motorola and SPARC, unlike Intel and VAX). */
/* #undef WORDS_BIGENDIAN */
/* The number of bytes in a int. */
#define SIZEOF_INT 4
/* Define if you have the <aio.h> header file. */
#define HAVE_AIO_H 1
/* Name of package */
#define PACKAGE "ib"
/* Version number of package */
#define VERSION "0.90"

21
innobase/ib_config.h.in Normal file
View file

@ -0,0 +1,21 @@
/* ib_config.h.in. Generated automatically from configure.in by autoheader. */
/* Define as __inline if that's what the C compiler calls it. */
#undef inline
/* Define if your processor stores words with the most significant
byte first (like Motorola and SPARC, unlike Intel and VAX). */
#undef WORDS_BIGENDIAN
/* The number of bytes in a int. */
#undef SIZEOF_INT
/* Define if you have the <aio.h> header file. */
#undef HAVE_AIO_H
/* Name of package */
#undef PACKAGE
/* Version number of package */
#undef VERSION

24
innobase/ibuf/Makefile.am Normal file
View file

@ -0,0 +1,24 @@
# Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
# & Innobase Oy
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
include ../include/Makefile.i
libs_LIBRARIES = libibuf.a
libibuf_a_SOURCES = ibuf0ibuf.c
EXTRA_PROGRAMS =

2617
innobase/ibuf/ibuf0ibuf.c Normal file

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,7 @@
include ..\include\makefile.i
ibuf.lib: ibuf0ibuf.obj
lib -out:..\libs\ibuf.lib ibuf0ibuf.obj
ibuf0ibuf.obj: ibuf0ibuf.c
$(CCOM) $(CFL) -c ibuf0ibuf.c

View file

@ -0,0 +1,5 @@
# Makefile included in Makefile.am in every subdirectory
libsdir = ../libs
INCLUDES = -I../../include -I../include

391
innobase/include/btr0btr.h Normal file
View file

@ -0,0 +1,391 @@
/******************************************************
The B-tree
(c) 1994-1996 Innobase Oy
Created 6/2/1994 Heikki Tuuri
*******************************************************/
#ifndef btr0btr_h
#define btr0btr_h
#include "univ.i"
#include "dict0dict.h"
#include "data0data.h"
#include "page0cur.h"
#include "rem0rec.h"
#include "mtr0mtr.h"
#include "btr0types.h"
/* Maximum record size which can be stored on a page, without using the
special big record storage structure */
#define BTR_PAGE_MAX_REC_SIZE (UNIV_PAGE_SIZE / 2 - 200)
/* Maximum key size in a B-tree: the records on non-leaf levels must be
shorter than this */
#define BTR_PAGE_MAX_KEY_SIZE 1024
/* If data in page drops below this limit, we try to compress it.
NOTE! The value has to be > 2 * BTR_MAX_KEY_SIZE */
#define BTR_COMPRESS_LIMIT (UNIV_PAGE_SIZE / 4 + 1);
/* Latching modes for the search function (in btr0cur.*) */
#define BTR_SEARCH_LEAF RW_S_LATCH
#define BTR_MODIFY_LEAF RW_X_LATCH
#define BTR_NO_LATCHES RW_NO_LATCH
#define BTR_MODIFY_TREE 33
#define BTR_CONT_MODIFY_TREE 34
#define BTR_SEARCH_PREV 35
#define BTR_MODIFY_PREV 36
/* If this is ORed to the latch mode, it means that the search tuple will be
inserted to the index, at the searched position */
#define BTR_INSERT 512
/* This flag ORed to latch mode says that we do the search in query
optimization */
#define BTR_ESTIMATE 1024
/******************************************************************
Gets a buffer page and declares its latching order level. */
UNIV_INLINE
page_t*
btr_page_get(
/*=========*/
ulint space, /* in: space id */
ulint page_no, /* in: page number */
ulint mode, /* in: latch mode */
mtr_t* mtr); /* in: mtr */
/******************************************************************
Gets the index id field of a page. */
UNIV_INLINE
dulint
btr_page_get_index_id(
/*==================*/
/* out: index id */
page_t* page); /* in: index page */
/************************************************************
Gets the node level field in an index page. */
UNIV_INLINE
ulint
btr_page_get_level_low(
/*===================*/
/* out: level, leaf level == 0 */
page_t* page); /* in: index page */
/************************************************************
Gets the node level field in an index page. */
UNIV_INLINE
ulint
btr_page_get_level(
/*===============*/
/* out: level, leaf level == 0 */
page_t* page, /* in: index page */
mtr_t* mtr); /* in: mini-transaction handle */
/************************************************************
Gets the next index page number. */
UNIV_INLINE
ulint
btr_page_get_next(
/*==============*/
/* out: next page number */
page_t* page, /* in: index page */
mtr_t* mtr); /* in: mini-transaction handle */
/************************************************************
Gets the previous index page number. */
UNIV_INLINE
ulint
btr_page_get_prev(
/*==============*/
/* out: prev page number */
page_t* page, /* in: index page */
mtr_t* mtr); /* in: mini-transaction handle */
/*****************************************************************
Gets pointer to the previous user record in the tree. It is assumed
that the caller has appropriate latches on the page and its neighbor. */
rec_t*
btr_get_prev_user_rec(
/*==================*/
/* out: previous user record, NULL if there is none */
rec_t* rec, /* in: record on leaf level */
mtr_t* mtr); /* in: mtr holding a latch on the page, and if
needed, also to the previous page */
/*****************************************************************
Gets pointer to the next user record in the tree. It is assumed
that the caller has appropriate latches on the page and its neighbor. */
rec_t*
btr_get_next_user_rec(
/*==================*/
/* out: next user record, NULL if there is none */
rec_t* rec, /* in: record on leaf level */
mtr_t* mtr); /* in: mtr holding a latch on the page, and if
needed, also to the next page */
/******************************************************************
Releases the latch on a leaf page and bufferunfixes it. */
UNIV_INLINE
void
btr_leaf_page_release(
/*==================*/
page_t* page, /* in: page */
ulint latch_mode, /* in: BTR_SEARCH_LEAF or BTR_MODIFY_LEAF */
mtr_t* mtr); /* in: mtr */
/******************************************************************
Gets the child node file address in a node pointer. */
UNIV_INLINE
ulint
btr_node_ptr_get_child_page_no(
/*===========================*/
/* out: child node address */
rec_t* rec); /* in: node pointer record */
/****************************************************************
Creates the root node for a new index tree. */
ulint
btr_create(
/*=======*/
/* out: page number of the created root, FIL_NULL if
did not succeed */
ulint type, /* in: type of the index */
ulint space, /* in: space where created */
dulint index_id,/* in: index id */
mtr_t* mtr); /* in: mini-transaction handle */
/****************************************************************
Frees a B-tree except the root page, which MUST be freed after this
by calling btr_free_root. */
void
btr_free_but_not_root(
/*==================*/
ulint space, /* in: space where created */
ulint root_page_no); /* in: root page number */
/****************************************************************
Frees the B-tree root page. Other tree MUST already have been freed. */
void
btr_free_root(
/*==========*/
ulint space, /* in: space where created */
ulint root_page_no, /* in: root page number */
mtr_t* mtr); /* in: a mini-transaction which has already
been started */
/*****************************************************************
Makes tree one level higher by splitting the root, and inserts
the tuple. It is assumed that mtr contains an x-latch on the tree.
NOTE that the operation of this function must always succeed,
we cannot reverse it: therefore enough free disk space must be
guaranteed to be available before this function is called. */
rec_t*
btr_root_raise_and_insert(
/*======================*/
/* out: inserted record */
btr_cur_t* cursor, /* in: cursor at which to insert: must be
on the root page; when the function returns,
the cursor is positioned on the predecessor
of the inserted record */
dtuple_t* tuple, /* in: tuple to insert */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Reorganizes an index page. */
void
btr_page_reorganize(
/*================*/
page_t* page, /* in: page to be reorganized */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Reorganizes an index page. */
void
btr_page_reorganize_low(
/*====================*/
ibool low, /* in: TRUE if locks should not be updated, i.e.,
there cannot exist locks on the page */
page_t* page, /* in: page to be reorganized */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Decides if the page should be split at the convergence point of
inserts converging to left. */
ibool
btr_page_get_split_rec_to_left(
/*===========================*/
/* out: TRUE if split recommended */
btr_cur_t* cursor, /* in: cursor at which to insert */
rec_t** split_rec);/* out: if split recommended,
the first record on upper half page,
or NULL if tuple should be first */
/*****************************************************************
Decides if the page should be split at the convergence point of
inserts converging to right. */
ibool
btr_page_get_split_rec_to_right(
/*============================*/
/* out: TRUE if split recommended */
btr_cur_t* cursor, /* in: cursor at which to insert */
rec_t** split_rec);/* out: if split recommended,
the first record on upper half page,
or NULL if tuple should be first */
/*****************************************************************
Splits an index page to halves and inserts the tuple. It is assumed
that mtr holds an x-latch to the index tree. NOTE: the tree x-latch
is released within this function! NOTE that the operation of this
function must always succeed, we cannot reverse it: therefore
enough free disk space must be guaranteed to be available before
this function is called. */
rec_t*
btr_page_split_and_insert(
/*======================*/
/* out: inserted record; NOTE: the tree
x-latch is released! NOTE: 2 free disk
pages must be available! */
btr_cur_t* cursor, /* in: cursor at which to insert; when the
function returns, the cursor is positioned
on the predecessor of the inserted record */
dtuple_t* tuple, /* in: tuple to insert */
mtr_t* mtr); /* in: mtr */
/***********************************************************
Inserts a data tuple to a tree on a non-leaf level. It is assumed
that mtr holds an x-latch on the tree. */
void
btr_insert_on_non_leaf_level(
/*=========================*/
dict_tree_t* tree, /* in: tree */
ulint level, /* in: level, must be > 0 */
dtuple_t* tuple, /* in: the record to be inserted */
mtr_t* mtr); /* in: mtr */
/********************************************************************
Sets a record as the predefined minimum record. */
void
btr_set_min_rec_mark(
/*=================*/
rec_t* rec, /* in: record */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Deletes on the upper level the node pointer to a page. */
void
btr_node_ptr_delete(
/*================*/
dict_tree_t* tree, /* in: index tree */
page_t* page, /* in: page whose node pointer is deleted */
mtr_t* mtr); /* in: mtr */
/****************************************************************
Checks that the node pointer to a page is appropriate. */
ibool
btr_check_node_ptr(
/*===============*/
/* out: TRUE */
dict_tree_t* tree, /* in: index tree */
page_t* page, /* in: index page */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Tries to merge the page first to the left immediate brother if such a
brother exists, and the node pointers to the current page and to the
brother reside on the same page. If the left brother does not satisfy these
conditions, looks at the right brother. If the page is the only one on that
level lifts the records of the page to the father page, thus reducing the
tree height. It is assumed that mtr holds an x-latch on the tree and on the
page. If cursor is on the leaf level, mtr must also hold x-latches to
the brothers, if they exist. NOTE: it is assumed that the caller has reserved
enough free extents so that the compression will always succeed if done! */
void
btr_compress(
/*=========*/
btr_cur_t* cursor, /* in: cursor on the page to merge or lift;
the page must not be empty: in record delete
use btr_discard_page if the page would become
empty */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Discards a page from a B-tree. This is used to remove the last record from
a B-tree page: the whole page must be removed at the same time. This cannot
be used for the root page, which is allowed to be empty. */
void
btr_discard_page(
/*=============*/
btr_cur_t* cursor, /* in: cursor on the page to discard: not on
the root page */
mtr_t* mtr); /* in: mtr */
/************************************************************************
Declares the latching order level for the page latch in the debug version. */
UNIV_INLINE
void
btr_declare_page_latch(
/*===================*/
page_t* page, /* in: page */
ibool leaf); /* in: TRUE if a leaf */
/********************************************************************
Parses the redo log record for setting an index record as the predefined
minimum record. */
byte*
btr_parse_set_min_rec_mark(
/*=======================*/
/* out: end of log record or NULL */
byte* ptr, /* in: buffer */
byte* end_ptr,/* in: buffer end */
page_t* page, /* in: page or NULL */
mtr_t* mtr); /* in: mtr or NULL */
/***************************************************************
Parses a redo log record of reorganizing a page. */
byte*
btr_parse_page_reorganize(
/*======================*/
/* out: end of log record or NULL */
byte* ptr, /* in: buffer */
byte* end_ptr,/* in: buffer end */
page_t* page, /* in: page or NULL */
mtr_t* mtr); /* in: mtr or NULL */
/******************************************************************
Gets the number of pages in a B-tree. */
ulint
btr_get_size(
/*=========*/
/* out: number of pages */
dict_index_t* index, /* in: index */
ulint flag); /* in: BTR_N_LEAF_PAGES or BTR_TOTAL_SIZE */
/*****************************************************************
Prints size info of a B-tree. */
void
btr_print_size(
/*===========*/
dict_tree_t* tree); /* in: index tree */
/******************************************************************
Prints directories and other info of all nodes in the tree. */
void
btr_print_tree(
/*===========*/
dict_tree_t* tree, /* in: tree */
ulint width); /* in: print this many entries from start
and end */
/******************************************************************
Checks the consistency of an index tree. */
void
btr_validate_tree(
/*==============*/
dict_tree_t* tree); /* in: tree */
#define BTR_N_LEAF_PAGES 1
#define BTR_TOTAL_SIZE 2
#ifndef UNIV_NONINL
#include "btr0btr.ic"
#endif
#endif

223
innobase/include/btr0btr.ic Normal file
View file

@ -0,0 +1,223 @@
/******************************************************
The B-tree
(c) 1994-1996 Innobase Oy
Created 6/2/1994 Heikki Tuuri
*******************************************************/
#include "mach0data.h"
#include "mtr0mtr.h"
#include "mtr0log.h"
#define BTR_MAX_NODE_LEVEL 50 /* used in debug checking */
/******************************************************************
Gets a buffer page and declares its latching order level. */
UNIV_INLINE
page_t*
btr_page_get(
/*=========*/
ulint space, /* in: space id */
ulint page_no, /* in: page number */
ulint mode, /* in: latch mode */
mtr_t* mtr) /* in: mtr */
{
page_t* page;
page = buf_page_get(space, page_no, mode, mtr);
#ifdef UNIV_SYNC_DEBUG
if (mode != RW_NO_LATCH) {
buf_page_dbg_add_level(page, SYNC_TREE_NODE);
}
#endif
return(page);
}
/******************************************************************
Sets the index id field of a page. */
UNIV_INLINE
void
btr_page_set_index_id(
/*==================*/
page_t* page, /* in: page to be created */
dulint id, /* in: index id */
mtr_t* mtr) /* in: mtr */
{
mlog_write_dulint(page + PAGE_HEADER + PAGE_INDEX_ID, id,
MLOG_8BYTES, mtr);
}
/******************************************************************
Gets the index id field of a page. */
UNIV_INLINE
dulint
btr_page_get_index_id(
/*==================*/
/* out: index id */
page_t* page) /* in: index page */
{
return(mach_read_from_8(page + PAGE_HEADER + PAGE_INDEX_ID));
}
/************************************************************
Gets the node level field in an index page. */
UNIV_INLINE
ulint
btr_page_get_level_low(
/*===================*/
/* out: level, leaf level == 0 */
page_t* page) /* in: index page */
{
ulint level;
ut_ad(page);
level = mach_read_from_2(page + PAGE_HEADER + PAGE_LEVEL);
ut_ad(level <= BTR_MAX_NODE_LEVEL);
return(level);
}
/************************************************************
Gets the node level field in an index page. */
UNIV_INLINE
ulint
btr_page_get_level(
/*===============*/
/* out: level, leaf level == 0 */
page_t* page, /* in: index page */
mtr_t* mtr) /* in: mini-transaction handle */
{
ut_ad(page && mtr);
return(btr_page_get_level_low(page));
}
/************************************************************
Sets the node level field in an index page. */
UNIV_INLINE
void
btr_page_set_level(
/*===============*/
page_t* page, /* in: index page */
ulint level, /* in: level, leaf level == 0 */
mtr_t* mtr) /* in: mini-transaction handle */
{
ut_ad(page && mtr);
ut_ad(level <= BTR_MAX_NODE_LEVEL);
mlog_write_ulint(page + PAGE_HEADER + PAGE_LEVEL, level,
MLOG_2BYTES, mtr);
}
/************************************************************
Gets the next index page number. */
UNIV_INLINE
ulint
btr_page_get_next(
/*==============*/
/* out: next page number */
page_t* page, /* in: index page */
mtr_t* mtr) /* in: mini-transaction handle */
{
ut_ad(page && mtr);
ut_ad(mtr_memo_contains(mtr, buf_block_align(page),
MTR_MEMO_PAGE_X_FIX)
|| mtr_memo_contains(mtr, buf_block_align(page),
MTR_MEMO_PAGE_S_FIX));
return(mach_read_from_4(page + FIL_PAGE_NEXT));
}
/************************************************************
Sets the next index page field. */
UNIV_INLINE
void
btr_page_set_next(
/*==============*/
page_t* page, /* in: index page */
ulint next, /* in: next page number */
mtr_t* mtr) /* in: mini-transaction handle */
{
ut_ad(page && mtr);
mlog_write_ulint(page + FIL_PAGE_NEXT, next, MLOG_4BYTES, mtr);
}
/************************************************************
Gets the previous index page number. */
UNIV_INLINE
ulint
btr_page_get_prev(
/*==============*/
/* out: prev page number */
page_t* page, /* in: index page */
mtr_t* mtr) /* in: mini-transaction handle */
{
ut_ad(page && mtr);
return(mach_read_from_4(page + FIL_PAGE_PREV));
}
/************************************************************
Sets the previous index page field. */
UNIV_INLINE
void
btr_page_set_prev(
/*==============*/
page_t* page, /* in: index page */
ulint prev, /* in: previous page number */
mtr_t* mtr) /* in: mini-transaction handle */
{
ut_ad(page && mtr);
mlog_write_ulint(page + FIL_PAGE_PREV, prev, MLOG_4BYTES, mtr);
}
/******************************************************************
Gets the child node file address in a node pointer. */
UNIV_INLINE
ulint
btr_node_ptr_get_child_page_no(
/*===========================*/
/* out: child node address */
rec_t* rec) /* in: node pointer record */
{
ulint n_fields;
byte* field;
ulint len;
n_fields = rec_get_n_fields(rec);
/* The child address is in the last field */
field = rec_get_nth_field(rec, n_fields - 1, &len);
ut_ad(len == 4);
return(mach_read_from_4(field));
}
/******************************************************************
Releases the latches on a leaf page and bufferunfixes it. */
UNIV_INLINE
void
btr_leaf_page_release(
/*==================*/
page_t* page, /* in: page */
ulint latch_mode, /* in: BTR_SEARCH_LEAF or BTR_MODIFY_LEAF */
mtr_t* mtr) /* in: mtr */
{
ut_ad(!mtr_memo_contains(mtr, buf_block_align(page),
MTR_MEMO_MODIFY));
if (latch_mode == BTR_SEARCH_LEAF) {
mtr_memo_release(mtr, buf_block_align(page),
MTR_MEMO_PAGE_S_FIX);
} else {
ut_ad(latch_mode == BTR_MODIFY_LEAF);
mtr_memo_release(mtr, buf_block_align(page),
MTR_MEMO_PAGE_X_FIX);
}
}

519
innobase/include/btr0cur.h Normal file
View file

@ -0,0 +1,519 @@
/******************************************************
The index tree cursor
(c) 1994-1996 Innobase Oy
Created 10/16/1994 Heikki Tuuri
*******************************************************/
#ifndef btr0cur_h
#define btr0cur_h
#include "univ.i"
#include "dict0dict.h"
#include "data0data.h"
#include "page0cur.h"
#include "btr0types.h"
#include "que0types.h"
#include "row0types.h"
#include "ha0ha.h"
/* Mode flags for btr_cur operations; these can be ORed */
#define BTR_NO_UNDO_LOG_FLAG 1 /* do no undo logging */
#define BTR_NO_LOCKING_FLAG 2 /* do no record lock checking */
#define BTR_KEEP_SYS_FLAG 4 /* sys fields will be found from the
update vector or inserted entry */
#define BTR_CUR_ADAPT
#define BTR_CUR_HASH_ADAPT
/*************************************************************
Returns the page cursor component of a tree cursor. */
UNIV_INLINE
page_cur_t*
btr_cur_get_page_cur(
/*=================*/
/* out: pointer to page cursor component */
btr_cur_t* cursor); /* in: tree cursor */
/*************************************************************
Returns the record pointer of a tree cursor. */
UNIV_INLINE
rec_t*
btr_cur_get_rec(
/*============*/
/* out: pointer to record */
btr_cur_t* cursor); /* in: tree cursor */
/*************************************************************
Invalidates a tree cursor by setting record pointer to NULL. */
UNIV_INLINE
void
btr_cur_invalidate(
/*===============*/
btr_cur_t* cursor); /* in: tree cursor */
/*************************************************************
Returns the page of a tree cursor. */
UNIV_INLINE
page_t*
btr_cur_get_page(
/*=============*/
/* out: pointer to page */
btr_cur_t* cursor); /* in: tree cursor */
/*************************************************************
Returns the tree of a cursor. */
UNIV_INLINE
dict_tree_t*
btr_cur_get_tree(
/*=============*/
/* out: tree */
btr_cur_t* cursor); /* in: tree cursor */
/*************************************************************
Positions a tree cursor at a given record. */
UNIV_INLINE
void
btr_cur_position(
/*=============*/
dict_index_t* index, /* in: index */
rec_t* rec, /* in: record in tree */
btr_cur_t* cursor);/* in: cursor */
/************************************************************************
Searches an index tree and positions a tree cursor on a given level.
NOTE: n_fields_cmp in tuple must be set so that it cannot be compared
to node pointer page number fields on the upper levels of the tree!
Note that if mode is PAGE_CUR_LE, which is used in inserts, then
cursor->up_match and cursor->low_match both will have sensible values.
If mode is PAGE_CUR_GE, then up_match will a have a sensible value. */
void
btr_cur_search_to_nth_level(
/*========================*/
dict_index_t* index, /* in: index */
ulint level, /* in: the tree level of search */
dtuple_t* tuple, /* in: data tuple; NOTE: n_fields_cmp in
tuple must be set so that it cannot get
compared to the node ptr page number field! */
ulint mode, /* in: PAGE_CUR_L, ...;
NOTE that if the search is made using a unique
prefix of a record, mode should be PAGE_CUR_LE,
not PAGE_CUR_GE, as the latter may end up on
the previous page of the record! Inserts
should always be made using PAGE_CUR_LE to
search the position! */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ...;
cursor->left_page is used to store a pointer
to the left neighbor page, in the cases
BTR_SEARCH_PREV and BTR_MODIFY_PREV */
btr_cur_t* cursor, /* out: tree cursor; the cursor page is s- or
x-latched */
ulint has_search_latch,/* in: latch mode the caller
currently has on btr_search_latch:
RW_S_LATCH, or 0 */
mtr_t* mtr); /* in: mtr */
/*********************************************************************
Opens a cursor at either end of an index. */
void
btr_cur_open_at_index_side(
/*=======================*/
ibool from_left, /* in: TRUE if open to the low end,
FALSE if to the high end */
dict_index_t* index, /* in: index */
ulint latch_mode, /* in: latch mode */
btr_cur_t* cursor, /* in: cursor */
mtr_t* mtr); /* in: mtr */
/**************************************************************************
Positions a cursor at a randomly chosen position within a B-tree. */
void
btr_cur_open_at_rnd_pos(
/*====================*/
dict_index_t* index, /* in: index */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ... */
btr_cur_t* cursor, /* in/out: B-tree cursor */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Tries to perform an insert to a page in an index tree, next to cursor.
It is assumed that mtr holds an x-latch on the page. The operation does
not succeed if there is too little space on the page. If there is just
one record on the page, the insert will always succeed; this is to
prevent trying to split a page with just one record. */
ulint
btr_cur_optimistic_insert(
/*======================*/
/* out: DB_SUCCESS, DB_WAIT_LOCK,
DB_FAIL, or error number */
ulint flags, /* in: undo logging and locking flags: if not
zero, the parameters index and thr should be
specified */
btr_cur_t* cursor, /* in: cursor on page after which
to insert; cursor stays valid */
dtuple_t* entry, /* in: entry to insert */
rec_t** rec, /* out: pointer to inserted record if
succeed */
que_thr_t* thr, /* in: query thread or NULL */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Performs an insert on a page of an index tree. It is assumed that mtr
holds an x-latch on the tree and on the cursor page. If the insert is
made on the leaf level, to avoid deadlocks, mtr must also own x-latches
to brothers of page, if those brothers exist. */
ulint
btr_cur_pessimistic_insert(
/*=======================*/
/* out: DB_SUCCESS or error number */
ulint flags, /* in: undo logging and locking flags: if not
zero, the parameters index and thr should be
specified */
btr_cur_t* cursor, /* in: cursor after which to insert;
cursor does not stay valid */
dtuple_t* entry, /* in: entry to insert */
rec_t** rec, /* out: pointer to inserted record if
succeed */
que_thr_t* thr, /* in: query thread or NULL */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Updates a record when the update causes no size changes in its fields. */
ulint
btr_cur_update_in_place(
/*====================*/
/* out: DB_SUCCESS or error number */
ulint flags, /* in: undo logging and locking flags */
btr_cur_t* cursor, /* in: cursor on the record to update;
cursor stays valid and positioned on the
same record */
upd_t* update, /* in: update vector */
ulint cmpl_info,/* in: compiler info on secondary index
updates */
que_thr_t* thr, /* in: query thread */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Tries to update a record on a page in an index tree. It is assumed that mtr
holds an x-latch on the page. The operation does not succeed if there is too
little space on the page or if the update would result in too empty a page,
so that tree compression is recommended. */
ulint
btr_cur_optimistic_update(
/*======================*/
/* out: DB_SUCCESS, or DB_OVERFLOW if the
updated record does not fit, DB_UNDERFLOW
if the page would become too empty */
ulint flags, /* in: undo logging and locking flags */
btr_cur_t* cursor, /* in: cursor on the record to update;
cursor stays valid and positioned on the
same record */
upd_t* update, /* in: update vector; this must also
contain trx id and roll ptr fields */
ulint cmpl_info,/* in: compiler info on secondary index
updates */
que_thr_t* thr, /* in: query thread */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Performs an update of a record on a page of a tree. It is assumed
that mtr holds an x-latch on the tree and on the cursor page. If the
update is made on the leaf level, to avoid deadlocks, mtr must also
own x-latches to brothers of page, if those brothers exist. */
ulint
btr_cur_pessimistic_update(
/*=======================*/
/* out: DB_SUCCESS or error code */
ulint flags, /* in: undo logging, locking, and rollback
flags */
btr_cur_t* cursor, /* in: cursor on the record to update;
cursor does not stay valid */
upd_t* update, /* in: update vector; this is allowed also
contain trx id and roll ptr fields, but
the values in update vector have no effect */
ulint cmpl_info,/* in: compiler info on secondary index
updates */
que_thr_t* thr, /* in: query thread */
mtr_t* mtr); /* in: mtr */
/***************************************************************
Marks a clustered index record deleted. Writes an undo log record to
undo log on this delete marking. Writes in the trx id field the id
of the deleting transaction, and in the roll ptr field pointer to the
undo log record created. */
ulint
btr_cur_del_mark_set_clust_rec(
/*===========================*/
/* out: DB_SUCCESS, DB_LOCK_WAIT, or error
number */
ulint flags, /* in: undo logging and locking flags */
btr_cur_t* cursor, /* in: cursor */
ibool val, /* in: value to set */
que_thr_t* thr, /* in: query thread */
mtr_t* mtr); /* in: mtr */
/***************************************************************
Sets a secondary index record delete mark to TRUE or FALSE. */
ulint
btr_cur_del_mark_set_sec_rec(
/*=========================*/
/* out: DB_SUCCESS, DB_LOCK_WAIT, or error
number */
ulint flags, /* in: locking flag */
btr_cur_t* cursor, /* in: cursor */
ibool val, /* in: value to set */
que_thr_t* thr, /* in: query thread */
mtr_t* mtr); /* in: mtr */
/***************************************************************
Sets a secondary index record delete mark to FALSE. This function is
only used by the insert buffer insert merge mechanism. */
void
btr_cur_del_unmark_for_ibuf(
/*========================*/
rec_t* rec, /* in: record to delete unmark */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Tries to compress a page of the tree on the leaf level. It is assumed
that mtr holds an x-latch on the tree and on the cursor page. To avoid
deadlocks, mtr must also own x-latches to brothers of page, if those
brothers exist. NOTE: it is assumed that the caller has reserved enough
free extents so that the compression will always succeed if done! */
void
btr_cur_compress(
/*=============*/
btr_cur_t* cursor, /* in: cursor on the page to compress;
cursor does not stay valid */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Tries to compress a page of the tree if it seems useful. It is assumed
that mtr holds an x-latch on the tree and on the cursor page. To avoid
deadlocks, mtr must also own x-latches to brothers of page, if those
brothers exist. NOTE: it is assumed that the caller has reserved enough
free extents so that the compression will always succeed if done! */
ibool
btr_cur_compress_if_useful(
/*=======================*/
/* out: TRUE if compression occurred */
btr_cur_t* cursor, /* in: cursor on the page to compress;
cursor does not stay valid if compression
occurs */
mtr_t* mtr); /* in: mtr */
/***********************************************************
Removes the record on which the tree cursor is positioned. It is assumed
that the mtr has an x-latch on the page where the cursor is positioned,
but no latch on the whole tree. */
ibool
btr_cur_optimistic_delete(
/*======================*/
/* out: TRUE if success, i.e., the page
did not become too empty */
btr_cur_t* cursor, /* in: cursor on the record to delete;
cursor stays valid: if deletion succeeds,
on function exit it points to the successor
of the deleted record */
mtr_t* mtr); /* in: mtr */
/*****************************************************************
Removes the record on which the tree cursor is positioned. Tries
to compress the page if its fillfactor drops below a threshold
or if it is the only page on the level. It is assumed that mtr holds
an x-latch on the tree and on the cursor page. To avoid deadlocks,
mtr must also own x-latches to brothers of page, if those brothers
exist. */
ibool
btr_cur_pessimistic_delete(
/*=======================*/
/* out: TRUE if compression occurred */
ulint* err, /* out: DB_SUCCESS or DB_OUT_OF_FILE_SPACE;
the latter may occur because we may have
to update node pointers on upper levels,
and in the case of variable length keys
these may actually grow in size */
ibool has_reserved_extents, /* in: TRUE if the
caller has already reserved enough free
extents so that he knows that the operation
will succeed */
btr_cur_t* cursor, /* in: cursor on the record to delete;
if compression does not occur, the cursor
stays valid: it points to successor of
deleted record on function exit */
mtr_t* mtr); /* in: mtr */
/***************************************************************
Parses a redo log record of updating a record in-place. */
byte*
btr_cur_parse_update_in_place(
/*==========================*/
/* out: end of log record or NULL */
byte* ptr, /* in: buffer */
byte* end_ptr,/* in: buffer end */
page_t* page); /* in: page or NULL */
/***************************************************************
Parses a redo log record of updating a record, but not in-place. */
byte*
btr_cur_parse_opt_update(
/*=====================*/
/* out: end of log record or NULL */
byte* ptr, /* in: buffer */
byte* end_ptr,/* in: buffer end */
page_t* page, /* in: page or NULL */
mtr_t* mtr); /* in: mtr or NULL */
/********************************************************************
Parses the redo log record for delete marking or unmarking of a clustered
index record. */
byte*
btr_cur_parse_del_mark_set_clust_rec(
/*=================================*/
/* out: end of log record or NULL */
byte* ptr, /* in: buffer */
byte* end_ptr,/* in: buffer end */
page_t* page); /* in: page or NULL */
/********************************************************************
Parses the redo log record for delete marking or unmarking of a secondary
index record. */
byte*
btr_cur_parse_del_mark_set_sec_rec(
/*===============================*/
/* out: end of log record or NULL */
byte* ptr, /* in: buffer */
byte* end_ptr,/* in: buffer end */
page_t* page); /* in: page or NULL */
/***********************************************************************
Estimates the number of rows in a given index range. */
ulint
btr_estimate_n_rows_in_range(
/*=========================*/
/* out: estimated number of rows */
dict_index_t* index, /* in: index */
dtuple_t* tuple1, /* in: range start, may also be empty tuple */
ulint mode1, /* in: search mode for range start */
dtuple_t* tuple2, /* in: range end, may also be empty tuple */
ulint mode2); /* in: search mode for range end */
/***********************************************************************
Estimates the number of different key values in a given index. */
ulint
btr_estimate_number_of_different_key_vals(
/*======================================*/
/* out: estimated number of key values */
dict_index_t* index); /* in: index */
/*######################################################################*/
/* In the pessimistic delete, if the page data size drops below this
limit, merging it to a neighbor is tried */
#define BTR_CUR_PAGE_COMPRESS_LIMIT (UNIV_PAGE_SIZE / 2)
/* A slot in the path array. We store here info on a search path down the
tree. Each slot contains data on a single level of the tree. */
typedef struct btr_path_struct btr_path_t;
struct btr_path_struct{
ulint nth_rec; /* index of the record
where the page cursor stopped on
this level (index in alphabetical
order); value ULINT_UNDEFINED
denotes array end */
ulint n_recs; /* number of records on the page */
};
#define BTR_PATH_ARRAY_N_SLOTS 250 /* size of path array (in slots) */
/* The tree cursor: the definition appears here only for the compiler
to know struct size! */
struct btr_cur_struct {
dict_index_t* index; /* index where positioned */
page_cur_t page_cur; /* page cursor */
page_t* left_page; /* this field is used to store a pointer
to the left neighbor page, in the cases
BTR_SEARCH_PREV and BTR_MODIFY_PREV */
/*------------------------------*/
que_thr_t* thr; /* this field is only used when
btr_cur_search_... is called for an
index entry insertion: the calling
query thread is passed here to be
used in the insert buffer */
/*------------------------------*/
/* The following fields are used in btr_cur_search... to pass
information: */
ulint flag; /* BTR_CUR_HASH, BTR_CUR_HASH_FAIL,
BTR_CUR_BINARY, or
BTR_CUR_INSERT_TO_IBUF */
ulint tree_height; /* Tree height if the search is done
for a pessimistic insert or update
operation */
ulint up_match; /* If the search mode was PAGE_CUR_LE,
the number of matched fields to the
the first user record to the right of
the cursor record after
btr_cur_search_...;
for the mode PAGE_CUR_GE, the matched
fields to the first user record AT THE
CURSOR or to the right of it;
NOTE that the up_match and low_match
values may exceed the correct values
for comparison to the adjacent user
record if that record is on a
different leaf page! (See the note in
row_ins_duplicate_key.) */
ulint up_bytes; /* number of matched bytes to the
right at the time cursor positioned;
only used internally in searches: not
defined after the search */
ulint low_match; /* if search mode was PAGE_CUR_LE,
the number of matched fields to the
first user record AT THE CURSOR or
to the left of it after
btr_cur_search_...;
NOT defined for PAGE_CUR_GE or any
other search modes; see also the NOTE
in up_match! */
ulint low_bytes; /* number of matched bytes to the
right at the time cursor positioned;
only used internally in searches: not
defined after the search */
ulint n_fields; /* prefix length used in a hash
search if hash_node != NULL */
ulint n_bytes; /* hash prefix bytes if hash_node !=
NULL */
ulint fold; /* fold value used in the search if
flag is BTR_CUR_HASH */
/*------------------------------*/
btr_path_t* path_arr; /* in estimating the number of
rows in range, we store in this array
information of the path through
the tree */
};
/* Values for the flag documenting the used search method */
#define BTR_CUR_HASH 1 /* successful shortcut using the hash
index */
#define BTR_CUR_HASH_FAIL 2 /* failure using hash, success using
binary search: the misleading hash
reference is stored in the field
hash_node, and might be necessary to
update */
#define BTR_CUR_BINARY 3 /* success using the binary search */
#define BTR_CUR_INSERT_TO_IBUF 4 /* performed the intended insert to
the insert buffer */
/* If pessimistic delete fails because of lack of file space,
there is still a good change of success a little later: try this many times,
and sleep this many microseconds in between */
#define BTR_CUR_RETRY_DELETE_N_TIMES 100
#define BTR_CUR_RETRY_SLEEP_TIME 50000
extern ulint btr_cur_n_non_sea;
#ifndef UNIV_NONINL
#include "btr0cur.ic"
#endif
#endif

172
innobase/include/btr0cur.ic Normal file
View file

@ -0,0 +1,172 @@
/******************************************************
The index tree cursor
(c) 1994-1996 Innobase Oy
Created 10/16/1994 Heikki Tuuri
*******************************************************/
#include "btr0btr.h"
/*************************************************************
Returns the page cursor component of a tree cursor. */
UNIV_INLINE
page_cur_t*
btr_cur_get_page_cur(
/*=================*/
/* out: pointer to page cursor component */
btr_cur_t* cursor) /* in: tree cursor */
{
return(&(cursor->page_cur));
}
/*************************************************************
Returns the record pointer of a tree cursor. */
UNIV_INLINE
rec_t*
btr_cur_get_rec(
/*============*/
/* out: pointer to record */
btr_cur_t* cursor) /* in: tree cursor */
{
return(page_cur_get_rec(&(cursor->page_cur)));
}
/*************************************************************
Invalidates a tree cursor by setting record pointer to NULL. */
UNIV_INLINE
void
btr_cur_invalidate(
/*===============*/
btr_cur_t* cursor) /* in: tree cursor */
{
page_cur_invalidate(&(cursor->page_cur));
}
/*************************************************************
Returns the page of a tree cursor. */
UNIV_INLINE
page_t*
btr_cur_get_page(
/*=============*/
/* out: pointer to page */
btr_cur_t* cursor) /* in: tree cursor */
{
return(buf_frame_align(page_cur_get_rec(&(cursor->page_cur))));
}
/*************************************************************
Returns the tree of a cursor. */
UNIV_INLINE
dict_tree_t*
btr_cur_get_tree(
/*=============*/
/* out: tree */
btr_cur_t* cursor) /* in: tree cursor */
{
return((cursor->index)->tree);
}
/*************************************************************
Positions a tree cursor at a given record. */
UNIV_INLINE
void
btr_cur_position(
/*=============*/
dict_index_t* index, /* in: index */
rec_t* rec, /* in: record in tree */
btr_cur_t* cursor) /* in: cursor */
{
page_cur_position(rec, btr_cur_get_page_cur(cursor));
cursor->index = index;
}
/*************************************************************************
Checks if compressing an index page where a btr cursor is placed makes
sense. */
UNIV_INLINE
ibool
btr_cur_compress_recommendation(
/*============================*/
/* out: TRUE if compression is recommended */
btr_cur_t* cursor, /* in: btr cursor */
mtr_t* mtr) /* in: mtr */
{
page_t* page;
ut_ad(mtr_memo_contains(mtr, buf_block_align(
btr_cur_get_page(cursor)),
MTR_MEMO_PAGE_X_FIX));
page = btr_cur_get_page(cursor);
if ((page_get_data_size(page) < BTR_CUR_PAGE_COMPRESS_LIMIT)
|| ((btr_page_get_next(page, mtr) == FIL_NULL)
&& (btr_page_get_prev(page, mtr) == FIL_NULL))) {
/* The page fillfactor has dropped below a predefined
minimum value OR the level in the B-tree contains just
one page: we recommend compression if this is not the
root page. */
if (dict_tree_get_page((cursor->index)->tree)
== buf_frame_get_page_no(page)) {
/* It is the root page */
return(FALSE);
}
return(TRUE);
}
return(FALSE);
}
/*************************************************************************
Checks if the record on which the cursor is placed can be deleted without
making tree compression necessary (or, recommended). */
UNIV_INLINE
ibool
btr_cur_can_delete_without_compress(
/*================================*/
/* out: TRUE if can be deleted without
recommended compression */
btr_cur_t* cursor, /* in: btr cursor */
mtr_t* mtr) /* in: mtr */
{
ulint rec_size;
page_t* page;
ut_ad(mtr_memo_contains(mtr, buf_block_align(
btr_cur_get_page(cursor)),
MTR_MEMO_PAGE_X_FIX));
rec_size = rec_get_size(btr_cur_get_rec(cursor));
page = btr_cur_get_page(cursor);
if ((page_get_data_size(page) - rec_size < BTR_CUR_PAGE_COMPRESS_LIMIT)
|| ((btr_page_get_next(page, mtr) == FIL_NULL)
&& (btr_page_get_prev(page, mtr) == FIL_NULL))
|| (page_get_n_recs(page) < 2)) {
/* The page fillfactor will drop below a predefined
minimum value, OR the level in the B-tree contains just
one page, OR the page will become empty: we recommend
compression if this is not the root page. */
if (dict_tree_get_page((cursor->index)->tree)
== buf_frame_get_page_no(page)) {
/* It is the root page */
return(TRUE);
}
return(FALSE);
}
return(TRUE);
}

486
innobase/include/btr0pcur.h Normal file
View file

@ -0,0 +1,486 @@
/******************************************************
The index tree persistent cursor
(c) 1996 Innobase Oy
Created 2/23/1996 Heikki Tuuri
*******************************************************/
#ifndef btr0pcur_h
#define btr0pcur_h
#include "univ.i"
#include "dict0dict.h"
#include "data0data.h"
#include "mtr0mtr.h"
#include "page0cur.h"
#include "btr0cur.h"
#include "btr0btr.h"
#include "btr0types.h"
/* Relative positions for a stored cursor position */
#define BTR_PCUR_ON 1
#define BTR_PCUR_BEFORE 2
#define BTR_PCUR_AFTER 3
/******************************************************************
Allocates memory for a persistent cursor object and initializes the cursor. */
btr_pcur_t*
btr_pcur_create_for_mysql(void);
/*============================*/
/* out, own: persistent cursor */
/******************************************************************
Frees the memory for a persistent cursor object. */
void
btr_pcur_free_for_mysql(
/*====================*/
btr_pcur_t* cursor); /* in, own: persistent cursor */
/******************************************************************
Copies the stored position of a pcur to another pcur. */
void
btr_pcur_copy_stored_position(
/*==========================*/
btr_pcur_t* pcur_receive, /* in: pcur which will receive the
position info */
btr_pcur_t* pcur_donate); /* in: pcur from which the info is
copied */
/******************************************************************
Sets the old_rec_buf field to NULL. */
UNIV_INLINE
void
btr_pcur_init(
/*==========*/
btr_pcur_t* pcur); /* in: persistent cursor */
/******************************************************************
Initializes and opens a persistent cursor to an index tree. It should be
closed with btr_pcur_close. */
UNIV_INLINE
void
btr_pcur_open(
/*==========*/
dict_index_t* index, /* in: index */
dtuple_t* tuple, /* in: tuple on which search done */
ulint mode, /* in: PAGE_CUR_L, ...;
NOTE that if the search is made using a unique
prefix of a record, mode should be
PAGE_CUR_LE, not PAGE_CUR_GE, as the latter
may end up on the previous page from the
record! */
ulint latch_mode,/* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in: memory buffer for persistent cursor */
mtr_t* mtr); /* in: mtr */
/******************************************************************
Opens an persistent cursor to an index tree without initializing the
cursor. */
UNIV_INLINE
void
btr_pcur_open_with_no_init(
/*=======================*/
dict_index_t* index, /* in: index */
dtuple_t* tuple, /* in: tuple on which search done */
ulint mode, /* in: PAGE_CUR_L, ...;
NOTE that if the search is made using a unique
prefix of a record, mode should be
PAGE_CUR_LE, not PAGE_CUR_GE, as the latter
may end up on the previous page of the
record! */
ulint latch_mode,/* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in: memory buffer for persistent cursor */
ulint has_search_latch,/* in: latch mode the caller
currently has on btr_search_latch:
RW_S_LATCH, or 0 */
mtr_t* mtr); /* in: mtr */
/*********************************************************************
Opens a persistent cursor at either end of an index. */
UNIV_INLINE
void
btr_pcur_open_at_index_side(
/*========================*/
ibool from_left, /* in: TRUE if open to the low end,
FALSE if to the high end */
dict_index_t* index, /* in: index */
ulint latch_mode, /* in: latch mode */
btr_pcur_t* pcur, /* in: cursor */
ibool do_init, /* in: TRUE if should be initialized */
mtr_t* mtr); /* in: mtr */
/******************************************************************
Gets the up_match value for a pcur after a search. */
UNIV_INLINE
ulint
btr_pcur_get_up_match(
/*==================*/
/* out: number of matched fields at the cursor
or to the right if search mode was PAGE_CUR_GE,
otherwise undefined */
btr_pcur_t* cursor); /* in: memory buffer for persistent cursor */
/******************************************************************
Gets the low_match value for a pcur after a search. */
UNIV_INLINE
ulint
btr_pcur_get_low_match(
/*===================*/
/* out: number of matched fields at the cursor
or to the right if search mode was PAGE_CUR_LE,
otherwise undefined */
btr_pcur_t* cursor); /* in: memory buffer for persistent cursor */
/******************************************************************
If mode is PAGE_CUR_G or PAGE_CUR_GE, opens a persistent cursor on the first
user record satisfying the search condition, in the case PAGE_CUR_L or
PAGE_CUR_LE, on the last user record. If no such user record exists, then
in the first case sets the cursor after last in tree, and in the latter case
before first in tree. The latching mode must be BTR_SEARCH_LEAF or
BTR_MODIFY_LEAF. */
void
btr_pcur_open_on_user_rec(
/*======================*/
dict_index_t* index, /* in: index */
dtuple_t* tuple, /* in: tuple on which search done */
ulint mode, /* in: PAGE_CUR_L, ... */
ulint latch_mode, /* in: BTR_SEARCH_LEAF or
BTR_MODIFY_LEAF */
btr_pcur_t* cursor, /* in: memory buffer for persistent
cursor */
mtr_t* mtr); /* in: mtr */
/**************************************************************************
Positions a cursor at a randomly chosen position within a B-tree. */
UNIV_INLINE
void
btr_pcur_open_at_rnd_pos(
/*=====================*/
dict_index_t* index, /* in: index */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in/out: B-tree pcur */
mtr_t* mtr); /* in: mtr */
/******************************************************************
Frees the possible old_rec_buf buffer of a persistent cursor and sets the
latch mode of the persistent cursor to BTR_NO_LATCHES. */
UNIV_INLINE
void
btr_pcur_close(
/*===========*/
btr_pcur_t* cursor); /* in: persistent cursor */
/******************************************************************
The position of the cursor is stored by taking an initial segment of the
record the cursor is positioned on, before, or after, and copying it to the
cursor data structure. NOTE that the page where the cursor is positioned
must not be empty! */
void
btr_pcur_store_position(
/*====================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/******************************************************************
If the latch mode of the cursor is BTR_LEAF_SEARCH or BTR_LEAF_MODIFY,
releases the page latch and bufferfix reserved by the cursor.
NOTE! In the case of BTR_LEAF_MODIFY, there should not exist changes
made by the current mini-transaction to the data protected by the
cursor latch, as then the latch must not be released until mtr_commit. */
void
btr_pcur_release_leaf(
/*==================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Gets the rel_pos field for a cursor whose position has been stored. */
UNIV_INLINE
ulint
btr_pcur_get_rel_pos(
/*=================*/
/* out: BTR_PCUR_ON, ... */
btr_pcur_t* cursor);/* in: persistent cursor */
/******************************************************************
Restores the stored position of a persistent cursor bufferfixing the page and
obtaining the specified latches. If the cursor position was saved when the
(1) cursor was positioned on a user record: this function restores the position
to the last record LESS OR EQUAL to the stored record;
(2) cursor was positioned on a page infimum record: restores the position to
the last record LESS than the user record which was the successor of the page
infimum;
(3) cursor was positioned on the page supremum: restores to the first record
GREATER than the user record which was the predecessor of the supremum. */
ibool
btr_pcur_restore_position(
/*======================*/
/* out: TRUE if the cursor position
was stored when it was on a user record
and it can be restored on a user record
whose ordering fields are identical to
the ones of the original user record */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in: detached persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Sets the mtr field for a pcur. */
UNIV_INLINE
void
btr_pcur_set_mtr(
/*=============*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in, own: mtr */
/*************************************************************
Gets the mtr field for a pcur. */
UNIV_INLINE
mtr_t*
btr_pcur_get_mtr(
/*=============*/
/* out: mtr */
btr_pcur_t* cursor); /* in: persistent cursor */
/******************************************************************
Commits the pcur mtr and sets the pcur latch mode to BTR_NO_LATCHES,
that is, the cursor becomes detached. If there have been modifications
to the page where pcur is positioned, this can be used instead of
btr_pcur_release_leaf. Function btr_pcur_store_position should be used
before calling this, if restoration of cursor is wanted later. */
UNIV_INLINE
void
btr_pcur_commit(
/*============*/
btr_pcur_t* pcur); /* in: persistent cursor */
/******************************************************************
Differs from btr_pcur_commit in that we can specify the mtr to commit. */
UNIV_INLINE
void
btr_pcur_commit_specify_mtr(
/*========================*/
btr_pcur_t* pcur, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr to commit */
/******************************************************************
Tests if a cursor is detached: that is the latch mode is BTR_NO_LATCHES. */
UNIV_INLINE
ibool
btr_pcur_is_detached(
/*=================*/
/* out: TRUE if detached */
btr_pcur_t* pcur); /* in: persistent cursor */
/*************************************************************
Moves the persistent cursor to the next record in the tree. If no records are
left, the cursor stays 'after last in tree'. */
UNIV_INLINE
ibool
btr_pcur_move_to_next(
/*==================*/
/* out: TRUE if the cursor was not after last
in tree */
btr_pcur_t* cursor, /* in: persistent cursor; NOTE that the
function may release the page latch */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Moves the persistent cursor to the previous record in the tree. If no records
are left, the cursor stays 'before first in tree'. */
ibool
btr_pcur_move_to_prev(
/*==================*/
/* out: TRUE if the cursor was not before first
in tree */
btr_pcur_t* cursor, /* in: persistent cursor; NOTE that the
function may release the page latch */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Moves the persistent cursor to the next user record in the tree. If no user
records are left, the cursor ends up 'after last in tree'. */
UNIV_INLINE
ibool
btr_pcur_move_to_next_user_rec(
/*===========================*/
/* out: TRUE if the cursor moved forward,
ending on a user record */
btr_pcur_t* cursor, /* in: persistent cursor; NOTE that the
function may release the page latch */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Moves the persistent cursor to the first record on the next page.
Releases the latch on the current page, and bufferunfixes it.
Note that there must not be modifications on the current page,
as then the x-latch can be released only in mtr_commit. */
void
btr_pcur_move_to_next_page(
/*=======================*/
btr_pcur_t* cursor, /* in: persistent cursor; must be on the
last record of the current page */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Moves the persistent cursor backward if it is on the first record
of the page. Releases the latch on the current page, and bufferunfixes
it. Note that to prevent a possible deadlock, the operation first
stores the position of the cursor, releases the leaf latch, acquires
necessary latches and restores the cursor position again before returning.
The alphabetical position of the cursor is guaranteed to be sensible
on return, but it may happen that the cursor is not positioned on the
last record of any page, because the structure of the tree may have
changed while the cursor had no latches. */
void
btr_pcur_move_backward_from_page(
/*=============================*/
btr_pcur_t* cursor, /* in: persistent cursor, must be on the
first record of the current page */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Returns the btr cursor component of a persistent cursor. */
UNIV_INLINE
btr_cur_t*
btr_pcur_get_btr_cur(
/*=================*/
/* out: pointer to btr cursor component */
btr_pcur_t* cursor); /* in: persistent cursor */
/*************************************************************
Returns the page cursor component of a persistent cursor. */
UNIV_INLINE
page_cur_t*
btr_pcur_get_page_cur(
/*==================*/
/* out: pointer to page cursor component */
btr_pcur_t* cursor); /* in: persistent cursor */
/*************************************************************
Returns the page of a persistent cursor. */
UNIV_INLINE
page_t*
btr_pcur_get_page(
/*==============*/
/* out: pointer to the page */
btr_pcur_t* cursor);/* in: persistent cursor */
/*************************************************************
Returns the record of a persistent cursor. */
UNIV_INLINE
rec_t*
btr_pcur_get_rec(
/*=============*/
/* out: pointer to the record */
btr_pcur_t* cursor);/* in: persistent cursor */
/*************************************************************
Checks if the persistent cursor is on a user record. */
UNIV_INLINE
ibool
btr_pcur_is_on_user_rec(
/*====================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Checks if the persistent cursor is after the last user record on
a page. */
UNIV_INLINE
ibool
btr_pcur_is_after_last_on_page(
/*===========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Checks if the persistent cursor is before the first user record on
a page. */
UNIV_INLINE
ibool
btr_pcur_is_before_first_on_page(
/*=============================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Checks if the persistent cursor is before the first user record in
the index tree. */
UNIV_INLINE
ibool
btr_pcur_is_before_first_in_tree(
/*=============================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Checks if the persistent cursor is after the last user record in
the index tree. */
UNIV_INLINE
ibool
btr_pcur_is_after_last_in_tree(
/*===========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Moves the persistent cursor to the next record on the same page. */
UNIV_INLINE
void
btr_pcur_move_to_next_on_page(
/*==========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/*************************************************************
Moves the persistent cursor to the previous record on the same page. */
UNIV_INLINE
void
btr_pcur_move_to_prev_on_page(
/*==========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr); /* in: mtr */
/* The persistent B-tree cursor structure. This is used mainly for SQL
selects, updates, and deletes. */
struct btr_pcur_struct{
btr_cur_t btr_cur; /* a B-tree cursor */
ulint latch_mode; /* see FIXME note below!
BTR_SEARCH_LEAF, BTR_MODIFY_LEAF,
BTR_MODIFY_TREE, or BTR_NO_LATCHES,
depending on the latching state of
the page and tree where the cursor is
positioned; the last value means that
the cursor is not currently positioned:
we say then that the cursor is
detached; it can be restored to
attached if the old position was
stored in old_rec */
ulint old_stored; /* BTR_PCUR_OLD_STORED
or BTR_PCUR_OLD_NOT_STORED */
rec_t* old_rec; /* if cursor position is stored,
contains an initial segment of the
latest record cursor was positioned
either on, before, or after */
ulint rel_pos; /* BTR_PCUR_ON, BTR_PCUR_BEFORE, or
BTR_PCUR_AFTER, depending on whether
cursor was on, before, or after the
old_rec record */
dulint modify_clock; /* the modify clock value of the
buffer block when the cursor position
was stored */
ulint pos_state; /* see FIXME note below!
BTR_PCUR_IS_POSITIONED,
BTR_PCUR_WAS_POSITIONED,
BTR_PCUR_NOT_POSITIONED */
ulint search_mode; /* PAGE_CUR_G, ... */
/*-----------------------------*/
/* NOTE that the following fields may possess dynamically allocated
memory, which should be freed if not needed anymore! */
mtr_t* mtr; /* NULL, or this field may contain
a mini-transaction which holds the
latch on the cursor page */
byte* old_rec_buf; /* NULL, or a dynamically allocated
buffer for old_rec */
ulint buf_size; /* old_rec_buf size if old_rec_buf
is not NULL */
};
#define BTR_PCUR_IS_POSITIONED 1997660512 /* FIXME: currently, the state
can be BTR_PCUR_IS_POSITIONED,
though it really should be
BTR_PCUR_WAS_POSITIONED,
because we have no obligation
to commit the cursor with
mtr; similarly latch_mode may
be out of date */
#define BTR_PCUR_WAS_POSITIONED 1187549791
#define BTR_PCUR_NOT_POSITIONED 1328997689
#define BTR_PCUR_OLD_STORED 908467085
#define BTR_PCUR_OLD_NOT_STORED 122766467
#ifndef UNIV_NONINL
#include "btr0pcur.ic"
#endif
#endif

View file

@ -0,0 +1,598 @@
/******************************************************
The index tree persistent cursor
(c) 1996 Innobase Oy
Created 2/23/1996 Heikki Tuuri
*******************************************************/
/*************************************************************
Gets the rel_pos field for a cursor whose position has been stored. */
UNIV_INLINE
ulint
btr_pcur_get_rel_pos(
/*=================*/
/* out: BTR_PCUR_ON, ... */
btr_pcur_t* cursor) /* in: persistent cursor */
{
ut_ad(cursor);
ut_ad(cursor->old_rec);
ut_ad(cursor->old_stored == BTR_PCUR_OLD_STORED);
ut_ad((cursor->pos_state == BTR_PCUR_WAS_POSITIONED)
|| (cursor->pos_state == BTR_PCUR_IS_POSITIONED));
return(cursor->rel_pos);
}
/*************************************************************
Sets the mtr field for a pcur. */
UNIV_INLINE
void
btr_pcur_set_mtr(
/*=============*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in, own: mtr */
{
ut_ad(cursor);
cursor->mtr = mtr;
}
/*************************************************************
Gets the mtr field for a pcur. */
UNIV_INLINE
mtr_t*
btr_pcur_get_mtr(
/*=============*/
/* out: mtr */
btr_pcur_t* cursor) /* in: persistent cursor */
{
ut_ad(cursor);
return(cursor->mtr);
}
/*************************************************************
Returns the btr cursor component of a persistent cursor. */
UNIV_INLINE
btr_cur_t*
btr_pcur_get_btr_cur(
/*=================*/
/* out: pointer to btr cursor component */
btr_pcur_t* cursor) /* in: persistent cursor */
{
return(&(cursor->btr_cur));
}
/*************************************************************
Returns the page cursor component of a persistent cursor. */
UNIV_INLINE
page_cur_t*
btr_pcur_get_page_cur(
/*==================*/
/* out: pointer to page cursor component */
btr_pcur_t* cursor) /* in: persistent cursor */
{
return(btr_cur_get_page_cur(&(cursor->btr_cur)));
}
/*************************************************************
Returns the page of a persistent cursor. */
UNIV_INLINE
page_t*
btr_pcur_get_page(
/*==============*/
/* out: pointer to the page */
btr_pcur_t* cursor) /* in: persistent cursor */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
return(page_cur_get_page(btr_pcur_get_page_cur(cursor)));
}
/*************************************************************
Returns the record of a persistent cursor. */
UNIV_INLINE
rec_t*
btr_pcur_get_rec(
/*=============*/
/* out: pointer to the record */
btr_pcur_t* cursor) /* in: persistent cursor */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
return(page_cur_get_rec(btr_pcur_get_page_cur(cursor)));
}
/******************************************************************
Gets the up_match value for a pcur after a search. */
UNIV_INLINE
ulint
btr_pcur_get_up_match(
/*==================*/
/* out: number of matched fields at the cursor
or to the right if search mode was PAGE_CUR_GE,
otherwise undefined */
btr_pcur_t* cursor) /* in: memory buffer for persistent cursor */
{
btr_cur_t* btr_cursor;
ut_ad((cursor->pos_state == BTR_PCUR_WAS_POSITIONED)
|| (cursor->pos_state == BTR_PCUR_IS_POSITIONED));
btr_cursor = btr_pcur_get_btr_cur(cursor);
ut_ad(btr_cursor->up_match != ULINT_UNDEFINED);
return(btr_cursor->up_match);
}
/******************************************************************
Gets the low_match value for a pcur after a search. */
UNIV_INLINE
ulint
btr_pcur_get_low_match(
/*===================*/
/* out: number of matched fields at the cursor
or to the right if search mode was PAGE_CUR_LE,
otherwise undefined */
btr_pcur_t* cursor) /* in: memory buffer for persistent cursor */
{
btr_cur_t* btr_cursor;
ut_ad((cursor->pos_state == BTR_PCUR_WAS_POSITIONED)
|| (cursor->pos_state == BTR_PCUR_IS_POSITIONED));
btr_cursor = btr_pcur_get_btr_cur(cursor);
ut_ad(btr_cursor->low_match != ULINT_UNDEFINED);
return(btr_cursor->low_match);
}
/*************************************************************
Checks if the persistent cursor is after the last user record on
a page. */
UNIV_INLINE
ibool
btr_pcur_is_after_last_on_page(
/*===========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
UT_NOT_USED(mtr);
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
return(page_cur_is_after_last(btr_pcur_get_page_cur(cursor)));
}
/*************************************************************
Checks if the persistent cursor is before the first user record on
a page. */
UNIV_INLINE
ibool
btr_pcur_is_before_first_on_page(
/*=============================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
UT_NOT_USED(mtr);
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
return(page_cur_is_before_first(btr_pcur_get_page_cur(cursor)));
}
/*************************************************************
Checks if the persistent cursor is on a user record. */
UNIV_INLINE
ibool
btr_pcur_is_on_user_rec(
/*====================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
if ((btr_pcur_is_before_first_on_page(cursor, mtr))
|| (btr_pcur_is_after_last_on_page(cursor, mtr))) {
return(FALSE);
}
return(TRUE);
}
/*************************************************************
Checks if the persistent cursor is before the first user record in
the index tree. */
UNIV_INLINE
ibool
btr_pcur_is_before_first_in_tree(
/*=============================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
if (btr_page_get_prev(btr_pcur_get_page(cursor), mtr) != FIL_NULL) {
return(FALSE);
}
return(page_cur_is_before_first(btr_pcur_get_page_cur(cursor)));
}
/*************************************************************
Checks if the persistent cursor is after the last user record in
the index tree. */
UNIV_INLINE
ibool
btr_pcur_is_after_last_in_tree(
/*===========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
if (btr_page_get_next(btr_pcur_get_page(cursor), mtr) != FIL_NULL) {
return(FALSE);
}
return(page_cur_is_after_last(btr_pcur_get_page_cur(cursor)));
}
/*************************************************************
Moves the persistent cursor to the next record on the same page. */
UNIV_INLINE
void
btr_pcur_move_to_next_on_page(
/*==========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
UT_NOT_USED(mtr);
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
page_cur_move_to_next(btr_pcur_get_page_cur(cursor));
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
}
/*************************************************************
Moves the persistent cursor to the previous record on the same page. */
UNIV_INLINE
void
btr_pcur_move_to_prev_on_page(
/*==========================*/
btr_pcur_t* cursor, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr */
{
UT_NOT_USED(mtr);
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
page_cur_move_to_prev(btr_pcur_get_page_cur(cursor));
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
}
/*************************************************************
Moves the persistent cursor to the next user record in the tree. If no user
records are left, the cursor ends up 'after last in tree'. */
UNIV_INLINE
ibool
btr_pcur_move_to_next_user_rec(
/*===========================*/
/* out: TRUE if the cursor moved forward,
ending on a user record */
btr_pcur_t* cursor, /* in: persistent cursor; NOTE that the
function may release the page latch */
mtr_t* mtr) /* in: mtr */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
loop:
if (btr_pcur_is_after_last_on_page(cursor, mtr)) {
if (btr_pcur_is_after_last_in_tree(cursor, mtr)) {
return(FALSE);
}
btr_pcur_move_to_next_page(cursor, mtr);
} else {
btr_pcur_move_to_next_on_page(cursor, mtr);
}
if (btr_pcur_is_on_user_rec(cursor, mtr)) {
return(TRUE);
}
goto loop;
}
/*************************************************************
Moves the persistent cursor to the next record in the tree. If no records are
left, the cursor stays 'after last in tree'. */
UNIV_INLINE
ibool
btr_pcur_move_to_next(
/*==================*/
/* out: TRUE if the cursor was not after last
in tree */
btr_pcur_t* cursor, /* in: persistent cursor; NOTE that the
function may release the page latch */
mtr_t* mtr) /* in: mtr */
{
ut_ad(cursor->pos_state == BTR_PCUR_IS_POSITIONED);
ut_ad(cursor->latch_mode != BTR_NO_LATCHES);
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
if (btr_pcur_is_after_last_on_page(cursor, mtr)) {
if (btr_pcur_is_after_last_in_tree(cursor, mtr)) {
return(FALSE);
}
btr_pcur_move_to_next_page(cursor, mtr);
return(TRUE);
}
btr_pcur_move_to_next_on_page(cursor, mtr);
return(TRUE);
}
/******************************************************************
Commits the pcur mtr and sets the pcur latch mode to BTR_NO_LATCHES,
that is, the cursor becomes detached. If there have been modifications
to the page where pcur is positioned, this can be used instead of
btr_pcur_release_leaf. Function btr_pcur_store_position should be used
before calling this, if restoration of cursor is wanted later. */
UNIV_INLINE
void
btr_pcur_commit(
/*============*/
btr_pcur_t* pcur) /* in: persistent cursor */
{
ut_a(pcur->pos_state == BTR_PCUR_IS_POSITIONED);
pcur->latch_mode = BTR_NO_LATCHES;
mtr_commit(pcur->mtr);
pcur->pos_state = BTR_PCUR_WAS_POSITIONED;
}
/******************************************************************
Differs from btr_pcur_commit in that we can specify the mtr to commit. */
UNIV_INLINE
void
btr_pcur_commit_specify_mtr(
/*========================*/
btr_pcur_t* pcur, /* in: persistent cursor */
mtr_t* mtr) /* in: mtr to commit */
{
ut_a(pcur->pos_state == BTR_PCUR_IS_POSITIONED);
pcur->latch_mode = BTR_NO_LATCHES;
mtr_commit(mtr);
pcur->pos_state = BTR_PCUR_WAS_POSITIONED;
}
/******************************************************************
Sets the pcur latch mode to BTR_NO_LATCHES. */
UNIV_INLINE
void
btr_pcur_detach(
/*============*/
btr_pcur_t* pcur) /* in: persistent cursor */
{
ut_a(pcur->pos_state == BTR_PCUR_IS_POSITIONED);
pcur->latch_mode = BTR_NO_LATCHES;
pcur->pos_state = BTR_PCUR_WAS_POSITIONED;
}
/******************************************************************
Tests if a cursor is detached: that is the latch mode is BTR_NO_LATCHES. */
UNIV_INLINE
ibool
btr_pcur_is_detached(
/*=================*/
/* out: TRUE if detached */
btr_pcur_t* pcur) /* in: persistent cursor */
{
if (pcur->latch_mode == BTR_NO_LATCHES) {
return(TRUE);
}
return(FALSE);
}
/******************************************************************
Sets the old_rec_buf field to NULL. */
UNIV_INLINE
void
btr_pcur_init(
/*==========*/
btr_pcur_t* pcur) /* in: persistent cursor */
{
pcur->old_stored = BTR_PCUR_OLD_NOT_STORED;
pcur->old_rec_buf = NULL;
pcur->old_rec = NULL;
}
/******************************************************************
Initializes and opens a persistent cursor to an index tree. It should be
closed with btr_pcur_close. */
UNIV_INLINE
void
btr_pcur_open(
/*==========*/
dict_index_t* index, /* in: index */
dtuple_t* tuple, /* in: tuple on which search done */
ulint mode, /* in: PAGE_CUR_L, ...;
NOTE that if the search is made using a unique
prefix of a record, mode should be
PAGE_CUR_LE, not PAGE_CUR_GE, as the latter
may end up on the previous page from the
record! */
ulint latch_mode,/* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in: memory buffer for persistent cursor */
mtr_t* mtr) /* in: mtr */
{
btr_cur_t* btr_cursor;
/* Initialize the cursor */
btr_pcur_init(cursor);
cursor->latch_mode = latch_mode;
cursor->search_mode = mode;
/* Search with the tree cursor */
btr_cursor = btr_pcur_get_btr_cur(cursor);
btr_cur_search_to_nth_level(index, 0, tuple, mode, latch_mode,
btr_cursor, 0, mtr);
cursor->pos_state = BTR_PCUR_IS_POSITIONED;
}
/******************************************************************
Opens an persistent cursor to an index tree without initializing the
cursor. */
UNIV_INLINE
void
btr_pcur_open_with_no_init(
/*=======================*/
dict_index_t* index, /* in: index */
dtuple_t* tuple, /* in: tuple on which search done */
ulint mode, /* in: PAGE_CUR_L, ...;
NOTE that if the search is made using a unique
prefix of a record, mode should be
PAGE_CUR_LE, not PAGE_CUR_GE, as the latter
may end up on the previous page of the
record! */
ulint latch_mode,/* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in: memory buffer for persistent cursor */
ulint has_search_latch,/* in: latch mode the caller
currently has on btr_search_latch:
RW_S_LATCH, or 0 */
mtr_t* mtr) /* in: mtr */
{
btr_cur_t* btr_cursor;
cursor->latch_mode = latch_mode;
cursor->search_mode = mode;
/* Search with the tree cursor */
btr_cursor = btr_pcur_get_btr_cur(cursor);
btr_cur_search_to_nth_level(index, 0, tuple, mode, latch_mode,
btr_cursor, has_search_latch, mtr);
cursor->pos_state = BTR_PCUR_IS_POSITIONED;
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
}
/*********************************************************************
Opens a persistent cursor at either end of an index. */
UNIV_INLINE
void
btr_pcur_open_at_index_side(
/*========================*/
ibool from_left, /* in: TRUE if open to the low end,
FALSE if to the high end */
dict_index_t* index, /* in: index */
ulint latch_mode, /* in: latch mode */
btr_pcur_t* pcur, /* in: cursor */
ibool do_init, /* in: TRUE if should be initialized */
mtr_t* mtr) /* in: mtr */
{
pcur->latch_mode = latch_mode;
if (from_left) {
pcur->search_mode = PAGE_CUR_G;
} else {
pcur->search_mode = PAGE_CUR_L;
}
if (do_init) {
btr_pcur_init(pcur);
}
btr_cur_open_at_index_side(from_left, index, latch_mode,
btr_pcur_get_btr_cur(pcur), mtr);
pcur->pos_state = BTR_PCUR_IS_POSITIONED;
pcur->old_stored = BTR_PCUR_OLD_NOT_STORED;
}
/**************************************************************************
Positions a cursor at a randomly chosen position within a B-tree. */
UNIV_INLINE
void
btr_pcur_open_at_rnd_pos(
/*=====================*/
dict_index_t* index, /* in: index */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /* in/out: B-tree pcur */
mtr_t* mtr) /* in: mtr */
{
/* Initialize the cursor */
cursor->latch_mode = latch_mode;
cursor->search_mode = PAGE_CUR_G;
btr_pcur_init(cursor);
btr_cur_open_at_rnd_pos(index, latch_mode,
btr_pcur_get_btr_cur(cursor), mtr);
cursor->pos_state = BTR_PCUR_IS_POSITIONED;
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
}
/******************************************************************
Frees the possible memory heap of a persistent cursor and sets the latch
mode of the persistent cursor to BTR_NO_LATCHES. */
UNIV_INLINE
void
btr_pcur_close(
/*===========*/
btr_pcur_t* cursor) /* in: persistent cursor */
{
if (cursor->old_rec_buf != NULL) {
mem_free(cursor->old_rec_buf);
cursor->old_rec = NULL;
cursor->old_rec_buf = NULL;
}
cursor->btr_cur.page_cur.rec = NULL;
cursor->old_rec = NULL;
cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
cursor->latch_mode = BTR_NO_LATCHES;
cursor->pos_state = BTR_PCUR_NOT_POSITIONED;
}

269
innobase/include/btr0sea.h Normal file
View file

@ -0,0 +1,269 @@
/************************************************************************
The index tree adaptive search
(c) 1996 Innobase Oy
Created 2/17/1996 Heikki Tuuri
*************************************************************************/
#ifndef btr0sea_h
#define btr0sea_h
#include "univ.i"
#include "rem0rec.h"
#include "dict0dict.h"
#include "btr0types.h"
#include "mtr0mtr.h"
#include "ha0ha.h"
/*********************************************************************
Creates and initializes the adaptive search system at a database start. */
void
btr_search_sys_create(
/*==================*/
ulint hash_size); /* in: hash index hash table size */
/************************************************************************
Returns search info for an index. */
UNIV_INLINE
btr_search_t*
btr_search_get_info(
/*================*/
/* out: search info; search mutex reserved */
dict_index_t* index); /* in: index */
/*********************************************************************
Creates and initializes a search info struct. */
btr_search_t*
btr_search_info_create(
/*===================*/
/* out, own: search info struct */
mem_heap_t* heap); /* in: heap where created */
/*************************************************************************
Updates the search info. */
UNIV_INLINE
void
btr_search_info_update(
/*===================*/
dict_index_t* index, /* in: index of the cursor */
btr_cur_t* cursor);/* in: cursor which was just positioned */
/**********************************************************************
Tries to guess the right search position based on the search pattern info
of the index. */
ibool
btr_search_guess_on_pattern(
/*========================*/
/* out: TRUE if succeeded */
dict_index_t* index, /* in: index */
btr_search_t* info, /* in: index search info */
dtuple_t* tuple, /* in: logical record */
ulint mode, /* in: PAGE_CUR_L, ... */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ... */
btr_cur_t* cursor, /* out: tree cursor */
mtr_t* mtr); /* in: mtr */
/**********************************************************************
Tries to guess the right search position based on the hash search info
of the index. Note that if mode is PAGE_CUR_LE, which is used in inserts,
and the function returns TRUE, then cursor->up_match and cursor->low_match
both have sensible values. */
ibool
btr_search_guess_on_hash(
/*=====================*/
/* out: TRUE if succeeded */
dict_index_t* index, /* in: index */
btr_search_t* info, /* in: index search info */
dtuple_t* tuple, /* in: logical record */
ulint mode, /* in: PAGE_CUR_L, ... */
ulint latch_mode, /* in: BTR_SEARCH_LEAF, ... */
btr_cur_t* cursor, /* out: tree cursor */
ulint has_search_latch,/* in: latch mode the caller
currently has on btr_search_latch:
RW_S_LATCH, RW_X_LATCH, or 0 */
mtr_t* mtr); /* in: mtr */
/************************************************************************
Moves or deletes hash entries for moved records. If new_page is already hashed,
then the hash index for page, if any, is dropped. If new_page is not hashed,
and page is hashed, then a new hash index is built to new_page with the same
parameters as page (this often happens when a page is split). */
void
btr_search_move_or_delete_hash_entries(
/*===================================*/
page_t* new_page, /* in: records are copied to this page */
page_t* page); /* in: index page */
/************************************************************************
Drops a page hash index. */
void
btr_search_drop_page_hash_index(
/*============================*/
page_t* page); /* in: index page, s- or x-latched */
/************************************************************************
Drops a page hash index when a page is freed from a fseg to the file system.
Drops possible hash index if the page happens to be in the buffer pool. */
void
btr_search_drop_page_hash_when_freed(
/*=================================*/
ulint space, /* in: space id */
ulint page_no); /* in: page number */
/************************************************************************
Updates the page hash index when a single record is inserted on a page. */
void
btr_search_update_hash_node_on_insert(
/*==================================*/
btr_cur_t* cursor);/* in: cursor which was positioned to the
place to insert using btr_cur_search_...,
and the new record has been inserted next
to the cursor */
/************************************************************************
Updates the page hash index when a single record is inserted on a page. */
void
btr_search_update_hash_on_insert(
/*=============================*/
btr_cur_t* cursor);/* in: cursor which was positioned to the
place to insert using btr_cur_search_...,
and the new record has been inserted next
to the cursor */
/************************************************************************
Updates the page hash index when a single record is deleted from a page. */
void
btr_search_update_hash_on_delete(
/*=============================*/
btr_cur_t* cursor);/* in: cursor which was positioned on the
record to delete using btr_cur_search_...,
the record is not yet deleted */
/************************************************************************
Prints info of the search system. */
void
btr_search_print_info(void);
/*=======================*/
/************************************************************************
Prints info of searches on an index. */
void
btr_search_index_print_info(
/*========================*/
dict_index_t* index); /* in: index */
/************************************************************************
Prints info of searches on a table. */
void
btr_search_table_print_info(
/*========================*/
char* name); /* in: table name */
/************************************************************************
Validates the search system. */
ibool
btr_search_validate(void);
/*=====================*/
/* Search info directions */
#define BTR_SEA_NO_DIRECTION 1
#define BTR_SEA_LEFT 2
#define BTR_SEA_RIGHT 3
#define BTR_SEA_SAME_REC 4
/* The search info struct in an index */
struct btr_search_struct{
/* The following 4 fields are currently not used: */
rec_t* last_search; /* pointer to the lower limit record of the
previous search; NULL if not known */
ulint n_direction; /* number of consecutive searches in the
same direction */
ulint direction; /* BTR_SEA_NO_DIRECTION, BTR_SEA_LEFT,
BTR_SEA_RIGHT, BTR_SEA_SAME_REC,
or BTR_SEA_SAME_PAGE */
dulint modify_clock; /* value of modify clock at the time
last_search was stored */
/*----------------------*/
/* The following 4 fields are not protected by any latch: */
page_t* root_guess; /* the root page frame when it was last time
fetched, or NULL */
ulint hash_analysis; /* when this exceeds a certain value, the
hash analysis starts; this is reset if no
success noticed */
ibool last_hash_succ; /* TRUE if the last search would have
succeeded, or did succeed, using the hash
index; NOTE that the value here is not exact:
it is not calculated for every search, and the
calculation itself is not always accurate! */
ulint n_hash_potential;/* number of consecutive searches which would
have succeeded, or did succeed, using the hash
index */
/*----------------------*/
ulint n_fields; /* recommended prefix length for hash search:
number of full fields */
ulint n_bytes; /* recommended prefix: number of bytes in
an incomplete field */
ulint side; /* BTR_SEARCH_LEFT_SIDE or
BTR_SEARCH_RIGHT_SIDE, depending on whether
the leftmost record of several records with
the same prefix should be indexed in the
hash index */
/*----------------------*/
ulint n_hash_succ; /* number of successful hash searches thus
far */
ulint n_hash_fail; /* number of failed hash searches */
ulint n_patt_succ; /* number of successful pattern searches thus
far */
ulint n_searches; /* number of searches */
};
/* The hash index system */
typedef struct btr_search_sys_struct btr_search_sys_t;
struct btr_search_sys_struct{
hash_table_t* hash_index;
};
extern btr_search_sys_t* btr_search_sys;
/* The latch protecting the adaptive search system: this latch protects the
(1) positions of records on those pages where a hash index has been built.
NOTE: It does not protect values of non-ordering fields within a record from
being updated in-place! We can use fact (1) to perform unique searches to
indexes. */
extern rw_lock_t* btr_search_latch_temp;
#define btr_search_latch (*btr_search_latch_temp)
extern ulint btr_search_n_succ;
extern ulint btr_search_n_hash_fail;
/* After change in n_fields or n_bytes in info, this many rounds are waited
before starting the hash analysis again: this is to save CPU time when there
is no hope in building a hash index. */
#define BTR_SEARCH_HASH_ANALYSIS 17
#define BTR_SEARCH_LEFT_SIDE 1
#define BTR_SEARCH_RIGHT_SIDE 2
/* Limit of consecutive searches for trying a search shortcut on the search
pattern */
#define BTR_SEARCH_ON_PATTERN_LIMIT 3
/* Limit of consecutive searches for trying a search shortcut using the hash
index */
#define BTR_SEARCH_ON_HASH_LIMIT 3
#ifndef UNIV_NONINL
#include "btr0sea.ic"
#endif
#endif

View file

@ -0,0 +1,65 @@
/************************************************************************
The index tree adaptive search
(c) 1996 Innobase Oy
Created 2/17/1996 Heikki Tuuri
*************************************************************************/
#include "dict0mem.h"
#include "btr0cur.h"
#include "buf0buf.h"
/*************************************************************************
Updates the search info. */
void
btr_search_info_update_slow(
/*========================*/
btr_search_t* info, /* in: search info */
btr_cur_t* cursor);/* in: cursor which was just positioned */
/************************************************************************
Returns search info for an index. */
UNIV_INLINE
btr_search_t*
btr_search_get_info(
/*================*/
/* out: search info; search mutex reserved */
dict_index_t* index) /* in: index */
{
ut_ad(index);
return(index->search_info);
}
/*************************************************************************
Updates the search info. */
UNIV_INLINE
void
btr_search_info_update(
/*===================*/
dict_index_t* index, /* in: index of the cursor */
btr_cur_t* cursor) /* in: cursor which was just positioned */
{
btr_search_t* info;
ut_ad(!rw_lock_own(&btr_search_latch, RW_LOCK_SHARED)
&& !rw_lock_own(&btr_search_latch, RW_LOCK_EX));
info = btr_search_get_info(index);
info->hash_analysis++;
if (info->hash_analysis < BTR_SEARCH_HASH_ANALYSIS) {
/* Do nothing */
return;
}
ut_ad(cursor->flag != BTR_CUR_HASH);
btr_search_info_update_slow(info, cursor);
}

Some files were not shown because too many files have changed in this diff Show more