mirror of
https://github.com/MariaDB/server.git
synced 2025-01-17 20:42:30 +01:00
1a96259191
- WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
1159 lines
38 KiB
C
1159 lines
38 KiB
C
/* Copyright (C) 2006 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; version 2 of the License.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program; if not, write to the Free Software
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
|
|
/* Create a MARIA table */
|
|
|
|
#include "ma_ftdefs.h"
|
|
#include "ma_sp_defs.h"
|
|
#include <my_bit.h>
|
|
#include "ma_blockrec.h"
|
|
#include "trnman_public.h"
|
|
|
|
#if defined(MSDOS) || defined(__WIN__)
|
|
#ifdef __WIN__
|
|
#include <fcntl.h>
|
|
#else
|
|
#include <process.h> /* Prototype for getpid */
|
|
#endif
|
|
#endif
|
|
#include <m_ctype.h>
|
|
|
|
static int compare_columns(MARIA_COLUMNDEF **a, MARIA_COLUMNDEF **b);
|
|
|
|
/*
|
|
Old options is used when recreating database, from maria_chk
|
|
*/
|
|
|
|
int maria_create(const char *name, enum data_file_type datafile_type,
|
|
uint keys,MARIA_KEYDEF *keydefs,
|
|
uint columns, MARIA_COLUMNDEF *columndef,
|
|
uint uniques, MARIA_UNIQUEDEF *uniquedefs,
|
|
MARIA_CREATE_INFO *ci,uint flags)
|
|
{
|
|
register uint i,j;
|
|
File dfile,file;
|
|
int errpos,save_errno, create_mode= O_RDWR | O_TRUNC, res;
|
|
myf create_flag;
|
|
uint length,max_key_length,packed,pack_bytes,pointer,real_length_diff,
|
|
key_length,info_length,key_segs,options,min_key_length_skip,
|
|
base_pos,long_varchar_count,varchar_length,
|
|
unique_key_parts,fulltext_keys,offset, not_block_record_extra_length;
|
|
uint max_field_lengths, extra_header_size;
|
|
ulong reclength, real_reclength,min_pack_length;
|
|
char filename[FN_REFLEN], dlinkname[FN_REFLEN], *dlinkname_ptr= NULL,
|
|
klinkname[FN_REFLEN], *klinkname_ptr= NULL;
|
|
ulong pack_reclength;
|
|
ulonglong tot_length,max_rows, tmp;
|
|
enum en_fieldtype type;
|
|
enum data_file_type org_datafile_type= datafile_type;
|
|
MARIA_SHARE share;
|
|
MARIA_KEYDEF *keydef,tmp_keydef;
|
|
MARIA_UNIQUEDEF *uniquedef;
|
|
HA_KEYSEG *keyseg,tmp_keyseg;
|
|
MARIA_COLUMNDEF *column, *end_column;
|
|
ulong *rec_per_key_part;
|
|
my_off_t key_root[HA_MAX_POSSIBLE_KEY], kfile_size_before_extension;
|
|
MARIA_CREATE_INFO tmp_create_info;
|
|
my_bool tmp_table= FALSE; /* cache for presence of HA_OPTION_TMP_TABLE */
|
|
my_bool forced_packed;
|
|
myf sync_dir= 0;
|
|
uchar *log_data= NULL;
|
|
DBUG_ENTER("maria_create");
|
|
DBUG_PRINT("enter", ("keys: %u columns: %u uniques: %u flags: %u",
|
|
keys, columns, uniques, flags));
|
|
|
|
DBUG_ASSERT(maria_block_size && maria_block_size % IO_SIZE == 0);
|
|
LINT_INIT(dfile);
|
|
LINT_INIT(file);
|
|
|
|
if (!ci)
|
|
{
|
|
bzero((char*) &tmp_create_info,sizeof(tmp_create_info));
|
|
ci=&tmp_create_info;
|
|
}
|
|
|
|
if (keys + uniques > MARIA_MAX_KEY || columns == 0)
|
|
{
|
|
DBUG_RETURN(my_errno=HA_WRONG_CREATE_OPTION);
|
|
}
|
|
errpos=0;
|
|
options=0;
|
|
bzero((byte*) &share,sizeof(share));
|
|
|
|
if (flags & HA_DONT_TOUCH_DATA)
|
|
{
|
|
org_datafile_type= ci->org_data_file_type;
|
|
if (!(ci->old_options & HA_OPTION_TEMP_COMPRESS_RECORD))
|
|
options=ci->old_options &
|
|
(HA_OPTION_COMPRESS_RECORD | HA_OPTION_PACK_RECORD |
|
|
HA_OPTION_READ_ONLY_DATA | HA_OPTION_CHECKSUM |
|
|
HA_OPTION_TMP_TABLE | HA_OPTION_DELAY_KEY_WRITE);
|
|
else
|
|
{
|
|
/* Uncompressing rows */
|
|
options=ci->old_options &
|
|
(HA_OPTION_CHECKSUM | HA_OPTION_TMP_TABLE | HA_OPTION_DELAY_KEY_WRITE);
|
|
}
|
|
}
|
|
|
|
if (ci->reloc_rows > ci->max_rows)
|
|
ci->reloc_rows=ci->max_rows; /* Check if wrong parameter */
|
|
|
|
if (!(rec_per_key_part=
|
|
(ulong*) my_malloc((keys + uniques)*HA_MAX_KEY_SEG*sizeof(long),
|
|
MYF(MY_WME | MY_ZEROFILL))))
|
|
DBUG_RETURN(my_errno);
|
|
|
|
/* Start by checking fields and field-types used */
|
|
|
|
varchar_length=long_varchar_count=packed= not_block_record_extra_length=
|
|
pack_reclength= max_field_lengths= 0;
|
|
reclength= min_pack_length= ci->null_bytes;
|
|
forced_packed= 0;
|
|
|
|
for (column= columndef, end_column= column + columns ;
|
|
column != end_column ;
|
|
column++)
|
|
{
|
|
/* Fill in not used struct parts */
|
|
column->offset= reclength;
|
|
column->empty_pos= 0;
|
|
column->empty_bit= 0;
|
|
column->fill_length= column->length;
|
|
|
|
reclength+= column->length;
|
|
type= column->type;
|
|
if (type == FIELD_SKIP_PRESPACE && datafile_type == BLOCK_RECORD)
|
|
type= FIELD_NORMAL; /* SKIP_PRESPACE not supported */
|
|
|
|
if (type != FIELD_NORMAL && type != FIELD_CHECK)
|
|
{
|
|
column->empty_pos= packed/8;
|
|
column->empty_bit= (1 << (packed & 7));
|
|
if (type == FIELD_BLOB)
|
|
{
|
|
forced_packed= 1;
|
|
packed++;
|
|
share.base.blobs++;
|
|
if (pack_reclength != INT_MAX32)
|
|
{
|
|
if (column->length == 4+portable_sizeof_char_ptr)
|
|
pack_reclength= INT_MAX32;
|
|
else
|
|
{
|
|
/* Add max possible blob length */
|
|
pack_reclength+= (1 << ((column->length-
|
|
portable_sizeof_char_ptr)*8));
|
|
}
|
|
}
|
|
max_field_lengths+= (column->length - portable_sizeof_char_ptr);
|
|
}
|
|
else if (type == FIELD_SKIP_PRESPACE ||
|
|
type == FIELD_SKIP_ENDSPACE)
|
|
{
|
|
forced_packed= 1;
|
|
max_field_lengths+= column->length > 255 ? 2 : 1;
|
|
not_block_record_extra_length++;
|
|
packed++;
|
|
}
|
|
else if (type == FIELD_VARCHAR)
|
|
{
|
|
varchar_length+= column->length-1; /* Used for min_pack_length */
|
|
pack_reclength++;
|
|
not_block_record_extra_length++;
|
|
max_field_lengths++;
|
|
packed++;
|
|
column->fill_length= 1;
|
|
/* We must test for 257 as length includes pack-length */
|
|
if (test(column->length >= 257))
|
|
{
|
|
long_varchar_count++;
|
|
max_field_lengths++;
|
|
column->fill_length= 2;
|
|
}
|
|
}
|
|
else if (type == FIELD_SKIP_ZERO)
|
|
packed++;
|
|
else
|
|
{
|
|
if (!column->null_bit)
|
|
min_pack_length+= column->length;
|
|
else
|
|
not_block_record_extra_length+= column->length;
|
|
column->empty_pos= 0;
|
|
column->empty_bit= 0;
|
|
}
|
|
}
|
|
else /* FIELD_NORMAL */
|
|
{
|
|
if (!column->null_bit)
|
|
{
|
|
min_pack_length+= column->length;
|
|
share.base.fixed_not_null_fields++;
|
|
share.base.fixed_not_null_fields_length+= column->length;
|
|
}
|
|
else
|
|
not_block_record_extra_length+= column->length;
|
|
}
|
|
}
|
|
|
|
if (datafile_type == STATIC_RECORD && forced_packed)
|
|
{
|
|
/* Can't use fixed length records, revert to block records */
|
|
datafile_type= BLOCK_RECORD;
|
|
}
|
|
|
|
if (datafile_type == DYNAMIC_RECORD)
|
|
options|= HA_OPTION_PACK_RECORD; /* Must use packed records */
|
|
|
|
if (datafile_type == STATIC_RECORD)
|
|
{
|
|
/* We can't use checksum with static length rows */
|
|
flags&= ~HA_CREATE_CHECKSUM;
|
|
options&= ~HA_OPTION_CHECKSUM;
|
|
min_pack_length+= varchar_length;
|
|
packed= 0;
|
|
}
|
|
if (datafile_type != BLOCK_RECORD)
|
|
min_pack_length+= not_block_record_extra_length;
|
|
|
|
if ((packed & 7) == 1)
|
|
{
|
|
/*
|
|
Not optimal packing, try to remove a 1 byte length zero-field as
|
|
this will get same record length, but smaller pack overhead
|
|
*/
|
|
while (column != columndef)
|
|
{
|
|
column--;
|
|
if (column->type == (int) FIELD_SKIP_ZERO && column->length == 1)
|
|
{
|
|
column->type=(int) FIELD_NORMAL;
|
|
column->empty_pos= 0;
|
|
column->empty_bit= 0;
|
|
packed--;
|
|
min_pack_length++;
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
if (flags & HA_CREATE_TMP_TABLE)
|
|
{
|
|
options|= HA_OPTION_TMP_TABLE;
|
|
tmp_table= TRUE;
|
|
create_mode|= O_EXCL | O_NOFOLLOW;
|
|
/* "CREATE TEMPORARY" tables are not crash-safe (dropped at restart) */
|
|
ci->transactional= FALSE;
|
|
}
|
|
share.base.null_bytes= ci->null_bytes;
|
|
share.base.original_null_bytes= ci->null_bytes;
|
|
share.base.transactional= ci->transactional;
|
|
share.base.max_field_lengths= max_field_lengths;
|
|
share.base.field_offsets= 0; /* for future */
|
|
|
|
if (pack_reclength != INT_MAX32)
|
|
pack_reclength+= max_field_lengths + long_varchar_count;
|
|
|
|
if (flags & HA_CREATE_CHECKSUM || (options & HA_OPTION_CHECKSUM))
|
|
{
|
|
options|= HA_OPTION_CHECKSUM;
|
|
min_pack_length++;
|
|
pack_reclength++;
|
|
}
|
|
if (flags & HA_CREATE_DELAY_KEY_WRITE)
|
|
options|= HA_OPTION_DELAY_KEY_WRITE;
|
|
if (flags & HA_CREATE_RELIES_ON_SQL_LAYER)
|
|
options|= HA_OPTION_RELIES_ON_SQL_LAYER;
|
|
|
|
pack_bytes= (packed + 7) / 8;
|
|
if (pack_reclength != INT_MAX32)
|
|
pack_reclength+= reclength+pack_bytes +
|
|
test(test_all_bits(options, HA_OPTION_CHECKSUM | HA_PACK_RECORD));
|
|
min_pack_length+= pack_bytes;
|
|
/* Calculate min possible row length for rows-in-block */
|
|
extra_header_size= MAX_FIXED_HEADER_SIZE;
|
|
if (ci->transactional)
|
|
extra_header_size= TRANS_MAX_FIXED_HEADER_SIZE;
|
|
share.base.min_row_length= (extra_header_size + share.base.null_bytes +
|
|
pack_bytes);
|
|
if (!ci->data_file_length && ci->max_rows)
|
|
{
|
|
if (pack_reclength == INT_MAX32 ||
|
|
(~(ulonglong) 0)/ci->max_rows < (ulonglong) pack_reclength)
|
|
ci->data_file_length= ~(ulonglong) 0;
|
|
else
|
|
ci->data_file_length=(ulonglong) ci->max_rows*pack_reclength;
|
|
}
|
|
else if (!ci->max_rows)
|
|
{
|
|
if (datafile_type == BLOCK_RECORD)
|
|
{
|
|
uint rows_per_page= ((maria_block_size - PAGE_OVERHEAD_SIZE) /
|
|
(min_pack_length + extra_header_size +
|
|
DIR_ENTRY_SIZE));
|
|
ulonglong data_file_length= ci->data_file_length;
|
|
if (!data_file_length)
|
|
data_file_length= ((((ulonglong) 1 << ((BLOCK_RECORD_POINTER_SIZE-1) *
|
|
8)) -1));
|
|
if (rows_per_page > 0)
|
|
{
|
|
set_if_smaller(rows_per_page, MAX_ROWS_PER_PAGE);
|
|
ci->max_rows= data_file_length / maria_block_size * rows_per_page;
|
|
}
|
|
else
|
|
ci->max_rows= data_file_length / (min_pack_length +
|
|
extra_header_size +
|
|
DIR_ENTRY_SIZE);
|
|
}
|
|
else
|
|
ci->max_rows=(ha_rows) (ci->data_file_length/(min_pack_length +
|
|
((options &
|
|
HA_OPTION_PACK_RECORD) ?
|
|
3 : 0)));
|
|
}
|
|
max_rows= (ulonglong) ci->max_rows;
|
|
if (datafile_type == BLOCK_RECORD)
|
|
{
|
|
/* The + 1 is for record position withing page */
|
|
pointer= maria_get_pointer_length((ci->data_file_length /
|
|
maria_block_size), 3) + 1;
|
|
set_if_smaller(pointer, BLOCK_RECORD_POINTER_SIZE);
|
|
|
|
if (!max_rows)
|
|
max_rows= (((((ulonglong) 1 << ((pointer-1)*8)) -1) * maria_block_size) /
|
|
min_pack_length);
|
|
}
|
|
else
|
|
{
|
|
if (datafile_type != STATIC_RECORD)
|
|
pointer= maria_get_pointer_length(ci->data_file_length,
|
|
maria_data_pointer_size);
|
|
else
|
|
pointer= maria_get_pointer_length(ci->max_rows, maria_data_pointer_size);
|
|
if (!max_rows)
|
|
max_rows= ((((ulonglong) 1 << (pointer*8)) -1) / min_pack_length);
|
|
}
|
|
|
|
real_reclength=reclength;
|
|
if (datafile_type == STATIC_RECORD)
|
|
{
|
|
if (reclength <= pointer)
|
|
reclength=pointer+1; /* reserve place for delete link */
|
|
}
|
|
else
|
|
reclength+= long_varchar_count; /* We need space for varchar! */
|
|
|
|
max_key_length=0; tot_length=0 ; key_segs=0;
|
|
fulltext_keys=0;
|
|
share.state.rec_per_key_part=rec_per_key_part;
|
|
share.state.key_root=key_root;
|
|
share.state.key_del= HA_OFFSET_ERROR;
|
|
if (uniques)
|
|
max_key_length= MARIA_UNIQUE_HASH_LENGTH + pointer;
|
|
|
|
for (i=0, keydef=keydefs ; i < keys ; i++ , keydef++)
|
|
{
|
|
share.state.key_root[i]= HA_OFFSET_ERROR;
|
|
min_key_length_skip=length=real_length_diff=0;
|
|
key_length=pointer;
|
|
if (keydef->flag & HA_SPATIAL)
|
|
{
|
|
#ifdef HAVE_SPATIAL
|
|
/* BAR TODO to support 3D and more dimensions in the future */
|
|
uint sp_segs=SPDIMS*2;
|
|
keydef->flag=HA_SPATIAL;
|
|
|
|
if (flags & HA_DONT_TOUCH_DATA)
|
|
{
|
|
/*
|
|
Called by maria_chk - i.e. table structure was taken from
|
|
MYI file and SPATIAL key *does have* additional sp_segs keysegs.
|
|
keydef->seg here points right at the GEOMETRY segment,
|
|
so we only need to decrease keydef->keysegs.
|
|
(see maria_recreate_table() in _ma_check.c)
|
|
*/
|
|
keydef->keysegs-=sp_segs-1;
|
|
}
|
|
|
|
for (j=0, keyseg=keydef->seg ; (int) j < keydef->keysegs ;
|
|
j++, keyseg++)
|
|
{
|
|
if (keyseg->type != HA_KEYTYPE_BINARY &&
|
|
keyseg->type != HA_KEYTYPE_VARBINARY1 &&
|
|
keyseg->type != HA_KEYTYPE_VARBINARY2)
|
|
{
|
|
my_errno=HA_WRONG_CREATE_OPTION;
|
|
goto err_no_lock;
|
|
}
|
|
}
|
|
keydef->keysegs+=sp_segs;
|
|
key_length+=SPLEN*sp_segs;
|
|
length++; /* At least one length byte */
|
|
min_key_length_skip+=SPLEN*2*SPDIMS;
|
|
#else
|
|
my_errno= HA_ERR_UNSUPPORTED;
|
|
goto err_no_lock;
|
|
#endif /*HAVE_SPATIAL*/
|
|
}
|
|
else if (keydef->flag & HA_FULLTEXT)
|
|
{
|
|
keydef->flag=HA_FULLTEXT | HA_PACK_KEY | HA_VAR_LENGTH_KEY;
|
|
options|=HA_OPTION_PACK_KEYS; /* Using packed keys */
|
|
|
|
for (j=0, keyseg=keydef->seg ; (int) j < keydef->keysegs ;
|
|
j++, keyseg++)
|
|
{
|
|
if (keyseg->type != HA_KEYTYPE_TEXT &&
|
|
keyseg->type != HA_KEYTYPE_VARTEXT1 &&
|
|
keyseg->type != HA_KEYTYPE_VARTEXT2)
|
|
{
|
|
my_errno=HA_WRONG_CREATE_OPTION;
|
|
goto err_no_lock;
|
|
}
|
|
if (!(keyseg->flag & HA_BLOB_PART) &&
|
|
(keyseg->type == HA_KEYTYPE_VARTEXT1 ||
|
|
keyseg->type == HA_KEYTYPE_VARTEXT2))
|
|
{
|
|
/* Make a flag that this is a VARCHAR */
|
|
keyseg->flag|= HA_VAR_LENGTH_PART;
|
|
/* Store in bit_start number of bytes used to pack the length */
|
|
keyseg->bit_start= ((keyseg->type == HA_KEYTYPE_VARTEXT1)?
|
|
1 : 2);
|
|
}
|
|
}
|
|
|
|
fulltext_keys++;
|
|
key_length+= HA_FT_MAXBYTELEN+HA_FT_WLEN;
|
|
length++; /* At least one length byte */
|
|
min_key_length_skip+=HA_FT_MAXBYTELEN;
|
|
real_length_diff=HA_FT_MAXBYTELEN-FT_MAX_WORD_LEN_FOR_SORT;
|
|
}
|
|
else
|
|
{
|
|
/* Test if prefix compression */
|
|
if (keydef->flag & HA_PACK_KEY)
|
|
{
|
|
/* Can't use space_compression on number keys */
|
|
if ((keydef->seg[0].flag & HA_SPACE_PACK) &&
|
|
keydef->seg[0].type == (int) HA_KEYTYPE_NUM)
|
|
keydef->seg[0].flag&= ~HA_SPACE_PACK;
|
|
|
|
/* Only use HA_PACK_KEY when first segment is a variable length key */
|
|
if (!(keydef->seg[0].flag & (HA_SPACE_PACK | HA_BLOB_PART |
|
|
HA_VAR_LENGTH_PART)))
|
|
{
|
|
/* pack relative to previous key */
|
|
keydef->flag&= ~HA_PACK_KEY;
|
|
keydef->flag|= HA_BINARY_PACK_KEY | HA_VAR_LENGTH_KEY;
|
|
}
|
|
else
|
|
{
|
|
keydef->seg[0].flag|=HA_PACK_KEY; /* for easyer intern test */
|
|
keydef->flag|=HA_VAR_LENGTH_KEY;
|
|
options|=HA_OPTION_PACK_KEYS; /* Using packed keys */
|
|
}
|
|
}
|
|
if (keydef->flag & HA_BINARY_PACK_KEY)
|
|
options|=HA_OPTION_PACK_KEYS; /* Using packed keys */
|
|
|
|
if (keydef->flag & HA_AUTO_KEY && ci->with_auto_increment)
|
|
share.base.auto_key=i+1;
|
|
for (j=0, keyseg=keydef->seg ; j < keydef->keysegs ; j++, keyseg++)
|
|
{
|
|
/* numbers are stored with high by first to make compression easier */
|
|
switch (keyseg->type) {
|
|
case HA_KEYTYPE_SHORT_INT:
|
|
case HA_KEYTYPE_LONG_INT:
|
|
case HA_KEYTYPE_FLOAT:
|
|
case HA_KEYTYPE_DOUBLE:
|
|
case HA_KEYTYPE_USHORT_INT:
|
|
case HA_KEYTYPE_ULONG_INT:
|
|
case HA_KEYTYPE_LONGLONG:
|
|
case HA_KEYTYPE_ULONGLONG:
|
|
case HA_KEYTYPE_INT24:
|
|
case HA_KEYTYPE_UINT24:
|
|
case HA_KEYTYPE_INT8:
|
|
keyseg->flag|= HA_SWAP_KEY;
|
|
break;
|
|
case HA_KEYTYPE_VARTEXT1:
|
|
case HA_KEYTYPE_VARTEXT2:
|
|
case HA_KEYTYPE_VARBINARY1:
|
|
case HA_KEYTYPE_VARBINARY2:
|
|
if (!(keyseg->flag & HA_BLOB_PART))
|
|
{
|
|
/* Make a flag that this is a VARCHAR */
|
|
keyseg->flag|= HA_VAR_LENGTH_PART;
|
|
/* Store in bit_start number of bytes used to pack the length */
|
|
keyseg->bit_start= ((keyseg->type == HA_KEYTYPE_VARTEXT1 ||
|
|
keyseg->type == HA_KEYTYPE_VARBINARY1) ?
|
|
1 : 2);
|
|
}
|
|
break;
|
|
default:
|
|
break;
|
|
}
|
|
if (keyseg->flag & HA_SPACE_PACK)
|
|
{
|
|
DBUG_ASSERT(!(keyseg->flag & HA_VAR_LENGTH_PART));
|
|
keydef->flag |= HA_SPACE_PACK_USED | HA_VAR_LENGTH_KEY;
|
|
options|=HA_OPTION_PACK_KEYS; /* Using packed keys */
|
|
length++; /* At least one length byte */
|
|
min_key_length_skip+=keyseg->length;
|
|
if (keyseg->length >= 255)
|
|
{ /* prefix may be 3 bytes */
|
|
min_key_length_skip+=2;
|
|
length+=2;
|
|
}
|
|
}
|
|
if (keyseg->flag & (HA_VAR_LENGTH_PART | HA_BLOB_PART))
|
|
{
|
|
DBUG_ASSERT(!test_all_bits(keyseg->flag,
|
|
(HA_VAR_LENGTH_PART | HA_BLOB_PART)));
|
|
keydef->flag|=HA_VAR_LENGTH_KEY;
|
|
length++; /* At least one length byte */
|
|
options|=HA_OPTION_PACK_KEYS; /* Using packed keys */
|
|
min_key_length_skip+=keyseg->length;
|
|
if (keyseg->length >= 255)
|
|
{ /* prefix may be 3 bytes */
|
|
min_key_length_skip+=2;
|
|
length+=2;
|
|
}
|
|
}
|
|
key_length+= keyseg->length;
|
|
if (keyseg->null_bit)
|
|
{
|
|
key_length++;
|
|
options|=HA_OPTION_PACK_KEYS;
|
|
keyseg->flag|=HA_NULL_PART;
|
|
keydef->flag|=HA_VAR_LENGTH_KEY | HA_NULL_PART_KEY;
|
|
}
|
|
}
|
|
} /* if HA_FULLTEXT */
|
|
key_segs+=keydef->keysegs;
|
|
if (keydef->keysegs > HA_MAX_KEY_SEG)
|
|
{
|
|
my_errno=HA_WRONG_CREATE_OPTION;
|
|
goto err_no_lock;
|
|
}
|
|
/*
|
|
key_segs may be 0 in the case when we only want to be able to
|
|
add on row into the table. This can happen with some DISTINCT queries
|
|
in MySQL
|
|
*/
|
|
if ((keydef->flag & (HA_NOSAME | HA_NULL_PART_KEY)) == HA_NOSAME &&
|
|
key_segs)
|
|
share.state.rec_per_key_part[key_segs-1]=1L;
|
|
length+=key_length;
|
|
/*
|
|
A key can't be longer than than half a index block (as we have
|
|
to be able to put at least 2 keys on an index block for the key
|
|
algorithms to work).
|
|
*/
|
|
if (length > maria_max_key_length())
|
|
{
|
|
my_errno=HA_WRONG_CREATE_OPTION;
|
|
goto err_no_lock;
|
|
}
|
|
keydef->block_length= maria_block_size;
|
|
keydef->keylength= (uint16) key_length;
|
|
keydef->minlength= (uint16) (length-min_key_length_skip);
|
|
keydef->maxlength= (uint16) length;
|
|
|
|
if (length > max_key_length)
|
|
max_key_length= length;
|
|
tot_length+= ((max_rows/(ulong) (((uint) maria_block_size-5)/
|
|
(length*2))) *
|
|
maria_block_size);
|
|
}
|
|
|
|
unique_key_parts=0;
|
|
offset=reclength-uniques*MARIA_UNIQUE_HASH_LENGTH;
|
|
for (i=0, uniquedef=uniquedefs ; i < uniques ; i++ , uniquedef++)
|
|
{
|
|
uniquedef->key=keys+i;
|
|
unique_key_parts+=uniquedef->keysegs;
|
|
share.state.key_root[keys+i]= HA_OFFSET_ERROR;
|
|
tot_length+= (max_rows/(ulong) (((uint) maria_block_size-5)/
|
|
((MARIA_UNIQUE_HASH_LENGTH + pointer)*2)))*
|
|
(ulong) maria_block_size;
|
|
}
|
|
keys+=uniques; /* Each unique has 1 key */
|
|
key_segs+=uniques; /* Each unique has 1 key seg */
|
|
|
|
base_pos=(MARIA_STATE_INFO_SIZE + keys * MARIA_STATE_KEY_SIZE +
|
|
key_segs * MARIA_STATE_KEYSEG_SIZE);
|
|
info_length= base_pos+(uint) (MARIA_BASE_INFO_SIZE+
|
|
keys * MARIA_KEYDEF_SIZE+
|
|
uniques * MARIA_UNIQUEDEF_SIZE +
|
|
(key_segs + unique_key_parts)*HA_KEYSEG_SIZE+
|
|
columns*MARIA_COLUMNDEF_SIZE);
|
|
|
|
DBUG_PRINT("info", ("info_length: %u", info_length));
|
|
/* There are only 16 bits for the total header length. */
|
|
if (info_length > 65535)
|
|
{
|
|
my_printf_error(0, "Maria table '%s' has too many columns and/or "
|
|
"indexes and/or unique constraints.",
|
|
MYF(0), name + dirname_length(name));
|
|
my_errno= HA_WRONG_CREATE_OPTION;
|
|
goto err_no_lock;
|
|
}
|
|
|
|
bmove(share.state.header.file_version,(byte*) maria_file_magic,4);
|
|
ci->old_options=options| (ci->old_options & HA_OPTION_TEMP_COMPRESS_RECORD ?
|
|
HA_OPTION_COMPRESS_RECORD |
|
|
HA_OPTION_TEMP_COMPRESS_RECORD: 0);
|
|
mi_int2store(share.state.header.options,ci->old_options);
|
|
mi_int2store(share.state.header.header_length,info_length);
|
|
mi_int2store(share.state.header.state_info_length,MARIA_STATE_INFO_SIZE);
|
|
mi_int2store(share.state.header.base_info_length,MARIA_BASE_INFO_SIZE);
|
|
mi_int2store(share.state.header.base_pos,base_pos);
|
|
share.state.header.data_file_type= datafile_type;
|
|
share.state.header.org_data_file_type= org_datafile_type;
|
|
share.state.header.language= (ci->language ?
|
|
ci->language : default_charset_info->number);
|
|
|
|
share.state.dellink = HA_OFFSET_ERROR;
|
|
share.state.first_bitmap_with_space= 0;
|
|
share.state.create_rename_lsn= 0;
|
|
share.state.process= (ulong) getpid();
|
|
share.state.unique= (ulong) 0;
|
|
share.state.update_count=(ulong) 0;
|
|
share.state.version= (ulong) time((time_t*) 0);
|
|
share.state.sortkey= (ushort) ~0;
|
|
share.state.auto_increment=ci->auto_increment;
|
|
share.options=options;
|
|
share.base.rec_reflength=pointer;
|
|
share.base.block_size= maria_block_size;
|
|
|
|
/* Get estimate for index file length (this may be wrong for FT keys) */
|
|
tmp= (tot_length + maria_block_size * keys *
|
|
MARIA_INDEX_BLOCK_MARGIN) / maria_block_size;
|
|
/*
|
|
use maximum of key_file_length we calculated and key_file_length value we
|
|
got from MYI file header (see also mariapack.c:save_state)
|
|
*/
|
|
share.base.key_reflength=
|
|
maria_get_pointer_length(max(ci->key_file_length,tmp),3);
|
|
share.base.keys= share.state.header.keys= keys;
|
|
share.state.header.uniques= uniques;
|
|
share.state.header.fulltext_keys= fulltext_keys;
|
|
mi_int2store(share.state.header.key_parts,key_segs);
|
|
mi_int2store(share.state.header.unique_key_parts,unique_key_parts);
|
|
|
|
maria_set_all_keys_active(share.state.key_map, keys);
|
|
|
|
share.base.keystart = share.state.state.key_file_length=
|
|
MY_ALIGN(info_length, maria_block_size);
|
|
share.base.max_key_block_length= maria_block_size;
|
|
share.base.max_key_length=ALIGN_SIZE(max_key_length+4);
|
|
share.base.records=ci->max_rows;
|
|
share.base.reloc= ci->reloc_rows;
|
|
share.base.reclength=real_reclength;
|
|
share.base.pack_reclength=reclength+ test(options & HA_OPTION_CHECKSUM);
|
|
share.base.max_pack_length=pack_reclength;
|
|
share.base.min_pack_length=min_pack_length;
|
|
share.base.pack_bytes= pack_bytes;
|
|
share.base.fields= columns;
|
|
share.base.pack_fields= packed;
|
|
#ifdef USE_RAID
|
|
share.base.raid_type=ci->raid_type;
|
|
share.base.raid_chunks=ci->raid_chunks;
|
|
share.base.raid_chunksize=ci->raid_chunksize;
|
|
#endif
|
|
|
|
/* max_data_file_length and max_key_file_length are recalculated on open */
|
|
if (tmp_table)
|
|
share.base.max_data_file_length= (my_off_t) ci->data_file_length;
|
|
else if (ci->transactional && translog_inited)
|
|
{
|
|
/*
|
|
we have checked translog_inited above, because maria_chk may call us
|
|
(via maria_recreate_table()) and it does not have a log.
|
|
*/
|
|
sync_dir= MY_SYNC_DIR;
|
|
}
|
|
|
|
if (datafile_type == BLOCK_RECORD)
|
|
share.base.min_block_length= share.base.min_row_length;
|
|
else
|
|
{
|
|
share.base.min_block_length=
|
|
(share.base.pack_reclength+3 < MARIA_EXTEND_BLOCK_LENGTH &&
|
|
! share.base.blobs) ?
|
|
max(share.base.pack_reclength,MARIA_MIN_BLOCK_LENGTH) :
|
|
MARIA_EXTEND_BLOCK_LENGTH;
|
|
}
|
|
if (! (flags & HA_DONT_TOUCH_DATA))
|
|
share.state.create_time= (long) time((time_t*) 0);
|
|
|
|
pthread_mutex_lock(&THR_LOCK_maria);
|
|
|
|
if (ci->index_file_name)
|
|
{
|
|
char *iext= strrchr(ci->index_file_name, '.');
|
|
int have_iext= iext && !strcmp(iext, MARIA_NAME_IEXT);
|
|
if (tmp_table)
|
|
{
|
|
char *path;
|
|
/* chop off the table name, tempory tables use generated name */
|
|
if ((path= strrchr(ci->index_file_name, FN_LIBCHAR)))
|
|
*path= '\0';
|
|
fn_format(filename, name, ci->index_file_name, MARIA_NAME_IEXT,
|
|
MY_REPLACE_DIR | MY_UNPACK_FILENAME | MY_APPEND_EXT);
|
|
}
|
|
else
|
|
{
|
|
fn_format(filename, ci->index_file_name, "", MARIA_NAME_IEXT,
|
|
MY_UNPACK_FILENAME | (have_iext ? MY_REPLACE_EXT :
|
|
MY_APPEND_EXT));
|
|
}
|
|
fn_format(klinkname, name, "", MARIA_NAME_IEXT,
|
|
MY_UNPACK_FILENAME|MY_APPEND_EXT);
|
|
klinkname_ptr= klinkname;
|
|
/*
|
|
Don't create the table if the link or file exists to ensure that one
|
|
doesn't accidently destroy another table.
|
|
Don't sync dir now if the data file has the same path.
|
|
*/
|
|
create_flag=
|
|
(ci->data_file_name &&
|
|
!strcmp(ci->index_file_name, ci->data_file_name)) ? 0 : sync_dir;
|
|
}
|
|
else
|
|
{
|
|
fn_format(filename, name, "", MARIA_NAME_IEXT,
|
|
(MY_UNPACK_FILENAME |
|
|
(flags & HA_DONT_TOUCH_DATA) ? MY_RETURN_REAL_PATH : 0) |
|
|
MY_APPEND_EXT);
|
|
/*
|
|
Replace the current file.
|
|
Don't sync dir now if the data file has the same path.
|
|
*/
|
|
create_flag= MY_DELETE_OLD | (!ci->data_file_name ? 0 : sync_dir);
|
|
}
|
|
|
|
/*
|
|
If a MRG_MARIA table is in use, the mapped MARIA tables are open,
|
|
but no entry is made in the table cache for them.
|
|
A TRUNCATE command checks for the table in the cache only and could
|
|
be fooled to believe, the table is not open.
|
|
Pull the emergency brake in this situation. (Bug #8306)
|
|
*/
|
|
if (_ma_test_if_reopen(filename))
|
|
{
|
|
my_printf_error(0, "MARIA table '%s' is in use "
|
|
"(most likely by a MERGE table). Try FLUSH TABLES.",
|
|
MYF(0), name + dirname_length(name));
|
|
goto err;
|
|
}
|
|
|
|
if ((file= my_create_with_symlink(klinkname_ptr, filename, 0, create_mode,
|
|
MYF(MY_WME|create_flag))) < 0)
|
|
goto err;
|
|
errpos=1;
|
|
|
|
if (!(flags & HA_DONT_TOUCH_DATA))
|
|
{
|
|
if (ci->data_file_name)
|
|
{
|
|
char *dext= strrchr(ci->data_file_name, '.');
|
|
int have_dext= dext && !strcmp(dext, MARIA_NAME_DEXT);
|
|
|
|
if (tmp_table)
|
|
{
|
|
char *path;
|
|
/* chop off the table name, tempory tables use generated name */
|
|
if ((path= strrchr(ci->data_file_name, FN_LIBCHAR)))
|
|
*path= '\0';
|
|
fn_format(filename, name, ci->data_file_name, MARIA_NAME_DEXT,
|
|
MY_REPLACE_DIR | MY_UNPACK_FILENAME | MY_APPEND_EXT);
|
|
}
|
|
else
|
|
{
|
|
fn_format(filename, ci->data_file_name, "", MARIA_NAME_DEXT,
|
|
MY_UNPACK_FILENAME |
|
|
(have_dext ? MY_REPLACE_EXT : MY_APPEND_EXT));
|
|
}
|
|
fn_format(dlinkname, name, "",MARIA_NAME_DEXT,
|
|
MY_UNPACK_FILENAME | MY_APPEND_EXT);
|
|
dlinkname_ptr= dlinkname;
|
|
create_flag=0;
|
|
}
|
|
else
|
|
{
|
|
fn_format(filename,name,"", MARIA_NAME_DEXT,
|
|
MY_UNPACK_FILENAME | MY_APPEND_EXT);
|
|
create_flag=MY_DELETE_OLD;
|
|
}
|
|
if ((dfile=
|
|
my_create_with_symlink(dlinkname_ptr, filename, 0, create_mode,
|
|
MYF(MY_WME | create_flag | sync_dir))) < 0)
|
|
goto err;
|
|
errpos=3;
|
|
|
|
share.data_file_type= datafile_type;
|
|
if (_ma_initialize_data_file(dfile, &share))
|
|
goto err;
|
|
}
|
|
DBUG_PRINT("info", ("write state info and base info"));
|
|
if (_ma_state_info_write(file, &share.state, 2) ||
|
|
_ma_base_info_write(file, &share.base))
|
|
goto err;
|
|
DBUG_PRINT("info", ("base_pos: %d base_info_size: %d",
|
|
base_pos, MARIA_BASE_INFO_SIZE));
|
|
DBUG_ASSERT(my_tell(file,MYF(0)) == base_pos+ MARIA_BASE_INFO_SIZE);
|
|
|
|
/* Write key and keyseg definitions */
|
|
DBUG_PRINT("info", ("write key and keyseg definitions"));
|
|
for (i=0 ; i < share.base.keys - uniques; i++)
|
|
{
|
|
uint sp_segs=(keydefs[i].flag & HA_SPATIAL) ? 2*SPDIMS : 0;
|
|
|
|
if (_ma_keydef_write(file, &keydefs[i]))
|
|
goto err;
|
|
for (j=0 ; j < keydefs[i].keysegs-sp_segs ; j++)
|
|
if (_ma_keyseg_write(file, &keydefs[i].seg[j]))
|
|
goto err;
|
|
#ifdef HAVE_SPATIAL
|
|
for (j=0 ; j < sp_segs ; j++)
|
|
{
|
|
HA_KEYSEG sseg;
|
|
sseg.type=SPTYPE;
|
|
sseg.language= 7; /* Binary */
|
|
sseg.null_bit=0;
|
|
sseg.bit_start=0;
|
|
sseg.bit_end=0;
|
|
sseg.bit_length= 0;
|
|
sseg.bit_pos= 0;
|
|
sseg.length=SPLEN;
|
|
sseg.null_pos=0;
|
|
sseg.start=j*SPLEN;
|
|
sseg.flag= HA_SWAP_KEY;
|
|
if (_ma_keyseg_write(file, &sseg))
|
|
goto err;
|
|
}
|
|
#endif
|
|
}
|
|
/* Create extra keys for unique definitions */
|
|
offset=reclength-uniques*MARIA_UNIQUE_HASH_LENGTH;
|
|
bzero((char*) &tmp_keydef,sizeof(tmp_keydef));
|
|
bzero((char*) &tmp_keyseg,sizeof(tmp_keyseg));
|
|
for (i=0; i < uniques ; i++)
|
|
{
|
|
tmp_keydef.keysegs=1;
|
|
tmp_keydef.flag= HA_UNIQUE_CHECK;
|
|
tmp_keydef.block_length= (uint16) maria_block_size;
|
|
tmp_keydef.keylength= MARIA_UNIQUE_HASH_LENGTH + pointer;
|
|
tmp_keydef.minlength=tmp_keydef.maxlength=tmp_keydef.keylength;
|
|
tmp_keyseg.type= MARIA_UNIQUE_HASH_TYPE;
|
|
tmp_keyseg.length= MARIA_UNIQUE_HASH_LENGTH;
|
|
tmp_keyseg.start= offset;
|
|
offset+= MARIA_UNIQUE_HASH_LENGTH;
|
|
if (_ma_keydef_write(file,&tmp_keydef) ||
|
|
_ma_keyseg_write(file,(&tmp_keyseg)))
|
|
goto err;
|
|
}
|
|
|
|
/* Save unique definition */
|
|
DBUG_PRINT("info", ("write unique definitions"));
|
|
for (i=0 ; i < share.state.header.uniques ; i++)
|
|
{
|
|
HA_KEYSEG *keyseg_end;
|
|
keyseg= uniquedefs[i].seg;
|
|
if (_ma_uniquedef_write(file, &uniquedefs[i]))
|
|
goto err;
|
|
for (keyseg= uniquedefs[i].seg, keyseg_end= keyseg+ uniquedefs[i].keysegs;
|
|
keyseg < keyseg_end;
|
|
keyseg++)
|
|
{
|
|
switch (keyseg->type) {
|
|
case HA_KEYTYPE_VARTEXT1:
|
|
case HA_KEYTYPE_VARTEXT2:
|
|
case HA_KEYTYPE_VARBINARY1:
|
|
case HA_KEYTYPE_VARBINARY2:
|
|
if (!(keyseg->flag & HA_BLOB_PART))
|
|
{
|
|
keyseg->flag|= HA_VAR_LENGTH_PART;
|
|
keyseg->bit_start= ((keyseg->type == HA_KEYTYPE_VARTEXT1 ||
|
|
keyseg->type == HA_KEYTYPE_VARBINARY1) ?
|
|
1 : 2);
|
|
}
|
|
break;
|
|
default:
|
|
DBUG_ASSERT((keyseg->flag & HA_VAR_LENGTH_PART) == 0);
|
|
break;
|
|
}
|
|
if (_ma_keyseg_write(file, keyseg))
|
|
goto err;
|
|
}
|
|
}
|
|
DBUG_PRINT("info", ("write field definitions"));
|
|
if (datafile_type == BLOCK_RECORD)
|
|
{
|
|
/* Store columns in a more efficent order */
|
|
MARIA_COLUMNDEF **col_order, **pos;
|
|
if (!(col_order= (MARIA_COLUMNDEF**) my_malloc(share.base.fields *
|
|
sizeof(MARIA_COLUMNDEF*),
|
|
MYF(MY_WME))))
|
|
goto err;
|
|
for (column= columndef, pos= col_order ;
|
|
column != end_column ;
|
|
column++, pos++)
|
|
*pos= column;
|
|
qsort(col_order, share.base.fields, sizeof(*col_order),
|
|
(qsort_cmp) compare_columns);
|
|
for (i=0 ; i < share.base.fields ; i++)
|
|
{
|
|
if (_ma_columndef_write(file, col_order[i]))
|
|
{
|
|
my_free((gptr) col_order, MYF(0));
|
|
goto err;
|
|
}
|
|
}
|
|
my_free((gptr) col_order, MYF(0));
|
|
}
|
|
else
|
|
{
|
|
for (i=0 ; i < share.base.fields ; i++)
|
|
if (_ma_columndef_write(file, &columndef[i]))
|
|
goto err;
|
|
}
|
|
|
|
if ((kfile_size_before_extension= my_tell(file,MYF(0))) == MY_FILEPOS_ERROR)
|
|
goto err;
|
|
#ifndef DBUG_OFF
|
|
if (kfile_size_before_extension != info_length)
|
|
DBUG_PRINT("warning",("info_length: %u != used_length: %u",
|
|
info_length, (uint)kfile_size_before_extension));
|
|
#endif
|
|
|
|
if (sync_dir)
|
|
{
|
|
/*
|
|
we log the first bytes and then the size to which we extend; this is
|
|
not log 1 KB of mostly zeroes if this is a small table.
|
|
*/
|
|
char empty_string[]= "";
|
|
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS + 3];
|
|
uint total_rec_length= 0;
|
|
uint i;
|
|
log_array[TRANSLOG_INTERNAL_PARTS + 0].length= 1 + 2 +
|
|
kfile_size_before_extension;
|
|
/* we are needing maybe 64 kB, so don't use the stack */
|
|
log_data= my_malloc(log_array[TRANSLOG_INTERNAL_PARTS + 0].length, MYF(0));
|
|
if ((log_data == NULL) ||
|
|
my_pread(file, 1 + 2 + log_data, kfile_size_before_extension,
|
|
0, MYF(MY_NABP)))
|
|
goto err_no_lock;
|
|
/*
|
|
remember if the data file was created or not, to know if Recovery can
|
|
do it or not, in the future
|
|
*/
|
|
log_data[0]= test(flags & HA_DONT_TOUCH_DATA);
|
|
int2store(log_data + 1, kfile_size_before_extension);
|
|
log_array[TRANSLOG_INTERNAL_PARTS + 0].str= log_data;
|
|
/* symlink description is also needed for re-creation by Recovery: */
|
|
log_array[TRANSLOG_INTERNAL_PARTS + 1].str=
|
|
dlinkname_ptr ? dlinkname : empty_string;
|
|
log_array[TRANSLOG_INTERNAL_PARTS + 1].length=
|
|
strlen(log_array[TRANSLOG_INTERNAL_PARTS + 1].str);
|
|
log_array[TRANSLOG_INTERNAL_PARTS + 2].str=
|
|
klinkname_ptr ? klinkname : empty_string;
|
|
log_array[TRANSLOG_INTERNAL_PARTS + 2].length=
|
|
strlen(log_array[TRANSLOG_INTERNAL_PARTS + 2].str);
|
|
for (i= TRANSLOG_INTERNAL_PARTS;
|
|
i < (sizeof(log_array)/sizeof(log_array[0])); i++)
|
|
total_rec_length+= log_array[i].length;
|
|
/*
|
|
For this record to be of any use for Recovery, we need the upper
|
|
MySQL layer to be crash-safe, which it is not now (that would require
|
|
work using the ddl_log of sql/sql_table.cc); when it is, we should
|
|
reconsider the moment of writing this log record (before or after op,
|
|
under THR_LOCK_maria or not...), how to use it in Recovery, and force
|
|
the log. For now this record is just informative.
|
|
Note that in case of TRUNCATE TABLE we also come here.
|
|
When in CREATE/TRUNCATE (or DROP or RENAME or REPAIR) we have not called
|
|
external_lock(), so have no TRN. It does not matter, as all these
|
|
operations are non-transactional and sync their files.
|
|
*/
|
|
if (unlikely(translog_write_record(&share.state.create_rename_lsn,
|
|
LOGREC_REDO_CREATE_TABLE,
|
|
&dummy_transaction_object, NULL,
|
|
total_rec_length,
|
|
sizeof(log_array)/sizeof(log_array[0]),
|
|
log_array, NULL)))
|
|
goto err_no_lock;
|
|
/*
|
|
store LSN into file, needed for Recovery to not be confused if a
|
|
DROP+CREATE happened (applying REDOs to the wrong table).
|
|
If such direct my_pwrite() to a fixed offset is too "hackish", I can
|
|
call ma_state_info_write() again but it will be less efficient.
|
|
*/
|
|
lsn_store(log_data, share.state.create_rename_lsn);
|
|
if (my_pwrite(file, log_data, LSN_STORE_SIZE,
|
|
sizeof(share.state.header) + 2, MYF(MY_NABP)))
|
|
goto err_no_lock;
|
|
my_free(log_data, MYF(0));
|
|
}
|
|
|
|
/* Enlarge files */
|
|
DBUG_PRINT("info", ("enlarge to keystart: %lu",
|
|
(ulong) share.base.keystart));
|
|
if (my_chsize(file,(ulong) share.base.keystart,0,MYF(0)))
|
|
goto err;
|
|
|
|
if (sync_dir && my_sync(file, MYF(0)))
|
|
goto err;
|
|
|
|
if (! (flags & HA_DONT_TOUCH_DATA))
|
|
{
|
|
#ifdef USE_RELOC
|
|
if (my_chsize(dfile,share.base.min_pack_length*ci->reloc_rows,0,MYF(0)))
|
|
goto err;
|
|
#endif
|
|
errpos=2;
|
|
if ((sync_dir && my_sync(dfile, MYF(0))) || my_close(dfile,MYF(0)))
|
|
goto err;
|
|
}
|
|
pthread_mutex_unlock(&THR_LOCK_maria);
|
|
res= 0;
|
|
my_free((char*) rec_per_key_part,MYF(0));
|
|
errpos=0;
|
|
if (my_close(file,MYF(0)))
|
|
res= my_errno;
|
|
DBUG_RETURN(res);
|
|
|
|
err:
|
|
pthread_mutex_unlock(&THR_LOCK_maria);
|
|
|
|
err_no_lock:
|
|
save_errno=my_errno;
|
|
switch (errpos) {
|
|
case 3:
|
|
VOID(my_close(dfile,MYF(0)));
|
|
/* fall through */
|
|
case 2:
|
|
if (! (flags & HA_DONT_TOUCH_DATA))
|
|
my_delete_with_symlink(fn_format(filename,name,"",MARIA_NAME_DEXT,
|
|
MY_UNPACK_FILENAME | MY_APPEND_EXT),
|
|
sync_dir);
|
|
/* fall through */
|
|
case 1:
|
|
VOID(my_close(file,MYF(0)));
|
|
if (! (flags & HA_DONT_TOUCH_DATA))
|
|
my_delete_with_symlink(fn_format(filename,name,"",MARIA_NAME_IEXT,
|
|
MY_UNPACK_FILENAME | MY_APPEND_EXT),
|
|
sync_dir);
|
|
}
|
|
my_free(log_data, MYF(MY_ALLOW_ZERO_PTR));
|
|
my_free((char*) rec_per_key_part, MYF(0));
|
|
DBUG_RETURN(my_errno=save_errno); /* return the fatal errno */
|
|
}
|
|
|
|
|
|
uint maria_get_pointer_length(ulonglong file_length, uint def)
|
|
{
|
|
DBUG_ASSERT(def >= 2 && def <= 7);
|
|
if (file_length) /* If not default */
|
|
{
|
|
#ifdef NOT_YET_READY_FOR_8_BYTE_POINTERS
|
|
if (file_length >= (ULL(1) << 56))
|
|
def=8;
|
|
else
|
|
#endif
|
|
if (file_length >= (ULL(1) << 48))
|
|
def=7;
|
|
else if (file_length >= (ULL(1) << 40))
|
|
def=6;
|
|
else if (file_length >= (ULL(1) << 32))
|
|
def=5;
|
|
else if (file_length >= (ULL(1) << 24))
|
|
def=4;
|
|
else if (file_length >= (ULL(1) << 16))
|
|
def=3;
|
|
else
|
|
def=2;
|
|
}
|
|
return def;
|
|
}
|
|
|
|
|
|
/*
|
|
Sort columns for records-in-block
|
|
|
|
IMPLEMENTATION
|
|
Sort columns in following order:
|
|
|
|
Fixed size, not null columns
|
|
Fixed length, null fields
|
|
Variable length fields (CHAR, VARCHAR)
|
|
Blobs
|
|
|
|
For same kind of fields, keep fields in original order
|
|
*/
|
|
|
|
static inline int sign(longlong a)
|
|
{
|
|
return a < 0 ? -1 : (a > 0 ? 1 : 0);
|
|
}
|
|
|
|
|
|
static int compare_columns(MARIA_COLUMNDEF **a_ptr, MARIA_COLUMNDEF **b_ptr)
|
|
{
|
|
MARIA_COLUMNDEF *a= *a_ptr, *b= *b_ptr;
|
|
enum en_fieldtype a_type, b_type;
|
|
|
|
a_type= ((a->type == FIELD_NORMAL || a->type == FIELD_CHECK) ?
|
|
FIELD_NORMAL : a->type);
|
|
b_type= ((b->type == FIELD_NORMAL || b->type == FIELD_CHECK) ?
|
|
FIELD_NORMAL : b->type);
|
|
|
|
if (a_type == FIELD_NORMAL && !a->null_bit)
|
|
{
|
|
if (b_type != FIELD_NORMAL || b->null_bit)
|
|
return -1;
|
|
return sign((long) (a->offset - b->offset));
|
|
}
|
|
if (b_type == FIELD_NORMAL && !b->null_bit)
|
|
return 1;
|
|
if (a_type == b_type)
|
|
return sign((long) (a->offset - b->offset));
|
|
if (a_type == FIELD_NORMAL)
|
|
return -1;
|
|
if (b_type == FIELD_NORMAL)
|
|
return 1;
|
|
if (a_type == FIELD_BLOB)
|
|
return 1;
|
|
if (b_type == FIELD_BLOB)
|
|
return -1;
|
|
return sign((long) (a->offset - b->offset));
|
|
}
|
|
|
|
|
|
/* Initialize data file */
|
|
|
|
int _ma_initialize_data_file(File dfile, MARIA_SHARE *share)
|
|
{
|
|
if (share->data_file_type == BLOCK_RECORD)
|
|
{
|
|
if (my_chsize(dfile, share->base.block_size, 0, MYF(MY_WME)))
|
|
return 1;
|
|
share->state.state.data_file_length= share->base.block_size;
|
|
_ma_bitmap_delete_all(share);
|
|
}
|
|
return 0;
|
|
}
|