decrease for INSERTs
Bulk inserts (multiple row, CREATE ... SELECT, INSERT ... SELECT) into
MyISAM tables were performed inefficiently. This was mainly affecting
use cases where read_buffer_size was considerably large (>256K) and low
number of rows was inserted (e.g. 30-100).
The problem was that during I/O cache initialization (this happens
before each bulk insert) allocated I/O buffer was unnecessarily
initialized to '\0'.
This was happening because of mess in flag values. MyISAM informs I/O
cache to wait for free space (if out of disk space) by passing
MY_WAIT_IF_FULL flag. Since MY_WAIT_IF_FULL and MY_ZEROFILL have the
same values, memory allocator was initializing memory to '\0'.
The performance gain provided with this patch may only be visible with
non-debug binaries, since safemalloc always initializes allocated memory
to 0xA5A5...
- Reserver namespace and place in frm for TABLE_CHECKSUM and PAGE_CHECKSUM create options
- Added syncing of directory when creating .frm files
- Portability fixes
- Added missing cast that could cause bugs
- Code cleanups
- Made some bit functions inline
- Moved things out of myisam.h to my_handler.h to make them more accessable
- Renamed some myisam variables and defines to make them more globaly usable (as they are used outside of MyISAM)
- Fixed bugs in error conditions
- Use compiler time asserts instead of run time
- Fixed indentation
HA_EXTRA_PREPARE_FOR_DELETE -> HA_EXTRA_PREPARE_FOR_DROP as the old name was wrong
(Added a define for old value to ensure we don't break any old code)
Added HA_EXTRA_PREPARE_FOR_RENAME as a signal for rename (before we used a DROP signal which is wrong)
- Initialize error messages early to get better errors when mysqld or an engine fails to start
- Fix windows bug that query_performance_frequency was not initialized if registry code failed
- thread_stack -> my_thread_stack_size
The following type conversions was done:
- Changed byte to uchar
- Changed gptr to uchar*
- Change my_string to char *
- Change my_size_t to size_t
- Change size_s to size_t
Removed declaration of byte, gptr, my_string, my_size_t and size_s.
Following function parameter changes was done:
- All string functions in mysys/strings was changed to use size_t
instead of uint for string lengths.
- All read()/write() functions changed to use size_t (including vio).
- All protocoll functions changed to use size_t instead of uint
- Functions that used a pointer to a string length was changed to use size_t*
- Changed malloc(), free() and related functions from using gptr to use void *
as this requires fewer casts in the code and is more in line with how the
standard functions work.
- Added extra length argument to dirname_part() to return the length of the
created string.
- Changed (at least) following functions to take uchar* as argument:
- db_dump()
- my_net_write()
- net_write_command()
- net_store_data()
- DBUG_DUMP()
- decimal2bin() & bin2decimal()
- Changed my_compress() and my_uncompress() to use size_t. Changed one
argument to my_uncompress() from a pointer to a value as we only return
one value (makes function easier to use).
- Changed type of 'pack_data' argument to packfrm() to avoid casts.
- Changed in readfrm() and writefrom(), ha_discover and handler::discover()
the type for argument 'frmdata' to uchar** to avoid casts.
- Changed most Field functions to use uchar* instead of char* (reduced a lot of
casts).
- Changed field->val_xxx(xxx, new_ptr) to take const pointers.
Other changes:
- Removed a lot of not needed casts
- Added a few new cast required by other changes
- Added some cast to my_multi_malloc() arguments for safety (as string lengths
needs to be uint, not size_t).
- Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done
explicitely as this conflict was often hided by casting the function to
hash_get_key).
- Changed some buffers to memory regions to uchar* to avoid casts.
- Changed some string lengths from uint to size_t.
- Changed field->ptr to be uchar* instead of char*. This allowed us to
get rid of a lot of casts.
- Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar
- Include zlib.h in some files as we needed declaration of crc32()
- Changed MY_FILE_ERROR to be (size_t) -1.
- Changed many variables to hold the result of my_read() / my_write() to be
size_t. This was needed to properly detect errors (which are
returned as (size_t) -1).
- Removed some very old VMS code
- Changed packfrm()/unpackfrm() to not be depending on uint size
(portability fix)
- Removed windows specific code to restore cursor position as this
causes slowdown on windows and we should not mix read() and pread()
calls anyway as this is not thread safe. Updated function comment to
reflect this. Changed function that depended on original behavior of
my_pwrite() to itself restore the cursor position (one such case).
- Added some missing checking of return value of malloc().
- Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow.
- Changed type of table_def::m_size from my_size_t to ulong to reflect that
m_size is the number of elements in the array, not a string/memory
length.
- Moved THD::max_row_length() to table.cc (as it's not depending on THD).
Inlined max_row_length_blob() into this function.
- More function comments
- Fixed some compiler warnings when compiled without partitions.
- Removed setting of LEX_STRING() arguments in declaration (portability fix).
- Some trivial indentation/variable name changes.
- Some trivial code simplifications:
- Replaced some calls to alloc_root + memcpy to use
strmake_root()/strdup_root().
- Changed some calls from memdup() to strmake() (Safety fix)
- Simpler loops in client-simple.c
Showstopper and regression against 5.0.24.
Previously, we ignored seek() errors (see Bug#22828) and let seek()s
against pipes fail. Now, since we check that a seek didn't fail,
and return without reading, this bug popped up.
This restores the behavior for file-ish objects that could never be
seek()ed.
- When cache memory can't be allocated size is recaclulated using 3/4 of
the requested memory. This number is rounded up to the nearest
min_cache step.
However with the previous implementation the new cache size might
become bigger than requested because of this rounding and thus we get
an infinit loop.
- This patch fixes this problem by ensuring that the new cache size
always will be smaller on the second and subsequent iterations until
we reach min_cache.
(Mostly in DBUG_PRINT() and unused arguments)
Fixed bug in query cache when used with traceing (--with-debug)
Fixed memory leak in mysqldump
Removed warnings from mysqltest scripts (replaced -- with #)
- The io cache flag seek_not_done was not set properly in the
reinit_io_cache function call and this led my_seek to be called
desipite an invalid file handle.
- Added a test in reinit_io_cache to ensure we have a valid file
handle before setting seek_not_done flag.
- Because my_seek actually is capable of returning an error code we should
exploit that in the best possible way.
- There might be kernel errors or other errors we can't predict and capturing
the return value of all system calls gives us better understanding of
possible errors.
- The io cache flag seek_not_done was not set properly in the reinit_
io_chache function call and this led my_seek to be called despite an
invalid file handle.
- Added a test in reinit_io_cache to ensure we have a valid file handle
before setting seek_not_done flag.
OPTIMIZE TABLE with myisam_repair_threads > 1 performs a non-quick
parallel repair. This means that it does not only rebuild all
indexes, but also the data file.
Non-quick parallel repair works so that there is one thread per
index. The first of the threads rebuilds also the new data file.
The problem was that all threads shared the read io cache on the
old data file. If there were holes (deleted records) in the table,
the first thread skipped them, writing only contiguous, non-deleted
records to the new data file. Then it built the new index so that
its entries pointed to the correct record positions. But the other
threads didn't know the new record positions, but put the positions
from the old data file into the index.
The new design is so that there is a shared io cache which is filled
by the first thread (the data file writer) with the new contiguous
records and read by the other threads. Now they know the new record
positions.
Another problem was that for the parallel repair of compressed
tables a common bit_buff and rec_buff was used. I changed it so
that thread specific buffers are used for parallel repair.
A similar problem existed for checksum calculation. I made this
multi-thread safe too.
A wrong cast led to numeric overflow for data files
greater than 4GB. The parallel repair assumed end of
file after reading the amount of data that the file
was bigger than 4GB. It truncated the data file and
noted the number of records it found so far in the
index file header as the number of rows in the table.
Removing the cast fixed the problem.
I added some cosmetic changes too.
The normal repair worked because it uses a different
function to read from the data file.
Added two status variables:
binlog_cache_use - counts number of transactions that used somehow
transaction temporary binary log.
binlog_cache_disk_use - counts number of transactions that required
disk I/O for storing info in this this binary log.