MDEV-5817 query cache bug (returning inconsistent/old result
set) with aria table parallel inserts, row format = page
The problem is that for transactional aria tables
(row_type=PAGE and transactional=1), maria_lock_database()
didn't flush the state or the query cache.
Not flushing the state is correct for transactional tables as
this is done by checkpoint, but not flushing the query cache
was wrong and could cause concurrent SELECT queries to not
be deleted from the cache.
Fixed by introducing a flush of the query cache as part of commit, if the table has changed.
t for transactional aria tables (row_type=PAGE and transactional=1), maria_lock_table() didn't flush their state or the query cache.
Changes:
- maria_create() now uses a bit in the parameter flags to check if table
should be encrypted instead of using maria_encrypted_tables.
- Don't encrypt tables that are to be converted to S3
- Added encrypted flag to ARIA_TABLE_CAPABILITIES
- maria_chk --description now prints if table is encrypted. Other
operations is not allowed on encrypted tables.
Limit increased from 1000 to 2000.
Avoiding stack overflow by only storing keys and pages on the stack in
recursive functions if there is plenty of space on it.
Other things:
- Use less stack space for b-tree operations as we now only allocate as
much space as needed instead of always allocating HA_MAX_KEY_LENGTH.
- Replaced most usage of my_safe_alloca() in Aria with the stack_alloc
interface.
- Moved my_setstacksize() to mysys/my_pthread.c
The MDEV-20265 commit e746f451d5
introduces DBUG_ASSERT(right_op == r_tbl) in
st_select_lex::add_cross_joined_table(), and that assertion would
fail in several tests that exercise joins. That commit was skipped
in this merge, and a separate fix of MDEV-20265 will be necessary in 10.4.
When using field_conv(), which is called in case of field1=field2 copy in
fill_records(), full varstring's was copied, including unitialized bytes.
This caused valgrind to compilain about usage of unitialized bytes when
using Aria static length records.
Fixed by not using memcpy when copying varstrings but instead just copy
the real bytes.
- pcretest.c could use macro with side effect
- maria_chk could access freed memory
- Initialized some variables that could be accessed uninitalized
- Fixed compiler warning in my_atomic-t.c
MDEV-19486 and one more similar bug appeared because handler::write_row() interface
welcomes to modify buffer by storage engine. But callers are not ready for that
thus bugs are possible in future.
handler::write_row():
handler::ha_write_row(): make argument const
The bug was that when long item-strings was converted to VARCHAR,
type_handler::string_type_handler() didn't take into account max
VARCHAR length. The resulting Aria temporary table was created with
a VARCHAR field of length 1 when it should have been 65537. This caused
MariaDB to send impossible records to ma_write() and Aria reported
eventually the table as crashed.
Fixed by updating Type_handler::string_type_handler() to not create too long
VARCHAR fields. To make things extra safe, I also added checks in when
writing dynamic Aria records to ensure we find the wrong record during write
instead of during read.
There was a bug in the page cache that didn't take into account that
another thread could be waiting for a page to be read by read_big_block().
Fixed by releasing all waiters in read_big_block()
MDEV-19585 Assertion with S3 table and flush_tables
The limit has to be increased so that MariaDB can create system tables.
It should not have any notable impact on performance.
There should not be any notable performance differences between 1K and 4K,
especially for temporary tables. In most cases using bigger blocks is also
faster (with the possible exception of doing key reads of not fixed length
keys).
The error occured because aria_copy_to_s3() function tried to copy .frm file
of partition, but partition does not have it's own .frm file. The same is true
for aria_rename_s3().
To fix this issue the new parameter was added to those two functions to specify
if .frm file must be copied or not. The parameter is set to 'false' for
partitions.
Also there was other issue with EXCHANGE PARTITION. Briefly, there is the
following sequence of operations(see exchange_name_with_ddl_log() for details):
1) rename swap table to temporary table,
2) rename partition to swap table,
3) rename temporary table to partition.
On step (1) .frm file is renamed too. On step (2) the swap table does not
have .frm file, as partition does not have it. On step (3) partition will have
.frm file, because it will be renamed from temporary table. All of this causes
error on different stages of the table access. To fix it, .frm is not touched
at all for s3 during EXCHANGE PARTITION operation. This is implemented in
ha_s3::rename_table() by additional checking of
current_thd->lex->alter_info.partition_flags(see also ALTER_PARTITION_EXCHANGE
in sql_yacc.yy).
The problem was that the code in maria_extra assumed that there could be
only one table open when doing maria_extra(MA_FORCE_REOPEN)
However in the case of triggers, there can be multiple copies of
the table open.
Fixed by removing assert.
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug
Maintainer mode makes all warnings errors. This patch fix warnings. Mostly about
deprecated `register` keyword.
Too much warnings came from Mroonga and I gave up on it.
- When recovery failed, errors would not be printed on
new lines.
- Print more information if file lengths are changed
- Added logging of table name for entries INCOMPLETE_LOG and
REDO_REPAIR_TABLE
MDEV-18461 Aria crash recovery failures
This does not fix the bug reported in the MDEV, but
now we get an error message of the problem instead of
an assert.