Remove the dead-code, in Spider, which is related to the Spider's
HandlerSocket support. The code has been disabled for a long time
and it is unlikely that the code will be enabled.
- rm all files under storage/spider/hs_client/ except hs_compat.h
- rm storage/spider/spd_db_handlersocket.*
- unifdef -UHS_HAS_SQLCOM -UHAVE_HANDLERSOCKET \
-m storage/spider/spd_* storage/spider/ha_spider.* storage/spider/hs_client/*
- remove relevant files from storage/spider/CMakeLists.txt
tablespace truncation
- InnoDB fails with out of bound write error after temporary
tablespace truncation. This issue caused by
commit c507678b20 (MDEV-28699).
InnoDB fail to clear freed ranges if shrinking size
is the last offset of the freed range.
Columns added to TABLE_STATISTICS
- ROWS_INSERTED, ROWS_DELETED, ROWS_UPDATED, KEY_READ_HITS and
KEY_READ_MISSES.
Columns added to CLIENT_STATISTICS and USER_STATISTICS:
- KEY_READ_HITS and KEY_READ_MISSES.
User visible changes (except new columns):
- CLIENT_STATISTICS and USER_STATISTICS has columns KEY_READ_HITS and
KEY_READ_MISSES added after column ROWS_UPDATED before SELECT_COMMANDS.
Other changes:
- Do not collect table statistics for system tables like index_stats
table_stats, performance_schema, information_schema etc as the user
has no control of these and the generate noice in the statistics.
- All row variables that are part of user_stats are moved to
'struct rows_stats' to make it easy to clear all of them at once.
- ha_read_key_misses added to STATUS_VAR
Notes:
- userstat.result has a change of numbers of rows for handler_read_key.
This is because use-stat-tables is now disabled for the test.
EE_ numbers occupy the same range as OS Exxx errors and
my_errno is generally for OS errors. And for HA_ERR_ handler errors
which occupy a different range, so can be freely mixed with OS errors.
Changes:
- Fixed that MyISAM and Aria parallel repair works with tmp file limit.
This required to add current_thd to all parallel workers and add
protection in my_malloc_size_cb_func() and temp_file_size_cb_func() to
be able to handle shared THD's. I removed the old code in MyISAM to
set current_thd() as only worked when using with virtal indexed
columns and I wanted to keep the Aria and MyISAM code identical.
Other things:
- Improved error messages from Aria parallel repair and
create_internal_tmp_table_from_heap().
Two new variables added:
- max_tmp_space_usage : Limits the the temporary space allowance per user
- max_total_tmp_space_usage: Limits the temporary space allowance for
all users.
New status variables: tmp_space_used & max_tmp_space_used
New field in information_schema.process_list: TMP_SPACE_USED
The temporary space is counted for:
- All SQL level temporary files. This includes files for filesort,
transaction temporary space, analyze, binlog_stmt_cache etc.
It does not include engine internal temporary files used for repair,
alter table, index pre sorting etc.
- All internal on disk temporary tables created as part of resolving a
SELECT, multi-source update etc.
Special cases:
- When doing a commit, the last flush of the binlog_stmt_cache
will not cause an error even if the temporary space limit is exceeded.
This is to avoid giving errors on commit. This means that a user
can temporary go over the limit with up to binlog_stmt_cache_size.
Noteworthy issue:
- One has to be careful when using small values for max_tmp_space_limit
together with binary logging and with non transactional tables.
If a the binary log entry for the query is bigger than
binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit
when flushing the entry to disk, the query will abort and the
binary log will not contain the last changes to the table.
This will also stop the slave!
This is also true for all Aria tables as Aria cannot do rollback
(except in case of crashes)!
One way to avoid it is to use @@binlog_format=statement for
queries that updates a lot of rows.
Implementation:
- All writes to temporary files or internal temporary tables, that
increases the file size, are routed through temp_file_size_cb_func()
which updates and checks the temp space usage.
- Most of the temporary file monitoring is done inside IO_CACHE.
Temporary file monitoring is done inside the Aria engine.
- MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache().
MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means
that we track the file usage and we give an error if the limit is
breached. This is used to not give an error on commit when
binlog_stmp_cache is flushed.
- global_tmp_space_used contains the total tmp space used so far.
This is needed quickly check against max_total_tmp_space_usage.
- Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and
handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL.
This is needed until we move general errors to it's own error space
so that they cannot conflict with system error numbers.
- Return value of my_chsize() and mysql_file_chsize() has changed
so that -1 is returned in the case my_chsize() could not decrease
the file size (very unlikely and will not happen on modern systems).
All calls to _chsize() are updated to check for > 0 as the error
condition.
- At the destruction of THD we check that THD::tmp_file_space == 0
- At server end we check that global_tmp_space_used == 0
- As a precaution against errors in the tmp_space_used code, one can set
max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable
the tmp space quota errors.
- truncate_io_cache() function added.
- Aria tables using static or dynamic row length are registered in 8K
increments to avoid some calls to update_tmp_file_size().
Other things:
- Ensure that all handler errors are registered. Before, some engine
errors could be printed as "Unknown error".
- Fixed bug in filesort() that causes a assert if there was an error
when writing to the temporay file.
- Fixed that compute_window_func() now takes into account write errors.
- In case of parallel replication, rpl_group_info::cleanup_context()
could call trans_rollback() with thd->error set, which would cause
an assert. Fixed by resetting the error before calling trans_rollback().
- Fixed bug in subselect3.inc which caused following test to use
heap tables with low value for max_heap_table_size
- Fixed bug in sql_expression_cache where it did not overflow
heap table to Aria table.
- Added Max_tmp_disk_space_used to slow query log.
- Fixed some bugs in log_slow_innodb.test
- FLUSH GLOBAL STATUS now resets most global_status_vars.
At this stage, this is mainly to be used for testing.
- FLUSH SESSION STATUS added as an alias for FLUSH STATUS.
- FLUSH STATUS does not require any privilege (before required RELOAD).
- FLUSH GLOBAL STATUS requires RELOAD privilege.
- All global status reset moved to FLUSH GLOBAL STATUS.
- Replication semisync status variables are now reset by
FLUSH GLOBAL STATUS.
- In test cases, the only changes are:
- Replace FLUSH STATUS with FLUSH GLOBAL STATUS
- Replace FLUSH STATUS with FLUSH STATUS; FLUSH GLOBAL STATUS.
This was only done in a few tests where the test was using SHOW STATUS
for both local and global variables.
- Uptime_since_flush_status is now always provided, independent if
ENABLED_PROFILING is enabled when compiling MariaDB.
- @@global.Uptime_since_flush_status is reset on FLUSH GLOBAL STATUS
and @@session.Uptime_since_flush_status is reset on FLUSH SESSION STATUS.
- When connected, @@session.Uptime_since_flush_status is set to 0.
Added SHOW_LONGLONG_NOFLUSH to mark the variables that should not be
flushed.
New variables cleared as part of SHOW STATUS:
- Rpl_semi_sync_master_request_ack
- Rpl_semi_sync_master_get_ac
- Rpl_semi_sync_slave_send_ack
- Slave_skipped_error
This task is to ensure we have a clear definition and rules of how to
repair or optimize a table.
The rules are:
- REPAIR should be used with tables that are crashed and are
unreadable (hardware issues with not readable blocks, blocks with
'unexpected data' etc)
- OPTIMIZE table should be used to optimize the storage layout for the
table (recover space for delete rows and optimize the index
structure.
- ALTER TABLE table_name FORCE should be used to rebuild the .frm file
(the table definition) and the table (with the original table row
format). If the table is from and older MariaDB/MySQL release with a
different storage format, it will convert the data to the new
format. ALTER TABLE ... FORCE is used as part of mariadb-upgrade
Here follows some more background:
The 3 ways to repair a table are:
1) ALTER TABLE table_name FORCE" (not other options).
As an alias we allow: "ALTER TABLE table_name ENGINE=original_engine"
2) "REPAIR TABLE" (without FORCE)
3) "OPTIMIZE TABLE"
All of the above commands will optimize row space usage (which means that
space will be needed to hold a temporary copy of the table) and
re-generate all indexes. They will also try to replicate the original
table definition as exact as possible.
For ALTER TABLE and "REPAIR TABLE without FORCE", the following will hold:
If the table is from an older MariaDB version and data conversion is
needed (for example for old type HASH columns, MySQL JSON type or new
TIMESTAMP format) "ALTER TABLE table_name FORCE, algorithm=COPY" will be
used.
The differences between the algorithms are
1) Will use the fastest algorithm the engine supports to do a full repair
of the table (except if data conversions are is needed).
2) Will use the storage engine internal REPAIR facility (MyISAM, Aria).
If the engine does not support REPAIR then
"ALTER TABLE FORCE, ALGORITHM=COPY" will be used.
If there was data incompatibilities (which means that FORCE was used)
then there will be a warning after REPAIR that ALTER TABLE FORCE is
still needed.
The reason for this is that REPAIR may be able to go around data
errors (wrong incompatible data, crashed or unreadable sectors) that
ALTER TABLE cannot do.
3) Will use the storage engine internal OPTIMIZE. If engine does not
support optimize, then "ALTER TABLE FORCE" is used.
The above will ensure that ALTER TABLE FORCE is able to
correct almost any errors in the row or index data. In case of
corrupted blocks then REPAIR possible followed by ALTER TABLE is needed.
This is important as mariadb-upgrade executes ALTER TABLE table_name
FORCE for any table that must be re-created.
Bugs fixed with InnoDB tables when using ALTER TABLE FORCE:
- No error for INNODB_DEFAULT_ROW_FORMAT=COMPACT even if row length
would be too wide. (Independent of innodb_strict_mode).
- Tables using symlinks will be symlinked after any of the above commands
(Independent of the setting of --symbolic-links)
If one specifies an algorithm together with ALTER TABLE FORCE, things
will work as before (except if data conversion is required as then
the COPY algorithm is enforced).
ALTER TABLE .. OPTIMIZE ALL PARTITIONS will work as before.
Other things:
- FORCE argument added to REPAIR to allow one to first run internal
repair to fix damaged blocks and then follow it with ALTER TABLE.
- REPAIR will not update frm_version if ha_check_for_upgrade() finds
that table is still incompatible with current version. In this case the
REPAIR will end with an error.
- REPAIR for storage engines that does not have native repair, like InnoDB,
is now using ALTER TABLE FORCE.
- REPAIR csv-table USE_FRM now works.
- It did not work before as CSV tables had extension list in wrong
order.
- Default error messages length for %M increased from 128 to 256 to not
cut information from REPAIR.
- Documented HA_ADMIN_XX variables related to repair.
- Added HA_ADMIN_NEEDS_DATA_CONVERSION to signal that we have to
do data conversions when converting the table (and thus ALTER TABLE
copy algorithm is needed).
- Fixed typo in error message (caused test changes).
Remove alter_algorithm but keep the variable as no-op (with a warning).
The reasons for removing alter_algorithm are:
- alter_algorithm was introduced as a replacement for the
old_alter_table that was used to force the usage of the original
alter table algorithm (copy) in the cases where the new alter
algorithm did not work. The new option was added as a way to force
the usage of a specific algorithm when it should instead have made
it possible to disable algorithms that would not work for some
reason.
- alter_algorithm introduced some cases where ALTER TABLE would not
work without specifying the ALGORITHM=XXX option together with
ALTER TABLE.
- Having different values of alter_algorithm on master and slave could
cause slave to stop unexpectedly.
- ALTER TABLE FORCE, as used by mariadb-upgrade, would not always work
if alter_algorithm was set for the server.
- As part of the MDEV-33449 "improving repair of tables" it become
clear that alter- algorithm made it harder to provide a better and
more consistent ALTER TABLE FORCE and REPAIR TABLE and it would be
better to remove it.
MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
This was done by changing DTVAL to use longlong to internally store
the date/timestamp instead of an int.
I had to preserve the old binary representation used for dates in
TABLE_TYPE=BIN. One consequence is that TABLE_TYPE=BIN cannot support
the full date bit range.
MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
- Changed usage of timeval to my_timeval as the timeval parts on windows
are 32-bit long, which causes some compiler issues on windows.
This patch extends the timestamp from
2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999
for 64 bit hardware and OS where 'long' is 64 bits.
This is true for 64 bit Linux but not for Windows.
This is done by treating the 32 bit stored int as unsigned instead of
signed. This is safe as MariaDB has never accepted dates before the epoch
(1970).
The benefit of this approach that for normal timestamp the storage is
compatible with earlier version.
However for tables using system versioning we before stored a
timestamp with the year 2038 as the 'max timestamp', which is used to
detect current values. This patch stores the new 2106 year max value
as the max timestamp. This means that old tables using system
versioning needs to be updated with mariadb-upgrade when moving them
to 11.4. That will be done in a separate commit.
Various help message improvements:
* MySQL->MariaDB, mysqld->mariadbd, "mysqld daemon" -> "mariadbd process"
* typos
* don't specify defaults directly in the help message
* don't say that an option is deprecated, mark is as such
* missing spaces in the middle of the text
etc
* use new deprecated printer for all deprecated server options
* restore alphabetic option sorting order
* move deprecated printer from mysqld.cc to my_getopt.c
* in --help print deprecation message at the end of the option help
* move 'ALL' help text where it belongs - to other SET options, and
with a correct indentation.
* consistently end all or none command-line option help strings
with a dot - my_print_help() needs that.
It's about 50/50 now, so let's do none, less line wraps in --help
* remove trailing spaces from command-line option help strings
Currently there are mechanism to mark a system variable as
deprecated, but they are only used to print warning messages
when a deprecated variable is set.
Leverage the existing mechanisms in order to make the
deprecation information available at the --help output of mysqld by:
* Moving the deprecation information (i.e `deprecation_substitute`
attribute) from the `sys_var` class into the `my_option` struct.
As every `sys_var` contains its own `my_option` struct, the access
to the deprecation information remains available to `sys_var`
objects. `my_getotp` functions, which works directly with
`my_option` structs, gain access to this information while building
the --help output.
* For plugin variables, leverages the `PLUGIN_VAR_DEPRECATED` flag
and set the `deprecation_substitute` attribute accordingly when
building the `my_option` objects.
* Change the `option_cmp` function to use the `deprecation_substitute`
attribute instead of the name when sorting the options. This way
deprecated options and the substitutes will be grouped together.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
- InnoDB page compression works only on COMPACT or DYNAMIC row
format tables. So InnoDB should throw error when alter table
tries to enable PAGE_COMPRESSED for redundant table.
Correct the second parameter for strxnmov to prevent potential buffer
overflows. The second parameter must be one less than the size of the
input buffer to avoid writing past the end of the buffer.
While the second parameter is usually correct, there are exceptions
that need fixing.
This commit addresses the issue within frm_file_exists() and other
affected places.
trx_undo_mem_create_at_db_start(): Invoke recv_sys_t::recover()
instead of buf_page_get_gen(), so that all undo log pages will be
recovered correctly. Failure to do this could prevent InnoDB from
starting up due to "Data structure corruption", or it could
potentially lead to a situation where InnoDB starts up but some
transactions were recovered incorrectly.
recv_sys_t::recover(): Only acquire a buffer-fix on the pages,
not a shared latch. This is adequate protection, because this function
is only being invoked during early startup when no "users" are modifying
buffer pool pages. The only writes are due to server bootstrap
(the data files being created) or crash recovery (changes from
ib_logfile0 being applied).
buf_page_get_gen(): Assert that the function is not invoked while crash
recovery is in progress, and that the special mode BUF_GET_RECOVER is
only invoked during crash recovery or server bootstrap.
All this should really have been part of
commit 850d61736d (MDEV-32042).
in buf_dblwr_t::init_or_load_pages()
- InnoDB fails to set the TRX_SYS_DOUBLEWRITE_SPACE_ID_STORED
flag in transaction system header page while recreating
the undo log tablespaces
buf_dblwr_t::init_or_load_pages(): Tries to reset the
space id and try to write into doublewrite buffer even
when read_only mode is enabled.
In srv_all_undo_tablespaces_open(), InnoDB should try to
open the extra unused undo tablespaces instead of trying to
creating it.
This bug was found related to MDEV-34212.
Some InnoDB tests, most notably innodb.table_flags,64k would occasionally
fail. I am able to reproduce this locally on a MemorySanitizer build,
sporadically.
The following is a minimal .test file for reproducing this:
--source include/have_innodb.inc
SELECT @@innodb_page_size;
and the .opt file:
--innodb-undo-tablespaces=0 --innodb-page-size=64k
--innodb-buffer-pool-size=20m
This bug was revealed due to the recent
commit 466ae1cf81
which set innodb_fast_shutdown=0 during server bootstrap
in our regression test driver.
Due to the bug, a write of undo page 50 in the system tablespace
was discarded in buf_page_t::flush(). A subsequent InnoDB startup
failed because an old version of that page would point to a
freed undo log page 300.
mtr_t::commit_shrink(): Only invoke fil_space_t::set_create_lsn()
on undo tablespaces, which will be fully reinitialized or created
from the scratch. On the system tablespace, we must only adjust
the file size, to avoid writing pages that are beyond the end
of the tablespace. Thanks to Thirunarayanan Balathandayuthapani
for providing this fix.
trx_undo_mem_create_at_db_start(): Invoke recv_sys_t::recover()
instead of buf_page_get_gen(), so that all undo log pages will be
recovered correctly. Failure to do this could prevent InnoDB from
starting up due to "Data structure corruption", or it could
potentially lead to a situation where InnoDB starts up but some
transactions were recovered incorrectly.
recv_sys_t::recover(): Only acquire a buffer-fix on the pages,
not a shared latch. This is adequate protection, because this function
is only being invoked during early startup when no "users" are modifying
buffer pool pages. The only writes are due to server bootstrap
(the data files being created) or crash recovery (changes from
ib_logfile0 being applied).
buf_page_get_gen(): Assert that the function is not invoked while crash
recovery is in progress, and that the special mode BUF_GET_RECOVER is
only invoked during crash recovery or server bootstrap.
All this should really have been part of
commit 850d61736d (MDEV-32042).
in buf_dblwr_t::init_or_load_pages()
- InnoDB fails to set the TRX_SYS_DOUBLEWRITE_SPACE_ID_STORED
flag in transaction system header page while recreating
the undo log tablespaces
buf_dblwr_t::init_or_load_pages(): Tries to reset the
space id and try to write into doublewrite buffer even
when read_only mode is enabled.
In srv_all_undo_tablespaces_open(), InnoDB should try to
open the extra unused undo tablespaces instead of trying to
creating it.
innobase_end(): Do not attempt to shrink the system tablespace if
innodb_read_only=ON or innodb_force_recovery>4. This fixes a regression
due to commit 2d6c2f22a4 (MDEV-32452).
This bug was caught when testing a fix of MDEV-34200, which adds
SET GLOBAL innodb_fast_shutdown=0 to the test innodb.undo_upgrade.
The patch for MDEV-31340 fixed the following bugs:
MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0
Backporting the fixes from 11.5 to 10.5
BUF_LRU_MIN_LEN (256) is too high value for low buffer pool(BP) size.
For example, for BP size lower than 80M and 16 K page size, the limit is
more than 5% of total BP and for lowest BP 5M, it is 80% of the BP.
Non-data objects like explicit locks could occupy part of the BP pool
reducing the pages available for LRU. If LRU reaches minimum limit and
if no free pages are available, server would hang with page cleaner not
able to free any more pages.
Fix: To avoid such hang, we adjust the LRU limit lower than the limit
for data objects as checked in buf_LRU_check_size_of_non_data_objects()
i.e. one page less than 5% of BP.
trx_free_at_shutdown(): Similar to trx_t::commit_in_memory(),
clear the detailed_error (FOREIGN KEY constraint error) before
invoking trx_t::free(). We only do this on debug instrumented
builds in order to avoid a debug assertion failure on shutdown.
This regression is introduced in 10.6 by following commit.
commit b6a2472489
MDEV-27891: SIGSEGV in InnoDB buffer pool resize
During DML, we check if buffer pool is running out of data pages in
buf_pool_t::running_out. Here is 75% of the buffer pool is occupied by
non-data pages we rollback the current transaction and exit with
ER_LOCK_TABLE_FULL.
The integer division (n_chunks_new / 4) becomes zero whenever the total
number of chunks are < 4 making the check completely ineffective for
such cases. Also the check is inaccurate for larger chunks.
Fix-1: Correct the check in buf_pool_t::running_out.
Fix-2: While waiting for free page, check for
buf_LRU_check_size_of_non_data_objects.
A wide_handler is shared among ha_spider of partitions of the same
spider table, where the last partition is designated the owner of the
wide_handler, and is responsible for its deallocation. Therefore in
case of failure, we only reset wide_handler in error handling if the
current ha_spider is the owner of the wide_handler, otherwise it will
result in segv in the destructor of ha_spider, or during
ha_spider::close().
The init, init_error, and init_error_time fields of a SPIDER_SHARE
should only be assigned when actually doing the initialisation of a
SPIDER_SHARE, otherwise they could result in spurious failures from
spider_get_share() in a subsequent statement.
This regression is introduced in 10.6 by following commit.
commit 898dcf93a8
(Cleanup the lock creation)
It removed one important optimization for lock bitmap pre-allocation.
We pre-allocate about 8 byte extra space along with every lock object to
adjust for similar locks on newly created records on the same page by
same transaction. When it is exhausted, a new lock object is created
with similar 8 byte pre-allocation. With this optimization removed we
are left with only 1 byte pre-allocation. When large number of records
are inserted and locked in a single page, we end up creating too many
new locks almost in n^2 order.
Fix-1: Bring back LOCK_PAGE_BITMAP_MARGIN for pre-allocation.
Fix-2: Use the extra space (40 bytes) for bitmap in trx->lock.rec_pool.