Commit graph

190256 commits

Author SHA1 Message Date
Monty
dbcd3384e0 MDEV-7947 strcmp() takes 0.37% in OLTP RO
This patch ensures that all identical character sets shares the same
cs->csname.
This allows us to replace strcmp() in my_charset_same() with comparisons
of pointers. This fixes a long standing performance issue that could cause
as strcmp() for every item sent trough the protocol class to the end user.

One consequence of this patch is that we don't allow one to add a character
definition in the Index.xml file that changes the csname of an existing
character set. This is by design as changing character set names of existing
ones is extremely dangerous, especially as some storage engines just records
character set numbers.

As we now have a hash over character set's csname, we can in the future
use that for faster access to a specific character set. This could be done
by changing the hash to non unique and use the hash to find the next
character set with same csname.
2020-07-23 10:54:33 +03:00
Monty
46ffd47f42 Fixed wrong free in comp_err 2020-07-23 10:54:33 +03:00
Monty
d55f8a249e Disable maria.max_length when using valgrind (too slow) 2020-07-23 10:54:33 +03:00
Monty
747479aba2 Fixed removed warning from valgrind in Protocol::store_str
The problem was that field_count is not initialized for the Protocol
variable used when printing metadata.
2020-07-23 10:54:32 +03:00
Monty
61c15ebe32 Remove String::lex_string() and String::lex_cstring()
- Better to use 'String *' directly.
- Added String::get_value(LEX_STRING*) for the few cases where we want to
  convert a String to LEX_CSTRING.

Other things:
- Use StringBuffer for some functions to avoid mallocs
2020-07-23 10:54:32 +03:00
Monty
2682458128 Use larger buffer when reading binary and relay logs
- Should speed up replication
2020-07-23 10:54:32 +03:00
Monty
c89e927a56 Clean up Item_uint() & Item_int()
- Removed val_str() and print() as these are handled by Item_int()
- Use local StringBuffer for Item_int::print() to avoid mallocs
2020-07-23 10:54:32 +03:00
Marko Mäkelä
5e76e234f5 Merge 10.4 into 10.5 2020-07-23 09:19:06 +03:00
Marko Mäkelä
5f2628d1ee MDEV-22778 Slow InnoDB shutdown on large instance
Starting with MDEV-17441 we would no longer have os_once,
and we would always initialize zip_pad_info_t::mutex and
dict_table_t::autoinc_mutex, even for tables are not in
ROW_FORMAT=COMPRESSED nor include any AUTO_INCREMENT column.

mutex_free() on those unnecessary objects would make shutdown very slow
compared to older versions.

Let us use std::mutex for those two mutexes, to reduce the overhead.
The critical sections protected by these mutexes is very small, and
therefore contention or the need for any instrumentation should
be unlikely.
2020-07-23 08:28:17 +03:00
Oleksandr Byelkin
ddb8309e8c MDEV-21997 Server crashes in LEX::create_item_ident_sp upon use of unknown identifier
If there is no current_select and variable is not found among SP variables it can be only an error.
2020-07-22 15:03:22 +02:00
Thirunarayanan Balathandayuthapani
92014bd1c6 MDEV-23252 Assertion failure 'req_type.is_dblwr_recover() || err == DB_SUCCESS' for page_compressed tables
- This issue is caused by a5584b13d1
(MDEV-15528). os_file_punch_hole() is added to fil_io() in MDEV-15528.
But it fails to handle failure of os_file_punch_hole(). InnoDB should
handle the DB_IO_NO_PUNCH_HOLE error and silently transform to
DB_SUCCESS. InnoDB should set the punch hole flag correctly when
tablespace is loaded

fil_node_t::read_page0(): Set the punch hole flag when tablespace is loaded

fil_io(): Handle the DB_IO_NO_PUNCH_HOLE error

buf_flush_free_pages(): Checks the punch hole condition earlier using
tablespace punch hole flag
2020-07-22 16:10:56 +05:30
Thirunarayanan Balathandayuthapani
d96027c84a MDEV-23254 Replace FSP_FLAGS_HAS_PAGE_COMPRESSION with fil_space_t::is_compressed
InnoDB should replace FSP_FLAGS_HAS_PAGE_COMPRESSION check with
fil_space_t::is_compressed(). fil_space_t::is_compressed() checks
for both non full crc32 and crc32 format.
2020-07-22 16:10:56 +05:30
Jan Lindström
3d01576af2 Fix regex on test. 2020-07-22 16:10:56 +05:30
Thirunarayanan Balathandayuthapani
1ca52b969a MDEV-23254 Replace FSP_FLAGS_HAS_PAGE_COMPRESSION with fil_space_t::is_compressed
InnoDB should replace FSP_FLAGS_HAS_PAGE_COMPRESSION check with
fil_space_t::is_compressed(). fil_space_t::is_compressed() checks
for both non full crc32 and crc32 format.
2020-07-22 14:42:19 +05:30
Jan Lindström
8c7f7bae47 Fix regex on test. 2020-07-22 08:48:14 +03:00
sjaakola
7bffe468b2 MDEV-21910 Deadlock between BF abort and manual KILL command
When high priority replication slave applier encounters lock conflict in innodb,
it will force the conflicting lock holder transaction (victim) to rollback.
This is a must in multi-master sychronous replication model to avoid cluster lock-up.
This high priority victim abort (aka "brute force" (BF) abort), is started
from innodb lock manager while holding the victim's transaction's (trx) mutex.
Depending on the execution state of the victim transaction, it may happen that the
BF abort will call for THD::awake() to wake up the victim transaction for the rollback.
Now, if BF abort requires THD::awake() to be called, then the applier thread executed
locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data

If, at the same time another DBMS super user issues KILL command to abort the same victim,
it will execute locking protocol of: victim THD::LOCK_thd_data  -> victim trx mutex.
These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking
deadlock may occur.

The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim
This flag is set both when BF is called for from innodb and by KILL command.
Either path of victim killing will bail out if victim's wsrep_killed is already
set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter
records the aborter THD's ID. This is needed to preserve the right to kill
the victim from different locations for the same aborter thread.
It is also good error logging, to see who is reponsible for the abort.

A new test case was added in galera.galera_bf_kill_debug.test for scenario where
wsrep applier thread and manual KILL command try to kill same idle victim
2020-07-22 08:20:10 +03:00
Marko Mäkelä
4ec032b492 Merge 10.4 into 10.5 2020-07-21 17:33:16 +03:00
Marko Mäkelä
b1538f4d60 Merge 10.3 into 10.4 2020-07-21 16:36:47 +03:00
Marko Mäkelä
b75563cdfd MDEV-15880: ASAN heap-use-after-free with innodb_evict_tables_on_commit_debug
trx_update_mod_tables_timestamp(): When implementing
innodb_evict_tables_on_commit_debug, do not evict tables
on which transactional locks exist.

This debug variable was broken since its introduction in
commit 947b0b5722.
2020-07-21 16:03:08 +03:00
Monty
e26c822aa0 MDEV-16929 Assertion ... in close_thread_tables upon killing connection
Problem was that the code didn't handle a transaction created in innodb
as part of a failed mysql_lock_tables()
2020-07-21 15:12:53 +03:00
Monty
fc48c8ff4c MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel repl.
The issue was:
T1, a parallel slave worker thread, is waiting for another worker thread to
commit. While waiting, it has the MDL_BACKUP_COMMIT lock.
T2, working for mariabackup, is doing BACKUP STAGE BLOCK_COMMIT and blocks
all commits.
This causes a deadlock as the thread T1 is waiting for can't commit.

Fixed by moving locking of MDL_BACKUP_COMMIT from ha_commit_trans() to
commit_one_phase_2()

Other things:
- Added a new argument to ha_comit_one_phase() to signal if the
  transaction was a write transaction.
- Ensured that ha_maria::implicit_commit() is always called under
  MDL_BACKUP_COMMIT. This code is not needed in 10.5
- Ensure that MDL_Request values 'type' and 'ticket' are always
  initialized. This makes it easier to check the state of the MDL_Request.
- Moved thd->store_globals() earlier in handle_rpl_parallel_thread() as
  thd->init_for_queries() could use a MDL that could crash if store_globals
  where not called.
- Don't call ha_enable_transactions() in THD::init_for_queries() as this
  is both slow (uses MDL locks) and not needed.
2020-07-21 12:42:42 +03:00
Eugene Kosov
c4d5b6b157 MDEV-22899 Assertion `field->col->is_binary() || field->prefix_len % field->col->mbmaxlen == 0' failed in dict_index_add_to_cache
is_part_of_a_key(): detect is TEXT field is a part of some key

ha_innobase::can_convert_blob(): now correctly detect whether our blob
is a part of some key. Previously the check didn't work in some cases.
2020-07-20 18:53:16 +03:00
Aleksey Midenkov
af83ed9f0e MDEV-20661 Virtual fields are not recalculated on system fields value assignment
Fix stale virtual field value in 4 cases: when virtual field depends
on row_start/row_end in timestamp/trx_id versioned table. row_start
dep is recalculated in vers_update_fields() (SQL and InnoDB
layer). row_end dep is recalculated on history row insert.
2020-07-20 18:28:08 +03:00
Aleksey Midenkov
af57c65809 MDEV-22061 InnoDB: Assertion of missing row in sec index row_start upon REPLACE on a system-versioned table
make_versioned_helper() appended new update field unconditionally
while it should check if this field already exists in update vector.

Misc renames to conform versioning prefix. vers_update_fields() name
conforms with sql layer TABLE::vers_update_fields().
2020-07-20 18:28:07 +03:00
Thirunarayanan Balathandayuthapani
c89366866b MDEV-22970 Possible corruption of page_compressed tables, or
when scrubbing is enabled

buf_read_recv_pages(): Ignore the page to read if it is already
present in the freed ranges.

store_freed_or_init_rec(): Store the ranges only if scrubbing
is enabled or page compressed tablespace.

recv_init_crash_recovery_space(): Add the freed range only when
scrubbing or page compressed tablespace.

range_set::contains(): Search the value is present in ranges.

range_set::remove_if_exists(): Remove the value if exist in ranges.

mtr_t::init(): Handles the scenario that mini-transaction may allocate
a page that had just been freed.

recv_sys_t::parse(): Note down the FREE and INIT redo log irrespective
of STORE value.

Removed innodb_tablespaces_scrubbing from test case
2020-07-20 18:52:10 +05:30
Marko Mäkelä
4d4865de6f Merge 10.4 into 10.5 2020-07-20 15:55:59 +03:00
Marko Mäkelä
6c165b4bd6 MDEV-21910: Null-merge 10.4 to 10.5 (FIXME: really merge this!) 2020-07-20 15:37:12 +03:00
Marko Mäkelä
4b959bd8df Merge 10.3 into 10.4 2020-07-20 15:34:59 +03:00
Marko Mäkelä
acc58fd835 Merge 10.2 into 10.3 2020-07-20 15:11:59 +03:00
Marko Mäkelä
ca9276e37e Merge 10.1 into 10.2 2020-07-20 14:53:24 +03:00
Marko Mäkelä
57ec42bc32 MDEV-23190 InnoDB data file extension is not crash-safe
When InnoDB is extending a data file, it is updating the FSP_SIZE
field in the first page of the data file.

In commit 8451e09073 (MDEV-11556)
we removed a work-around for this bug and made recovery stricter,
by making it track changes to FSP_SIZE via redo log records, and
extend the data files before any changes are being applied to them.

It turns out that the function fsp_fill_free_list() is not crash-safe
with respect to this when it is initializing the change buffer bitmap
page (page 1, or generally, N*innodb_page_size+1). It uses a separate
mini-transaction that is committed (and will be written to the redo
log file) before the mini-transaction that actually extended the data
file. Hence, recovery can observe a reference to a page that is
beyond the current end of the data file.

fsp_fill_free_list(): Initialize the change buffer bitmap page in
the same mini-transaction.

The rest of the changes are fixing a bug that the use of the separate
mini-transaction was attempting to work around. Namely, we must ensure
that no other thread will access the change buffer bitmap page before
our mini-transaction has been committed and all page latches have been
released.

That is, for read-ahead as well as neighbour flushing, we must avoid
accessing pages that might not yet be durably part of the tablespace.

fil_space_t::committed_size: The size of the tablespace
as persisted by mtr_commit().

fil_space_t::max_page_number_for_io(): Limit the highest page
number for I/O batches to committed_size.

MTR_MEMO_SPACE_X_LOCK: Replaces MTR_MEMO_X_LOCK for fil_space_t::latch.

mtr_x_space_lock(): Replaces mtr_x_lock() for fil_space_t::latch.

mtr_memo_slot_release_func(): When releasing MTR_MEMO_SPACE_X_LOCK,
copy space->size to space->committed_size. In this way, read-ahead
or flushing will never be invoked on pages that do not yet exist
according to FSP_SIZE.
2020-07-20 14:48:56 +03:00
Marko Mäkelä
98e2c17e9e Cleanup: Remove fil_check_adress_in_tablespace() 2020-07-20 14:48:56 +03:00
Marko Mäkelä
14543afd59 Cleanup: Remove unused AbstractCallback::m_free_limit 2020-07-20 14:48:56 +03:00
Marko Mäkelä
0a7faed75a MDEV-22771 Instant extension of CHAR column is wrongly allowed
commit 854c219a7f (MDEV-17301)
broke a constraint: Fixed-length columns cannot be extended in InnoDB
without rebuilding the table.

ha_innobase::can_convert_string(): Correct the condition. We must
not allow any instantaneous change to the length of CHAR columns
measured in characters. For any format other than ROW_FORMAT=REDUNDANT,
we can allow the length in bytes to be extended if mbminlen<mbmaxlen held
before the change of the character set.
2020-07-20 14:15:56 +03:00
Varun Gupta
c400ef2586 Making the stat_tables_innodb test deterministic 2020-07-18 10:19:13 +05:30
Alexey Botchkov
2cae58f891 MDEV-18371 Server crashes in ha_innobase::cmp_ref upon UPDATE with PARTITION clause.
m_file[0] not always is a good sample.
2020-07-17 12:20:23 +04:00
Julius Goryavsky
a1e52e7f32 MDEV-20401: revert unnecessary change 2020-07-16 16:40:37 +02:00
Julius Goryavsky
1ba8df4c60 MDEV-20401: revert unnecessary change 2020-07-16 16:31:27 +02:00
Julius Goryavsky
956f21c3b0 Merge remote-tracking branch 'origin/bb-10.4-MDEV-21910' into 10.4 2020-07-16 13:03:29 +02:00
Julius Goryavsky
b3cae9db11 MDEV-20401: Server incorrectly auto-sets lower_case_file_system value
Server auto-sets lower_case_file_system value based on default
datadir's behavior instead of instead of using the directory specified
by the user through the configuration file or command line options.

This patch fixes this problem.
2020-07-16 12:28:36 +02:00
Julius Goryavsky
4412a461a1 MDEV-20401: Server incorrectly auto-sets lower_case_file_system value
Server auto-sets lower_case_file_system value based on default
datadir's behavior instead of instead of using the directory specified
by the user through the configuration file or command line options.

This patch fixes this problem.
2020-07-16 12:17:01 +02:00
Marko Mäkelä
054f10365c Merge 10.4 into 10.5 2020-07-16 07:15:06 +03:00
Marko Mäkelä
3280edda89 Merge 10.3 into 10.4 2020-07-16 06:57:50 +03:00
Marko Mäkelä
73aa31fbfd Merge 10.2 into 10.3 2020-07-16 06:55:23 +03:00
Marko Mäkelä
147d4b1ec0 MDEV-21347 innodb_log_optimize_ddl=OFF is not crash safe
In commit 0f90728bc0 (MDEV-16809)
we introduced the configuration option innodb_log_optimize_ddl
for controlling whether native index creation or table-rebuild
in InnoDB should avoid writing full redo log.

Fungo Wang reported that this option is causing occasional failures.
The reason is that pages may be written to data files in an
inconsistent state. Applying log records to such inconsistent pages
may fail.

The solution is to always invoke PageBulk::finish() before page latches
may be released, to ensure that the page contents is in a consistent
state.

Something similar was implemented in MySQL 8.0.13:
mysql/mysql-server@d1254b9473

buf_block_t::skip_flush_check: Remove. Suppressing consistency checks
is a bad idea.

PageBulk::needs_finish(): New predicate: Determine whether
PageBulk::finish() must fix up the page.

PageBulk::init(): Clear PAGE_DIRECTION to ensure that needs_finish()
will hold. We change the field from PAGE_NO_DIRECTION to 0
and back without writing redo log. This trick avoids the need
to introduce any new data member to PageBulk.

PageBulk::insert(): Replace some high-level accessors to bypass
debug assertions related to PAGE_HEAP_TOP that we will be violating
until finish() has been executed.

PageBulk::finish(): Tolerate m_rec_no==0. We must invoke this also
on an empty page, to ensure that PAGE_HEAP_TOP is initialized.

PageBulk::commit(): Always invoke finish().

PageBulk::release(), BtrBulk::pageSplit(), BtrBulk::storeExt(),
BtrBulk::finish(): Invoke PageBulk::finish().
2020-07-16 06:35:15 +03:00
Marko Mäkelä
fee11c7727 Make page validation stricter
page_simple_validate_old(), page_simple_validate_new():
Require PAGE_N_DIR_SLOTS to be at least 2.
2020-07-15 19:41:01 +03:00
Marko Mäkelä
38b4c07833 MDEV-23183 Infinite loop on page_validate() on corrupted page
MDEV-22721 (commit eba2d10ac5)
inadvertently introduced an infinite loop.

page_validate(): Remove the infinite loop.
2020-07-15 19:41:01 +03:00
Daniel Black
20512a68d8 MDEV-23175: my_timer_milliseconds ftime deprecated - clock_gettime replacement
Linux glibc has deprecated ftime resutlting in a compile error on Fedora-32.

Per manual clock_gettime is the suggested replacement. Because my_timer_milliseconds
is a relative time used by largely the perfomrance schema, CLOCK_MONOTONIC_COARSE
is used. This has been available since Linux-2.6.32.

The low overhead is shows in the unittest:

    $ unittest/mysys/my_rdtsc-t
    1..11
    # ----- Routine ---------------
    # myt.cycles.routine          :             5
    # myt.nanoseconds.routine     :            11
    # myt.microseconds.routine    :            13
    # myt.milliseconds.routine    :            18
    # myt.ticks.routine           :            17
    # ----- Frequency -------------
    # myt.cycles.frequency        :    3596597014
    # myt.nanoseconds.frequency   :    1000000000
    # myt.microseconds.frequency  :       1000000
    # myt.milliseconds.frequency  :          1039
    # myt.ticks.frequency         :           103
    # ----- Resolution ------------
    # myt.cycles.resolution       :             1
    # myt.nanoseconds.resolution  :             1
    # myt.microseconds.resolution :             1
    # myt.milliseconds.resolution :             1
    # myt.ticks.resolution        :             1
    # ----- Overhead --------------
    # myt.cycles.overhead         :           118
    # myt.nanoseconds.overhead    :           234
    # myt.microseconds.overhead   :           222
    # myt.milliseconds.overhead   :            30
    # myt.ticks.overhead          :          4946
    ok 1 - my_timer_init() did not crash
    ok 2 - The cycle timer is strictly increasing
    ok 3 - The cycle timer is implemented
    ok 4 - The nanosecond timer is increasing
    ok 5 - The nanosecond timer is implemented
    ok 6 - The microsecond timer is increasing
    ok 7 - The microsecond timer is implemented
    ok 8 - The millisecond timer is increasing
    ok 9 - The millisecond timer is implemented
    ok 10 - The tick timer is increasing
    ok 11 - The tick timer is implemented
2020-07-15 16:23:27 +03:00
Marko Mäkelä
e67daa5653 Merge 10.4 into 10.5 2020-07-15 14:51:22 +03:00
Vladislav Vaintroub
9c8420fe8c Fix compile warning 2020-07-15 09:49:48 +02:00