Commit graph

10147 commits

Author SHA1 Message Date
Marko Mäkelä
2255649939 Merge 10.6 into 10.7 2021-09-17 20:23:17 +03:00
Marko Mäkelä
ae8c8d8874 Merge 10.5 into 10.6 2021-09-17 20:07:38 +03:00
Marko Mäkelä
699de65d5e Merge 10.4 into 10.5 2021-09-17 19:57:13 +03:00
Jan Lindström
d3b35598fc MDEV-26053 : TRUNCATE on table with Foreign Key Constraint no longer replicated to other nodes
Problem was that there was extra condition !thd->lex->no_write_to_binlog
before call to begin TOI. It seems that this variable is not initialized.
TRUNCATE does not support [NO_WRITE_TO_BINLOG | LOCAL] keywords, thus
we should not check this condition. All this was hidden in a macro,
so I decided to remove those macros that were used only a few places
with actual function calls.
2021-09-17 07:18:37 +03:00
Marko Mäkelä
03c09837fc Merge 10.5 into 10.6 2021-09-16 20:17:12 +03:00
Monty
0d47945b58 Fixed bug in aria_chk that overwrote sort_buffer_length
This bug happens when one runs aria_chk on multiple tables. It does not
affect REPAIR TABLE.
aria_chk tries to optimize the sort buffer size to minimize memory usage
when used with small tables. The bug was that the adjusted value was
used as a base for the next table, which could cause problems.
2021-09-15 21:21:03 +03:00
Monty
f03fee06b0 Improve error messages from Aria
- Error on commit now returns HA_ERR_COMMIT_ERROR instead of
  HA_ERR_INTERNAL_ERROR
- If checkpoint fails, it will now print out where it failed.
2021-09-15 19:27:34 +03:00
Marko Mäkelä
58aaa67064 Merge 10.6 into 10.7 2021-08-31 11:01:19 +03:00
Marko Mäkelä
e94172c2a0 Merge 10.5 into 10.6 2021-08-31 11:00:41 +03:00
Marko Mäkelä
e62120cec7 Merge 10.4 into 10.5 2021-08-31 10:04:56 +03:00
Marko Mäkelä
0464761126 Merge 10.3 into 10.4 2021-08-31 09:22:21 +03:00
Marko Mäkelä
e835cc851e Merge 10.2 into 10.3 2021-08-31 08:36:59 +03:00
Marko Mäkelä
fda704c82c Fix GCC 11 -Wmaybe-uninitialized for PLUGIN_PERFSCHEMA
init_mutex_v1_t: Stop lying that the mutex parameter is const.
GCC 11.2.0 assumes that it is and could complain about any mysql_mutex_t
being uninitialized even after mysql_mutex_init() as long as
PLUGIN_PERFSCHEMA is enabled.

init_rwlock_v1_t, init_cond_v1_t: Remove untruthful const qualifiers.

Note: init_socket_v1_t is expecting that the socket fd has already
been created before PSI_SOCKET_CALL(init_socket), and therefore that
parameter really is being treated as a pointer to const.
2021-08-30 11:52:59 +03:00
Sergei Golubchik
0299ec29d4 cleanup: MY_BITMAP mutex
in about a hundred of users of MY_BITMAP, only two were using its
built-in mutex, and only one of those two was actually needing it.

Remove the mutex from MY_BITMAP, remove all associated conditions
and checks in bitmap functions. Use an external LOCK_temp_pool
mutex and temp_pool_set_next/temp_pool_clear_bit acccessors.

Remove bitmap_init/bitmap_free, always use my_* versions.
2021-08-26 23:39:52 +02:00
Marko Mäkelä
c949772680 Merge 10.6 into 10.7 2021-08-16 08:24:00 +03:00
Vladislav Vaintroub
9ac1ac0061 Fix clang-cl warning 2021-08-11 11:26:35 +02:00
Vladislav Vaintroub
3cc23c9973 MDEV-25602 Eliminate the rest of __WIN__ in Connect 2021-08-05 11:32:41 +02:00
Marko Mäkelä
ddcb242b3c Merge 10.6 into 10.7 2021-08-04 11:52:39 +03:00
Oleksandr Byelkin
6efb5e9f5e Merge branch '10.5' into 10.6 2021-08-02 10:11:41 +02:00
Oleksandr Byelkin
ae6bdc6769 Merge branch '10.4' into 10.5 2021-07-31 23:19:51 +02:00
Oleksandr Byelkin
7841a7eb09 Merge branch '10.3' into 10.4 2021-07-31 22:59:58 +02:00
Marko Mäkelä
f50eb0d398 Merge 10.2 into 10.3 2021-07-27 10:47:17 +03:00
Marko Mäkelä
da094188f6 MDEV-24393 InnoDB disregards --skip-external-locking
On POSIX systems, InnoDB would unconditionally acquire advisory locks
on the files that it opens. On Linux, this would be observable by
a large number of entries in /proc/locks.

Other storage engines would only acquire advisory locks on files
based on the Boolean configuration parameter external_locking.

Let InnoDB do the same.

NOTE: The --skip-external-locking is activated by default. To have
InnoDB acquire advisory locks, --external-locking must be specified.

Reviewed by: Sergei Golubchik
2021-07-27 08:52:59 +03:00
Michael Okoko
6cd3588f0e Improve documentation of json parser functions
Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2021-07-22 21:51:49 +03:00
Eric Herman
105e4148bf Add json_normalize function to json_lib
This patch implements a library for normalizing json documents.
The algorithm is:
* Recursively sort json keys according to utf8mb4_bin collation.
* Normalize numbers to be of the form [-]<digit>.<frac>E<exponent>
* All unneeded whitespace and line endings are removed.
* Arrays are not sorted.

Co-authored-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>
2021-07-21 16:32:11 +03:00
Eric Herman
7b587fcbe7 fix json typo s/UNINITALIZED/UNINITIALIZED/ 2021-07-21 16:32:11 +03:00
Monty
b1d81974b2 Added support to MEM_ROOT for write protected memory
This is useful for thing like Item_true and Item_false that we
allocated and initalize once and want to ensure that nothing can
change them

Main changes:
- Memory protection is achived by allocating memory with mmap() and
  protect it from write with mprotect()
- init_alloc_root(...,MY_ROOT_USE_MPROTECT) will create a
  memroot that one can later use with protect_root() to turn it
  read only or turn it back to read-write. All allocations to this
  memroot is done with mmap() to ensure page alligned allocations.
- alloc_root() code was rearranged to combine normal and valgrind code.
- init_alloc_root() now changes block size to be power of 2's, to get less
  memory fragmentation.
- Changed MEM_ROOT structure to make it smaller. Also renamed
  MEM_ROOT m_psi_key to psi_key.
- Moved MY_THREAD_SPECIFIC marker in MEM_ROOT from block size (old hack)
  to flags.
- Added global variable my_system_page_size. This is initialized at
  startup.
2021-07-18 19:59:35 +03:00
Marko Mäkelä
4b0070f642 MDEV-26029: Implement my_test_if_thinly_provisioned() for ScaleFlux
This is based on code that was contributed by Ning Zheng and Ray Kuan
from ScaleFlux.
2021-06-29 15:20:16 +03:00
Marko Mäkelä
30edd5549d MDEV-26029: Sparse files are inefficient on thinly provisioned storage
The MariaDB implementation of page_compressed tables for InnoDB used
sparse files. In the worst case, in the data file, every data page
will consist of some data followed by a hole. This may be extremely
inefficient in some file systems.

If the underlying storage device is thinly provisioned (can compress
data on the fly), it would be good to write regular files (with sequences
of NUL bytes at the end of each page_compressed block) and let the
storage device take care of compressing the data.

For reads, sparse file regions and regions containing NUL bytes will be
indistinguishable.

my_test_if_disable_punch_hole(): A new predicate for detecting thinly
provisioned storage. (Not implemented yet.)

innodb_atomic_writes: Correct the comment.

buf_flush_page(): Support all values of fil_node_t::punch_hole.
On a thinly provisioned storage device, we will always write
NUL-padded innodb_page_size bytes also for page_compressed tables.

buf_flush_freed_pages(): Remove a redundant condition.

fil_space_t::atomic_write_supported: Remove. (This was duplicating
fil_node_t::atomic_write.)

fil_space_t::punch_hole: Remove. (Duplicated fil_node_t::punch_hole.)

fil_node_t: Remove magic_n, and consolidate flags into bitfields.
For punch_hole we introduce a third value that indicates a
thinly provisioned storage device.

fil_node_t::find_metadata(): Detect all attributes of the file.
2021-06-29 15:18:22 +03:00
Marko Mäkelä
891a927e80 Merge 10.5 into 10.6 2021-06-26 11:53:28 +03:00
Marko Mäkelä
759deaa0a2 MDEV-26010 fixup: Use acquire/release memory order
In commit 5f22511e35 we depend on
Total Store Ordering. For correct operation on ISAs that implement
weaker memory ordering, we must explicitly use release/acquire stores
and loads on buf_page_t::oldest_modification_ to prevent a race condition
when buf_page_t::list does not happen to be on the same cache line.

buf_page_t::clear_oldest_modification(): Assert that the block is
not in buf_pool.flush_list, and use std::memory_order_release.

buf_page_t::oldest_modification_acquire(): Read oldest_modification_
with std::memory_order_acquire. In this way, if the return value is 0,
the caller may safely assume that it will not observe the buf_page_t
as being in buf_pool.flush_list, even if it is not holding
buf_pool.flush_list_mutex.

buf_flush_relocate_on_flush_list(), buf_LRU_free_page():
Invoke buf_page_t::oldest_modification_acquire().
2021-06-26 11:16:40 +03:00
Marko Mäkelä
4dfec8b230 Merge 10.5 into 10.6 2021-06-21 17:49:33 +03:00
Marko Mäkelä
a42c80bd48 Merge 10.4 into 10.5 2021-06-21 14:22:22 +03:00
Monty
af33202af7 Added DDL_options_st *thd_ddl_options(const MYSQL_THD thd)
This is used by InnoDB to detect if CREATE...SELECT is used

Other things:
- Changed InnoDB to use thd_ddl_options()
- Removed lock checking code for create...select (Approved by Marko)
2021-06-14 17:03:19 +03:00
Marko Mäkelä
1bd681c8b3 MDEV-25506 (3 of 3): Do not delete .ibd files before commit
This is a complete rewrite of DROP TABLE, also as part of other DDL,
such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE.

The background DROP TABLE queue hack is removed.
If a transaction needs to drop and create a table by the same name
(like TRUNCATE TABLE does), it must first rename the table to an
internal #sql-ib name. No committed version of the data dictionary
will include any #sql-ib tables, because whenever a transaction
renames a table to a #sql-ib name, it will also drop that table.
Either the rename will be rolled back, or the drop will be committed.

Data files will be unlinked after the transaction has been committed
and a FILE_RENAME record has been durably written. The file will
actually be deleted when the detached file handle returned by
fil_delete_tablespace() will be closed, after the latches have been
released. It is possible that a purge of the delete of the SYS_INDEXES
record for the clustered index will execute fil_delete_tablespace()
concurrently with the DDL transaction. In that case, the thread that
arrives later will wait for the other thread to finish.

HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag.
ha_innobase::truncate() now requires that all other references to
the table be released in advance. This was implemented by Monty.

ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected,
we will "hijack" the current transaction, drop the table in
the current transaction and commit the current transaction.
This essentially fixes MDEV-21602. There is a FIXME comment about
making the check less failure-prone.

ha_innobase::truncate(), ha_innobase::delete_table():
Implement a fast path for temporary tables. We will no longer allow
temporary tables to use the adaptive hash index.

dict_table_t::mdl_name: The original table name for the purpose of
acquiring MDL in purge, to prevent a race condition between a
DDL transaction that is dropping a table, and purge processing
undo log records of DML that had executed before the DDL operation.
For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the
dict_table_t::mdl_name will differ from dict_table_t::name.

dict_table_t::parse_name(): Use mdl_name instead of name.

dict_table_rename_in_cache(): Update mdl_name.

For the internal FTS_ tables of FULLTEXT INDEX, purge would
acquire MDL on the FTS_ table name, but not on the main table,
and therefore it would be able to run concurrently with a
DDL transaction that is dropping the table. Previously, the
DROP TABLE queue hack prevented a race between purge and DDL.
For now, we introduce purge_sys.stop_FTS() to prevent purge from
opening any table, while a DDL transaction that may drop FTS_
tables is in progress. The function fts_lock_table(), which will
be invoked before the dictionary is locked, will wait for
purge to release any table handles.

trx_t::drop_table_statistics(): Drop statistics for the table.
This replaces dict_stats_drop_index(). We will drop or rename
persistent statistics atomically as part of DDL transactions.
On lock conflict for dropping statistics, we will fail instantly
with DB_LOCK_WAIT_TIMEOUT, because we will be holding the
exclusive data dictionary latch.

trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory().
Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT
in addition to DB_DUPLICATE_KEY. The call to fts_commit() is
entirely misplaced here and may obviously break the consistency
of transactions that affect FULLTEXT INDEX. It needs to be fixed
separately.

dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175).
The counter was a work-around for missing meta-data locking (MDL)
on the SQL layer, and not really needed in MariaDB.

ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28.

HA_ERR_TABLE_IN_FK_CHECK: Remove.

row_ins_check_foreign_constraints(): Do not acquire
dict_sys.latch either. The SQL-layer MDL will protect us.

This was reviewed by Thirunarayanan Balathandayuthapani
and tested by Matthias Leich.
2021-06-09 17:06:07 +03:00
Vladislav Vaintroub
b81803f065 MDEV-22221: MariaDB with WolfSSL doesn't support AES-GCM cipher for SSL
Enable AES-GCM for SSL (only).

AES-GCM for encryption plugins remains disabled (aes-t fails, on some bug
in GCM or CTR padding)
2021-06-09 15:44:55 +02:00
Vladislav Vaintroub
5ba4c4200c MDEV-25870 Windows - fix ARM64 cross-compilation 2021-06-07 23:15:36 +02:00
Vladislav Vaintroub
3d6eb7afcf MDEV-25602 get rid of __WIN__ in favor of standard _WIN32
This fixed the MySQL bug# 20338 about misuse of double underscore
prefix __WIN__, which was old MySQL's idea of identifying Windows
Replace it by _WIN32 standard symbol for targeting Windows OS
(both 32 and 64 bit)

Not that connect storage engine is not fixed in this patch (must be
fixed in "upstream" branch)
2021-06-06 13:21:03 +02:00
Krunal Bauskar
2a4f72b7d2 MDEV-25807: ARM build failure due to missing ISB instruction on ARMv6
Debian has support for 3 different arm machine and probably one which is
failing is a pretty old version which doesn't support the said instruction.

So we can make it specific to _aarch64_ (as per the arm official reference
manual ISB should be defined by all ARMv8 processors). [as per the ARMv7 specs
even 32 bits processor should support it but not sure which exact version
Debian has under armel].

In either case, the said performance issue will have an impact mainly with a
high-end processors with extreme parallelism so it safe to limit it to _aarch64_.

Corrects: MDEV-24630
2021-06-01 13:51:39 +10:00
Georg Richter
be8e51c459 MDEV-25511: Command line tools don't support CRL parameters
Enable CRL support for GnuTLS (which was implemented by CONC-433).
2021-05-31 08:29:37 +02:00
Sergei Golubchik
7700626fe6 remove thread_pool_priv.h 2021-05-19 22:54:14 +02:00
Monty
79d9a725f9 Added checking to protect against simultaneous double free in safemalloc
If two threads would call sf_free() / free_memory() at the same time,
bad_ptr() would not detect this. Fixed by adding extra detection
when working with the memory region under sf_mutex.

Other things:
- If safe_malloc crashes while mutex is hold, stack trace printing will
  hang because we malloc is called by my_open(), which is used by stack
  trace printing code. Fixed by adding MY_NO_REGISTER flag to my_open,
  which will disable the malloc() call to remmeber the file name.
2021-05-19 22:54:14 +02:00
Monty
b8c3159594 MDEV-24285 support oracle build-in function: sys_guid
SYS_GUID() returns same as UUID(), but without any '-'

author: woqutech
2021-05-19 22:54:11 +02:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Monty
b6ff139aa3 Reduce usage of strlen()
Changes:
- To detect automatic strlen() I removed the methods in String that
  uses 'const char *' without a length:
  - String::append(const char*)
  - Binary_string(const char *str)
  - String(const char *str, CHARSET_INFO *cs)
  - append_for_single_quote(const char *)
  All usage of append(const char*) is changed to either use
  String::append(char), String::append(const char*, size_t length) or
  String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
  String::append()
- Added overflow argument to escape_string_for_mysql() and
  escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
  This was needed as most usage of the above functions never tested the
  result for -1 and would have given wrong results or crashes in case
  of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
  Changed all Item_func::func_name()'s to func_name_cstring()'s.
  The old Item_func_or_sum::func_name() is now an inline function that
  returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
  LEX_CSTRING.
- Changed for some functions the name argument from const char * to
  to const LEX_CSTRING &:
  - Item::Item_func_fix_attributes()
  - Item::check_type_...()
  - Type_std_attributes::agg_item_collations()
  - Type_std_attributes::agg_item_set_converter()
  - Type_std_attributes::agg_arg_charsets...()
  - Type_handler_hybrid_field_type::aggregate_for_result()
  - Type_handler_geometry::check_type_geom_or_binary()
  - Type_handler::Item_func_or_sum_illegal_param()
  - Predicant_to_list_comparator::add_value_skip_null()
  - Predicant_to_list_comparator::add_value()
  - cmp_item_row::prepare_comparators()
  - cmp_item_row::aggregate_row_elements_for_comparison()
  - Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
  could be simplified to not use String_space(), thanks to the fixed
  my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
  - NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
  clarify what the function really does.
- Rename of protocol function:
  bool store(const char *from, CHARSET_INFO *cs) to
  bool store_string_or_null(const char *from, CHARSET_INFO *cs).
  This was done to both clarify the difference between this 'store' function
  and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.

Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
  in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
  append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
  inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
  case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
2021-05-19 22:27:48 +02:00
Monty
da85ad7987 Optimize Sql_alloc
- Remove 'dummy_for_valgrind' overrun marker as this doesn't help much.
  The element also distorts the sizes of objects a bit, which makes it
  harder to calculate gain in object sizes when doing size optimizations.
- Replace usage of thd_get_current_thd() with _current_thd()
- Avoid one extra call indirection when using thd_get_current_thd(), which
  is used by Sql_alloc, by replacing it with _current_thd()
2021-05-19 22:27:27 +02:00
Monty
fa7d4abf16 Added typedef decimal_digits_t (uint16) for number of digits in most
aspects of decimals and integers

For fields and Item's uint8 should be good enough. After
discussions with Alexander Barkov we choose uint16 (for now)
as some format functions may accept +256 digits.

The reason for this patch was to make the usage and storage of decimal
digits simlar. Before this patch decimals was stored/used as uint8,
int and uint.  The lengths for numbers where also using a lot of
different types.

Changed most decimal variables and functions to use the new typedef.

squash! af7f09106b6c1dc20ae8c480bff6fd22d266b184

Use decimal_digits_t for all aspects of digits (total, precision
and scale), both for decimals and integers.
2021-05-19 22:27:27 +02:00
Rucha Deodhar
2fdb556e04 MDEV-8334: Rename utf8 to utf8mb3
This patch changes the main name of 3 byte character set from utf8 to
utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default,
so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
2021-05-19 06:48:36 +02:00
Marko Mäkelä
c290c0d7e0 Simplify a preprocessor condition
Compilers older than GCC 4.8 are not supported anyway.
2021-05-17 18:10:11 +03:00
Jan Lindström
bee1bb056d MDEV-9609 : wsrep_debug only logs DDL information on originating node
Added DDL logging to applier and replaying also so that
DDL is logged on other than originating node.

wsrep.h
	Removed wsrep_thd_is_local conditions and cleaned up
	the macros. Removed WSREP_TO_ISOLATION_END.

Event_job_data::execute
change_password
acl_set_default_role
mysql_execute_command
	Replaced macro by function call

wsrep_to_isolation_begin
wsrep_to_isolation_end
	If execution is not local log DDL-information when
	wsrep_debug is enabled

No new tests required as current regression setting is
already testing these code paths.
2021-05-15 13:24:22 +03:00
Sergei Golubchik
0b116d160a 5.7.34 2021-05-03 11:22:07 +02:00
Marko Mäkelä
4930f9c94b Merge 10.5 into 10.6 2021-04-21 11:45:00 +03:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Oleksandr Byelkin
a3099a3b4a MDEV-24312 master_host has 60 character limit, increase to 255 bytes
Also increase user name up to 128.

The work was started by Rucha Deodhar <rucha.deodhar@mariadb.com>,
contains audit plugin fixes by Alexey Botchkov <holyfoot@askmonty.org>.
2021-04-20 16:36:56 +02:00
Monty
031f11717d Fix all warnings given by UBSAN
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.

The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
  complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
  memory access of integers.  Fixed by using byte_order_generic.h when
  compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
  disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
  suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
  safe to have overflows (two cases, in item_func.cc).

Things fixed:
- Don't left shift signed values
  (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
  constructors.  This was needed as UBSAN checks that these types has
  correct values when one copies an object.
  (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
  deleted objects.
  (events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
  on Query_arena object.
- Fixed several cast of objects to an incompatible class!
  (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
   sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
  This includes also ++ and -- of integers.
  (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
  value_type is initialized to this instead of to -1, which is not a valid
  enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
  instead of a null string (safer as it ensures we do not do arithmetic
  on null strings).

Other things:

- Changed struct st_position to an OBJECT and added an initialization
  function to it to ensure that we do not copy or use uninitialized
  members. The change to a class was also motived that we used "struct
  st_position" and POSITION randomly trough the code which was
  confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
  the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
  avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr.  (This variable was before
  only in 10.5 and up).  It can now have one of two values:
  ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
  it virtual. This was an effort to get UBSAN to work with loaded storage
  engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
  in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
  server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
  to integer arithmetic.

Changes that should not be needed but had to be done to suppress warnings
from UBSAN:

- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
  compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
  some compile time warnings.

Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
2021-04-20 12:30:09 +03:00
Marko Mäkelä
1900c2ede5 Merge 10.5 into 10.6 2021-04-08 10:11:36 +03:00
Daniel Black
553ef1a78b MDEV-13115: Implement SELECT SKIP LOCKED
Adds an implementation for SELECT ... FOR UPDATE SKIP LOCKED /
SELECT ... LOCK IN SHARED MODE SKIP LOCKED

This is implemented only InnoDB at the moment, not in RockDB yet.

This adds a new hander flag HA_CAN_SKIP_LOCKED than
will be used when the storage engine advertises the flag.

When a storage engine indicates this flag it will get
TL_WRITE_SKIP_LOCKED and TL_READ_SKIP_LOCKED transaction types.

The Lex structure has been updated to store both the FOR UPDATE/LOCK IN
SHARE as well as the SKIP LOCKED so the SHOW CREATE VIEW
implementation is simplier.

"SELECT FOR UPDATE ... SKIP LOCKED" combined with CREATE TABLE AS or
INSERT.. SELECT on the result set is not safe for STATEMENT based
replication. MIXED replication will replicate this as row based events."

Thanks to guidance from Facebook commit
193896c466
This helped verify basic test case, and components that need implementing
(even though every part was implemented differently).

Thanks Marko for guidance on simplier InnoDB implementation.

Reviewers: Marko, Monty
2021-04-08 16:51:36 +10:00
Daniel Black
058484687a Add TL_FIRST_WRITE in SQL layer for determining R/W
Use < TL_FIRST_WRITE for determining a READ transaction.

Use TL_FIRST_WRITE as the relative operator replacing TL_WRITE_ALLOW_WRITE
as the minimium WRITE lock type.
2021-04-08 16:51:36 +10:00
Marko Mäkelä
7b48da4d7e Merge 10.4 into 10.5 2021-04-08 07:47:49 +03:00
Daniel Black
f69c1c9dcb MDEV-19508: SI_KERNEL is not on all implementations
SI_USER is, however in FreeBSD there are a couple of non-kernel
user signal infomations above SI_KERNEL.

Put a fallback just in case there is nothing available.
2021-04-07 14:01:56 +10:00
Monty
81258f1432 MDEV-17913 Encrypted transactional Aria tables remain corrupt after crash recovery, automatic repairment does not work
This was because of a wrong test in encryption code that wrote random
numbers over the LSN for pages for transactional Aria tables during repair.
The effect was that after an ALTER TABLE ENABLE KEYS of a encrypted
recovery of the tables would not work.

Fixed by changing testing of !share->now_transactional to
!share->base.born_transactional.

Other things:
- Extended Aria check_table() to check for wrong (= too big) LSN numbers.
- If check_table() failed just because of wrong LSN or TRN numbers,
  a following repair table will just do a zerofill which is much faster.
- Limit number of LSN errors in one check table to MAX_LSN_ERROR (10).
- Removed old obsolete test of 'if (error_count & 2)'. Changed error_count
  and warning_count from bits to numbers of errors/warnings as this is
  more useful.
2021-04-06 14:57:22 +03:00
Marko Mäkelä
75db05c599 Merge 10.5 into 10.6 (except MDEV-24630)
Commit 76d2846a71 was for 10.5 only.
It caused some performance regression on 10.6 in some cases,
likely related to the removal of ib_mutex_t in MDEV-21452.
2021-03-30 13:28:05 +03:00
Krunal Bauskar
76d2846a71 MDEV-24630: MY_RELAX_CPU assembly instruction upgrade/research for
memory barrier on ARM

As suggested in the said JIRA ticket based on the contribution done by
the community (in an attempt to optimize the spin-loop) the said approach
was evaluated against MariaDB Server 10.5 and found to help improve
throughput in the range of 2-5%.

Note: 10.6 timing graph and model are different as home-brew
mutexes are replaced with pthread mutexes. Said patch has mixed
impact on 10.6 so not recommended for 10.6.
2021-03-30 13:26:19 +03:00
Marko Mäkelä
8c2e3259c1 MDEV-24302 follow-up: RESET MASTER hangs
As pointed out by Andrei Elkin, the previous fix did not fix one
race condition that may have caused the observed hang.

innodb_log_flush_request(): If we are enqueueing the very first
request at the same time the log write is being completed,
we must ensure that a near-concurrent call to log_flush_notify()
will not result in a missed notification. We guarantee this by
release-acquire operations on log_requests.start and
log_sys.flushed_to_disk_lsn.

log_flush_notify_and_unlock(): Cleanup: Always release the mutex.

log_sys_t::get_flushed_lsn(): Use acquire memory order.

log_sys_t::set_flushed_lsn(): Use release memory order.

log_sys_t::set_lsn(): Use release memory order.

log_sys_t::get_lsn(): Use relaxed memory order by default, and
allow the caller to specify acquire memory order explicitly.
Whenever the log_sys.mutex is being held or when log writes are
prohibited during startup, we can use a relaxed load. Likewise,
in some assertions where reading a stale value of log_sys.lsn
should not matter, we can use a relaxed load.

This will cause some additional instructions to be emitted on
architectures that do not implement Total Store Ordering (TSO),
such as POWER, ARM, and RISC-V Weak Memory Ordering (RVWMO).
2021-03-30 10:29:11 +03:00
Krunal Bauskar
0f6f72965b MDEV-25281: Switch to use non-atomic (vs atomic) distributed counter to
track page-access counter

As part of MDEV-21212, n_page_gets that is meant to track page access,
is ported to use distributed counter that default uses atomic sub-counters.

n_page_gets originally was a non-atomic counter that represented an approximate
value of pages tracked. Using the said analogy it doesn't need to be
an atomic distributed counter.

This patch introduces an interface that allows distributed counter to be
used with atomic and non-atomic sub-counter (through template) and also
port n_page_gets to use non-atomic distributed counter using the said
updated interface.
2021-03-29 16:15:28 +03:00
Daniel Black
460d480c74 MDEV-5536: add systemd socket activation
Systemd has a socket activation feature where a mariadb.socket
definition defines the sockets to listen to, and passes those
file descriptors directly to mariadbd to use when a connection
occurs.

The new functionality is utilized when starting as follows:

  systemctl start mariadb.socket

The mariadb.socket definition only needs to contain the network
information, ListenStream= directives, the mariadb.service
definition is still used for service instigation.

When mariadbd is started in this way, the socket, port, bind-address
backlog are all assumed to be self contained in the mariadb.socket
definition and as such the mariadb settings and command line
arguments of these network settings are ignored.
See man systemd.socket for how to limit this to specific ports.

Extra ports, those specified with extra_port in socket activation
mode, are those with a FileDescriptorName=extra. These need
to be in a separate service name like mariadb-extra.socket and
these require a Service={mariadb.service} directive to map to the
original service. Extra ports need systemd v227 or greater
(not RHEL/Centos7 - v219) when FileDescriptorName= was added,
otherwise the extra ports are treated like ordinary ports.

The number of sockets isn't limited when using systemd socket activation
(except by operating system limits on file descriptors and a minimal
amount of memory used per file descriptor). The systemd sockets passed
can include any ownership or permissions, including those the
mariadbd process wouldn't normally have the permission to create.

This implementation is compatible with mariadb.service definitions.
Those services started with:

  systemctl start mariadb.service

does actually start the mariadb.service and used all the my.cnf
settings of sockets and ports like it previously did.
2021-03-28 13:53:55 +11:00
Marko Mäkelä
356c149603 Merge 10.5 into 10.6 2021-03-26 11:50:32 +02:00
Daniel Black
bcb9ca4105 MEM_CHECK_DEFINED: replace HAVE_valgrind
HAVE_valgrind_or_MSAN to HAVE_valgrind was incorrect in
af784385b4.

In my_valgrind.h when clang exists (hence no __has_feature(memory_sanitizer),
and -DWITH_VALGRIND=1, but without memcheck.h, we end up with a MEM_CHECK_DEFINED
being empty.

If we are also doing a CMAKE_BUILD_TYPE=Debug this results a number of
[-Werror,-Wunused-variable] errors because MEM_CHECK_DEFINED is empty.
With MEM_CHECK_DEFINED empty, there becomes no uses of this of the
fixed field and innodb variables in this patch.

So we stop using HAVE_valgrind as catchall and use the name
HAVE_CHECK_MEM to indicate that a CHECK_MEM_DEFINED function exists.

Reviewer: Monty

Corrects: af784385b4
2021-03-26 07:58:49 +11:00
Eugene Kosov
1274bb7729 MDEV-25223 change fil_system_t::space_list and fil_system_t::named_spaces from UT_LIST to ilist
Mostly a refactoring. Also, debug functions were added for ease of life while debugging
2021-03-24 15:15:18 +03:00
Marko Mäkelä
00528a0445 Merge 10.5 into 10.6 2021-03-19 13:35:18 +02:00
Marko Mäkelä
be881ec457 Merge 10.4 into 10.5 2021-03-19 13:09:21 +02:00
Marko Mäkelä
44d70c01f0 Merge 10.3 into 10.4 2021-03-19 11:42:44 +02:00
Marko Mäkelä
19052b6deb Merge 10.2 into 10.3 2021-03-18 12:34:48 +02:00
Marko Mäkelä
f87a944c79 Merge 10.5 into 10.6 2021-03-16 15:51:26 +02:00
Vladislav Vaintroub
031b3dfc22 MDEV-25123 support MSVC ASAN 2021-03-12 08:44:55 +01:00
Sergey Zolotarev
d317350a76 Remove trailing newline from error message in dlerror() on Windows
1. Add FORMAT_MESSAGE_MAX_WIDTH_MASK flag to remove trailing \r\n from
   the error message returned by FormatMessageA().
2. Also, add FORMAT_MESSAGE_IGNORE_INSERTS to avoid potential problems
   with system messages that have argument placeholders.

Quote from FormatMessage docs on MSDN:

If this function is called without FORMAT_MESSAGE_IGNORE_INSERTS, the
Arguments parameter must contain enough parameters to satisfy all insertion
sequences in the message string, and they must be of the correct type.
Therefore, do not use untrusted or unknown message strings with inserts
enabled because they can contain more insertion sequences than Arguments
provides, or those that may be of the wrong type. In particular, it is
unsafe to take an arbitrary system error code returned from an API and use
FORMAT_MESSAGE_FROM_SYSTEM without FORMAT_MESSAGE_IGNORE_INSERTS.
2021-03-08 18:23:42 +01:00
Sergei Golubchik
2c0b3141f3 update wsrep service version after 7345d37141 2021-03-08 14:54:05 +01:00
Julius Goryavsky
7345d37141 MDEV-24853: Duplicate key generated during cluster configuration change
Incorrect processing of an auto-incrementing field in the
WSREP-related code during applying transactions results in
a duplicate key being created. This is due to the fact that
at the beginning of the write_row() and update_row() functions,
the values of the auto-increment parameters are used, which
are read from the parameters of the current thread, but further
along the code other values are used, which are read from global
variables (when applying a transaction). This can happen when
the cluster configuration has changed while applying a transaction
(for example in the high_priority_service mode for Galera 4).
Further during IST processing duplicating key is detected, and
processing of the DB_DUPLICATE_KEY return code (inside innodb,
in the write_row() handler) results in a call to the
wsrep_thd_self_abort() function.
2021-03-08 11:15:08 +01:00
Marko Mäkelä
03ff588d15 Merge 10.5 into 10.6 2021-03-05 16:05:47 +02:00
Marko Mäkelä
10d544aa7b Merge 10.4 into 10.5 2021-03-05 12:54:43 +02:00
Marko Mäkelä
fcc9f8b10c Remove unused HA_EXTRA_FAKE_START_STMT
This is fixup for commit f06a0b5338.
2021-03-05 10:40:16 +02:00
Rinat Ibragimov
b3abcf80a1 MDEV-6536: make --bind=hostname to listen on both IPv6 and IPv4 addresses
Binding to a hostname now makes MariaDB server to listen on all addresses
that hostname resolves to.

Rebased to 10.6 by Daniel Black

Closes: #1668
2021-03-05 08:25:52 +11:00
Marko Mäkelä
80ac9ec1cc MDEV-24973 Performance schema duplicates rarely executed code for mutex operations
The PERFORMANCE_SCHEMA wrapper for mutex and rw-lock operations is
causing a lot of unlikely code to be inlined in each invocation.
The impact of this may have been emphasized in MariaDB 10.6, because
InnoDB now uses the common implementation of mutexes and condition
variables (MDEV-21452).

By default, we build with cmake -DPLUGIN_PERFSCHEMA enabled,
but at runtime no instrumentation will be enabled. Similar to
commit eba2d10ac5
we had better avoid inlining the rarely executed code in order to reduce
the code size and to improve the efficiency of the instruction cache.

This change was extensively tested by Axel Schwenke with and without
--enable-performance-schema (with no individual instruments enabled).
Removing the inline functions did not cause any performance regression
in either case. There seemed to be a tiny improvement, possibly due
to reduced code size and better instruction cache hit rate.
2021-03-02 14:32:37 +02:00
Marko Mäkelä
b47304eb02 Merge 10.5 into 10.6 2021-02-26 15:02:13 +02:00
Daniel Black
e0ba68ba34 MDEV-23510: arm64 lf_hash alignment of pointers
Like the 10.2 version 1635686b50,
except C++ on internal functions for my_assume_aligned.

volatile != atomic.

volatile has no memory barrier schemantics, its for mmaped IO
so lets allow some optimizer gains and stop pretending it helps
with memory atomicity.

The MDEV lists a SEGV an assumption is made that an address was
partially read. As C packs structs strictly in order and on arm64 the
cache line size is 128 bits. A pointer (link - 64 bits), followed
by a hashnr (uint32 - 32 bits), leaves the following key (uchar *
64 bits), neither naturally aligned to any pointer and worse, split
across a cache line which is the processors view of an atomic
reservation of memory.

lf_dynarray_lvalue is assumed to return a 64 bit aligned address.

As a solution move the 32bit hashnr to the end so we don't get the
*key pointer split across two cache lines.

Tested by: Krunal Bauskar
Reviewer: Marko Mäkelä
2021-02-25 10:06:15 +11:00
Marko Mäkelä
94b4578704 Merge 10.5 into 10.6 2021-02-17 19:39:05 +02:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Marko Mäkelä
b19ec8848c Merge 10.5 into 10.6 2021-02-11 09:26:53 +02:00
Monty
bd5ac03896 Make maria_data_root const char*
This allow one to remove some casts like:
maria_data_root= (char *)".";

It also removes warnings from icc.
2021-02-08 12:16:29 +02:00
Monty
5d6ad2ad66 Added 'const' to arguments in get_one_option and find_typeset()
One should not change the program arguments!
This change also reduces warnings from the icc compiler.

Almost all changes are just syntax changes (adding const to
'get_one_option function' declarations).

Other changes:
- Added a few cast of 'argument' from 'const char*' to 'char *'. This
  was mainly in calls to 'external' functions we don't have control of.
- Ensure that all reset of 'password command line argument' are similar.
  (In almost all cases it was just adding a comment and a cast)
- In mysqlbinlog.cc and mysqld.cc there was a few cases that changed
  the command line argument. These places where changed to instead allocate
  the option in a MEM_ROOT to avoid changing the argument. Some of this
  code was changed to ensure that different programs did parsing the
  same way. Added a test case for the changes in mysqlbinlog.cc
- Changed a few variables that took their value from command line options
  from 'char *' to 'const char *'.
2021-02-08 12:16:29 +02:00
Marko Mäkelä
1110beccd4 Merge 10.5 into 10.6 2021-02-02 15:15:53 +02:00
Sergei Golubchik
2676c9aad7 galera fixes related to THD::LOCK_thd_kill
Since 2017 (c2118a08b1) THD::awake() no longer requires LOCK_thd_data.
It uses LOCK_thd_kill, and this latter mutex is used to prevent
a thread of dying, not LOCK_thd_data as before.
2021-02-02 10:02:17 +01:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
David CARLIER
b1241585b2
Mac M1 build support proposal (minus rocksdb option) (#1743) 2021-01-30 17:04:27 +02:00
FX Coudert
52795c2f78 Apple Silicon is a 64-bit platform 2021-01-30 16:35:46 +02:00
Marko Mäkelä
3cef4f8f0f MDEV-515 Reduce InnoDB undo logging for insert into empty table
We implement an idea that was suggested by Michael 'Monty' Widenius
in October 2017: When InnoDB is inserting into an empty table or partition,
we can write a single undo log record TRX_UNDO_EMPTY, which will cause
ROLLBACK to clear the table.

For this to work, the insert into an empty table or partition must be
covered by an exclusive table lock that will be held until the transaction
has been committed or rolled back, or the INSERT operation has been
rolled back (and the table is empty again), in lock_table_x_unlock().

Clustered index records that are covered by the TRX_UNDO_EMPTY record
will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
be distinguished from what MDEV-12288 leaves behind after purging the
history of row-logged operations.

Concurrent non-locking reads must be adjusted: If the read view was
created before the INSERT into an empty table, then we must continue
to imagine that the table is empty, and not try to read any records.
If the read view was created after the INSERT was committed, then
all records must be visible normally. To implement this, we introduce
the field dict_table_t::bulk_trx_id.

This special handling only applies to the very first INSERT statement
of a transaction for the empty table or partition. If a subsequent
statement in the transaction is modifying the initially empty table again,
we must enable row-level undo logging, so that we will be able to
roll back to the start of the statement in case of an error (such as
duplicate key).

INSERT IGNORE will continue to use row-level logging and locking, because
implementing it would require the ability to roll back the latest row.
Since the undo log that we write only allows us to roll back the entire
statement, we cannot support INSERT IGNORE. We will introduce a
handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
engines that INSERT IGNORE is being executed.

In many test cases, we add an extra record to the table, so that during
the 'interesting' part of the test, row-level locking and logging will
be used.

Replicas will continue to use row-level logging and locking until
MDEV-24622 has been addressed. Likewise, this optimization will be
disabled in Galera cluster until MDEV-24623 enables it.

dict_table_t::bulk_trx_id: The latest active or committed transaction
that initiated an insert into an empty table or partition.
Protected by exclusive table lock and a clustered index leaf page latch.

ins_node_t::bulk_insert: Whether bulk insert was initiated.

trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
Unlike earlier, this collection will cover also temporary tables.

trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
is_bulk_insert(), was_bulk_insert().

trx_undo_report_row_operation(): Before accessing any undo log pages,
invoke trx->mod_tables.emplace() in order to determine whether undo
logging was disabled, or whether this is the first INSERT and we are
supposed to write a TRX_UNDO_EMPTY record.

row_ins_clust_index_entry_low(): If we are inserting into an empty
clustered index leaf page, set the ins_node_t::bulk_insert flag for
the subsequent trx_undo_report_row_operation() call.

lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
Remove the redundant parameter 'flags' that can be checked in the caller.

btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().

trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
the next statement will not be covered by table-level undo logging.

ReadView::changes_visible(trx_id_t) const: New accessor for the case
where the trx_id_t is not read from a potentially corrupted index page
but directly from the memory. In this case, we can skip a sanity check.

row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
row_sel_try_search_shortcut_for_mysql(),
row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.

row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().

lock_sec_rec_cons_read_sees(): Replaced with lower-level code.

btr_root_page_init(): Refactored from btr_create().

dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
for the ROLLBACK of an INSERT operation.

ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
into an empty table.

This is joint work with Thirunarayanan Balathandayuthapani,
who created a working prototype.
Thanks to Matthias Leich for extensive testing.
2021-01-25 18:41:27 +02:00
Sergei Golubchik
a216672dab MDEV-16341 Wrong length for USER columns in performance_schema tables
use USERNAME_CHAR_LENGTH and HOSTNAME_LENGTH for perfschema
USER and HOST columns
2021-01-11 21:54:48 +01:00
Marko Mäkelä
92abdcca5a Merge 10.5 into 10.6 2021-01-07 09:08:09 +02:00
Oleksandr Byelkin
02e7bff882 Merge commit '10.4' into 10.5 2021-01-06 10:53:00 +01:00
Marko Mäkelä
6268bdadf7 Merge 10.5 into 10.6 2021-01-04 10:52:32 +02:00
Marko Mäkelä
c1a7a82bca WolfSSL v4.6.0-stable 2021-01-02 11:56:41 +02:00
Marko Mäkelä
c1f0afb102 WolfSSL v4.6.0-stable 2021-01-01 19:15:46 +02:00
Oleksandr Byelkin
478b83032b Merge branch '10.3' into 10.4 2020-12-25 09:13:28 +01:00
Oleksandr Byelkin
25561435e0 Merge branch '10.2' into 10.3 2020-12-23 19:28:02 +01:00
Marko Mäkelä
c36a2a0d1c Merge 10.5 into 10.6 2020-12-17 14:56:08 +02:00
Etienne Guesnet
2c7247622a AIX workaround for GCC TOC bug 2020-12-16 08:07:04 +11:00
Etienne Guesnet
2f5d372444 Add build on AIX 2020-12-16 08:07:04 +11:00
Sergei Golubchik
e189faf0b3 document that a fulltext parser plugin can replace mysql_add_word callback 2020-12-10 08:45:20 +01:00
Eugene Kosov
a50cb4867a MDEV-24334 make monitor_set_tbl global variable thread-safe
Atomic_relaxed<T>: add fetch_or() and fetch_and()

innodb_init(): rely on a zero-initialization of a global variable

monitor_set_tbl: make Atomic_relaxed<ulint> array and use proper operations
for setting bit, unsetting bit and reading bit

Reviewed by: Marko Mäkelä
2020-12-03 11:55:36 +03:00
Marko Mäkelä
a13fac9eee Merge 10.5 into 10.6 2020-12-03 08:12:47 +02:00
Marko Mäkelä
f146969fb3 MDEV-22929 fixup: root_name() clash with clang++ <fstream>
The clang++ -stdlib=libc++ header file <fstream> depends on
<filesystem> that defines a member function path::root_name(),
which conflicts with the rather unused #define root_name()
that had been introduced in
commit 7c58e97bf6.

Because an instrumented -stdlib=libc++ (rather than the default
-stdlib=libstdc++) is easier to build for a working -fsanitize=memory
(cmake -DWITH_MSAN=ON), let us remove the conflicting #define for now.
2020-12-03 07:45:48 +02:00
Vladislav Vaintroub
295f3e4cfb MDEV-19237 Skip sending metadata when possible for binary protocol.
Do not resend metadata, if metadata does not change between prepare and
execute of prepared statement, or between executes.

Currently, metadata of *every* prepared statement will be checksummed,
and change is detected once checksum changes.

This is not from ideal, performance-wise. The code for
better/faster detection of unchanged metadata, is already in place, but
currently disabled due to PS bugs, such as MDEV-23913.
2020-11-23 19:28:29 +01:00
Marko Mäkelä
a62a675fd2 Merge 10.5 into 10.6 2020-11-23 17:57:58 +02:00
Daniel Black
8cc5d2845c MDEV-24125: linux large pages - Revert "Fixed centos 6 build failure"
This reverts commit 6cf8f05fd9.

Original patch assumed that MAP_HUGETLB as consistent across
achitectures which isn't the case. Defining it unconditionally
broke large pages on every achitecutre where the value differed
from x86_64.

With the EOL for Centos/RHEL6 announced in 10.5.7, <3.8 linux
kernels are no longer supported.
2020-11-17 07:53:55 +11:00
Oleksandr Byelkin
10aa576483 Merge branch '10.5' into 10.6 2020-11-14 20:05:35 +01:00
Marko Mäkelä
d7a5824899 Merge 10.4 into 10.5 2020-11-13 21:54:21 +02:00
Marko Mäkelä
254bb1c35b Merge 10.5 into 10.6 2020-11-12 15:54:08 +02:00
sjaakola
ad432ef4c0 MDEV-24119 MDL BF-BF Conflict caused by TRUNCATE TABLE
This PR fixes same issue as MDEV-21577 for TRUNCATE TABLE.
MDEV-21577 fixed TOI replication for OPTIMIZE, REPAIR and ALTER TABLE
operating on FK child table. It was later found out that also TRUNCATE
has similar problem and needs a fix.

The actual fix is to do FK parent table lookup before TRUNCATE TOI
isolation and append found FK parent table names in certification key
list for the write set.

PR contains also new test scenario in galera_ddl_fk_conflict test where
FK child has two FK parent tables and there are two DML transactions operating
on both parent tables.

For development convenience, new TO isolation macro was added:
WSREP_TO_ISOLATION_BEGIN_IF and WSREP_TO_ISOLATION_BEGIN_ALTER macro was changed
to skip the goto statement.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-11-11 17:46:50 +02:00
Marko Mäkelä
4cbfdeca84 MDEV-24109 InnoDB hangs with innodb_flush_sync=OFF
MDEV-23855 broke the handling of innodb_flush_sync=OFF.
That parameter is supposed to limit the page write rate
in case the log capacity is being exceeded and log checkpoints
are needed.

With this fix, the following should pass:
./mtr --mysqld=--loose-innodb-flush-sync=0

One of our best regression tests for page flushing is
encryption.innochecksum. With innodb_page_size=16k and
innodb_flush_sync=OFF it would likely hang without this fix.

log_sys.last_checkpoint_lsn: Declare as Atomic_relaxed<lsn_t>
so that we are allowed to read the value while not holding
log_sys.mutex.

buf_flush_wait_flushed(): Let the page cleaner perform the flushing
also if innodb_flush_sync=OFF. After the page cleaner has
completed, perform a checkpoint if it is needed, because
buf_flush_sync_for_checkpoint() will not be run if
innodb_flush_sync=OFF.

buf_flush_ahead(): Simplify the condition. We do not really care
whether buf_flush_page_cleaner() is running.

buf_flush_page_cleaner(): Evaluate innodb_flush_sync at the low
level. If innodb_flush_sync=OFF, rate-limit the batches to
innodb_io_capacity_max pages per second.

Reviewed by: Vladislav Vaintroub
2020-11-04 16:55:36 +02:00
sjaakola
4d6c661144 MDEV-21577 MDL BF-BF conflict
Some DDL statements appear to acquire MDL locks for a table referenced by
foreign key constraint from the actual affected table of the DDL statement.
OPTIMIZE, REPAIR and ALTER TABLE belong to this class of DDL statements.

Earlier MariaDB version did not take this in consideration, and appended
only affected table in the certification key list in write set.
Because of missing certification information, it could happen that e.g.
OPTIMIZE table for FK child table could be allowed to apply in parallel
with DML operating on the foreign key parent table, and this could lead to
unhandled MDL lock conflicts between two high priority appliers (BF).

The fix in this patch, changes the TOI replication for OPTIMIZE, REPAIR and
ALTER TABLE statements so that before the execution of respective DDL
statement, there is foreign key parent search round. This FK parent search
contains following steps:
* open and lock the affected table (with permissive shared locks)
* iterate over foreign key contstraints and collect and array of Fk parent
  table names
* close all tables open for the THD and release MDL locks
* do the actual TOI replication with the affected table and FK parent
  table names as key values

The patch contains also new mtr test for verifying that the above mentioned
DDL statements replicate without problems when operating on FK child table.
The mtr test scenario #1, which can be used to check if some other DDL
(on top of OPTIMIZE, REPAIR and ALTER) could cause similar excessive FK
parent table locking.

Reviewed-by: Aleksey Midenkov <aleksey.midenkov@mariadb.com>
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-11-03 19:40:06 +02:00
Marko Mäkelä
c498250888 Merge 10.5 into 10.6 2020-11-03 19:11:36 +02:00
Marko Mäkelä
133b4b46fe Merge 10.4 into 10.5 2020-11-03 16:24:47 +02:00
Marko Mäkelä
533a13af06 Merge 10.3 into 10.4 2020-11-03 14:49:17 +02:00
Marko Mäkelä
c7f322c91f Merge 10.2 into 10.3 2020-11-02 15:48:47 +02:00
Marko Mäkelä
8036d0a359 MDEV-22387: Do not violate __attribute__((nonnull))
This follows up commit
commit 94a520ddbe and
commit 7c5519c12d.

After these changes, the default test suites on a
cmake -DWITH_UBSAN=ON build no longer fail due to passing
null pointers as parameters that are declared to never be null,
but plenty of other runtime errors remain.
2020-11-02 14:19:21 +02:00
Marko Mäkelä
09a1f0075a Merge 10.5 into 10.6 2020-11-02 12:49:19 +02:00
Marko Mäkelä
898521e2dd Merge 10.4 into 10.5 2020-10-30 11:15:30 +02:00
Vicențiu Ciorbaru
76fabe816f Expose utf8mb4_bin charset for plugins
Cleanup other linker errors
2020-10-29 15:01:33 +02:00
Marko Mäkelä
7b2bb67113 Merge 10.3 into 10.4 2020-10-29 13:38:38 +02:00
Marko Mäkelä
a8de8f261d Merge 10.2 into 10.3 2020-10-28 10:01:50 +02:00
Varun Gupta
b94e8e4b25 MDEV-23867: insert... select crash in compute_window_func
There are 2 issues here:

Issue #1: memory allocation.
An IO_CACHE that uses encryption uses a larger buffer (it needs space for the encrypted data,
decrypted data, IO_CACHE_CRYPT struct to describe encryption parameters etc).

Issue #2: IO_CACHE::seek_not_done
When IO_CACHE objects are cloned, they still share the file descriptor.
This means, operation on one IO_CACHE may change the file read position
which will confuse other IO_CACHEs using it.

The fix of these issues would be:
Allocate the buffer to also include the extra size needed for encryption.
Perform seek again after one IO_CACHE reads the file.
2020-10-23 22:36:47 +05:30
Sergei Golubchik
e864e6b181 compilation failure with new C/C
define symbols as C/C does to avoid "macro redefined" warnings
2020-10-23 15:53:41 +02:00
Marko Mäkelä
7cffb5f6e8 MDEV-23399: Performance regression with write workloads
The buffer pool refactoring in MDEV-15053 and MDEV-22871 shifted
the performance bottleneck to the page flushing.

The configuration parameters will be changed as follows:

innodb_lru_flush_size=32 (new: how many pages to flush on LRU eviction)
innodb_lru_scan_depth=1536 (old: 1024)
innodb_max_dirty_pages_pct=90 (old: 75)
innodb_max_dirty_pages_pct_lwm=75 (old: 0)

Note: The parameter innodb_lru_scan_depth will only affect LRU
eviction of buffer pool pages when a new page is being allocated. The
page cleaner thread will no longer evict any pages. It used to
guarantee that some pages will remain free in the buffer pool. Now, we
perform that eviction 'on demand' in buf_LRU_get_free_block().
The parameter innodb_lru_scan_depth(srv_LRU_scan_depth) is used as follows:
 * When the buffer pool is being shrunk in buf_pool_t::withdraw_blocks()
 * As a buf_pool.free limit in buf_LRU_list_batch() for terminating
   the flushing that is initiated e.g., by buf_LRU_get_free_block()
The parameter also used to serve as an initial limit for unzip_LRU
eviction (evicting uncompressed page frames while retaining
ROW_FORMAT=COMPRESSED pages), but now we will use a hard-coded limit
of 100 or unlimited for invoking buf_LRU_scan_and_free_block().

The status variables will be changed as follows:

innodb_buffer_pool_pages_flushed: This includes also the count of
innodb_buffer_pool_pages_LRU_flushed and should work reliably,
updated one by one in buf_flush_page() to give more real-time
statistics. The function buf_flush_stats(), which we are removing,
was not called in every code path. For both counters, we will use
regular variables that are incremented in a critical section of
buf_pool.mutex. Note that show_innodb_vars() directly links to the
variables, and reads of the counters will *not* be protected by
buf_pool.mutex, so you cannot get a consistent snapshot of both variables.

The following INFORMATION_SCHEMA.INNODB_METRICS counters will be
removed, because the page cleaner no longer deals with writing or
evicting least recently used pages, and because the single-page writes
have been removed:
* buffer_LRU_batch_flush_avg_time_slot
* buffer_LRU_batch_flush_avg_time_thread
* buffer_LRU_batch_flush_avg_time_est
* buffer_LRU_batch_flush_avg_pass
* buffer_LRU_single_flush_scanned
* buffer_LRU_single_flush_num_scan
* buffer_LRU_single_flush_scanned_per_call

When moving to a single buffer pool instance in MDEV-15058, we missed
some opportunity to simplify the buf_flush_page_cleaner thread. It was
unnecessarily using a mutex and some complex data structures, even
though we always have a single page cleaner thread.

Furthermore, the buf_flush_page_cleaner thread had separate 'recovery'
and 'shutdown' modes where it was waiting to be triggered by some
other thread, adding unnecessary latency and potential for hangs in
relatively rarely executed startup or shutdown code.

The page cleaner was also running two kinds of batches in an
interleaved fashion: "LRU flush" (writing out some least recently used
pages and evicting them on write completion) and the normal batches
that aim to increase the MIN(oldest_modification) in the buffer pool,
to help the log checkpoint advance.

The buf_pool.flush_list flushing was being blocked by
buf_block_t::lock for no good reason. Furthermore, if the FIL_PAGE_LSN
of a page is ahead of log_sys.get_flushed_lsn(), that is, what has
been persistently written to the redo log, we would trigger a log
flush and then resume the page flushing. This would unnecessarily
limit the performance of the page cleaner thread and trigger the
infamous messages "InnoDB: page_cleaner: 1000ms intended loop took 4450ms.
The settings might not be optimal" that were suppressed in
commit d1ab89037a unless log_warnings>2.

Our revised algorithm will make log_sys.get_flushed_lsn() advance at
the start of buf_flush_lists(), and then execute a 'best effort' to
write out all pages. The flush batches will skip pages that were modified
since the log was written, or are are currently exclusively locked.
The MDEV-13670 message "page_cleaner: 1000ms intended loop took" message
will be removed, because by design, the buf_flush_page_cleaner() should
not be blocked during a batch for extended periods of time.

We will remove the single-page flushing altogether. Related to this,
the debug parameter innodb_doublewrite_batch_size will be removed,
because all of the doublewrite buffer will be used for flushing
batches. If a page needs to be evicted from the buffer pool and all
100 least recently used pages in the buffer pool have unflushed
changes, buf_LRU_get_free_block() will execute buf_flush_lists() to
write out and evict innodb_lru_flush_size pages. At most one thread
will execute buf_flush_lists() in buf_LRU_get_free_block(); other
threads will wait for that LRU flushing batch to finish.

To improve concurrency, we will replace the InnoDB ib_mutex_t and
os_event_t native mutexes and condition variables in this area of code.
Most notably, this means that the buffer pool mutex (buf_pool.mutex)
is no longer instrumented via any InnoDB interfaces. It will continue
to be instrumented via PERFORMANCE_SCHEMA.

For now, both buf_pool.flush_list_mutex and buf_pool.mutex will be
declared with MY_MUTEX_INIT_FAST (PTHREAD_MUTEX_ADAPTIVE_NP). The critical
sections of buf_pool.flush_list_mutex should be shorter than those for
buf_pool.mutex, because in the worst case, they cover a linear scan of
buf_pool.flush_list, while the worst case of a critical section of
buf_pool.mutex covers a linear scan of the potentially much longer
buf_pool.LRU list.

mysql_mutex_is_owner(), safe_mutex_is_owner(): New predicate, usable
with SAFE_MUTEX. Some InnoDB debug assertions need this predicate
instead of mysql_mutex_assert_owner() or mysql_mutex_assert_not_owner().

buf_pool_t::n_flush_LRU, buf_pool_t::n_flush_list:
Replaces buf_pool_t::init_flush[] and buf_pool_t::n_flush[].
The number of active flush operations.

buf_pool_t::mutex, buf_pool_t::flush_list_mutex: Use mysql_mutex_t
instead of ib_mutex_t, to have native mutexes with PERFORMANCE_SCHEMA
and SAFE_MUTEX instrumentation.

buf_pool_t::done_flush_LRU: Condition variable for !n_flush_LRU.

buf_pool_t::done_flush_list: Condition variable for !n_flush_list.

buf_pool_t::do_flush_list: Condition variable to wake up the
buf_flush_page_cleaner when a log checkpoint needs to be written
or the server is being shut down. Replaces buf_flush_event.
We will keep using timed waits (the page cleaner thread will wake
_at least_ once per second), because the calculations for
innodb_adaptive_flushing depend on fixed time intervals.

buf_dblwr: Allocate statically, and move all code to member functions.
Use a native mutex and condition variable. Remove code to deal with
single-page flushing.

buf_dblwr_check_block(): Make the check debug-only. We were spending
a significant amount of execution time in page_simple_validate_new().

flush_counters_t::unzip_LRU_evicted: Remove.

IORequest: Make more members const. FIXME: m_fil_node should be removed.

buf_flush_sync_lsn: Protect by std::atomic, not page_cleaner.mutex
(which we are removing).

page_cleaner_slot_t, page_cleaner_t: Remove many redundant members.

pc_request_flush_slot(): Replaces pc_request() and pc_flush_slot().

recv_writer_thread: Remove. Recovery works just fine without it, if we
simply invoke buf_flush_sync() at the end of each batch in
recv_sys_t::apply().

recv_recovery_from_checkpoint_finish(): Remove. We can simply call
recv_sys.debug_free() directly.

srv_started_redo: Replaces srv_start_state.

SRV_SHUTDOWN_FLUSH_PHASE: Remove. logs_empty_and_mark_files_at_shutdown()
can communicate with the normal page cleaner loop via the new function
flush_buffer_pool().

buf_flush_remove(): Assert that the calling thread is holding
buf_pool.flush_list_mutex. This removes unnecessary mutex operations
from buf_flush_remove_pages() and buf_flush_dirty_pages(),
which replace buf_LRU_flush_or_remove_pages().

buf_flush_lists(): Renamed from buf_flush_batch(), with simplified
interface. Return the number of flushed pages. Clarified comments and
renamed min_n to max_n. Identify LRU batch by lsn=0. Merge all the functions
buf_flush_start(), buf_flush_batch(), buf_flush_end() directly to this
function, which was their only caller, and remove 2 unnecessary
buf_pool.mutex release/re-acquisition that we used to perform around
the buf_flush_batch() call. At the start, if not all log has been
durably written, wait for a background task to do it, or start a new
task to do it. This allows the log write to run concurrently with our
page flushing batch. Any pages that were skipped due to too recent
FIL_PAGE_LSN or due to them being latched by a writer should be flushed
during the next batch, unless there are further modifications to those
pages. It is possible that a page that we must flush due to small
oldest_modification also carries a recent FIL_PAGE_LSN or is being
constantly modified. In the worst case, all writers would then end up
waiting in log_free_check() to allow the flushing and the checkpoint
to complete.

buf_do_flush_list_batch(): Clarify comments, and rename min_n to max_n.
Cache the last looked up tablespace. If neighbor flushing is not applicable,
invoke buf_flush_page() directly, avoiding a page lookup in between.

buf_flush_space(): Auxiliary function to look up a tablespace for
page flushing.

buf_flush_page(): Defer the computation of space->full_crc32(). Never
call log_write_up_to(), but instead skip persistent pages whose latest
modification (FIL_PAGE_LSN) is newer than the redo log. Also skip
pages on which we cannot acquire a shared latch without waiting.

buf_flush_try_neighbors(): Do not bother checking buf_fix_count
because buf_flush_page() will no longer wait for the page latch.
Take the tablespace as a parameter, and only execute this function
when innodb_flush_neighbors>0. Avoid repeated calls of page_id_t::fold().

buf_flush_relocate_on_flush_list(): Declare as cold, and push down
a condition from the callers.

buf_flush_check_neighbor(): Take id.fold() as a parameter.

buf_flush_sync(): Ensure that the buf_pool.flush_list is empty,
because the flushing batch will skip pages whose modifications have
not yet been written to the log or were latched for modification.

buf_free_from_unzip_LRU_list_batch(): Remove redundant local variables.

buf_flush_LRU_list_batch(): Let the caller buf_do_LRU_batch() initialize
the counters, and report n->evicted.
Cache the last looked up tablespace. If neighbor flushing is not applicable,
invoke buf_flush_page() directly, avoiding a page lookup in between.

buf_do_LRU_batch(): Return the number of pages flushed.

buf_LRU_free_page(): Only release and re-acquire buf_pool.mutex if
adaptive hash index entries are pointing to the block.

buf_LRU_get_free_block(): Do not wake up the page cleaner, because it
will no longer perform any useful work for us, and we do not want it
to compete for I/O while buf_flush_lists(innodb_lru_flush_size, 0)
writes out and evicts at most innodb_lru_flush_size pages. (The
function buf_do_LRU_batch() may complete after writing fewer pages if
more than innodb_lru_scan_depth pages end up in buf_pool.free list.)
Eliminate some mutex release-acquire cycles, and wait for the LRU
flush batch to complete before rescanning.

buf_LRU_check_size_of_non_data_objects(): Simplify the code.

buf_page_write_complete(): Remove the parameter evict, and always
evict pages that were part of an LRU flush.

buf_page_create(): Take a pre-allocated page as a parameter.

buf_pool_t::free_block(): Free a pre-allocated block.

recv_sys_t::recover_low(), recv_sys_t::apply(): Preallocate the block
while not holding recv_sys.mutex. During page allocation, we may
initiate a page flush, which in turn may initiate a log flush, which
would require acquiring log_sys.mutex, which should always be acquired
before recv_sys.mutex in order to avoid deadlocks. Therefore, we must
not be holding recv_sys.mutex while allocating a buffer pool block.

BtrBulk::logFreeCheck(): Skip a redundant condition.

row_undo_step(): Do not invoke srv_inc_activity_count() for every row
that is being rolled back. It should suffice to invoke the function in
trx_flush_log_if_needed() during trx_t::commit_in_memory() when the
rollback completes.

sync_check_enable(): Remove. We will enable innodb_sync_debug from the
very beginning.

Reviewed by: Vladislav Vaintroub
2020-10-15 17:04:56 +03:00
Marko Mäkelä
6ce0a6f9ad Merge 10.5 into 10.6 2020-09-24 10:21:26 +03:00
Marko Mäkelä
882ce206db Merge 10.4 into 10.5 2020-09-23 11:32:43 +03:00
Marko Mäkelä
3a423088ac Merge 10.3 into 10.4 2020-09-21 12:29:00 +03:00
Marko Mäkelä
cbcb4ecabb Merge 10.2 into 10.3 2020-09-21 11:04:04 +03:00
Vladislav Vaintroub
ccbe6bb6fc MDEV-19935 Create unified CRC-32 interface
Add CRC32C code to mysys. The x86-64 implementation uses PCMULQDQ in addition to CRC32 instruction
after Intel whitepaper, and is ported from rocksdb code.

Optimized ARM and POWER CRC32 were already present in mysys.
2020-09-17 16:07:37 +02:00
Jan Lindström
224c950462 MDEV-23101 : SIGSEGV in lock_rec_unlock() when Galera is enabled
Remove incorrect BF (brute force) handling from lock_rec_has_to_wait_in_queue
and move condition to correct callers. Add a function to report
BF lock waits and assert if incorrect BF-BF lock wait happens.

wsrep_report_bf_lock_wait
	Add a new function to report BF lock wait.

wsrep_assert_no_bf_bf_wait
	Add a new function to check do we have a
	BF-BF wait and if we have report this case
	and assert as it is a bug.

lock_rec_has_to_wait
	Use new wsrep_assert_bf_wait to check BF-BF wait.

lock_rec_create_low
lock_table_create
	Use new function to report BF lock waits.

lock_rec_insert_by_trx_age
lock_grant_and_move_on_page
lock_grant_and_move_on_rec
	Assert that trx is not Galera as VATS is not compatible
	with Galera.

lock_rec_add_to_queue
	If there is conflicting lock in a queue make sure that
	transaction is BF.

lock_rec_has_to_wait_in_queue
	Remove incorrect BF handling. If there is conflicting
	locks in a queue all transactions must wait.

lock_rec_dequeue_from_page
lock_rec_unlock
	If there is conflicting lock make sure it is not
	BF-BF case.

lock_rec_queue_validate
	Add Galera record locking rules comment and use
	new function to report BF lock waits.

All attempts to reproduce the original assertion have been
failed. Therefore, there is no test case on this commit.
2020-09-10 13:18:12 +03:00
Marko Mäkelä
5ff7e68c7e Merge 10.4 into 10.5 2020-09-04 18:44:44 +03:00
Marko Mäkelä
b0c194cab4 MDEV-23633 fixup: Add missing semicolon 2020-09-04 11:40:17 +03:00
Marko Mäkelä
24f510bba4 MDEV-23633 MY_RELAX_CPU performs unnecessary compare-and-swap on ARM
This follows up MDEV-14374, which was filed against MariaDB Server 10.3.
Back then, on a 48-core Qualcomm Centriq 2400, the performance of
delay loops for spinloops was tested both with and without the dummy
compare-and-swap operation, and it was decided to keep the dummy
operation.

On target architectures where nothing special is available (other than
x86 (IA-32, AMD64) or POWER), we perform a dummy compare-and-swap operation.
This is contrary to the idea of the x86 PAUSE instruction and the
__ppc_get_timebase(), which aim to keep the memory bus idle for a while,
to allow other cores to better execute code while a spinloop is waiting
for something to be changed.

On MariaDB Server 10.4 and another implementation of the ARMv8 ISA,
omitting the dummy compare-and-swap improved performance by up to 12%.
So, let us avoid the dummy compare-and-swap on ARM.

For now, we are retaining the dummy compare-and-swap on other ISAs
(such as SPARC, MIPS, S390x, RISC-V) because we do not have any
performance data for them.
2020-09-04 10:31:41 +03:00
Marko Mäkelä
1cda462f46 Merge 10.3 into 10.4 2020-09-03 16:55:14 +03:00
Oleksandr Byelkin
5edf3e0388 Merge branch '10.5' into 10.6 2020-09-02 14:36:14 +02:00
Vladislav Vaintroub
32a29afea7 MDEV-23238 - remove async client from server code.
It is already in libmariadb, and server (also that client in server)
does not need it.

It does not work in embedded either since it relies on non-blocking sockets
2020-09-01 21:30:52 +02:00
Yuqi Gu
151fc0ed88 MDEV-23495: Refine Arm64 PMULL runtime check in MariaDB
Raspberry Pi 4 supports crc32 but doesn't support pmull (MDEV-23030).

The PR #1645 offers a solution to fix this issue. But it does not consider
the condition that the target platform does support crc32 but not support PMULL.

In this condition, it should leverage the Arm64 crc32 instruction (__crc32c) and
just only skip parallel computation (pmull/vmull) rather than skip all hardware
crc32 instruction of computation.

The PR also removes unnecessary CRC32_ZERO branch in 'crc32c_aarch64' for MariaDB,
formats the indent and coding style.

Change-Id: I76371a6bd767b4985600e8cca10983d71b7e9459
Signed-off-by: Yuqi Gu <yuqi.gu@arm.com>
2020-08-21 20:41:35 +03:00
Monty
3ef65f2783 Added DBUG_PUSH_EMPTY and DBUG_POP_EMPTY to speed up DBUG 2020-08-20 19:34:11 +03:00
Marko Mäkelä
65c43bcfe2 After-merge fix of the Windows build 2020-08-20 13:35:26 +03:00
Marko Mäkelä
d5d8756de3 Merge 10.4 into 10.5 2020-08-20 12:52:44 +03:00