Commit graph

10172 commits

Author SHA1 Message Date
Eric Herman
7b587fcbe7 fix json typo s/UNINITALIZED/UNINITIALIZED/ 2021-07-21 16:32:11 +03:00
Monty
b1d81974b2 Added support to MEM_ROOT for write protected memory
This is useful for thing like Item_true and Item_false that we
allocated and initalize once and want to ensure that nothing can
change them

Main changes:
- Memory protection is achived by allocating memory with mmap() and
  protect it from write with mprotect()
- init_alloc_root(...,MY_ROOT_USE_MPROTECT) will create a
  memroot that one can later use with protect_root() to turn it
  read only or turn it back to read-write. All allocations to this
  memroot is done with mmap() to ensure page alligned allocations.
- alloc_root() code was rearranged to combine normal and valgrind code.
- init_alloc_root() now changes block size to be power of 2's, to get less
  memory fragmentation.
- Changed MEM_ROOT structure to make it smaller. Also renamed
  MEM_ROOT m_psi_key to psi_key.
- Moved MY_THREAD_SPECIFIC marker in MEM_ROOT from block size (old hack)
  to flags.
- Added global variable my_system_page_size. This is initialized at
  startup.
2021-07-18 19:59:35 +03:00
Marko Mäkelä
4b0070f642 MDEV-26029: Implement my_test_if_thinly_provisioned() for ScaleFlux
This is based on code that was contributed by Ning Zheng and Ray Kuan
from ScaleFlux.
2021-06-29 15:20:16 +03:00
Marko Mäkelä
30edd5549d MDEV-26029: Sparse files are inefficient on thinly provisioned storage
The MariaDB implementation of page_compressed tables for InnoDB used
sparse files. In the worst case, in the data file, every data page
will consist of some data followed by a hole. This may be extremely
inefficient in some file systems.

If the underlying storage device is thinly provisioned (can compress
data on the fly), it would be good to write regular files (with sequences
of NUL bytes at the end of each page_compressed block) and let the
storage device take care of compressing the data.

For reads, sparse file regions and regions containing NUL bytes will be
indistinguishable.

my_test_if_disable_punch_hole(): A new predicate for detecting thinly
provisioned storage. (Not implemented yet.)

innodb_atomic_writes: Correct the comment.

buf_flush_page(): Support all values of fil_node_t::punch_hole.
On a thinly provisioned storage device, we will always write
NUL-padded innodb_page_size bytes also for page_compressed tables.

buf_flush_freed_pages(): Remove a redundant condition.

fil_space_t::atomic_write_supported: Remove. (This was duplicating
fil_node_t::atomic_write.)

fil_space_t::punch_hole: Remove. (Duplicated fil_node_t::punch_hole.)

fil_node_t: Remove magic_n, and consolidate flags into bitfields.
For punch_hole we introduce a third value that indicates a
thinly provisioned storage device.

fil_node_t::find_metadata(): Detect all attributes of the file.
2021-06-29 15:18:22 +03:00
Marko Mäkelä
891a927e80 Merge 10.5 into 10.6 2021-06-26 11:53:28 +03:00
Marko Mäkelä
759deaa0a2 MDEV-26010 fixup: Use acquire/release memory order
In commit 5f22511e35 we depend on
Total Store Ordering. For correct operation on ISAs that implement
weaker memory ordering, we must explicitly use release/acquire stores
and loads on buf_page_t::oldest_modification_ to prevent a race condition
when buf_page_t::list does not happen to be on the same cache line.

buf_page_t::clear_oldest_modification(): Assert that the block is
not in buf_pool.flush_list, and use std::memory_order_release.

buf_page_t::oldest_modification_acquire(): Read oldest_modification_
with std::memory_order_acquire. In this way, if the return value is 0,
the caller may safely assume that it will not observe the buf_page_t
as being in buf_pool.flush_list, even if it is not holding
buf_pool.flush_list_mutex.

buf_flush_relocate_on_flush_list(), buf_LRU_free_page():
Invoke buf_page_t::oldest_modification_acquire().
2021-06-26 11:16:40 +03:00
Marko Mäkelä
4dfec8b230 Merge 10.5 into 10.6 2021-06-21 17:49:33 +03:00
Marko Mäkelä
a42c80bd48 Merge 10.4 into 10.5 2021-06-21 14:22:22 +03:00
Monty
af33202af7 Added DDL_options_st *thd_ddl_options(const MYSQL_THD thd)
This is used by InnoDB to detect if CREATE...SELECT is used

Other things:
- Changed InnoDB to use thd_ddl_options()
- Removed lock checking code for create...select (Approved by Marko)
2021-06-14 17:03:19 +03:00
Marko Mäkelä
1bd681c8b3 MDEV-25506 (3 of 3): Do not delete .ibd files before commit
This is a complete rewrite of DROP TABLE, also as part of other DDL,
such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE.

The background DROP TABLE queue hack is removed.
If a transaction needs to drop and create a table by the same name
(like TRUNCATE TABLE does), it must first rename the table to an
internal #sql-ib name. No committed version of the data dictionary
will include any #sql-ib tables, because whenever a transaction
renames a table to a #sql-ib name, it will also drop that table.
Either the rename will be rolled back, or the drop will be committed.

Data files will be unlinked after the transaction has been committed
and a FILE_RENAME record has been durably written. The file will
actually be deleted when the detached file handle returned by
fil_delete_tablespace() will be closed, after the latches have been
released. It is possible that a purge of the delete of the SYS_INDEXES
record for the clustered index will execute fil_delete_tablespace()
concurrently with the DDL transaction. In that case, the thread that
arrives later will wait for the other thread to finish.

HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag.
ha_innobase::truncate() now requires that all other references to
the table be released in advance. This was implemented by Monty.

ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected,
we will "hijack" the current transaction, drop the table in
the current transaction and commit the current transaction.
This essentially fixes MDEV-21602. There is a FIXME comment about
making the check less failure-prone.

ha_innobase::truncate(), ha_innobase::delete_table():
Implement a fast path for temporary tables. We will no longer allow
temporary tables to use the adaptive hash index.

dict_table_t::mdl_name: The original table name for the purpose of
acquiring MDL in purge, to prevent a race condition between a
DDL transaction that is dropping a table, and purge processing
undo log records of DML that had executed before the DDL operation.
For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the
dict_table_t::mdl_name will differ from dict_table_t::name.

dict_table_t::parse_name(): Use mdl_name instead of name.

dict_table_rename_in_cache(): Update mdl_name.

For the internal FTS_ tables of FULLTEXT INDEX, purge would
acquire MDL on the FTS_ table name, but not on the main table,
and therefore it would be able to run concurrently with a
DDL transaction that is dropping the table. Previously, the
DROP TABLE queue hack prevented a race between purge and DDL.
For now, we introduce purge_sys.stop_FTS() to prevent purge from
opening any table, while a DDL transaction that may drop FTS_
tables is in progress. The function fts_lock_table(), which will
be invoked before the dictionary is locked, will wait for
purge to release any table handles.

trx_t::drop_table_statistics(): Drop statistics for the table.
This replaces dict_stats_drop_index(). We will drop or rename
persistent statistics atomically as part of DDL transactions.
On lock conflict for dropping statistics, we will fail instantly
with DB_LOCK_WAIT_TIMEOUT, because we will be holding the
exclusive data dictionary latch.

trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory().
Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT
in addition to DB_DUPLICATE_KEY. The call to fts_commit() is
entirely misplaced here and may obviously break the consistency
of transactions that affect FULLTEXT INDEX. It needs to be fixed
separately.

dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175).
The counter was a work-around for missing meta-data locking (MDL)
on the SQL layer, and not really needed in MariaDB.

ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28.

HA_ERR_TABLE_IN_FK_CHECK: Remove.

row_ins_check_foreign_constraints(): Do not acquire
dict_sys.latch either. The SQL-layer MDL will protect us.

This was reviewed by Thirunarayanan Balathandayuthapani
and tested by Matthias Leich.
2021-06-09 17:06:07 +03:00
Vladislav Vaintroub
b81803f065 MDEV-22221: MariaDB with WolfSSL doesn't support AES-GCM cipher for SSL
Enable AES-GCM for SSL (only).

AES-GCM for encryption plugins remains disabled (aes-t fails, on some bug
in GCM or CTR padding)
2021-06-09 15:44:55 +02:00
Vladislav Vaintroub
5ba4c4200c MDEV-25870 Windows - fix ARM64 cross-compilation 2021-06-07 23:15:36 +02:00
Vladislav Vaintroub
3d6eb7afcf MDEV-25602 get rid of __WIN__ in favor of standard _WIN32
This fixed the MySQL bug# 20338 about misuse of double underscore
prefix __WIN__, which was old MySQL's idea of identifying Windows
Replace it by _WIN32 standard symbol for targeting Windows OS
(both 32 and 64 bit)

Not that connect storage engine is not fixed in this patch (must be
fixed in "upstream" branch)
2021-06-06 13:21:03 +02:00
Krunal Bauskar
2a4f72b7d2 MDEV-25807: ARM build failure due to missing ISB instruction on ARMv6
Debian has support for 3 different arm machine and probably one which is
failing is a pretty old version which doesn't support the said instruction.

So we can make it specific to _aarch64_ (as per the arm official reference
manual ISB should be defined by all ARMv8 processors). [as per the ARMv7 specs
even 32 bits processor should support it but not sure which exact version
Debian has under armel].

In either case, the said performance issue will have an impact mainly with a
high-end processors with extreme parallelism so it safe to limit it to _aarch64_.

Corrects: MDEV-24630
2021-06-01 13:51:39 +10:00
Georg Richter
be8e51c459 MDEV-25511: Command line tools don't support CRL parameters
Enable CRL support for GnuTLS (which was implemented by CONC-433).
2021-05-31 08:29:37 +02:00
Sergei Golubchik
7700626fe6 remove thread_pool_priv.h 2021-05-19 22:54:14 +02:00
Monty
79d9a725f9 Added checking to protect against simultaneous double free in safemalloc
If two threads would call sf_free() / free_memory() at the same time,
bad_ptr() would not detect this. Fixed by adding extra detection
when working with the memory region under sf_mutex.

Other things:
- If safe_malloc crashes while mutex is hold, stack trace printing will
  hang because we malloc is called by my_open(), which is used by stack
  trace printing code. Fixed by adding MY_NO_REGISTER flag to my_open,
  which will disable the malloc() call to remmeber the file name.
2021-05-19 22:54:14 +02:00
Monty
b8c3159594 MDEV-24285 support oracle build-in function: sys_guid
SYS_GUID() returns same as UUID(), but without any '-'

author: woqutech
2021-05-19 22:54:11 +02:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Monty
b6ff139aa3 Reduce usage of strlen()
Changes:
- To detect automatic strlen() I removed the methods in String that
  uses 'const char *' without a length:
  - String::append(const char*)
  - Binary_string(const char *str)
  - String(const char *str, CHARSET_INFO *cs)
  - append_for_single_quote(const char *)
  All usage of append(const char*) is changed to either use
  String::append(char), String::append(const char*, size_t length) or
  String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
  String::append()
- Added overflow argument to escape_string_for_mysql() and
  escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
  This was needed as most usage of the above functions never tested the
  result for -1 and would have given wrong results or crashes in case
  of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
  Changed all Item_func::func_name()'s to func_name_cstring()'s.
  The old Item_func_or_sum::func_name() is now an inline function that
  returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
  LEX_CSTRING.
- Changed for some functions the name argument from const char * to
  to const LEX_CSTRING &:
  - Item::Item_func_fix_attributes()
  - Item::check_type_...()
  - Type_std_attributes::agg_item_collations()
  - Type_std_attributes::agg_item_set_converter()
  - Type_std_attributes::agg_arg_charsets...()
  - Type_handler_hybrid_field_type::aggregate_for_result()
  - Type_handler_geometry::check_type_geom_or_binary()
  - Type_handler::Item_func_or_sum_illegal_param()
  - Predicant_to_list_comparator::add_value_skip_null()
  - Predicant_to_list_comparator::add_value()
  - cmp_item_row::prepare_comparators()
  - cmp_item_row::aggregate_row_elements_for_comparison()
  - Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
  could be simplified to not use String_space(), thanks to the fixed
  my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
  - NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
  clarify what the function really does.
- Rename of protocol function:
  bool store(const char *from, CHARSET_INFO *cs) to
  bool store_string_or_null(const char *from, CHARSET_INFO *cs).
  This was done to both clarify the difference between this 'store' function
  and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.

Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
  in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
  append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
  inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
  case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
2021-05-19 22:27:48 +02:00
Monty
da85ad7987 Optimize Sql_alloc
- Remove 'dummy_for_valgrind' overrun marker as this doesn't help much.
  The element also distorts the sizes of objects a bit, which makes it
  harder to calculate gain in object sizes when doing size optimizations.
- Replace usage of thd_get_current_thd() with _current_thd()
- Avoid one extra call indirection when using thd_get_current_thd(), which
  is used by Sql_alloc, by replacing it with _current_thd()
2021-05-19 22:27:27 +02:00
Monty
fa7d4abf16 Added typedef decimal_digits_t (uint16) for number of digits in most
aspects of decimals and integers

For fields and Item's uint8 should be good enough. After
discussions with Alexander Barkov we choose uint16 (for now)
as some format functions may accept +256 digits.

The reason for this patch was to make the usage and storage of decimal
digits simlar. Before this patch decimals was stored/used as uint8,
int and uint.  The lengths for numbers where also using a lot of
different types.

Changed most decimal variables and functions to use the new typedef.

squash! af7f09106b6c1dc20ae8c480bff6fd22d266b184

Use decimal_digits_t for all aspects of digits (total, precision
and scale), both for decimals and integers.
2021-05-19 22:27:27 +02:00
Rucha Deodhar
2fdb556e04 MDEV-8334: Rename utf8 to utf8mb3
This patch changes the main name of 3 byte character set from utf8 to
utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default,
so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
2021-05-19 06:48:36 +02:00
Marko Mäkelä
c290c0d7e0 Simplify a preprocessor condition
Compilers older than GCC 4.8 are not supported anyway.
2021-05-17 18:10:11 +03:00
Jan Lindström
bee1bb056d MDEV-9609 : wsrep_debug only logs DDL information on originating node
Added DDL logging to applier and replaying also so that
DDL is logged on other than originating node.

wsrep.h
	Removed wsrep_thd_is_local conditions and cleaned up
	the macros. Removed WSREP_TO_ISOLATION_END.

Event_job_data::execute
change_password
acl_set_default_role
mysql_execute_command
	Replaced macro by function call

wsrep_to_isolation_begin
wsrep_to_isolation_end
	If execution is not local log DDL-information when
	wsrep_debug is enabled

No new tests required as current regression setting is
already testing these code paths.
2021-05-15 13:24:22 +03:00
Sergei Golubchik
0b116d160a 5.7.34 2021-05-03 11:22:07 +02:00
Marko Mäkelä
4930f9c94b Merge 10.5 into 10.6 2021-04-21 11:45:00 +03:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Oleksandr Byelkin
a3099a3b4a MDEV-24312 master_host has 60 character limit, increase to 255 bytes
Also increase user name up to 128.

The work was started by Rucha Deodhar <rucha.deodhar@mariadb.com>,
contains audit plugin fixes by Alexey Botchkov <holyfoot@askmonty.org>.
2021-04-20 16:36:56 +02:00
Monty
031f11717d Fix all warnings given by UBSAN
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.

The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
  complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
  memory access of integers.  Fixed by using byte_order_generic.h when
  compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
  disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
  suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
  safe to have overflows (two cases, in item_func.cc).

Things fixed:
- Don't left shift signed values
  (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
  constructors.  This was needed as UBSAN checks that these types has
  correct values when one copies an object.
  (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
  deleted objects.
  (events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
  on Query_arena object.
- Fixed several cast of objects to an incompatible class!
  (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
   sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
  This includes also ++ and -- of integers.
  (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
  value_type is initialized to this instead of to -1, which is not a valid
  enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
  instead of a null string (safer as it ensures we do not do arithmetic
  on null strings).

Other things:

- Changed struct st_position to an OBJECT and added an initialization
  function to it to ensure that we do not copy or use uninitialized
  members. The change to a class was also motived that we used "struct
  st_position" and POSITION randomly trough the code which was
  confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
  the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
  avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr.  (This variable was before
  only in 10.5 and up).  It can now have one of two values:
  ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
  it virtual. This was an effort to get UBSAN to work with loaded storage
  engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
  in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
  server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
  to integer arithmetic.

Changes that should not be needed but had to be done to suppress warnings
from UBSAN:

- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
  compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
  some compile time warnings.

Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
2021-04-20 12:30:09 +03:00
Marko Mäkelä
1900c2ede5 Merge 10.5 into 10.6 2021-04-08 10:11:36 +03:00
Daniel Black
553ef1a78b MDEV-13115: Implement SELECT SKIP LOCKED
Adds an implementation for SELECT ... FOR UPDATE SKIP LOCKED /
SELECT ... LOCK IN SHARED MODE SKIP LOCKED

This is implemented only InnoDB at the moment, not in RockDB yet.

This adds a new hander flag HA_CAN_SKIP_LOCKED than
will be used when the storage engine advertises the flag.

When a storage engine indicates this flag it will get
TL_WRITE_SKIP_LOCKED and TL_READ_SKIP_LOCKED transaction types.

The Lex structure has been updated to store both the FOR UPDATE/LOCK IN
SHARE as well as the SKIP LOCKED so the SHOW CREATE VIEW
implementation is simplier.

"SELECT FOR UPDATE ... SKIP LOCKED" combined with CREATE TABLE AS or
INSERT.. SELECT on the result set is not safe for STATEMENT based
replication. MIXED replication will replicate this as row based events."

Thanks to guidance from Facebook commit
193896c466
This helped verify basic test case, and components that need implementing
(even though every part was implemented differently).

Thanks Marko for guidance on simplier InnoDB implementation.

Reviewers: Marko, Monty
2021-04-08 16:51:36 +10:00
Daniel Black
058484687a Add TL_FIRST_WRITE in SQL layer for determining R/W
Use < TL_FIRST_WRITE for determining a READ transaction.

Use TL_FIRST_WRITE as the relative operator replacing TL_WRITE_ALLOW_WRITE
as the minimium WRITE lock type.
2021-04-08 16:51:36 +10:00
Marko Mäkelä
7b48da4d7e Merge 10.4 into 10.5 2021-04-08 07:47:49 +03:00
Daniel Black
f69c1c9dcb MDEV-19508: SI_KERNEL is not on all implementations
SI_USER is, however in FreeBSD there are a couple of non-kernel
user signal infomations above SI_KERNEL.

Put a fallback just in case there is nothing available.
2021-04-07 14:01:56 +10:00
Monty
81258f1432 MDEV-17913 Encrypted transactional Aria tables remain corrupt after crash recovery, automatic repairment does not work
This was because of a wrong test in encryption code that wrote random
numbers over the LSN for pages for transactional Aria tables during repair.
The effect was that after an ALTER TABLE ENABLE KEYS of a encrypted
recovery of the tables would not work.

Fixed by changing testing of !share->now_transactional to
!share->base.born_transactional.

Other things:
- Extended Aria check_table() to check for wrong (= too big) LSN numbers.
- If check_table() failed just because of wrong LSN or TRN numbers,
  a following repair table will just do a zerofill which is much faster.
- Limit number of LSN errors in one check table to MAX_LSN_ERROR (10).
- Removed old obsolete test of 'if (error_count & 2)'. Changed error_count
  and warning_count from bits to numbers of errors/warnings as this is
  more useful.
2021-04-06 14:57:22 +03:00
Marko Mäkelä
75db05c599 Merge 10.5 into 10.6 (except MDEV-24630)
Commit 76d2846a71 was for 10.5 only.
It caused some performance regression on 10.6 in some cases,
likely related to the removal of ib_mutex_t in MDEV-21452.
2021-03-30 13:28:05 +03:00
Krunal Bauskar
76d2846a71 MDEV-24630: MY_RELAX_CPU assembly instruction upgrade/research for
memory barrier on ARM

As suggested in the said JIRA ticket based on the contribution done by
the community (in an attempt to optimize the spin-loop) the said approach
was evaluated against MariaDB Server 10.5 and found to help improve
throughput in the range of 2-5%.

Note: 10.6 timing graph and model are different as home-brew
mutexes are replaced with pthread mutexes. Said patch has mixed
impact on 10.6 so not recommended for 10.6.
2021-03-30 13:26:19 +03:00
Marko Mäkelä
8c2e3259c1 MDEV-24302 follow-up: RESET MASTER hangs
As pointed out by Andrei Elkin, the previous fix did not fix one
race condition that may have caused the observed hang.

innodb_log_flush_request(): If we are enqueueing the very first
request at the same time the log write is being completed,
we must ensure that a near-concurrent call to log_flush_notify()
will not result in a missed notification. We guarantee this by
release-acquire operations on log_requests.start and
log_sys.flushed_to_disk_lsn.

log_flush_notify_and_unlock(): Cleanup: Always release the mutex.

log_sys_t::get_flushed_lsn(): Use acquire memory order.

log_sys_t::set_flushed_lsn(): Use release memory order.

log_sys_t::set_lsn(): Use release memory order.

log_sys_t::get_lsn(): Use relaxed memory order by default, and
allow the caller to specify acquire memory order explicitly.
Whenever the log_sys.mutex is being held or when log writes are
prohibited during startup, we can use a relaxed load. Likewise,
in some assertions where reading a stale value of log_sys.lsn
should not matter, we can use a relaxed load.

This will cause some additional instructions to be emitted on
architectures that do not implement Total Store Ordering (TSO),
such as POWER, ARM, and RISC-V Weak Memory Ordering (RVWMO).
2021-03-30 10:29:11 +03:00
Krunal Bauskar
0f6f72965b MDEV-25281: Switch to use non-atomic (vs atomic) distributed counter to
track page-access counter

As part of MDEV-21212, n_page_gets that is meant to track page access,
is ported to use distributed counter that default uses atomic sub-counters.

n_page_gets originally was a non-atomic counter that represented an approximate
value of pages tracked. Using the said analogy it doesn't need to be
an atomic distributed counter.

This patch introduces an interface that allows distributed counter to be
used with atomic and non-atomic sub-counter (through template) and also
port n_page_gets to use non-atomic distributed counter using the said
updated interface.
2021-03-29 16:15:28 +03:00
Daniel Black
460d480c74 MDEV-5536: add systemd socket activation
Systemd has a socket activation feature where a mariadb.socket
definition defines the sockets to listen to, and passes those
file descriptors directly to mariadbd to use when a connection
occurs.

The new functionality is utilized when starting as follows:

  systemctl start mariadb.socket

The mariadb.socket definition only needs to contain the network
information, ListenStream= directives, the mariadb.service
definition is still used for service instigation.

When mariadbd is started in this way, the socket, port, bind-address
backlog are all assumed to be self contained in the mariadb.socket
definition and as such the mariadb settings and command line
arguments of these network settings are ignored.
See man systemd.socket for how to limit this to specific ports.

Extra ports, those specified with extra_port in socket activation
mode, are those with a FileDescriptorName=extra. These need
to be in a separate service name like mariadb-extra.socket and
these require a Service={mariadb.service} directive to map to the
original service. Extra ports need systemd v227 or greater
(not RHEL/Centos7 - v219) when FileDescriptorName= was added,
otherwise the extra ports are treated like ordinary ports.

The number of sockets isn't limited when using systemd socket activation
(except by operating system limits on file descriptors and a minimal
amount of memory used per file descriptor). The systemd sockets passed
can include any ownership or permissions, including those the
mariadbd process wouldn't normally have the permission to create.

This implementation is compatible with mariadb.service definitions.
Those services started with:

  systemctl start mariadb.service

does actually start the mariadb.service and used all the my.cnf
settings of sockets and ports like it previously did.
2021-03-28 13:53:55 +11:00
Marko Mäkelä
356c149603 Merge 10.5 into 10.6 2021-03-26 11:50:32 +02:00
Daniel Black
bcb9ca4105 MEM_CHECK_DEFINED: replace HAVE_valgrind
HAVE_valgrind_or_MSAN to HAVE_valgrind was incorrect in
af784385b4.

In my_valgrind.h when clang exists (hence no __has_feature(memory_sanitizer),
and -DWITH_VALGRIND=1, but without memcheck.h, we end up with a MEM_CHECK_DEFINED
being empty.

If we are also doing a CMAKE_BUILD_TYPE=Debug this results a number of
[-Werror,-Wunused-variable] errors because MEM_CHECK_DEFINED is empty.
With MEM_CHECK_DEFINED empty, there becomes no uses of this of the
fixed field and innodb variables in this patch.

So we stop using HAVE_valgrind as catchall and use the name
HAVE_CHECK_MEM to indicate that a CHECK_MEM_DEFINED function exists.

Reviewer: Monty

Corrects: af784385b4
2021-03-26 07:58:49 +11:00
Eugene Kosov
1274bb7729 MDEV-25223 change fil_system_t::space_list and fil_system_t::named_spaces from UT_LIST to ilist
Mostly a refactoring. Also, debug functions were added for ease of life while debugging
2021-03-24 15:15:18 +03:00
Marko Mäkelä
00528a0445 Merge 10.5 into 10.6 2021-03-19 13:35:18 +02:00
Marko Mäkelä
be881ec457 Merge 10.4 into 10.5 2021-03-19 13:09:21 +02:00
Marko Mäkelä
44d70c01f0 Merge 10.3 into 10.4 2021-03-19 11:42:44 +02:00
Marko Mäkelä
19052b6deb Merge 10.2 into 10.3 2021-03-18 12:34:48 +02:00
Marko Mäkelä
f87a944c79 Merge 10.5 into 10.6 2021-03-16 15:51:26 +02:00
Vladislav Vaintroub
031b3dfc22 MDEV-25123 support MSVC ASAN 2021-03-12 08:44:55 +01:00