When using the default innodb_log_buffer_size=2m, mariadb-backup --backup
would spend a lot of time re-reading and re-parsing the log. For reads,
it would be beneficial to memory-map the entire ib_logfile0 to the
address space (typically 48 bits or 256 TiB) and read it from there,
both during --backup and --prepare.
We will introduce the Boolean read-only parameter innodb_log_file_mmap
that will be OFF by default on most platforms, to avoid aggressive
read-ahead of the entire ib_logfile0 in when only a tiny portion would be
accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON,
because those platforms define a specific mmap(2) option for enabling
such read-ahead and therefore it can be assumed that the default would
be on-demand paging. This parameter will only have impact on the initial
InnoDB startup and recovery. Any writes to the log will use regular I/O,
except when the ib_logfile0 is stored in a specially configured file system
that is backed by persistent memory (Linux "mount -o dax").
We also experimented with allowing writes of the ib_logfile0 via a
memory mapping and decided against it. A fundamental problem would be
unnecessary read-before-write in case of a major page fault, that is,
when a new, not yet cached, virtual memory page in the circular
ib_logfile0 is being written to. There appears to be no way to tell
the operating system that we do not care about the previous contents of
the page, or that the page fault handler should just zero it out.
Many references to HAVE_PMEM have been replaced with references to
HAVE_INNODB_MMAP.
The predicate log_sys.is_pmem() has been replaced with
log_sys.is_mmap() && !log_sys.is_opened().
Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the
way that an open file handle to ib_logfile0 will be retained. In both
code paths, log_sys.is_mmap() will hold. Holding a file handle open will
allow log_t::clear_mmap() to disable the interface with fewer operations.
It should be noted that ever since
commit 685d958e38 (MDEV-14425)
most 64-bit Linux platforms on our CI platforms
(s390x a.k.a. IBM System Z being a notable exception) read and write
/dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is
persistent memory (mount -o dax). So, the memory mapping based log
parsing that this change is enabling by default on Linux and FreeBSD
has already been extensively tested on Linux.
::log_mmap(): If a log cannot be opened as PMEM and the desired access
is read-only, try to open a read-only memory mapping.
xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile():
Copy the InnoDB log in mariadb-backup --backup from a memory
mapped file.
Building of MariaDB server version 11.0 and 11.1 fails
on MacOS 12.x (Monterey). Build failure happened on generating
header files from the error messages contained in the file
errmsg-utf8.txt. This process is performed by the utility comp_err
that crashes when it is run on MacOS Monterey.
comp_err invokes my_init at the very beginning of start before
initialization of thread environment done. While executing my_init
the function my_readlink is called. my_readlink is wrapper around
the system call readlink with extra errors handling. In case the
system call readlink returns error the following block of code is run
if (my_thread_var)
my_errno= errno;
my_thred_var is macros that expanded to invocation of
my_pthread_getspecific() against supplied thread specific key
THR_KEY_mysys. Unfortunately, the tsd key THR_KEY_mysys is initialized
right after the call of my_init() so return value of pthread_getspecific
is platform dependent. On Linux pthread_getspecific returns NULL
if key is not a valid TSD key. On MacOS, the effect of calling
pthread_getspecific() with a key value not obtained from pthread_key_create()
is undefined. So, on MacOS pthread_getspecific() returns some invalid address
where the errno value is written. It leads to a crash latter when the library
API function pthread_self() is called.
To fix the issue, initialization of thread environment is moved at very
beginning of the main() function in order to run it as the first step
to have full initiliazed environment at the moment when my_init()
is ivoked.
don't include my_progname in the error message, my_error starts from it
automatically, resulting in, like
/usr/bin/mysqladmin: Notice: /usr/bin/mysqladmin is deprecated and will be removed in a future release, use command 'mariadb-admin'
and remove "Notice" so that the problem description would directly
follow the executable name.
make the check to work when the executable is in the PATH
(so, invoked simply like 'mysql' and thus readlink cannot find it)
fix the check in mysql_install_db and mysql_secure_installation to not
print the warning if the intermediate path contains "mysql" substring
add this message also to
* mysql_waitpid
* mysql_convert_table_format
* mysql_find_rows
* mysql_setpermissions
* mysqlaccess
* mysqld_multi
* mysqld_safe
* mysqldumpslow
* mysqlhotcopy
* mysql_ldb
Closes#2273
Eventually mysql symlinks will go away, as MariaDB and MySQL keep
diverging and we do not want to make it impossible to install
MariaDB and MySQL side-by-side when users want it.
It also useful if people start using MariaDB tools with MariaDB.
If the exe doesn't begine with "mariadb" or is a symlink,
print a warning to use the resolved name.
In my_readlink, add check on my_thread_var as its used by comp_err
and other build utils that also use my_init.
This avoids LF->CRLF conversion by the C runtime, which historically has
been rather buggy (see MDEV-9409)
Disabling text mode also fixes the --binary-mode in command line client
to work the same on Windows, as it does elsewhere.
The user-visible effect is that some text files, e.g output of mysqldump
or mysqlbinlog will not have CRLF end-of-lines,but LF. That should be
acceptable, as even Notepad can read this Unix EOLs since 2018
(on older Windows, Wordpad can)
Leave error log in text(CRLF) mode for now, for the sake of old Windows.
- Use corresponding entry in the manifest, as described in
https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
- If if ANSI codepage is UTF8 (i.e for Windows 1903 and later)
Use UTF8 as default client charset
Set console codepage(s) to UTF8, in case process is using console
- Allow some previously disabled MTR tests, that used Unicode for in "exec",
for the recent Windows versions
This is useful for thing like Item_true and Item_false that we
allocated and initalize once and want to ensure that nothing can
change them
Main changes:
- Memory protection is achived by allocating memory with mmap() and
protect it from write with mprotect()
- init_alloc_root(...,MY_ROOT_USE_MPROTECT) will create a
memroot that one can later use with protect_root() to turn it
read only or turn it back to read-write. All allocations to this
memroot is done with mmap() to ensure page alligned allocations.
- alloc_root() code was rearranged to combine normal and valgrind code.
- init_alloc_root() now changes block size to be power of 2's, to get less
memory fragmentation.
- Changed MEM_ROOT structure to make it smaller. Also renamed
MEM_ROOT m_psi_key to psi_key.
- Moved MY_THREAD_SPECIFIC marker in MEM_ROOT from block size (old hack)
to flags.
- Added global variable my_system_page_size. This is initialized at
startup.
Add CRC32C code to mysys. The x86-64 implementation uses PCMULQDQ in addition to CRC32 instruction
after Intel whitepaper, and is ported from rocksdb code.
Optimized ARM and POWER CRC32 were already present in mysys.
Existing implementation used my_checksum (from mysys)
for calculating table checksum and binlog checksum.
This implementation was optimized for powerpc only and lacked
SIMD implementation for x86 (using clmul) and ARM
(using ACLE) instead used zlib-crc32.
mariabackup had its own copy of the crc32 implementation
using hardware optimized implementation only for x86 and lagged
hardware based implementation for powerpc and ARM.
Patch helps unifies all such calls and help aggregate all of them
using an unified interface my_checksum().
Said unification also enables hardware optimized calls for all
architecture viz. x86, ARM, POWERPC.
Default always fallback to zlib crc32.
Thanks to Daniel Black for reviewing, fixing and testing
PowerPC changes. Thanks to Marko and Daniel for early code feedback.
- Do not scan registry to check if TCPIP is supported.
- Do not read registry under HKEY_LOCAL_MACHINE\SOFTWARE\MySQL anymore.
- Do not load threadpool function dynamically, it is available
since Win7.
- simplify win32_init_tcp_ip(), and return error of WSAStartup() fails.
- Correct comment in my_parameter_handler()
don't rely on imprecise my_file_opened | my_stream_opened,
scan the array for open handles instead.
also, remove my_print_open_files() and embed it in my_end()
to have one array scan instead of two.
To use:
- Compile with -DUSE_MY_LIKELY
- Change (with replace) all likely/unlikely to my_likely/my_/unlikely
replace likely my_likely unlikely my_unlikely -- *c *h
- Start mysqld with -T
- run some test
- When mysqld has shut down cleanely, report will be on stderr
- Fix win64 pointer truncation warnings
(usually coming from misusing 0x%lx and long cast in DBUG)
- Also fix printf-format warnings
Make the above mentioned warnings fatal.
- fix pthread_join on Windows to set return value.
If compiling a non DBUG binary with
-DDBUG_ASSERT_AS_PRINTF asserts will be
changed to printf + stack trace (of stack
trace are enabled).
- Changed #ifndef DBUG_OFF to
#ifdef DBUG_ASSERT_EXISTS
for those DBUG_OFF that was just used to enable
assert
- Assert checking that could greatly impact
performance where changed to DBUG_ASSERT_SLOW which
is not affected by DBUG_ASSERT_AS_PRINTF
- Added one extra option to my_print_stacktrace() to
get more silent in case of stack trace printing as
part of assert.
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
my_thread_global_init() + my_thrad_global_end() repeatadily.
This caused THR_KEY_mysys to be allocated multiple times.
Deletion of THR_KEY_mysys was originally in my_thread_global_end() but was
moved to my_end() as DBUG uses THR_KEY_mysys and DBUG is released after
my_thread_global_end() is called.
Releasing DBUG before my_thread_global_end() and move THR_KEY_mysys back
into my_thread_global_end() could be a solution, but as safe_mutex and other
things called by my_thread_global_end is using DBUG it may not be completely
safe.
To solve this, I used the simple solution to add a marker that THR_KEY_mysys
is created and not re-create it in my_thread_global_init if it already
exists.
Added MAX_STATEMENT_TIME user variable to automaticly kill queries after a given time limit has expired.
- Added timer functions based on pthread_cond_timedwait
- Added kill_handlerton() to signal storage engines about kill/timeout
- Added support for GRANT ... MAX_STATEMENT_TIME=#
- Copy max_statement_time to current user, if stored in mysql.user
- Added status variable max_statement_time_exceeded
- Added KILL_TIMEOUT
- Removed digest hash from performance schema tests as they change all the time.
- Updated test results that changed because of the new user variables or new fields in mysql.user
This functionallity is inspired by work done by Davi Arnaut at twitter.
Test case is copied from Davi's work.
Documentation can be found at
https://kb.askmonty.org/en/how-to-limittimeout-queries/
mysql-test/r/mysqld--help.result:
Updated for new help message
mysql-test/suite/perfschema/r/all_instances.result:
Added new mutex
mysql-test/suite/sys_vars/r/max_statement_time_basic.result:
Added testing of max_statement_time
mysql-test/suite/sys_vars/t/max_statement_time_basic.test:
Added testing of max_statement_time
mysql-test/t/max_statement_time.test:
Added testing of max_statement_time
mysys/CMakeLists.txt:
Added thr_timer
mysys/my_init.c:
mysys/mysys_priv.h:
Added new mutex and condition variables
Added new mutex and condition variables
mysys/thr_timer.c:
Added timer functions based on pthread_cond_timedwait()
This can be compiled with HAVE_TIMER_CREATE to benchmark agains timer_create()/timer_settime()
sql/lex.h:
Added MAX_STATEMENT_TIME
sql/log_event.cc:
Safety fix (timeout should be threated as an interrupted query)
sql/mysqld.cc:
Added support for timers
Added status variable max_statement_time_exceeded
sql/share/errmsg-utf8.txt:
Added ER_QUERY_TIMEOUT
sql/signal_handler.cc:
Added support for KILL_TIMEOUT
sql/sql_acl.cc:
Added support for GRANT ... MAX_STATEMENT_TIME=#
Copy max_statement_time to current user
sql/sql_class.cc:
Added timer functionality to THD.
Added thd_kill_timeout()
sql/sql_class.h:
Added timer functionality to THD.
Added KILL_TIMEOUT
Added max_statement_time variable in similar manner as long_query_time was done.
sql/sql_connect.cc:
Added handling of max_statement_time_exceeded
sql/sql_parse.cc:
Added starting and stopping timers for queries.
sql/sql_show.cc:
Added max_statement_time_exceeded for user/connects status in MariaDB 10.0
sql/sql_yacc.yy:
Added support for GRANT ... MAX_STATEMENT_TIME=# syntax, to be enabled in 10.0
sql/structs.h:
Added max_statement_time user resource
sql/sys_vars.cc:
Added max_statement_time variables
mysql-test/suite/roles/create_and_drop_role_invalid_user_table.test
Removed test as we require all fields in mysql.user table.
scripts/mysql_system_tables.sql
scripts/mysql_system_tables_data.sql
scripts/mysql_system_tables_fix.sql
Updated mysql.user with new max_statement_time field
and small collateral changes
mysql-test/lib/My/Test.pm:
somehow with "print" we get truncated writes sometimes
mysql-test/suite/perfschema/r/digest_table_full.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/dml_handler.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/information_schema.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/nesting.result:
this differs, because we don't rewrite general log queries, and multi-statement
packets are logged as a one entry. this result file is identical to what mysql-5.6.5
produces with the --log-raw option.
mysql-test/suite/perfschema/r/relaylog.result:
MariaDB modifies the binlog index file directly, while MySQL 5.6 has a feature "crash-safe binlog index" and modifies a special "crash-safe" shadow copy of the index file and then moves it over. That's why this test shows "NONE" index file writes in MySQL and "MANY" in MariaDB.
mysql-test/suite/perfschema/r/server_init.result:
MariaDB initializes the "manager" resources from the "manager" thread, and starts this thread only when --flush-time is not 0. MySQL 5.6 initializes "manager" resources unconditionally on server startup.
mysql-test/suite/perfschema/r/stage_mdl_global.result:
this differs, because MariaDB disables query cache when query_cache_size=0. MySQL does not
do that, and this causes useless mutex locks and waits.
mysql-test/suite/perfschema/r/statement_digest.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_consumers.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_long_query.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/rpl/r/rpl_mixed_drop_create_temp_table.result:
will be updated to match 5.6 when alfranio.correia@oracle.com-20110512172919-c1b5kmum4h52g0ni and anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y are merged
mysql-test/suite/rpl/r/rpl_non_direct_mixed_mixing_engines.result:
will be updated to match 5.6 when anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y is merged
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5
Fixed memory leak printing when doing 'mysqld --version', 'mysqld --debug --help' and 'mysqld --debug --help --verbose'
mysys/my_init.c:
Moved checking if we should call DBUG_END() before my_thread_end() as otherwise we will not free DBUG variables and files.
mysys/thr_lock.c:
Fixed compiler warning
sql/mysqld.cc:
Fixed memory leaks when using mysqld --help and mysqld --version
Added --debug as an option that works for all builds. For non debug builds we now get a warning.
strings/dtoa.c:
Fixed valgrind warning (c could contain data outside of the given string)