In commit fd9ca2a742 (MDEV-23295)
we added a debug assertion, which caught a similar bug.
prepare_inplace_alter_table_dict(): If we had promised that
ALGORITHM=INPLACE or ALGORITHM=NOCOPY is supported, we must
preserve the ROW_FORMAT.
The parameters innodb_thread_concurrency and innodb_commit_concurrency
were useful years ago when both computing resources and the implementation
of some shared data structures were limited. MySQL 5.0 or 5.1 had trouble
scaling beyond 8 concurrent connections. Most of the scalability bottlenecks
have been removed since then, and the transactions per second delivered
by MariaDB Server 10.5 should not dramatically drop upon exceeding the
'optimal' number of connections.
Hence, enabling any concurrency throttling for InnoDB actually makes
things worse. We have seen many customers mistakenly setting this to a
small value like 16 or 64 and then complaining the server was slow.
Ignoring the parameters allows us to remove some normally unused code
and data structures, which could slightly improve performance.
innodb_thread_concurrency, innodb_commit_concurrency,
innodb_replication_delay, innodb_concurrency_tickets,
innodb_thread_sleep_delay, innodb_adaptive_max_sleep_delay:
Deprecate and ignore; hard-wire to 0.
The column INFORMATION_SCHEMA.INNODB_TRX.trx_concurrency_tickets
will always report 0.
Problem:- rpl_parallel2 was failing non-deterministically
Analysis:-
When FLUSH TABLES WITH READ LOCK is executed, it will allow all worker
threads to complete their ongoing transactions and then it will pause them.
At this state FTWRL will proceed to acquire global read lock. FTWRL first
blocks threads from starting new commits, then upgrades the lock to block
commit of existing transactions.
Step1:
FLUSH TABLES WITH READ LOCK - Blocks new commits
Step2:
* STOP SLAVE command enables 'force_abort=1' which unblocks workers,
they continue to execute events.
* T1: Waits in 'record_gtid' call to update 'gtid_slave_pos' table with
its current GTID, but it is blocked becuase of Step1.
* T2: Holds COMMIT lock and waits for T1 to commit.
Step3:
FLUSH TABLES WITH READ LOCK - Waiting to get BLOCK_COMMIT.
This results in deadlock. When STOP SLAVE command allows paused workers to
proceed, workers should skip the execution of all further events, similar
to 'conservative' parallel mode.
Solution:-
We will assign 1 to skip_event_group when we are aborted in do_ftwrl_wait.
rpl_parallel_entry->pause_sub_id is only reset when force_abort is off in
rpl_pause_after_ftwrl.
/home/buildbot/buildbot/build/storage/xtradb/mtr/mtr0mtr.cc:97:37: error: invalid access to non-static data member ‘fil_space_t::latch’ of NULL object [-Werror=invalid-offsetof]
Changing that in case of *INT and hex hybrid input:
- ROUND(x,NULL) creates a column with the same type as x.
The old code created a DOUBLE column, which was not relevant at all.
This change simplifies the code a lot.
- ROUND(x,non_constant) creates a column of the INT, BIGINT or DECIMAL
data type (depending on the exact type of x).
The old code created a column of the DOUBLE data type,
which lead to precision loss. Hence MDEV-23366.
- ROUND(bigint_30,negative_constant) creates a column of the DECIMAL(30,0)
data type. The old code created DECIMAL(29,0), which looked strange:
the data type promoted to a higher one, but max length reduced.
Now the length attribute is preserved.
failed or late ER_PERIOD_FIELD_WRONG_ATTRIBUTES upon attempt to create
existing table
Analysis: Error state is not stored when field is checked in
Table_period_info::check_field()
Fix: Store error state by setting res to true.
Item_func_round::fix_arg_int() did not take into account cases
when the result of ROUND(bigint_subject,negative_precision)
could go outside of the BIGINT range. The old code only incremented
max_length, but did not extend change the data type.
Fixing to extend the data type (together with max_length increment).
MDEV-22871 refactored the InnoDB buf_pool.page_hash to use a simple
rw-lock implementation that avoids a spinloop between non-contended
read-lock requests, simply using std::atomic::fetch_add() for the
lock acquisition.
Alas, in a write-heavy stress test on a 56-core system with 1,000
concurrent client connections, the server would stop processing
any transactions every now and then. The reason turned out to be
false sharing. Attaching a debugger to the server during one such
hang revealed that 22 of the 1,033 threads were polling in
page_hash_latch::read_lock_wait() on the same object, which appeared
to be in unlocked state (no readers or writers). All 22 requests were
for accessing an undo log page, with a distinct page number.
To eliminate such false sharing, we will make buf_pool.page_hash.array
contain one page_hash_latch per CPU data cache line. On AMD64, this
will pad the size of the array by 8/7, or almost 15%. For a 50GiB
buffer pool of 16KiB pages, the buf_pool.page_hash.array would
grow from 25MiB to 28.6MiB. On other instruction set architectures,
the incurred memory overhead may be smaller.
Thanks to Vladislav Vaintroub for noticing this anomaly.
The condition in Item_func_round::fix_arg_int() to decide whether:
- we can preserve the data type of args[0] versus
- the result can go outside of the args[0] data type
was wrong.
The data type of the first argument can be preserved in these cases:
- TRUNCATE(x, n)
- ROUND(x, n>=0)
Fixing the condition accordingly.
Sorry, this should have been pushed instead of
7f4c749d64 to 10.3.
(The push of d63631c3fa
prevented required an extra merge of that fix.
And that commit was merged to 10.4
da78e952fb327161311a590eb902c5c55da0f2fc.)
On MariaDB 10.4 (commit 4db4b77365),
the query results would not be sorted, which creates random result
differences. Let us explicitly sort the results already in 10.3
in order to avoid future merge trouble.
On MariaDB 10.4 (commit 4db4b77365),
the query results would not be sorted, which creates random result
differences. Let us explicitly sort the results already in 10.3
in order to avoid future merge trouble.
- Adding optional qualifiers to data types:
CREATE TABLE t1 (a schema.DATE);
Qualifiers now work only for three pre-defined schemas:
mariadb_schema
oracle_schema
maxdb_schema
These schemas are virtual (hard-coded) for now, but may turn into real
databases on disk in the future.
- mariadb_schema.TYPE now always resolves to a true MariaDB data
type TYPE without sql_mode specific translations.
- oracle_schema.DATE translates to MariaDB DATETIME.
- maxdb_schema.TIMESTAMP translates to MariaDB DATETIME.
- Fixing SHOW CREATE TABLE to use a qualifier for a data type TYPE
if the current sql_mode translates TYPE to something else.
The above changes fix the reported problem, so this script:
SET sql_mode=ORACLE;
CREATE TABLE t2 AS SELECT mariadb_date_column FROM t1;
is now replicated as:
SET sql_mode=ORACLE;
CREATE TABLE t2 (mariadb_date_column mariadb_schema.DATE);
and the slave can unambiguously treat DATE as the true MariaDB DATE
without ORACLE specific translation to DATETIME.
Similar,
SET sql_mode=MAXDB;
CREATE TABLE t2 AS SELECT mariadb_timestamp_column FROM t1;
is now replicated as:
SET sql_mode=MAXDB;
CREATE TABLE t2 (mariadb_timestamp_column mariadb_schema.TIMESTAMP);
so the slave treats TIMESTAMP as the true MariaDB TIMESTAMP
without MAXDB specific translation to DATETIME.
- Include a link to the relevant KB article for more info
- Use spaced around the equal sign for better readability and so that
the examples are more aligned with the general style in the KB
- Load plugins with just the base name, the .so is optional and excess
Verified by running before and after:
mariadb --skip-column-names -e "select plugin_name, plugin_status,
plugin_type, plugin_library, plugin_license from
information_schema.all_plugins order by plugin_name, plugin_library"
Nothing else but exactly this line changed so there are no side effects:
-S3 NOT INSTALLED STORAGE ENGINE ha_s3.so GPL
+S3 ACTIVE STORAGE ENGINE ha_s3.so GPL
Also enrich config file with link to KB and unify option syntax and
standard comments.
Fixing ROUND(date,0), TRUNCATE(date,x), FLOOR(date), CEILING(date)
to return the `int(8) unsigned` data type.
Details:
1. Cleanup: moving virtual implementations
- Type_handler_temporal_result::Item_func_int_val_fix_length_and_dec()
- Type_handler_temporal_result::Item_func_round_fix_length_and_dec()
to Type_handler_date_common. Other temporal data type handlers
override these methods anyway. So they were only DATE specific.
This change makes the code clearer.
2. Backporting DTCollation_numeric from 10.5, to reuse the code easier.
3. Adding the `preferred_attrs` argument to Item_func_round::fix_arg_int(). Now
Type_handler_xxx::Item_func_round_val_fix_length_and_dec() work as follows:
- The INT-alike and YEAR handlers copy preferred_attrs from args[0].
- The DATE handler passes explicit attributes, to get `int(8) unsigned`.
- The hex hybrid handler passes NULL, so fix_arg_int() calculates attributes.
4. Type_handler_date_common::Item_func_int_val_fix_length_and_dec()
now sets the type handler and attributes to get `int(8) unsigned`.
In case of NATURAL JOIN / USING mark all field (one table can not be opened
in any case so optimisation does not worth it).
IMHO table should be checked for used fields and filled after prepare,
when we will fave whole info about used fields but it is too big change
for a bugfix. Which will be made later by Serg patch
The purpose of the InnoDB doublewrite buffer is to make InnoDB
tolerant against cases where the server was killed in the middle
of a page write. (In Linux, killing a process may interrupt a
write system call, typically on a 4096-byte boundary.)
There may exist multiple copies of a page number in the doublewrite
buffer. Recovery should choose the latest valid copy of the page.
By design, the FIL_PAGE_LSN must not precede the latest checkpoint LSN
nor be later than the end of the recovered log.
For page_compressed and encrypted pages, we were missing proper
consistency checks. In the 10.4 data set generated for in MDEV-23231,
the data file contained a valid page_compressed page, and an
identical copy of that page was also present in the doublewrite
buffer. But, recovery would incorrectly consider the page invalid
and restore an uncompressed copy of the same page that had been
written before the log checkpoint. (In fact, no redo log was to
be applied to that page.)
buf_dblwr_process(): Validate the FIL_PAGE_LSN in the doublewrite
buffer pages, and always skip page 0, because those pages should
have been recovered by Datafile::restore_from_doublewrite() if
necessary.
Datafile::restore_from_doublewrite(): Choose the latest applicable
page from the doublewrite buffer.
recv_dblwr_t::find_page(): Also validate encrypted or
page_compressed pages.
recv_dblwr_t::validate_page(): New function to validate a page,
either a copy in a data file or in the doublewrite buffer.
Also validate encrypted or page_compressed pages.
This is joint work with Thirunarayanan Balathandayuthapani.
row_vers_impl_x_locked_low(): clust_offsets may point to memory
that is allocated by mem_heap_alloc() and may have been freed.
For initializing clust_offsets, try to use the stack-allocated
buffer instead of a pointer that may point to freed memory.
This fixes a regression that was introduced in
commit f0aa073f2b (MDEV-20950).