This patch provides a new mode FULL_NODUP to binlog_row_image system
variable. With FULL_NODUP mode, all columns are included in before
image, but only updated columns are included in after image for UPDATE.
While all columns are included in the after image for INSERT.
FULL_NODUP is for replacing FULL mode. It includes all data of
the before and after image as FULL mode, but it uses less storage
especially in the case that only a few columns are updated.
Note: It will binlog full before and after image for all modes if the
table has no primary key. FULL_NODUP follows the behavior.
This allows to simplify net_real_read() and net_real_write() a bit.
Removed some superfluous #ifdef/ifndef MYSQL_SERVER from net_serv.cc
The code always runs in server, either normal or embedded.
Dead code for switching socket between blocking and non-blocking modes,
is also removed.
Removed pthread_kill() with alarm signal that woke up main thread on
server shutdown. Used shutdown(2) on polling sockets instead, to the same
effect.
Removed yet another superstitious pthread_kill(), that ran on non-Windows
in terminate_slave_thread().
The reason was that Event e11 was re-executed before
"ALTER EVENT e11 DISABLE" had been executed.
Fixed by increasing re-schedule time
Other things:
- Removed double accounting of 'execution_count'. It was incremented in
top->mark_last_executed(thd) that was executed a few lines earlier.
MDEV-32441 SENT_ROWS shows random wrong values when stored function
is selected.
MDEV-32281 EXAMINED_ROWS is not populated in
information_schema.processlist upon SELECT.
Added ROWS_SENT to information_schema.processlist
This is to have the same information as Percona server (SENT_ROWS)
To ensure that information_schema.processlist has correct values for
sent_rows and examined_rows I introduced two new variables to hold the
total counts so far. This was needed as stored functions and stored
procedures will reset the normal counters to be able to count rows for
each statement individually for slow query log.
Other things:
- Selects with functions shows in processlist the total examined_rows
and sent_rows by the main statement and all functions.
- Stored procedures shows in processlist examined_rows and sent_rows
per stored procedure statement.
- Fixed some double accounting for sent_rows and examined_rows.
- HANDLER operations now also supports send_rows and examined_rows.
- Display sizes for MEMORY_USED, MAX_MEMORY_USED, EXAMINED_ROWS and
QUERY_ID in information_schema.processlist changed to 10 characters.
- EXAMINED_ROWS and SENT_ROWS changed to bigint.
- INSERT RETURNING and DELETE RETURNING now updates SENT_ROWS.
- As thd is always up to date with examined_rows, we do not need
to handle examined row counting for unions or filesort.
- I renamed SORT_INFO::examined_rows to m_examined_rows to ensure that
we don't get bugs in merges that tries to use examined_rows.
- Removed calls of type "thd->set_examined_row_count(0)" as they are
not needed anymore.
- Removed JOIN::join_examined_rows
- Removed not used functions:
THD::set_examined_row_count()
- Made inline some functions that where called for each row.
Compute binlog checksums (when enabled) already when writing events
into the statement or transaction caches, where before it was done
when the caches are copied to the real binlog file. This moves the
checksum computation outside of holding LOCK_log, improving
scalabitily.
At stmt/trx cache write time, the final end_log_pos values are not
known, so with this patch these will be set to 0. Events that are
written directly to the binlog file (not through stmt/trx cache) keep
the correct end_log_pos value. The GTID and COMMIT/XID events at the
start and end of event groups are written directly, so the zero
end_log_pos is only for events in the middle of event groups, which
do not negatively affect replication.
An option --binlog-legacy-event-pos, off by default, is provided to
disable this behavior to provide backwards compatibility with any
external applications that might rely on end_log_pos in events in the
middle of event groups.
Checksums cannot be pre-computed when binlog encryption is enabled, as
encryption relies on correct end_log_pos to provide part of the
nonce/IV.
Checksum pre-computation is also disabled for WSREP/Galera, as it uses
events differently in its write-sets and so on. Extending pre-computation of
checksums to Galera where it makes sense could be added in a future patch.
The current --binlog-checksum configuration is saved in
binlog_cache_data at transaction start and used to pre-compute
checksums in cache, if applicable. When the cache is later copied to
the binlog, a check is made if the saved value still matches the
configured global value; if so, the events are block-copied directly
into the binlog file. If --binlog-checksum was changed during the
transaction, events are re-written to the binlog file one-by-one and
the checksums recomputed/discarded as appropriate.
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
- Introduced the variable "innodb_truncate_temporary_tablespace_now"
to shrink the temporary tablespace.
Steps for shrinking the temporary tablespace:
1) Find the last used extent in temporary tablespace
by iterating through the BITMAP in extent descriptor pages
2) If the last used extent is lesser than user specified size
then set desired target size to user specified size
3) Store the page contents of "to be modified" extent
descriptor pages, latches the "to be modified" extent
descriptor pages and check for buffer pool memory availability
4) Update the FSP_SIZE and FSP_FREE_LIMIT in header page
5) Remove the "to be truncated" pages from FSP_FREE and
FSP_FREE_FRAG list
6) Reset the bitmap in the last descriptor pages for the
"to be truncated" pages.
7) Clear the freed range in temporary tablespace which
are to be truncated.
8) Evict the "to be truncated" temporary tablespace pages
from LRU list.
9) In case of multiple files, calculate the truncated last
file size and do truncation in last file
10) Commit the mini-transaction for shrinking the tablespace
This is a preparatory commit for pre-computing checksums outside of
holding LOCK_log, no functional changes.
Which checksum algorithm is used (if any) when writing an event does not
belong in the event, it is a property of the log being written to.
Instead decide the checksum algorithm when constructing the
Log_event_writer object, and store it there.
Introduce a client-only Log_event::read_checksum_alg to be able to
print the checksum read, and a
Format_description_log_event::source_checksum_alg which is the
checksum algorithm (if any) to use when reading events from a log.
Also eliminate some redundant `enum` keywords on the enum_binlog_checksum_alg
type.
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Like sql_mode, we factor out of ON_CHECK function for export, to be
used in get_options() during server startup, for validation of
--redirect_url value.
Adding a global/session var `redirect_url' of string type. The initial
value is empty. Can be supplied in mysqld with --redirect-url or set
in --init-connect. A valid redirect_url should be of the format
{mysql,mariadb}://host[:port]
where <host> is an arbitrary string not containing colons, and <port>
is a number between 0 and 65535 inclusive.
The variable will be used by the server to notify clients that they
should connect to another server, specified by the value of the
variable, if not empty.
The notification is done by the inclusion of the variable in
session_track_system_variable.
Merge sys_var_charptr with sys_var_charptr_base, as well as merge
Sys_var_session_lexstring into Sys_var_lexstring. Also refactored
update methods of sys_var_charptr accordingly.
Because the class is more generic, session_update() calls
sys_var_charptr::session_update() which does not assume a buffer field
associated with THD, but instead call strdup/free, we get rid of
THD::default_master_connection_buff accordingly. This also makes THD
smaller by ~192 bytes, and there can be many thousands of concurrent
THDs.
New Feature:
============
This patch extends the START SLAVE UNTIL command with options
SQL_BEFORE_GTIDS and SQL_AFTER_GTIDS to allow user control of
whether the replica stops before or after a provided GTID state. Its
syntax is:
START SLAVE UNTIL (SQL_BEFORE_GTIDS|SQL_AFTER_GTIDS)=”<gtid_list>”
When providing SQL_BEFORE_GTIDS=”<gtid_list>”, for each domain
specified in the gtid_list, the replica will execute transactions up
to the GTID found, and immediately stop processing events in that
domain (without executing the transaction of the specified GTID).
Once all domains have stopped, the replica will stop. Events
originating from domains that are not specified in the list are not
replicated.
START SLAVE UNTIL SQL_AFTER_GTIDS=”<gtid_list>” is an alias to the
default behavior of START SLAVE UNTIL master_gtid_pos=”<gtid_list>”.
That is, the replica will only execute transactions originating from
domain ids provided in the list, and will stop once all transactions
provided in the UNTIL list have all been executed.
Example:
=========
If a primary server has a binary log consisting of the following GTIDs:
0-1-1
1-1-1
0-1-2
1-1-2
0-1-3
1-1-3
If a fresh replica (i.e. one with an empty GTID position,
@@gtid_slave_pos='') is started with SQL_BEFORE_GTIDS, i.e.
START SLAVE UNTIL SQL_BEFORE_GTIDS=”1-1-2”
The resulting gtid_slave_pos of the replica will be “1-1-1”.
This is because the replica will execute only events from domain 1
until it sees the transaction with sequence number 2, and
immediately stop without executing it.
If the replica is started with SQL_AFTER_GTIDS, i.e.
START SLAVE UNTIL SQL_AFTER_GTIDS=”1-1-2”
then the resulting gtid_slave_pos of the replica will be “1-1-2”.
This is because it will only execute events from domain 1 until it
has executed the provided GTID.
Reviewed By:
============
Kristian Nielson <knielsen@knielsen-hq.org>
- InnoDB fails to check the overflow buffer while applying
the operation to the table that was rebuilt. This is caused
by commit 3cef4f8f0f (MDEV-515).
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.
Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue
Disabled test:
- spider/bugfix.mdev_27239 because we started to get
+Error 1429 Unable to connect to foreign data source: localhost
-Error 1158 Got an error reading communication packets
- main.delayed
- Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
This part is disabled for now as it fails randomly with different
warnings/errors (no corruption).
The error is caused by MDEV-30165 fix with the following commit:
d13a57ae81
There is logical error in lock_release_on_prepare_try():
if (supremum_bit)
lock_rec_unlock_supremum(*cell, lock);
else
lock_rec_dequeue_from_page(lock, false);
Because there can be other bits set in the lock's bitmap, and the lock
type can be suitable for releasing criteria, but the above logic
releases only supremum bit of the lock.
The fix is to release lock if it suits for releasing criteria and unlock
supremum if supremum is locked otherwise.
Tere is also the test for the case, which was reported by QA team. I
placed it in a separate files, because it requires debug build.
Reviewed by: Marko Mäkelä
- InnoDB fails to report the error when encryption configuration
wasn't passed. This patch addresses the issue by adding
the error while loading the tablespace and deferring the
tablespace creation.
There are many filesystem related errors that can occur with
MariaBackup. These already outputed to stderr with a good description of
the error. Many of these are permission or resource (file descriptor)
limits where the assertion and resulting core crash doesn't offer
developers anything more than the log message. To the user, assertions
and core crashes come across as poor error handling.
As such we return an error and handle this all the way up the stack.
Before starting to go over the format string, prepare the current time
zone information incase '%z' or '%Z' is encountered.
This information can be obtained as given below:
A) If timezone is not set ( meaning we are working with system timezone):
Get the MYSQL_TIME representation for current time and GMT time using
current thread variable for timezone and timezone variable for UTC
respectively. This MYSQL_TIME variable will be used to calculate time
difference. Also convert current time in second to tm structure to
get system timezone information.
B) If timezone is set as offset:
Get timezone information using current timezone information and store
in appropriate variable.
C) If timezone is set as some place (example: Europe/Berlin)
Get timezone information by searching the timezone. During internal
timezone search, information like timeoffset from UTC and abbrevation
is stored in another relevant structure. Hence use the same information.
In commit 5ea5291 @sanja-byelkin for unknown reason switched the file mode
for 3 Galera tzinfo related test files from 644 -> 755. This exists only
from branch 10.6 onward:
$ git checkout 10.5
$ find mysql-test -executable -name *.test -or -executable -name *.result
(no results)
$ git checkout 10.6
$ find mysql-test -executable -name *.test -or -executable -name *.result
mysql-test/suite/galera/t/mysql_tzmysql-test/suite/galera/t/mysql_tzinfo_to_sql.test
mysql-test/suite/galera/t/mariadb_tzinfo_to_sql.test
mysql-test/suite/galera/r/mariadb_tzinfo_to_sql.resultinfo_to_sql.test
mysql-test/suite/galera/t/mariadb_tzinfo_to_sql.test
mysql-test/suite/galera/r/mariadb_tzinfo_to_sql.result
No test file nor test result file should be executable, so run chmod -x
on them.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
This allows a user to to change the default value of MAX_SEL_ARGS (16000)
in the rare case where they neeed more generated SEL_ARGS (as part of
the range optimizer)
Raise notes if indexes cannot be used:
- in case of data type or collation mismatch (diferent error messages).
- in case if a table field was replaced to something else
(e.g. Item_func_conv_charset) during a condition rewrite.
Added option to write warnings and notes to the slow query log for
slow queries.
New variables added/changed:
- note_verbosity, with is a set of the following options:
basic - All old notes
unusable_keys - Print warnings about keys that cannot be used
for select, delete or update.
explain - Print unusable_keys warnings for EXPLAIN querys.
The default is 'basic,explain'. This means that for old installations
the only notable new behavior is that one will get notes about
unusable keys when one does an EXPLAIN for a query. One can turn all
of all notes by either setting note_verbosity to "" or setting sql_notes=0.
- log_slow_verbosity has a new option 'warnings'. If this is set
then warnings and notes generated are printed in the slow query log
(up to log_slow_max_warnings times per statement).
- log_slow_max_warnings - Max number of warnings written to
slow query log.
Other things:
- One can now use =ALL for any 'set' variable to set all options at once.
For example using "note_verbosity=ALL" in a config file or
"SET @@note_verbosity=ALL' in SQL.
- mysqldump will in the future use @@note_verbosity=""' instead of
@sql_notes=0 to disable notes.
- Added "enum class Data_type_compatibility" and changing the return type
of all Field::can_optimize*() methods from "bool" to this new data type.
Reviewer & Co-author: Alexander Barkov <bar@mariadb.com>
- The code that prints out the notes comes mainly from Alexander
remove old deprecation helpers that were not used anywhere.
create new deprecation helpers and enforce their usage
this also removes inconsistencies in reporting deprecation:
sometimes it was ER_WARN_DEPRECATED_SYNTAX (1287),
sometimes ER_WARN_DEPRECATED_SYNTAX_NO_REPLACEMENT (1681),
sometimes a warning, sometimes a note.
it should always be
* ER_WARN_DEPRECATED_SYNTAX
* a warning (because it's something actionable, not purely informational)
In particular:
* @@debug
deprecated since 5.5.37
* sr_YU locale
deprecated since 10.0.11
* "engine_condition_pushdown" in the @@optimizer_switch
deprecated since 10.1.1
* @@date_format, @@datetime_format, @@time_format, @@max_tmp_tables
deprecated since 10.1.2
* @@wsrep_causal_reads
deprecated since 10.1.3
* "parser" in mroonga table comment
deprecated since 10.2.11
In `pseudo_slave_mode=1` aka "psedo-slave" mode any prepared XA
transaction disconnects from the user session, as if the user
connection drops. The xid of such transaction remains in the server,
and should the prepared transaction be read-only, it is marked.
The marking makes sure that the following termination of the
read-only transaction ends up with ER_XA_RBROLLBACK.
This did not take place actually for `pseudo_slave_mode=1` read-only.
Fixed with checking the read-only status of a prepared transaction
at time it disconnects from the `pseudo_slave_mode=1` session, to mark
its xid when that's the case.
trx_t::set_skip_lock_inheritance() must be invoked at the very beginning
of lock_release_on_prepare(). Currently trx_t::set_skip_lock_inheritance()
is invoked at the end of lock_release_on_prepare() when lock_sys and trx
are released, and there can be a case when locks on prepare are released,
but "not inherit gap locks" bit has not yet been set, and page split
inherits lock to supremum.
Also reset supremum bit and rebuild waiting queue when XA is prepared.
Reviewed by: Marko Mäkelä