mirror of
https://github.com/MariaDB/server.git
synced 2025-11-19 12:09:41 +01:00
The scenario of the bug is the following. Before killing the server some
transaction A starts undo log writing in some undo segment U of rseg R.
It writes its trx_id into the undo log header. Then new trx_id is assigned
to transaction B, but undo log hasn't been started yet. Then transaction
A commits and writes trx_no into its undo log header. Transaction B
starts writing undo log into the undo segment U. So we have the
following undo logs in the undo segments U:
... undo log 1...
... undo log 2...
...
undo log A, trx_id: L, trx_no: M, ...
undo log B, trx_id: N, trx_no: 0, ...
Where L < N < M.
Then server is killed.
On recovery the maximum trx_no is extracted from each rseg, and the
maximum trx_no among all rsegs plus one is considered as a new value
for server-wide transaction id/no counter.
For each undo segment of each rseg we read the last undo log header. If
the last undo log is committed, then we read trx_no from the header,
otherwise we treat trx_id as trx_no. The maximum trx_no from all undo
log segments of the current rseg is treated as the maximum trx_no of the
rseg.
For the above case the undo log of transaction B is not committed and
its trx_no is 0. So we read trx_id and treat it as trx_no. But M < N. If
U is the last modified undo segment in rseg R, and trx_(id/no) N is the
maximum trx_no among all rsegs, then there can be the case when after
recovery some transaction with trx_no_C, such as N < trx_no_C <= M, is
committed.
During a purging we store trx_no of the last parsed undo log of a
committed transaction in purge_sys.tail.trx_no. So if the last parsed
undo log is the undo log of transaction A(transaction B was rolled back
on recovery and its undo log was also removed from the undo segment U),
then purse_sys.tail.trx_no = M. Than if some other transaction C with
trx_no_C <= M is being committed and purged, we hit
"tail.trx_no <= last_trx_no" assertion failure in
purge_sys_t::choose_next_log(), because purge queue is min-heap of
(trx_no, trx_sys.rseg_array index) pairs, where the key is trx_no, and it
must not be that trx_no of the last parsed undo log of a committed
transaction is greater than the last trx_no of the rseg at the top of
the queue.
The fix is to read the trx_no of the previous to last undo log in undo
segment, if the last undo log in that undo segment is not committed, and
set trx_no=max(trx_id of the last undo log, trx_no of the previous to
last undo log) during recovery.
We can do this because we need to extract the maximum
value of trx_no or trx_id of the undo log segment, and the maximum value
is either trx_id of the last undo log or trx_no of the previous to
last undo log, because undo segment can be assigned only to the one
transaction at time, and undo logs in the undo segment are ordered by
trx_id.
Reviewed by Marko Mäkelä.
|
||
|---|---|---|
| .. | ||
| collections | ||
| include | ||
| lib | ||
| main | ||
| std_data | ||
| suite | ||
| asan.supp | ||
| CMakeLists.txt | ||
| dgcov.pl | ||
| lsan.supp | ||
| mariadb-stress-test.pl | ||
| mariadb-test-run.pl | ||
| mtr.out-of-source | ||
| purify.supp | ||
| README | ||
| README-gcov | ||
| README.stress | ||
| suite.pm | ||
| valgrind.supp | ||
This directory contains test suites for the MariaDB server. To run currently existing test cases, execute ./mysql-test-run in this directory. Some tests are known to fail on some platforms or be otherwise unreliable. In the file collections/smoke_test there is a list of tests that are expected to be stable. In general you do not have to have to do "make install", and you can have a co-existing MariaDB installation, the tests will not conflict with it. To run the tests in a source directory, you must do "make" first. In Red Hat distributions, you should run the script as user "mysql". The user is created with nologin shell, so the best bet is something like # su - # cd /usr/share/mysql-test # su -s /bin/bash mysql -c ./mysql-test-run This will use the installed MariaDB executables, but will run a private copy of the server process (using data files within /usr/share/mysql-test), so you need not start the mysqld service beforehand. You can omit --skip-test-list option if you want to check whether the listed failures occur for you. To clean up afterwards, remove the created "var" subdirectory, e.g. # su -s /bin/bash - mysql -c "rm -rf /usr/share/mysql-test/var" If tests fail on your system, please read the following manual section for instructions on how to report the problem: https://mariadb.com/kb/en/reporting-bugs If you want to use an already running MySQL server for specific tests, use the --extern option to mysql-test-run. Please note that in this mode, you are expected to provide names of the tests to run. For example, here is the command to run the "alias" and "analyze" tests with an external server: # mysql-test-run --extern socket=/tmp/mysql.sock alias analyze To match your setup, you might need to provide other relevant options. With no test names on the command line, mysql-test-run will attempt to execute the default set of tests, which will certainly fail, because many tests cannot run with an external server (they need to control the options with which the server is started, restart the server during execution, etc.) You can create your own test cases. To create a test case, create a new file in the main subdirectory using a text editor. The file should have a .test extension. For example: # xemacs t/test_case_name.test In the file, put a set of SQL statements that create some tables, load test data, and run some queries to manipulate it. Your test should begin by dropping the tables you are going to create and end by dropping them again. This ensures that you can run the test over and over again. If you are using mysqltest commands in your test case, you should create the result file as follows: # mysql-test-run --record test_case_name or # mysqltest --record < t/test_case_name.test If you only have a simple test case consisting of SQL statements and comments, you can create the result file in one of the following ways: # mysql-test-run --record test_case_name # mysql test < t/test_case_name.test > r/test_case_name.result # mysqltest --record --database test --result-file=r/test_case_name.result < t/test_case_name.test When this is done, take a look at r/test_case_name.result. If the result is incorrect, you have found a bug. In this case, you should edit the test result to the correct results so that we can verify that the bug is corrected in future releases. If you want to submit your test case you can send it to developers@lists.mariadb.org or attach it to a bug report on http://mariadb.org/jira/. If the test case is really big or if it contains 'not public' data, then put your .test file and .result file(s) into a tar.gz archive, add a README that explains the problem, ftp the archive to ftp://ftp.mariadb.org/private and submit a report to https://mariadb.org/jira about it. The latest information about mysql-test-run can be found at: https://mariadb.com/kb/en/mariadb/mysqltest/ If you want to create .rdiff files, check https://mariadb.com/kb/en/mariadb/mysql-test-auxiliary-files/