This is an addendum to the fix for MDEV-7026. The ARM memory model is
similar to that of PowerPC and thus needs the same semantics with
respect to memory barriers. That is, os_atomic_test_and_set_*_release()
must be a store with a release barrier followed by a full
barrier. Unlike x86 using __sync_lock_test_and_set() which is
implemented as “exclusive load with acquire barriers + exclusive store”
is insufficient in contexts where os_atomic_test_and_set_*_release()
macros are used.
As man page of open(2) suggested, we should open the same file in the same
mode, to have better performance. For some data files, we will first call
os_file_create_simple_no_error_handling_func() to open them, and then call
os_file_create_func() again. We have to make sure if DIRECT IO is specified,
these two functions should both open file with O_DIRECT.
Reviewed-by: Sunny Bains <sunny.bains@oracle.com>
RB: 8981
Scenario:
1. The purge thread takes an undo log record and parses it and forms
the record to be purged. We have the primary and secondary keys
to locate the actual records.
2. Using the secondary index key, we search in the secondary index.
One record is found.
3. Then it is checked if this record can be purged. The answer is we
can purge this record. To determine this we look up the clustered
index record. Either there is no corresponding clustered index
record, or the matching clustered index record is delete marked.
4. Then we check whether the secondary index record is delete marked.
We find that it is not delete marked. We report warning in optimized
build and assert in debug build.
Problem:
In step 3, we report that the record is purgeable even though it is
not delete marked. This is because of inconsistency between the
following members of purge_node_t structure - found_clust, ref and pcur.
Solution:
In the row_purge_reposition_pcur(), if the persistent cursor restore
fails, then reset the purge_node_t->found_clust member. This will
keep the members of purge_node_t structure in a consistent state.
rb#8813 approved by Marko.
if XA PREPARE transactions hold explicit locks.
innobase_shutdown_for_mysql(): Call trx_sys_close() before lock_sys_close()
(and dict_close()) so that trx_free_prepared() will see all locks intact.
RB: 8561
Reviewed-by: Vasil Dimov <vasil.dimov@oracle.com>
PROBLEM
Create time is calculated as last status change time of .frm file.
The first problem was that innodb was passing file name as
"table_name#po#p0.frm" to the stat() call which calculates the create time.
Since there is no frm file with this name create_time will be stored as NULL.
The second problem is ha_partition::info() updates stats for create time
when HA_STATUS_CONST flag was set ,where as innodb calculates this statistic
when HA_STATUS_TIME is set,which causes create_time to be set as NULL.
Fix
Pass proper .frm name to stat() call and calculate create time when
HA_STATUS_CONST flag is set.
TO USE A SECOND WATCH PAGE PER INSTANCE
Description:
BUF_POOL_WATCH_SIZE is also initialized to number of purge threads.
so BUF_POOL_WATCH_SIZE will never be lesser than number of purge threads.
From the code, there is no scope for purge thread to skip buf_pool_watch_unset.
So there can be at most one buffer pool watch active per purge thread.
In other words, there is no chance for purge thread instance to hold a watch
when setting another watch.
Solution:
Adding code comments to clarify the issue.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Approved via Bug page.
MYISAM TABLE CAUSES THE SERVER TO CRASH
Issue:
-----
During index maintanence, R-tree node might need a split.
In some cases the square of mbr could be calculated to
infinite (as in this case) or to NaN. This is currently
not handled. This is specific to MyISAM.
SOLUTION:
---------
If the calculated value in "mbr_join_square" is infinite or
NaN, set it to max double value.
Initialization of output parameters of "pick_seeds" is
required if calculation is infinite (or negative infinite).
Similar to the fix made for INNODB as part of Bug#19533996.
RESERVATION AND SIGNAL COUNT
Problem:
Reservation and Signal count value shows negative value for show engine
innodb statement.
Solution:
This is happening due to counter overflow error. Reservation and Signal
count values are defined as unsigned long but these variables are converted to
long while printing it. Change Reservation and Signal count values as unsigned
long datatype while printing it.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Approved in bug page.
MYISAM TABLE CAUSES THE SERVER TO CRASH
Issue:
-----
During index maintanence, R-tree node might need a split.
In some cases the square of mbr could be calculated to
infinite (as in this case) or to NaN. This is currently
not handled. This is specific to MyISAM.
SOLUTION:
---------
If the calculated value in "mbr_join_square" is infinite or
NaN, set it to max double value.
Initialization of output parameters of "pick_seeds" is
required if calculation is infinite (or negative infinite).
Similar to the fix made for INNODB as part of Bug#19533996.
Analysis: after a red-black-tree lookup we use node withouth
checking did lookup succeed or not. This lead to situation
where NULL-pointer was used.
Fix: Add additional check that found node from red-back-tree
is valid.
Analysis: On master when executing (single/multi) row INSERTs/REPLACEs
InnoDB fallback to old style autoinc locks (table locks)
only if another transaction has already acquired the AUTOINC lock.
Instead on slave as we are executing log_events and sql_command
is not correctly set, InnoDB does not use new style autoinc
locks when it could.
Fix: Use new style autoinc locks also when
thd_sql_command(user_thd) == SQLCOM_END i.e. this is RBR event.
Problem:
This is a coding mistake during error handling. When the specified foreign
key constraint is wrong because of data type mismatch, the resulting
foreign key object will not have valid foreign->id (it will be NULL.)
Solution:
While removing the foreign key object from dictionary cache during error
handling, ensure that foreign->id is not null before using it.
rb#8204 approved by Sunny.
ISSUE:
------
There can be up to MERGEBUFF2 number of sorted merge chunks,
We need enough buffer space for at least one record from
each merge chunks. If estimates are wrong(very low) and we
allocate buffer space for less than MERGEBUFF2, then we will
have issue in merge_buffers, if actual number of rows to be
sorted is bigger than estimate and external filesort is
chosen.
SOLUTION:
---------
Set number of rows to sort to be at least MERGEBUFF2.
causes server crash
Analysis: If wrong data types used on foreign constraint there
was possibility that foreign->id is NULL when incorrect
foreign constraint was removed from the dictionary cache.
Fix: Add guard foreign->id != NULL before trying to lookup
or remove the foreign constraint from dictionary cache.
Tested using user database where problem was repeatable.
Analysis: Purge thread does not have thd and no access to
handlerton.
Fix: If thd does not exists we use sql_print_warning instead
of push_warning_printf.
CRASHES WITH AUTO_INCREMENT COLUMN
Description:- Creating a federated table with AUTO_INCREMENT
column using LIKE clause results in a server crash.
Analysis:- Creating a federated table with AUTO_INCREMENT
column using LIKE clause results in a federated server
crash due to the uninitialized connection structure(mysql).
Also due to unassigned connection string for the remote
server, at the time of preparation of "create_info"
structure, the creation of any federated table using LIKE
clause fails with an error, "ERROR 1 (HY000): server name:
'' doesn't exist!". This bug is not only with
AUTO_INCREMENT but in all creations of federated tables with
LIKE clause.
Fix :- In ha_federated::info(), "mysql->insert_id" assigned
to "stats.auto_increment_value" only when there is an
active connection. This fixes the crash issue. For creating
the federated table with LIKE clause, connection string is
assigned at the time of preparation of "create_info"
structure.
CRASHES ON EVERY START ATTEMPT
Description:
------------
push_warning_printf function is used to print the warning message
to the client. So this function should not invoke while recovering
the server. Moreover current_thd is NULL while starting the server.
Solution:
---------
- Avoiding the warning to be printed while recovery.
This patch already pushed in mysql-5.6.
Problem was that repair() did lock and unlock tables, which leaved already locked tables in wrong state
include/my_check_opt.h:
Added option T_NO_LOCKS to disable locking during repair()
Fixed duplicated bit T_NO_CREATE_RENAME_LSN
mysql-test/suite/rpl/r/myisam_external_lock.result:
Test case for MDEV-6871
mysql-test/suite/rpl/t/myisam_external_lock-slave.opt:
Test case for MDEV-6871
mysql-test/suite/rpl/t/myisam_external_lock.test:
Test case for MDEV-6871
storage/maria/ha_maria.cc:
Don't lock tables during enable_indexes()
Removed some calls to current_thd
storage/myisam/ha_myisam.cc:
Don't lock tables during enable_indexes()
Removed some calls to current_thd
On PPC64 high-loaded server may crash due to assertion failure in InnoDB
rwlocks code.
This happened because load order between "recursive" and "writer_thread"
wasn't properly enforced.
Use traditional statistics estimation by default (innodb-stats-traditional=true).
There could be performance regression for customers if there is a lot of
open table operations.
innodb_stats_sample_pages
Analysis: If you set the number of analyzed pages
to very low number compared to actual pages on
that table/index it randomly pics those pages
(default 8 pages), this leads to fact that query
after analyze table returns different results. If
the index tree is small, smaller than 10 *
n_sample_pages + total_external_size, then the
estimate is ok. For bigger index trees it is
common that we do not see any borders between
key values in the few pages we pick. But still
there may be n_sample_pages different key values,
or even more. And it just tries to
approximate to n_sample_pages (8).
Fix: (1) Introduced new dynamic configuration variable
innodb_stats_sample_traditional that retains
the current design. Default false.
(2) If traditional sample is not used we use
n_sample_pages = max(min(srv_stats_sample_pages,
index->stat_index_size),
log2(index->stat_index_size)*
srv_stats_sample_pages);
(3) Introduced new dynamic configuration variable
stat_modified_counter (default = 0) if set
sets lower bound for row updates when statistics is re-estimated.
If user has provided upper bound for how many rows needs to be updated
before we calculate new statistics we use minimum of provided value
and 1/16 of table every 16th round. If no upper bound is provided
(srv_stats_modified_counter = 0, default) then calculate new statistics
if 1 / 16 of table has been modified
since the last time a statistics batch was run.
We calculate statistics at most every 16th round, since we may have
a counter table which is very small and updated very often.
@param t table
@return true if the table has changed too much and stats need to be
recalculated
*/
#define DICT_TABLE_CHANGED_TOO_MUCH(t) \
((ib_int64_t) (t)->stat_modified_counter > (srv_stats_modified_counter ? \
ut_min(srv_stats_modified_counter, (16 + (t)->stat_n_rows / 16)) : \
16 + (t)->stat_n_rows / 16))
The bug was that full memory barrier was missing in the code that ensures that
a waiter on an InnoDB mutex will not go to sleep unless it is guaranteed to be
woken up again by another thread currently holding the mutex. This made
possible a race where a thread could get stuck waiting for a mutex that is in
fact no longer locked. If that thread was also holding other critical locks,
this could stall the entire server. There is an error monitor thread than can
break the stall, it runs about once per second. But if the error monitor
thread itself got stuck or was not running, then the entire server could hang
infinitely.
This was introduced on i386/amd64 platforms in 5.5.40 and 10.0.13 by an
incorrect patch that tried to fix the similar problem for PowerPC.
This commit reverts the incorrect PowerPC patch, and instead implements a fix
for PowerPC that does not change i386/amd64 behaviour, making PowerPC work
similarly to i386/amd64.