Commit graph

179879 commits

Author SHA1 Message Date
Sergey Vojtovich
b04f2a0f01 MDEV-14529 - InnoDB rw-locks: optimize memory barriers
Relax memory barrier for lock_word.

rw_lock_lock_word_decr() - used to acquire rw-lock, thus we only need to issue
ACQUIRE when we succeed locking.

rw_lock_x_lock_func_nowait() - same as above, but used to attempt to acquire
X-lock.

rw_lock_s_unlock_func() - used to release S-lock, RELEASE is what we need here.

rw_lock_x_unlock_func() - used to release X-lock. Ideally we'd need only RELEASE
here, but due to mess with waiters (they must be loaded after lock_word is
stored) we have to issue both ACQUIRE and RELEASE.

rw_lock_sx_unlock_func() - same as above, but used to release SX-lock.

rw_lock_s_lock_spin(), rw_lock_x_lock_func(), rw_lock_sx_lock_func() -
fetch-and-store to waiters has to issue only ACQUIRE memory barrier, so that
waiters are stored before lock_word is loaded.

Note that there is violation of RELEASE-ACQUIRE protocol here, because we do
on lock:

  my_atomic_fas32_explicit((int32*) &lock->waiters, 1, MY_MEMORY_ORDER_ACQUIRE);
  my_atomic_load32_explicit(&lock->lock_word, MY_MEMORY_ORDER_RELAXED);

on unlock

  my_atomic_add32_explicit(&lock->lock_word, X_LOCK_DECR, MY_MEMORY_ORDER_ACQ_REL);
  my_atomic_load32_explicit((int32*) &lock->waiters, MY_MEMORY_ORDER_RELAXED);

That is we kind of synchronize ACQUIRE on lock_word with ACQUIRE on waiters.
It was there before this patch. Simple fix may have negative performance impact.
Proper fix requires refactoring of lock_word.
2017-12-08 17:55:41 +04:00
Sergey Vojtovich
51bb18f989 MDEV-14529 - InnoDB rw-locks: optimize memory barriers
Relax memory barrier for waiters: these 2 stores must be completed before
os_event_set() finishes. This is guaranteed by RELEASE barrier issued by
mutex.exit() of os_event_set().
2017-12-08 17:55:41 +04:00
Sergey Vojtovich
5b624f00fc MDEV-14529 - InnoDB rw-locks: optimize memory barriers
Remove volatile modifier from waiters: it's not supposed for inter-thread
communication, use appropriate atomic operations instead.

Changed waiters to int32_t, my_atomic friendly type.
2017-12-08 17:55:41 +04:00
Sergey Vojtovich
57d20f1132 MDEV-14529 - InnoDB rw-locks: optimize memory barriers
Remove volatile modifier from lock_word: it's not supposed for inter-thread
communication, use appropriate atomic operations instead.
2017-12-08 17:55:41 +04:00
Sergey Vojtovich
c73e77da0f MDEV-14529 - InnoDB rw-locks: optimize memory barriers
Change lock_word from lint to int32_t: the latter is my_atomic_* friendly type.
2017-12-08 17:55:41 +04:00
Monty
db715ff392 Add extra mutex check to wsrep_aborting_thd_enqueue 2017-12-08 11:38:22 +02:00
Monty
68cd543539 Search for galera libraries also in /usr/lib64/galera-3
This is where Codership's offical rpm's puts them
2017-12-08 11:38:22 +02:00
Monty
c4581735d0 Cleanups
- Remove not used thd_rpl_is_parallel()
- Remove not used mysql_notify_thread_having_shared_lock()
- Remove not needed LOCK_thread_count from MYSQL_BIN_LOG::reset_logs()
  - LOCK_thread_count is not protecting against rollback, so this
    code and comment is not needed
- Remove mutex_locks in slave.cc that are not needed.
  Added THD::assert_not_linked() to ensure that it was safe to remove
- Fixed not repeatable test load_data_stmt_view
- Updated binlog_killed to test removal of mutex
  (thanks to Andrei Elkin for test)
- More code comments
2017-12-08 11:38:22 +02:00
Varun Gupta
6d63a03490 MDEV-11297: Add support for LIMIT clause in GROUP_CONCAT() 2017-12-08 12:21:26 +05:30
Marko Mäkelä
3aa618a969 MDEV-13820 trx_id_check() fails during row_log_table_apply()
When logging ROW_T_INSERT or ROW_T_UPDATE records, we did not normalize
the DB_TRX_ID of the current transaction into 0 if the current transaction
had started (modifying other tables) before the ALTER TABLE started.

MDEV-13654 introduced this normalization for ROW_T_DELETE
and for all operations with ADD PRIMARY KEY, in row_log_table_get_pk().
2017-12-07 14:35:32 +02:00
Marko Mäkelä
4ea5b126c5 Merge bb-10.2-ext into 10.3 2017-12-07 08:21:00 +02:00
Marko Mäkelä
bce4065129 Merge 10.2 into bb-10.2-ext 2017-12-07 08:18:43 +02:00
Marko Mäkelä
51c73a431f Merge 10.1 into 10.2 2017-12-07 08:17:50 +02:00
Marko Mäkelä
447931c6ab Post-fix for MDEV-14587
dict_stats_process_entry_from_defrag_pool(): Release the mutex
2017-12-07 08:14:49 +02:00
Marko Mäkelä
976f6fb1b6 Merge bb-10.2-ext into 10.3 2017-12-06 19:36:33 +02:00
Marko Mäkelä
ce07676502 Merge 10.2 into bb-10.2-ext 2017-12-06 19:34:03 +02:00
Marko Mäkelä
77fb7ccba4 Follow-up fix to MDEV-13201 Assertion srv_undo_sources || ... failed on shutdown during DDL operation
Introduce the debug flag trx_t::persistent_stats to suppress the
assertion for the updates of persistent statistics during fast
shutdown.

dict_stats_exec_sql(): Do execute the statement even though shutdown
has been initiated.
2017-12-06 18:52:28 +02:00
Marko Mäkelä
7dc6066dea MDEV-14511 Use fewer transactions for updating InnoDB persistent statistics
dict_stats_exec_sql(): Expect the caller to always provide a transaction.
Remove some redundant assertions. The caller must hold dict_sys->mutex,
but holding dict_operation_lock is only necessary for accessing
data dictionary tables, which we are not accessing.

dict_stats_save_index_stat(): Acquire dict_sys->mutex
for invoking dict_stats_exec_sql().

dict_stats_save(), dict_stats_update_for_index(), dict_stats_update(),
dict_stats_drop_index(), dict_stats_delete_from_table_stats(),
dict_stats_delete_from_index_stats(), dict_stats_drop_table(),
dict_stats_rename_in_table_stats(), dict_stats_rename_in_index_stats(),
dict_stats_rename_table(): Use a single caller-provided
transaction that is started and committed or rolled back by the caller.

dict_stats_process_entry_from_recalc_pool(): Let the caller provide
a transaction object.

ha_innobase::open(): Pass a transaction to dict_stats_init().

ha_innobase::create(), ha_innobase::discard_or_import_tablespace():
Pass a transaction to dict_stats_update().

ha_innobase::rename_table(): Pass a transaction to
dict_stats_rename_table(). We do not use the same transaction
as the one that updated the data dictionary tables, because
we already released the dict_operation_lock. (FIXME: there is
a race condition; a lock wait on SYS_* tables could occur
in another DDL transaction until the data dictionary transaction
is committed.)

ha_innobase::info_low(): Pass a transaction to dict_stats_update()
when calculating persistent statistics.

alter_stats_norebuild(), alter_stats_rebuild(): Update the
persistent statistics as well. In this way, a single transaction
will be used for updating the statistics of a whole table, even
for partitioned tables.

ha_innobase::commit_inplace_alter_table(): Drop statistics for
all partitions when adding or dropping virtual columns, so that
the statistics will be recalculated on the next handler::open().
This is a refactored version of Oracle Bug#22469660 fix.

RecLock::add_to_waitq(), lock_table_enqueue_waiting():
Do not allow a lock wait to occur for updating statistics
in a data dictionary transaction, such as DROP TABLE. Instead,
return the previously unused error code DB_QUE_THR_SUSPENDED.

row_merge_lock_table(), row_mysql_lock_table(): Remove dead code
for handling DB_QUE_THR_SUSPENDED.

row_drop_table_for_mysql(), row_truncate_table_for_mysql():
Drop the statistics as part of the data dictionary transaction.
After TRUNCATE TABLE, the statistics will be recalculated on
subsequent ha_innobase::open(), similar to how the logic after
the above-mentioned Oracle Bug#22469660 fix in
ha_innobase::commit_inplace_alter_table() works.

btr_defragment_thread(): Use a single transaction object for
updating defragmentation statistics.

dict_stats_save_defrag_stats(), dict_stats_save_defrag_stats(),
dict_stats_process_entry_from_defrag_pool(),
dict_defrag_process_entries_from_defrag_pool(),
dict_stats_save_defrag_summary(), dict_stats_save_defrag_stats():
Add a parameter for the transaction.

dict_stats_empty_table(): Make public. This will be called by
row_truncate_table_for_mysql() after dropping persistent statistics,
to clear the memory-based statistics as well.
2017-12-06 18:52:28 +02:00
Sergei Petrunia
2c1e4d4d7a MDEV-14563: Wrong query plan for query with no PK
Part #2: Don't use the new code for the clustered PK, it is handled
in the special way right above.
2017-12-06 12:35:17 +03:00
Sergei Petrunia
a6254e5e7d MDEV-14563: Wrong query plan for query with no PK
TABLE_SHARE::init_from_binary_frm_image() calls handler_file->index_flags()
before it has set TABLE_SHARE::primary_key (it is 0 while it should be
MAX_KEY in my example).
This causes MyRocks to report wrong index flags (it thinks it's a PK while
it is not), which causes invalid query plans later on.

Do the only thing that seems feasible: adjust field->part_of key to have
correct value in ha_rocksdb::open.
2017-12-06 12:35:17 +03:00
Sergei Petrunia
c3803914c5 MDEV-14433: RocksDB may show empty or incorrect output with rocksdb_strict_collation_check=off
Part#1: Set field->part_of_key correctly for PK fields.
2017-12-06 12:35:17 +03:00
Marko Mäkelä
f1f2b7742f MDEV-13626 Merge InnoDB test cases from MySQL 5.7 (part 4) 2017-12-06 10:40:58 +02:00
Marko Mäkelä
afe6aef5ff Adjust the test innodb.virtual_stats and rename to gcol.innodb_virtual_stats 2017-12-06 10:37:08 +02:00
Marko Mäkelä
b1cd5ca2af Import innodb.virtual_stats from MySQL 5.7 2017-12-06 10:35:09 +02:00
Marko Mäkelä
e9bc0f75ef MDEV-5834 cleanup: Inline two tiny functions 2017-12-06 10:32:24 +02:00
Marko Mäkelä
1d526f31fb Merge 10.1 into 10.2 2017-12-05 14:23:57 +02:00
Marko Mäkelä
63cbb98275 MDEV-14587 dict_stats_process_entry_from_defrag_pool() fails to call dict_table_close() when index==NULL
dict_stats_process_entry_from_defrag_pool(): Simplify the logic,
and always call dict_table_close() when dict_table_open() returned
a non-NULL handle.
2017-12-05 13:25:09 +02:00
Heinz Wiesinger
a34b976d8e Add "leaves" algorithm to oqgraph.
This algorithm returns all reachable leaf nodes from a given origin,
or all root nodes that can reach a given destination.
2017-12-05 13:11:02 +02:00
Marko Mäkelä
d1ab89037a MDEV-13670/MDEV-14550 Error log flood : "InnoDB: page_cleaner: 1000ms intended loop took N ms. The settings might not be optimal."
Silence the error log output that was introduced in MySQL 5.7
(MariaDB 10.2.2) if log_warnings=2 or less.

We should still figure out what these messages really indicate
and how to solve the problems.

pc_sleep_if_needed(): Add a parameter for the current time,
so that there will be fewer successive calls to ut_time_ms()
with no I/O between them.

buf_flush_page_cleaner_coordinator(): Exit the first loop
whenever shutdown has been requested. At the start of the loop,
call ut_time_ms() only once. Do not display the message if
log_warnings=2 or less.
2017-12-05 12:58:09 +02:00
Vesa Pentti
5868a184fa Revert "MDEV-12501 -- set --maturity-level by default"
This reverts commit 1af2d7ba23.
2017-12-05 08:49:28 +00:00
Vesa Pentti
1af2d7ba23 MDEV-12501 -- set --maturity-level by default
* Note: breaking change; since this commit, a plugin that has
    worked so far might get rejected due to plugin maturity
  * mariabackup is not affected (allows all plugins)
  * VERSION file defines SERVER_MATURITY, which defines the
    corresponding numeric value as SERVER_MATURITY_LEVEL in
    include/mysql_version.h
  * The default value for 'plugin_maturity' is SERVER_MATURITY_LEVEL - 1
  * Logs a warning if a plugin has maturity lower than
    SERVER_MATURITY_LEVEL
  * Tests suppress the plugin maturity warning
  * Tests use --plugin-maturity=unknown by default so as not to fail
    due to the stricter plugin maturity handling
2017-12-04 21:12:35 +00:00
Marko Mäkelä
8be7548085 Follow-up to MDEV-12698: Adjust some comments
The function dict_stats_update_if_needed() replaced
row_update_statistics_if_needed(). Adjust the comments accordingly.
2017-12-04 13:43:02 +02:00
Varun Gupta
60c446584c MDEV-7773: Aggregate stored functions
This commit implements aggregate stored functions. The basic idea behind
the feature is:

* Implement a special instruction FETCH GROUP NEXT ROW that will pause
the execution of the stored function. When the instruction is reached,
execution of the initial query resumes "as if" the function returned.
This gives the server the opportunity to advance to the next row in the
result set.

* Stored aggregates behave like regular aggregate functions. The
implementation of thus resides in the class Item_sum_sp. Because it is
an aggregate function, for each new row in the group, the
Item_sum_sp::add() method will be called. This is when execution resumes
and the function does another iteration to "add" one extra element to
the final result.

* When the end of group is reached, val_xxx() method will be called for
the item. This case is handled by another execute step for the stored
function, only with a special flag to force a call to the return
handler. See Item_sum_sp::execute() for details.

To allow this pause and resume semantic, we must preserve the function
context across executions. This is stored in Item_sp::sp_query_arena only for
aggregate stored functions, but has no impact for regular functions.

We also enforce aggregate functions to include the "FETCH GROUP NEXT ROW"
instruction.

Signed-off-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>
2017-12-04 13:22:29 +02:00
Vicențiu Ciorbaru
7448b01bb5 Remove the side effect of setting m_sp from Item_sp::init_result_field
Item_sp::init_result_field no longer takes sp_head* parameter. It
expects the m_sp member to be already set to something valid.
2017-12-04 13:22:29 +02:00
Varun Gupta
c12d1ed48e Refactor parts of Item_func_sp into Item_sp
In preparation for implementing custom aggregate functions, refactor
the common code between regular stored functions and aggregate stored
functions. This includes:

* initialising SP result field
* executing a SP
* access checks

In addition, refactor sp_head::execute_function to take two extra
parameters, a function rcontext and a Query_arena. These two paremeters
were initially initialised and destroyed within
sp_head::execute_function, but for aggregate stored functions we will
require control over their lifetime. The owner of these objects now
becomes Item_sp.

Signed-off-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>
2017-12-04 13:22:29 +02:00
Marko Mäkelä
bd8fd3b7c3 Remove references to UNIV_SYNC_DEBUG which was merged with UNIV_DEBUG 2017-12-04 11:48:12 +02:00
Marko Mäkelä
b213f57dc3 Follow-up to MDEV-12288: Avoid mutex acquistion in trx_rw_is_active(0)
Suggested-by: Sergey Vojtovich <svoj@mariadb.org>
2017-12-04 11:29:00 +02:00
Marko Mäkelä
751ad74491 Silence -Wimplicit-fallthrough 2017-12-04 11:28:10 +02:00
Monty
d7b0b8ddac MDEV-10688 rpl.rpl_row_log_innodb failed in buildbot
Problem was that Binlog_checkpoint can happen at random times.
Fixed by not write binlog_checkpoint for the rpl_log test.

Other things:
- Removed not used variable "$keep_gtid_events"
- Added option for show_binlog_events to skip binlog_checkpoint
2017-12-03 15:21:53 +02:00
Monty
60df17e95a Remove compiler warnings 2017-12-03 13:58:36 +02:00
Monty
52ca07c2a0 Add direct join support for Spider
Includes Spider patches
- 062_mariadb-10.2.0.direct_join_1and3.diff
- 063_mariadb-10.2.0.direct_join_for_single_partition.diff
- Test cases from Kentoku

Allows Spider to push full joins to the Spider engine trough the
create_group_by interface.

Other things:
- Increased MYSQL_VERSION_ID to check for 10211 (latest 10.2 version)
- Fix for const_table at calling create_group_by().

Original author: Kentoku SHIBA
2017-12-03 13:58:36 +02:00
Jacob Mathew
bfaf2d6e35 Changes to fix 64-bit Windows build errors and warnings.
Signed-off-by: Monty <monty@mariadb.org>
2017-12-03 13:58:36 +02:00
Kentoku SHIBA
207594afac merge Spider 3.3.13
New features in 3.3.13 are:
- Join Push Down for 1 by 1 table and single partition.
2017-12-03 13:58:36 +02:00
Kentoku SHIBA
e53ef202bd Adding direct update/delete to the server and to the partition engine.
Add support for direct update and direct delete requests for spider.
A direct update/delete request handles all qualified rows in a single
operation rather than one row at a time.

Contains Spiral patches:
006_mariadb-10.2.0.direct_update_rows.diff      MDEV-7704
008_mariadb-10.2.0.partition_direct_update.diff MDEV-7706
010_mariadb-10.2.0.direct_update_rows2.diff     MDEV-7708
011_mariadb-10.2.0.aggregate.diff               MDEV-7709
027_mariadb-10.2.0.force_bulk_update.diff       MDEV-7724
061_mariadb-10.2.0.mariadb-10.1.8.diff          MDEV-12870

- The differences compared to the original patches:
  - Most of the parameters of the new functions are unnecessary.  The
    unnecessary parameters have been removed.
  - Changed bit positions for new handler flags upon consideration of
    handler flags not needed by other Spiral patches and handler flags
    merged from MySQL.
  - Added info_push() (Was originally part of bulk access patch)
  - Didn't include code related to handler socket
  - Added HA_CAN_DIRECT_UPDATE_AND_DELETE

Original author: Kentoku SHIBA
First reviewer:  Jacob Mathew
Second reviewer: Michael Widenius
2017-12-03 13:58:36 +02:00
Michael Widenius
d1e4ecec07 Spider fix: Stop searching next when m_top_entry=NO_CURRENT_PART_ID 2017-12-03 13:58:36 +02:00
Michael Widenius
352feb49c4 Cleanups for ha_partition.cc
- Ensure that var= doesn't have a space before =
- Fixed DBUG_PRINT to use %u for unsigned types
- Use "enter" when printing function arguments
- Fixed typos
- Added some extra DBUG_PRINT
- Removed not needed assignment
2017-12-03 13:58:36 +02:00
Monty
b016e1ba7f MDEV-7702 Spiral patch 004_mariadb-10.0.15.slave-trx-retry.diff
This is about adding more options to force slave retries

Two new variables has been added:
slave_transaction_retry_errors
- Tells the slave thread to retry transaction for replication when a
  query event returns an error from the provided list. Deadlock and
  elapsed lock wait timeout errors are automatically added to this list
slave-transaction-retry-interval
- Interval of the slave SQL thread will retry a transaction
  in case it failed with a deadlock or elapsed lock wait
  timeout or listed in slave_transaction_retry_errors

Other changes:
- Simplifed code for slave_skip_errors (to be aligned with
  slave_transaction_retry_errors)
- Renamed print_slave_skip_errors() to make_slave_skip_errors_printable()
- Remove printing error from init_slave_skip_errors as my_bitmap_init()
  will do that if needed.
- Generalize has_temporary_error()
2017-12-03 13:58:35 +02:00
Monty
3907ff2d24 Fix index scan cleanup in the partition engine.
Spiral Patch 057: 057_mariadb-10.2.0.partition_index_end.diff MDEV-12999

Original author: Kentoku SHIBA
First reviewer:  Jacob Mathew
Second reviewer: Michael Widenius
2017-12-03 13:58:35 +02:00
Monty
f26e14e2d8 Adding option to tell that cmp_ref handler call is expensive
- In Spider, calling cmp_ref() can be very expensive. In ha_partition.cc
  we don't anymore sort rows according to position for the Spider
  engine.
- Removed Spider specific call info(HA_EXTRA_STARTING_ORDERED_INDEX_SCAN)
  from handle_ordered_index_scan(). It's caused performance issues and
  does not change results for queries with ORDER BY.
- The visible effect of this patch is that for some storage engines,
  rows may be returned in a different order if there is no ORDER BY clause.

- Based in Spiral Patch 052:
  052_mariadb-10.2.0.add_partition_skip_pk_sort_for_non_clustered_index
  MDEV-7748
- The major difference from original patch is that there is no variable to
  get the old behaviour.

Other things:
- Optimized ha_partition::cmp_ref() and cmp_part_ids() to make them
  simpler and faster.
- Changed arguments to cmp_key_part_id() to be same as
  cmp_key_rowid_part_id to simplify code.

Original author: Kentoku SHIBA
First reviewer:  Jacob Mathew
Second reviewer: Michael Widenius
2017-12-03 13:58:35 +02:00
Monty
dc17ac1638 Adding support for auto_increment in the partition engine.
Contains Spiral patches:
022_mariadb-10.2.0.auto_increment.diff               MDEV-7720
030: 030_mariadb-10.2.0.partition_auto_inc_init.diff MDEV-7726

These patches have the following differences compared to the original
patches:
- Added the new #defines for the feature in spd_environ.h instead of in
  handler.h because these #defines are needed by Spider and are not needed
  by the server.
- Cleaned up code related to the removed variable m_need_info_for_auto_inc
. Changed variable assignment in lock_auto_increment() and
  unlock_auto_increment() so that the assignments are done under locks.
- Added a test case.
- Added test result changes resulting from a bug that was fixed by these
  patches.

Original author: Kentoku SHIBA
First reviewer:  Jacob Mathew
Second reviewer: Michael Widenius
2017-12-03 13:58:35 +02:00