The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.
Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue
Disabled test:
- spider/bugfix.mdev_27239 because we started to get
+Error 1429 Unable to connect to foreign data source: localhost
-Error 1158 Got an error reading communication packets
- main.delayed
- Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
This part is disabled for now as it fails randomly with different
warnings/errors (no corruption).
The system variable spider_disable_group_by_handler, if on, will
disable the spider group by handler (gbh), and such disablement serves
as workaround to bugs caused by gbh, labelled with spider-gbh on jira,
including MDEV-26247, MDEV-28998, MDEV-29163, MDEV-30392, MDEV-31645.
Tests for these tickets are added accordingly with the workaround in
place.
The direct aggregate mechanism sems to be only intended to work when
otherwise a full table scan query will be executed from the spider
node and the aggregation done at the spider node too. Typically this
happens in sub_select(). In the test spider.direct_aggregate_part
direct aggregate allows to send COUNT statements directly to the data
nodes and adds up the results at the spider node, instead of iterating
over the rows one by one at the spider node.
By contrast, the group by handler (GBH) typically sends aggregated
queries directly to data nodes, in which case DA does not improve the
situation here.
That is why we should fix it by disabling DA when GBH is used.
There are other reasons supporting this change. First, the creation of
GBH results in a call to change_to_use_tmp_fields() (as opposed to
setup_copy_fields()) which causes the spider DA function
spider_db_fetch_for_item_sum_funcs() to work on wrong items. Second,
the spider DA function only calls direct_add() on the items, and the
follow-up add() needs to be called by the sql layer code. In
do_select(), after executing the query with the GBH, it seems that the
required add() would not necessarily be called.
Disabling DA when GBH is used does fix the bug. There are a few
other things included in this commit to improve the situation with
spider DA:
1. Add a session variable that allows user to disable DA completely,
this will help as a temporary measure if/when further bugs with DA
emerge.
2. Move the increment of direct_aggregate_count to the spider DA
function. Currently this is done in rather bizarre and random
locations.
3. Fix the spider_db_mbase_row creation so that the last of its row
field (sentinel) is NULL. The code is already doing a null check, but
somehow the sentinel field is on an invalid address, causing the
segfaults. With a correct implementation of the row creation, we can
avoid such segfaults.
The existing (incorrect) overriding mechanism is:
Non-minus-one var value overrides table param overrides default value.
Before MDEV-27169, unspecified var value is -1. So if the user sets
both the var to be a value other than -1 and the table param, the var
value will prevail, which is incorrect.
After MDEV-27169, unspecified var value is default value. So if the
user does not set the var but sets the table param, the default value
will prevail, which is even more incorrect.
This patch fixes it so that table param, if specified, always
overrides var value, and the latter if not specified or set to -1,
falls back to the default value
We achieve this by replacing all such overriding in spd_param.cc with
macros that override in the correct way, and removing all the
"overriding -1" lines involving table params in
spider_set_connect_info_default() except for those table params not
defined as sysvar/thdvar in spd_params.cc
We also introduced macros for non-overriding sysvar and thdvar, so
that the code is cleaner and less error-prone
In server versions where MDEV-27169 has not been applied, we also
backport the patch, that is, replacing -1 default values with real
default values
In server versions where MDEV-28006 has not been applied, we do the
same for udf params
Extracted out common subroutines, gave more meaningful names etc,
added comments etc.
Also:
- Documented active servers load balancing reads, and other fields in
SPIDER_SHARE etc.
- Removed commented out code
- Documented and refactored self-reference check
- Removed some unnecessary functions
- Renamed unhelpful roop_count
- Refactored spider_get_{sts,crd}, where we turn get_type into an enum
- Cleaned up spider_mbase_handler::show_table_status() and
spider_mbase_handler::show_index()
When the variable, spider_semi_table_lock, is 1, Spider sends
LOCK TABLES before each SQL execution. The feature is for
non-transactional remote tables and adds some overhead to query
executions.
We change the default value of the plugin variable to 0 and then
deprecate the variable because it is rare to use non-transactional
engines these days and the variable complicates the code.
The variable, spider_semi_table_lock_connection, should be too
deprecated because it is for tweaking the semi-table locking.
Delete the deprecated variable, spider_use_handler and related code.
Spider now does not supports accessing data nodes via handler
statements. Thus, the notion of SQL kinds are no longer useful.
We too discard it.
Deprecate the server variable spider_internal_offset and the corresponding
table parameters "ios" and "internal_offset".
We believe this variable is not much use to users, therefore decided to
deprecate it.
Reviewed-by: Nayuta Yanagisawa <nayuta.yanagisawa@hey.com>
The variables, spider_crd_mode and spider_sts_mode, specify the
ways to fetch statistics from data nodes.
Using the SHOW command seems to work for any cases. Thus, we deprecate
the variables.
We deprecate the variable spider_xa_register_mode because there is
no need to perform a two phase commit for a read-only transaction.
Note that the variable only affects Spider's internal XA transactions.
The variable spider_internal_limit is for specifying the number of
records acquired by Spider from each remote server.
There seems not to be much use of the variable. Thus, we deprecate it
and the corresponding table options.
Reviewed-by: Nayuta Yanagisawa <nayuta.yanagisawa@hey.com>
Configuring UDFs via plugin variables looks not a good idea.
The more variables Spider has, the more complex it becomes.
Further, I expect that only a few users use Spider UDFs.
Deprecate the following plugin variables regarding Spider UDFs:
* spider_udf_ds_bulk_insert_rows
* spider_udf_ds_table_loop_mode
* spider_udf_ds_use_real_table
* spider_udf_ct_bulk_insert_interval
* spider_udf_ct_bulk_insert_rows
spider_udf_table_lock_mutex_count and spider_udf_table_mon_mutex_count
are also for tweaking UDFs but they are already read-only. So,
there is no need to deprecate them.
"#ifdef WITH_PARTITION_STORAGE_ENGINE ... #endif" appears frequently
in the Spider code base. However, there is no need to maintain such
ifdefs because Spider is disabled if the partitioning engine is disabled.