mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-04-07 15:55:41 +02:00

Author	SHA1	Message	Date
Sergei Petrunia	9020baf126	Trivial fix: Make test_if_cheaper_ordering() use actual_rec_per_key() Discovered this while working on MDEV-34720: test_if_cheaper_ordering() uses rec_per_key, while the original estimate for the access method is produced in best_access_path() by using actual_rec_per_key(). Make test_if_cheaper_ordering() also use actual_rec_per_key(). Also make several getter function "const" to make this compile. Also adjusted the testcase to handle this (the change backported from 11.0)	2024-08-25 16:05:00 +03:00
Monty	a1b6befc78	Fixed crash in is_stat_table() when using hash joins. Other usage if persistent statistics is checking 'stats_is_read' in caller, which is why this was not noticed earlier. Other things: - Simplified no_stat_values_provided	2023-10-19 16:17:01 +03:00
Monty	e3b36b8f1b	MDEV-31957 Concurrent ALTER and ANALYZE collecting statistics can result in stale statistical data Example of what causes the problem: T1: ANALYZE TABLE starts to collect statistics T2: ALTER TABLE starts by deleting statistics for all changed fields, then creates a temp table and copies data to it. T1: ANALYZE ends and writes to the statistics tables. T2: ALTER TABLE renames temp table in place of the old table. Now the statistics from analyze matches the old deleted tables. Fixed by waiting to delete old statistics until ALTER TABLE is the only one using the old table and ensure that rename of columns can handle swapping of column names. rename_columns_in_stat_table() (former rename_column_in_stat_tables()) now takes a list of columns to rename. It uses the following algorithm to update column_stats to be able to handle circular renames - While there are columns to be renamed and it is the first loop or last rename loop did change something. - Loop over all columns to be renamed - Change column name in column_stat - If fail because of duplicate key - If this is first change attempt for this column - Change column name to a temporary column name - If there was a conflicting row, replace it with the current row. else - Remove entry from column list - Loop over all remaining columns in the list - Remove the conflicting row - Change column from temporary name to final name in column_stat Other things: - Don't flush tables for every operation. Only flush when all updates are done. - Rename of columns was not handled in case of ALGORITHM=copy (old bug). - Fixed that we do not collect statistics for hidden hash columns used by UNIQUE constraint on long values. - Fixed that we do not collect statistics for blob columns referred by generated virtual columns. This was achieved by storing the fields for which we want to have statistics in table->has_value_set instead of in table->read_set. - Rename of indexes was not handled for persistent statistics. - This is now handled similar as rename of columns. Renamed columns are now stored in 'rename_stat_indexes' and handled in Alter_info::delete_statistics() together with drooped indexes. - ALTER TABLE .. ADD INDEX may instead of creating a new index rename an existing generated foreign key index. This was not reflected in the index_stats table because this was handled in mysql_prepare_create_table instead instead of in the mysql_alter() code. Fixed by adding a call in mysql_prepare_create_table() to drop the changed index. I also had to change the code that 'marked the index' to be ignored with code that would not destroy the original index name. Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-10-03 08:25:30 +03:00
Monty	a6bf4b5807	MDEV-29693 ANALYZE TABLE still flushes table definition cache when engine-independent statistics is used This commits enables reloading of engine-independent statistics without flushing the table from table definition cache. This is achieved by allowing multiple version of the TABLE_STATISTICS_CB object and having independent pointers to it in TABLE and TABLE_SHARE. The TABLE_STATISTICS_CB object have reference pointers and are freed when no one is pointing to it anymore. TABLE's TABLE_STATISTICS_CB pointer is updated to use the TABLE_SHARE's pointer when read_statistics_for_tables() is called at the beginning of a query. Main changes: - read_statistics_for_table() will allocate an new TABLE_STATISTICS_CB object. - All get_stat_values() functions has a new parameter that tells where collected data should be stored. get_stat_values() are not using the table_field object anymore to store data. - All get_stat_values() functions returns 1 if they found any data in the statistics tables. Other things: - Fixed INSERT DELAYED to not read statistics tables. - Removed Statistics_state from TABLE_STATISTICS_CB as this is not needed anymore as wer are not changing TABLE_SHARE->stats_cb while calculating or loading statistics. - Store values used with store_from_statistical_minmax_field() in TABLE_STATISTICS_CB::mem_root. This allowed me to remove the function delete_stat_values_for_table_share(). - Field_blob::store_from_statistical_minmax_field() is implemented but is not normally used as we do not yet support EIS statistics for blobs. For example Field_blob::update_min() and Field_blob::update_max() are not implemented. Note that the function can be called if there is an concurrent "ALTER TABLE MODIFY field BLOB" running because of a bug in ALTER TABLE where it deletes entries from column_stats before it has an exclusive lock on the table. - Use result of field->val_str(&val) as a pointer to the result instead of val (safetly fix). - Allocate memory for collected statistics in THD::mem_root, not in in TABLE::mem_root. This could cause the TABLE object to grow if a ANALYZE TABLE was run many times on the same table. This was done in allocate_statistics_for_table(), create_min_max_statistical_fields_for_table() and create_min_max_statistical_fields_for_table_share(). - Store in TABLE_STATISTICS_CB::stats_available which statistics was found in the statistics tables. - Removed index_table from class Index_prefix_calc as it was not used. - Added TABLE_SHARE::LOCK_statistics to ensure we don't load EITS in parallel. First thread will load it, others will reuse the loaded data. - Eliminate read_histograms_for_table(). The loading happens within read_statistics_for_tables() if histograms are needed. One downside is that if we have read statistics without histograms before and someone requires histograms, we have to read all statistics again (once) from the statistics tables. A smaller downside is the need to call alloc_root() for each individual histogram. Before we could allocate all the space for histograms with a single alloc_root. - Fixed bug in MyISAM and Aria where they did not properly notice that table had changed after analyze table. This was not a problem before this patch as then the MyISAM and Aria tables where flushed as part of ANALYZE table which did hide this issue. - Fixed a bug in ANALYZE table where table->records could be seen as 0 in collect_statistics_for_table(). The effect of this unlikely bug was that a full table scan could be done even if analyze_sample_percentage was not set to 1. - Changed multiple mallocs in a row to use multi_alloc_root(). - Added a mutex protection in update_statistics_for_table() to ensure that several tables are not updating the statistics at the same time. Some of the changes in sql_statistics.cc are based on a patch from Oleg Smirnov <olernov@gmail.com> Co-authored-by: Oleg Smirnov <olernov@gmail.com> Co-authored-by: Vicentiu Ciorbaru <cvicentiu@gmail.com> Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-08-18 13:28:39 +03:00
Sergei Golubchik	e841957416	Merge branch '10.3' into 10.4	2021-02-23 09:25:57 +01:00
Sergei Golubchik	0ab1e3914c	Merge branch '10.2' into 10.3	2021-02-22 22:42:27 +01:00
Varun Gupta	a461e4d306	MDEV-19474: Histogram statistics are used even with optimizer_use_condition_selectivity=3 The issue here was histogram statistics were being used even when the level of optimizer_use_condition_selectivity doesn't allow usage of statistics from histogram. The histogram statistics are read for a table only when optimizer_use_condition_selectivity > 3. But the TABLE structure can be stored in the internal table cache and be reused for the next query. So in this case the histogram statistics will be available for the next query. The fix would be to make sure to use the histogram statistics only when optimizer_use_condition_selectivity > 3.	2021-02-16 11:53:13 +05:30
Marko Mäkelä	4b959bd8df	Merge 10.3 into 10.4	2020-07-20 15:34:59 +03:00
Marko Mäkelä	acc58fd835	Merge 10.2 into 10.3	2020-07-20 15:11:59 +03:00
Marko Mäkelä	ca9276e37e	Merge 10.1 into 10.2	2020-07-20 14:53:24 +03:00
Varun Gupta	dfdfeecb03	MDEV-22851: Engine independent index statistics are incorrect for large tables on Windows An oveflow was happening on windows because on Windows sizeof(ulong) is 4 bytes while it is 8 bytes on Linux. Switched avg_frequency and avg length for column statistics to ulonglong. Switched avg_frequency for index statistics to ulonglong.	2020-07-15 11:27:32 +05:30
Marko Mäkelä	8059148154	Merge 10.3 into 10.4	2020-06-03 07:32:09 +03:00
Marko Mäkelä	8300f639a1	Merge 10.2 into 10.3	2020-06-02 10:25:11 +03:00
Marko Mäkelä	d72eebaa3d	Merge 10.1 into 10.2	2020-06-01 09:33:03 +03:00
Sergey Vojtovich	c279878493	Thread safe histograms loading Previously multiple threads were allowed to load histograms concurrently. There were no known problems caused by this. But given amount of data races in this code, it'd happen sooner or later. To avoid scalability bottleneck, histograms loading is protected by per-TABLE_SHARE atomic variable. Whenever histograms were loaded by preceding statement (hot-path), a scalable load-acquire check is performed. Whenever histograms have to be loaded anew, mutual exclusion for loaders is established by atomic variable. If histograms are being loaded concurrently, statement waits until load is completed. - Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only meaningful within TABLE_SHARE (not used for collected stats). - TABLE_STATISTICS_CB::histograms_can_be_read and TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state atomic variable. - Simplified away alloc_histograms_for_table_share(). Note: there's still likely a data race if a thread attempts accessing histograms data after it failed to load it (because of concurrent load). It was there previously and goes out of the scope of this effort. One way of fixing it could be reviving TABLE::histograms_are_read and adding appropriate checks whenever it is needed. Part of MDEV-19061 - table_share used for reading statistical tables is not protected	2020-05-29 21:53:54 +04:00
Marko Mäkelä	c11e5cdd12	Merge 10.3 into 10.4	2019-10-10 11:19:25 +03:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Marko Mäkelä	24232ec12c	Merge 10.1 into 10.2	2019-10-09 08:30:23 +03:00
Sergey Vojtovich	adefaeffcc	MDEV-19536 - Server crash or ASAN heap-use-after-free in is_temporary_table / read_statistics_for_tables_if_needed Regression after `279a907`, read_statistics_for_tables_if_needed() was called after open_normal_and_derived_tables() failure. Fixed by moving read_statistics_for_tables() call to a branch of get_schema_stat_record() where result of open_normal_and_derived_tables() is checked. Removed THD::force_read_stats, added read_statistics_for_tables() instead. Simplified away statistics_for_command_is_needed().	2019-10-07 13:30:22 +04:00
Sergey Vojtovich	e43791d4dc	Cleanup EITS Moved EITS allocation inside read_statistics_for_tables_if_needed(). Removed redundant is_safe argument.	2019-10-02 15:23:59 +04:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Marko Mäkelä	c64265f3f9	Merge 10.3 into 10.4	2018-12-12 14:09:48 +02:00
Marko Mäkelä	839cf16bb2	Merge 10.2 into 10.3	2018-12-12 13:46:06 +02:00
Marko Mäkelä	db1210f939	Merge 10.1 into 10.2	2018-12-12 12:13:43 +02:00
Marko Mäkelä	f77f8f6d1a	Merge 10.0 into 10.1	2018-12-12 10:48:53 +02:00
Varun Gupta	9207a838ed	MDEV-17255: New optimizer defaults and ANALYZE TABLE Added to new values to the server variable use_stat_tables. The values are COMPLEMENTARY_FOR_QUERIES and PREFERABLY_FOR_QUERIES. Both these values don't allow to collect EITS for queries like analyze table t1; To collect EITS we would need to use the syntax with persistent like analyze table t1 persistent for columns (col1,col2...) index (idx1, idx2...) / ALL Changing the default value from NEVER to PREFERABLY_FOR_QUERIES.	2018-12-09 13:25:27 +05:30
Varun Gupta	4886d14827	MDEV-17032: Estimates are higher for partitions of a table with @@use_stat_tables= PREFERABLY The problem here is EITS statistics does not calculate statistics for the partitions of the table. So a temporary solution would be to not read EITS statistics for partitioned tables. Also disabling reading of EITS for columns that participate in the partition list of a table.	2018-12-07 19:59:45 +05:30
Sergei Golubchik	57e0da50bb	Merge branch '10.2' into 10.3	2018-09-28 16:37:06 +02:00
Oleksandr Byelkin	28f08d3753	Merge branch '10.1' into 10.2	2018-09-14 08:47:22 +02:00
Oleksandr Byelkin	31081593aa	Merge branch '11.0' into 10.1	2018-09-06 22:45:19 +02:00
Varun Gupta	a9c09c95bd	MDEV-15306: Wrong/Unexpected result with the value optimizer_use_condition_selectivity set to 4 Currently for selectivity calculation we perform range analysis for a column even when we don't have any statistics(EITS). This makes less sense but is used to catch contradiction for WHERE condition. So the solution is to not perform range analysis for selectivity calculation for columns that do not have statistics.	2018-08-29 02:17:37 +05:30
Marko Mäkelä	7830fb7f45	Merge 10.2 into 10.3	2018-08-28 12:22:56 +03:00
Igor Babaev	4eac5df3fc	MDEV-16934 Query with very large IN clause lists runs slowly This patch introduces support for the system variable eq_range_index_dive_limit that existed in MySQL starting from 5.6. The variable sets a limit for index dives into equality ranges. Index dives are performed by optimizer to estimate the number of rows in range scans. Index dives usually provide good estimate but they are pretty expensive. To estimate the number of rows in equality ranges statistical data on indexes can be employed. Its usage gives not so good estimates but it's cheap. So if the number of equality dives required by an index scan exceeds the set limit no dives for equality ranges are performed by the optimizer for this index. As the new system variable is introduced in a stable version the default value for it is set to a special value meaning there is no limit for the number of index dives performed by the optimizer. The patch partially uses the MySQL code for WL 5957 'Statistics-based Range optimization for many ranges'.	2018-08-17 14:28:39 -07:00
Marko Mäkelä	05459706f2	Merge 10.2 into 10.3	2018-08-03 15:57:23 +03:00
Igor Babaev	095dc81158	MDEV-16757 Memory leak after adding manually min/max statistical data for blob column ANALYZE TABLE <table> does not collect statistical data on min/max values for BLOB columns of <table>. However these values can be added into mysql.column_stats manually by executing proper statements. Unfortunately this led to a memory leak because the memory allocated for these values was never freed. This patch provides the server with a function to free memory allocated for min/max statistical values of BLOB types. Temporarily changed the test case until MDEV-16711 is fixed as without this fix the test case for MDEV-16757 did not fail only for 10.0.	2018-07-15 16:24:24 -07:00
Igor Babaev	c89bb15c31	MDEV-16757 Memory leak after adding manually min/max statistical data for blob column ANALYZE TABLE <table> does not collect statistical data on min/max values for BLOB columns of <table>. However these values can be added into mysql.column_stats manually by executing proper statements. Unfortunately this led to a memory leak because the memory allocated for these values was never freed. This patch provides the server with a function to free memory allocated for min/max statistical values of BLOB types.	2018-07-13 17:48:45 -07:00
Varun Gupta	10f6b7001b	MDEV-9744: session optimizer_use_condition_selectivity=5 causing SQL Error (1918): Encountered illegal value '' when converting to DECIMAL The issue was that EITS data was allocated but then not read for some reason (one being to avoid a deadlock), then the optimizer was using these bzero'ed buffers as EITS statistics. This should not be allowed, we should use statistcs for a table only when we have successfully loaded/read the stats from the statistical tables.	2018-04-02 13:14:30 +03:00
Monty	a7e352b54d	Changed database, tablename and alias to be LEX_CSTRING This was done in, among other things: - thd->db and thd->db_length - TABLE_LIST tablename, db, alias and schema_name - Audit plugin database name - lex->db - All db and table names in Alter_table_ctx - st_select_lex db Other things: - Changed a lot of functions to take const LEX_CSTRING* as argument for db, table_name and alias. See init_one_table() as an example. - Changed some function arguments from LEX_CSTRING to const LEX_CSTRING - Changed some lists from LEX_STRING to LEX_CSTRING - threads_mysql.result changed because process list_db wasn't always correctly updated - New append_identifier() function that takes LEX_CSTRING* as arguments - Added new element tmp_buff to Alter_table_ctx to separate temp name handling from temporary space - Ensure we store the length after my_casedn_str() of table/db names - Removed not used version of rename_table_in_stat_tables() - Changed Natural_join_column::table_name and db_name() to never return NULL (used for print) - thd->get_db() now returns db as a printable string (thd->db.str or "")	2018-01-30 21:33:55 +02:00
Monty	5a759d31f7	Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char) to (char) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."' is not anymore same as table_name. - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)	2017-04-23 22:35:46 +03:00
Alexey Botchkov	83f7151da5	MDEV-10435 crash with bad stat tables. Functions from sql/statistics.cc don't seem to expect stat tables to fail or to have inadequate structure. Table open errors suppressed and some validity checks added. Invalid tables reported to the server log.	2016-12-09 17:13:43 +04:00
Nirbhay Choubey	3daf89ced9	MDEV-10957: Assertion failure when dropping a myisam table with wsrep_replicate_myisam enabled Internal updates to system statistical tables could wrongly trigger an additional total-order replication if wsrep_repli -cate_myisam is enabled. Fixed by adding a check to skip total-order replication for stat tables. Test: galera.galera_var_replicate_myisam_on	2016-11-02 09:45:43 -04:00
Igor Babaev	9d4a0dde0a	Fixed bug mdev-11096. 1. When min/max value is provided the null flag for it must be set to 0 in the bitmap Culumn_statistics::column_stat_nulls. 2. When the calculation of the selectivity of the range condition over a column requires min and max values for the column then we have to check that these values are provided.	2016-10-26 20:45:35 -07:00
Sergey Petrunya	fd4c9af398	MDEV-6442: Assertion `join->best_read < double(...)' failed with optimizer_use_condition_selectivity >=3 - Fix the crash by making get_column_range_cardinality() to handle the special case where Column_stats objects is an all-zeros object (the question of what is the point of having Field::read_stats point to such object remains a mystery) - Added a few comments. Learning the code still.	2014-10-06 15:29:22 +04:00
Sergey Vojtovich	d466ed90fd	MDEV-6473 - main.statistics fails on PPC64 mysql.column_stats wasn't stored/restored properly on big-endian with histogram_type=DOUBLE_PREC_HB. Store histogram values using int2store()/uint2korr(). Note that this patch invalidates previously calculated histogram values on big-endian.	2014-07-23 12:55:26 +04:00
Sergey Petrunya	79a8a6130b	Code cleanup: - Move [some] engine-agnostic tests from t/selectivity.test to t/selectivity_no_engine.test - Move Histogram::point_selectivity to sql_statistics.cc	2014-03-27 13:08:00 +04:00
Sergey Petrunya	ab061a2bb3	MDEV-5926, MDEV-4362 post-fixes: - Histogram::find_bucket() should not walk off the end of the value range. - Address review feedback in Histogram::point_selectivity(): different handling for zero-width buckets, and explanations.	2014-03-27 12:30:49 +04:00
Sergey Petrunya	dee11f9633	MDEV-4362: {division by zero when lookup constant is outside the value table} - Fix Histogram::point_selectivity() to work in the case where the passed value_pos=0 (or 1) and the first (or the last) bucket in the histogram has zero value-range (i.e one value).	2014-03-26 21:05:31 +04:00

1 2

66 commits