mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 12:02:42 +01:00

Author	SHA1	Message	Date
Oleksandr Byelkin	dd7d9d7fb1	Merge branch '11.4' into 11.5	2024-05-23 17:01:43 +02:00
Sergei Golubchik	f9807aadef	Merge branch '10.11' into 11.0	2024-05-12 12:18:28 +02:00
Oleksandr Byelkin	c9b1ebee2f	Merge branch '10.6' into 10.11	2024-04-26 08:02:49 +02:00
Monty	0ccdf54b64	Check and remove high stack usage I checked all stack overflow potential problems found with gcc -Wstack-usage=16384 and clang -Wframe-larger-than=16384 -no-inline Fixes: Added '#pragma clang diagnostic ignored "-Wframe-larger-than="' to a lot of function to where stack usage large but resonable. - Added stack check warnings to BUILD scrips when using clang and debug. Function changed to use malloc instead allocating things on stack: - read_bootstrap_query() now allocates line_buffer (20000 bytes) with malloc() instead of using stack. This has a small performance impact but this is not releant for bootstrap. - mroonga grn_select() used 65856 bytes on stack. Changed it to use malloc(). - Wsrep_schema::replay_transaction() and Wsrep_schema::recover_sr_transactions(). - Connect zipOpen3() Not fixed: - mroonga/vendor/groonga/lib/expr.c grn_proc_call() uses 43712 byte on stack. However this is not easy to fix as the stack used is caused by a lot of code generated by defines. - Most changes in mroonga/groonga where only adding of pragmas to disable stack warnings. - rocksdb/options/options_helper.cc uses 20288 of stack space. (no reason to fix except to get rid of the compiler warning) - Causes using alloca() where the allocation size is resonable. - An issue in libmariadb (reported to connectors).	2024-04-23 14:12:31 +03:00
Alexander Barkov	fd247cc21f	MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp() This patch also fixes: MDEV-33050 Build-in schemas like oracle_schema are accent insensitive MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 MDEV-33088 Cannot create triggers in the database `MYSQL` MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 - Removing the virtual function strnncoll() from MY_COLLATION_HANDLER - Adding a wrapper function CHARSET_INFO::streq(), to compare two strings for equality. For now it calls strnncoll() internally. In the future it will turn into a virtual function. - Adding new accent sensitive case insensitive collations: - utf8mb4_general1400_as_ci - utf8mb3_general1400_as_ci They implement accent sensitive case insensitive comparison. The weight of a character is equal to the code point of its upper case variant. These collations use Unicode-14.0.0 casefolding data. The result of my_charset_utf8mb3_general1400_as_ci.strcoll() is very close to the former my_charset_utf8mb3_general_ci.strcasecmp() There is only a difference in a couple dozen rare characters, because: - the switch from "tolower" to "toupper" comparison, to make utf8mb3_general1400_as_ci closer to utf8mb3_general_ci - the switch from Unicode-3.0.0 to Unicode-14.0.0 This difference should be tolarable. See the list of affected characters in the MDEV description. Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters! Unlike utf8mb4_general_ci, it does not treat all BMP characters as equal. - Adding classes representing names of the file based database objects: Lex_ident_db Lex_ident_table Lex_ident_trigger Their comparison collation depends on the underlying file system case sensitivity and on --lower-case-table-names and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci. - Adding classes representing names of other database objects, whose names have case insensitive comparison style, using my_charset_utf8mb3_general1400_as_ci: Lex_ident_column Lex_ident_sys_var Lex_ident_user_var Lex_ident_sp_var Lex_ident_ps Lex_ident_i_s_table Lex_ident_window Lex_ident_func Lex_ident_partition Lex_ident_with_element Lex_ident_rpl_filter Lex_ident_master_info Lex_ident_host Lex_ident_locale Lex_ident_plugin Lex_ident_engine Lex_ident_server Lex_ident_savepoint Lex_ident_charset engine_option_value::Name - All the mentioned Lex_ident_xxx classes implement a method streq(): if (ident1.streq(ident2)) do_equal(); This method works as a wrapper for CHARSET_INFO::streq(). - Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name" in class members and in function/method parameters. - Replacing all calls like system_charset_info->coll->strcasecmp(ident1, ident2) to ident1.streq(ident2) - Taking advantage of the c++11 user defined literal operator for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h) data types. Use example: const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column; is now a shorter version of: const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});	2024-04-18 15:22:10 +04:00
Oleksandr Byelkin	48af85db21	Merge branch '10.11' into 11.0	2023-11-08 17:09:44 +01:00
Marko Mäkelä	5a8fca5a4f	Merge 10.6 into 10.10	2023-10-23 18:43:36 +03:00
Monty	a1b6befc78	Fixed crash in is_stat_table() when using hash joins. Other usage if persistent statistics is checking 'stats_is_read' in caller, which is why this was not noticed earlier. Other things: - Simplified no_stat_values_provided	2023-10-19 16:17:01 +03:00
Marko Mäkelä	be24e75229	Merge 10.11 into 11.0	2023-10-19 08:12:16 +03:00
Monty	a49ebf71af	Fixed memory leak when using histograms This was introduced in last merge with 10.6 The reason is that 10.6 does not need anything special to free histograms as everything is allocated on a memroot. In 10.10 histograms is using the vector class, which has some problems: - No automatic free - No memory usage accounting (we should at some point remove vector usage because of the above problem) Fixed by expliciting freeing histograms when freeing TABLE_STATISTICS objects.	2023-10-17 15:12:49 +03:00
Monty	ec277a70e8	Do not create histograms for single column unique key The intentention was always to not create histograms for single value unique keys (as histograms is not useful in this case), but because of a bug in the code this was still done. The changes in the test cases was mainly because hist_size is now NULL for these kind of columns.	2023-10-14 13:43:26 +03:00
Marko Mäkelä	d5e15424d8	Merge 10.6 into 10.10 The MDEV-29693 conflict resolution is from Monty, as well as is a bug fix where ANALYZE TABLE wrongly built histograms for single-column PRIMARY KEY. Also includes a fix for safe_malloc error reporting. Other things: - Copied main.log_slow from 10.4 to avoid mtr issue Disabled test: - spider/bugfix.mdev_27239 because we started to get +Error 1429 Unable to connect to foreign data source: localhost -Error 1158 Got an error reading communication packets - main.delayed - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED This part is disabled for now as it fails randomly with different warnings/errors (no corruption).	2023-10-14 13:36:11 +03:00
Monty	fdcb443e62	Remember first error in Dummy_error_handler Use Dummy_error_handler in open_stat_tables() to ignore all errors when opening statistics tables.	2023-10-10 11:12:26 +03:00
Monty	4c8d2410b6	Give warnings if open_stat_table_for_ddl() fails The warning is given in case of table not found or if there is a lock timeout. The warning is needed as in case of a lock timeout then the persistent table stats are going to be wrong.	2023-10-03 08:25:31 +03:00
Monty	e3b36b8f1b	MDEV-31957 Concurrent ALTER and ANALYZE collecting statistics can result in stale statistical data Example of what causes the problem: T1: ANALYZE TABLE starts to collect statistics T2: ALTER TABLE starts by deleting statistics for all changed fields, then creates a temp table and copies data to it. T1: ANALYZE ends and writes to the statistics tables. T2: ALTER TABLE renames temp table in place of the old table. Now the statistics from analyze matches the old deleted tables. Fixed by waiting to delete old statistics until ALTER TABLE is the only one using the old table and ensure that rename of columns can handle swapping of column names. rename_columns_in_stat_table() (former rename_column_in_stat_tables()) now takes a list of columns to rename. It uses the following algorithm to update column_stats to be able to handle circular renames - While there are columns to be renamed and it is the first loop or last rename loop did change something. - Loop over all columns to be renamed - Change column name in column_stat - If fail because of duplicate key - If this is first change attempt for this column - Change column name to a temporary column name - If there was a conflicting row, replace it with the current row. else - Remove entry from column list - Loop over all remaining columns in the list - Remove the conflicting row - Change column from temporary name to final name in column_stat Other things: - Don't flush tables for every operation. Only flush when all updates are done. - Rename of columns was not handled in case of ALGORITHM=copy (old bug). - Fixed that we do not collect statistics for hidden hash columns used by UNIQUE constraint on long values. - Fixed that we do not collect statistics for blob columns referred by generated virtual columns. This was achieved by storing the fields for which we want to have statistics in table->has_value_set instead of in table->read_set. - Rename of indexes was not handled for persistent statistics. - This is now handled similar as rename of columns. Renamed columns are now stored in 'rename_stat_indexes' and handled in Alter_info::delete_statistics() together with drooped indexes. - ALTER TABLE .. ADD INDEX may instead of creating a new index rename an existing generated foreign key index. This was not reflected in the index_stats table because this was handled in mysql_prepare_create_table instead instead of in the mysql_alter() code. Fixed by adding a call in mysql_prepare_create_table() to drop the changed index. I also had to change the code that 'marked the index' to be ignored with code that would not destroy the original index name. Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-10-03 08:25:30 +03:00
Monty	f009c4da91	Small corrections to MDEV-29693 ANALYZE TABLE 32 bit MariaDB crashed in innodb.innodb-16k and a few other tests. Fixed by using correct sizeof() calls. Histograms where not read if first read was without histograms.	2023-09-05 19:37:07 +03:00
Monty	a6bf4b5807	MDEV-29693 ANALYZE TABLE still flushes table definition cache when engine-independent statistics is used This commits enables reloading of engine-independent statistics without flushing the table from table definition cache. This is achieved by allowing multiple version of the TABLE_STATISTICS_CB object and having independent pointers to it in TABLE and TABLE_SHARE. The TABLE_STATISTICS_CB object have reference pointers and are freed when no one is pointing to it anymore. TABLE's TABLE_STATISTICS_CB pointer is updated to use the TABLE_SHARE's pointer when read_statistics_for_tables() is called at the beginning of a query. Main changes: - read_statistics_for_table() will allocate an new TABLE_STATISTICS_CB object. - All get_stat_values() functions has a new parameter that tells where collected data should be stored. get_stat_values() are not using the table_field object anymore to store data. - All get_stat_values() functions returns 1 if they found any data in the statistics tables. Other things: - Fixed INSERT DELAYED to not read statistics tables. - Removed Statistics_state from TABLE_STATISTICS_CB as this is not needed anymore as wer are not changing TABLE_SHARE->stats_cb while calculating or loading statistics. - Store values used with store_from_statistical_minmax_field() in TABLE_STATISTICS_CB::mem_root. This allowed me to remove the function delete_stat_values_for_table_share(). - Field_blob::store_from_statistical_minmax_field() is implemented but is not normally used as we do not yet support EIS statistics for blobs. For example Field_blob::update_min() and Field_blob::update_max() are not implemented. Note that the function can be called if there is an concurrent "ALTER TABLE MODIFY field BLOB" running because of a bug in ALTER TABLE where it deletes entries from column_stats before it has an exclusive lock on the table. - Use result of field->val_str(&val) as a pointer to the result instead of val (safetly fix). - Allocate memory for collected statistics in THD::mem_root, not in in TABLE::mem_root. This could cause the TABLE object to grow if a ANALYZE TABLE was run many times on the same table. This was done in allocate_statistics_for_table(), create_min_max_statistical_fields_for_table() and create_min_max_statistical_fields_for_table_share(). - Store in TABLE_STATISTICS_CB::stats_available which statistics was found in the statistics tables. - Removed index_table from class Index_prefix_calc as it was not used. - Added TABLE_SHARE::LOCK_statistics to ensure we don't load EITS in parallel. First thread will load it, others will reuse the loaded data. - Eliminate read_histograms_for_table(). The loading happens within read_statistics_for_tables() if histograms are needed. One downside is that if we have read statistics without histograms before and someone requires histograms, we have to read all statistics again (once) from the statistics tables. A smaller downside is the need to call alloc_root() for each individual histogram. Before we could allocate all the space for histograms with a single alloc_root. - Fixed bug in MyISAM and Aria where they did not properly notice that table had changed after analyze table. This was not a problem before this patch as then the MyISAM and Aria tables where flushed as part of ANALYZE table which did hide this issue. - Fixed a bug in ANALYZE table where table->records could be seen as 0 in collect_statistics_for_table(). The effect of this unlikely bug was that a full table scan could be done even if analyze_sample_percentage was not set to 1. - Changed multiple mallocs in a row to use multi_alloc_root(). - Added a mutex protection in update_statistics_for_table() to ensure that several tables are not updating the statistics at the same time. Some of the changes in sql_statistics.cc are based on a patch from Oleg Smirnov <olernov@gmail.com> Co-authored-by: Oleg Smirnov <olernov@gmail.com> Co-authored-by: Vicentiu Ciorbaru <cvicentiu@gmail.com> Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-08-18 13:28:39 +03:00
Sergei Golubchik	0005f2f06c	Merge branch 'bb-10.11-release' into bb-11.0-release	2023-06-05 19:27:00 +02:00
Monty	3bdc5542dc	MDEV-30812: Improve output cardinality estimates for hash join Introduces @@optimizer_switch flag: hash_join_cardinality When this option is on, use EITS statistics to produce tighter bounds for hash join output cardinality. This patch is an extension / replacement to a similar patch in 10.6 New features: - optimizer_switch hash_join_cardinality is on by default - records_out is set to fanout when HASH is used - Fixed bug in is_eits_usable: The function did not work with views	2023-05-03 21:44:57 +03:00
Oleksandr Byelkin	f0f1f2de0e	Merge branch '10.6' into 10.8	2023-05-03 11:33:57 +02:00
Oleksandr Byelkin	043d69bbcc	Merge branch '10.5' into 10.6	2023-05-03 09:51:25 +02:00
Oleksandr Byelkin	edf8ce5b97	Merge branch 'bb-10.4-release' into bb-10.5-release	2023-05-02 13:54:54 +02:00
Sergei Petrunia	85cc831880	MDEV-31067: selectivity_from_histogram >1.0 for a DOUBLE_PREC_HB histogram Variant #2. When Histogram::point_selectivity() sees that the point value of interest falls into one bucket, it tries to guess whether the bucket has many different (unpopular) values or a few popular values. (The number of rows is fixed, as it's a Height-balanced histogram). The basis for this guess is the "width" of the value range the bucket covers. Buckets covering wider value ranges are assumed to contain values with proportionally lower frequencies. This is just a [brave] guesswork. For a very narrow bucket, it may produce an estimate that's larger than total #rows in the bucket or even in the whole table. Remove the guesswork and replace it with basic logic: return either the per-table average selectivity of col=const, or selectivity of one bucket, whichever is lower.	2023-04-28 22:39:25 +03:00
Marko Mäkelä	2e431ff7e6	Merge 10.11 into 11.0	2023-02-16 13:34:45 +02:00
Marko Mäkelä	dbab3e8d90	Merge 10.6 into 10.8	2023-02-10 13:43:53 +02:00
Marko Mäkelä	6aec87544c	Merge 10.5 into 10.6	2023-02-10 13:03:01 +02:00
Marko Mäkelä	c41c79650a	Merge 10.4 into 10.5	2023-02-10 12:02:11 +02:00
Vicențiu Ciorbaru	08c852026d	Apply clang-tidy to remove empty constructors / destructors This patch is the result of running run-clang-tidy -fix -header-filter=.* -checks='-,modernize-use-equals-default' . Code style changes have been done on top. The result of this change leads to the following improvements: 1. Binary size reduction. For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by ~400kb. * A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb. 2. Compiler can better understand the intent of the code, thus it leads to more optimization possibilities. Additionally it enabled detecting unused variables that had an empty default constructor but not marked so explicitly. Particular change required following this patch in sql/opt_range.cc result_keys, an unused template class Bitmap now correctly issues unused variable warnings. Setting Bitmap template class constructor to default allows the compiler to identify that there are no side-effects when instantiating the class. Previously the compiler could not issue the warning as it assumed Bitmap class (being a template) would not be performing a NO-OP for its default constructor. This prevented the "unused variable warning".	2023-02-09 16:09:08 +02:00
Monty	c443dbff0e	Ensure that test_quick_select doesn't return more rows than in the table Other changes: - In test_quick_select(), assume that if table->used_stats_records is 0 then the table has 0 rows. - Fixed prepare_simple_select() to populate table->used_stat_records - Enusre that set_statistics_for_tables() doesn't cause used_stats_records to be 0 when using stat_tables. - To get blackhole to work with replication, set stats.records to 2 so that test_quick_select() doesn't assume the table is empty.	2023-01-30 15:22:20 +02:00
Marko Mäkelä	3c2a5ad3e8	Merge 10.7 into 10.8	2022-07-01 17:53:06 +03:00
Marko Mäkelä	62a20f8047	Merge 10.5 into 10.6	2022-07-01 15:24:50 +03:00
Marko Mäkelä	f09687094c	Merge 10.4 into 10.5	2022-07-01 14:42:02 +03:00
Marko Mäkelä	392ee571c1	Merge 10.3 into 10.4	2022-07-01 13:10:36 +03:00
Marko Mäkelä	045771c050	Fix most clang-15 -Wunused-but-set-variable Also, refactor trx_i_s_common_fill_table() to remove dead code. Warnings about yynerrs in Bison-generated yyparse() will remain for now.	2022-07-01 09:48:36 +03:00
Sergei Petrunia	ce4956f322	Code cleanup	2022-01-19 18:14:07 +03:00
Sergei Petrunia	db8f15be93	MDEV-27229: Estimation for filtered rows less precise ... #5 Followup: remove this line from get_column_range_cardinality() set_if_bigger(res, col_stats->get_avg_frequency()); and make sure it is only used with the binary histograms. For JSON histograms, it makes the estimates unnecessarily imprecise.	2022-01-19 18:10:12 +03:00
Sergei Petrunia	1d14176ec4	MDEV-26519: Improved histograms: Make JSON parser efficient Previous JSON parser was using an API which made the parsing inefficient: the same JSON contents was parsed again and again. Switch to using a lower-level parsing API which allows to do parsing in an efficient way.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	106c785e2d	MDEV-26911: Unexpected ER_DUP_KEY, ASAN errors, double free detected in ... When loading the histogram, use table->field[N], not table->s->field[N]. When we used the latter we would corrupt the fields's default value. One of the consequences of that would be that AUTO_INCREMENT fields would stop working correctly.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	05877df472	MDEV-26849: JSON Histograms: point selectivity estimates are off .. for non-existent values. Handle this special case.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	27539cd2c8	MDEV-26801: Valgrind/MSAN errors in Column_statistics_collected::finish ... The problem was introduced in fix for MDEV-26724. That patch has made it possible for histogram collection to fail. In particular, it fails for non-assigned characters. When histogram construction fails, we also abort the computation of COUNT(DISTINCT). When we try to use the value, we get valgrind failures. Switched the code to abort the statistics collection in this case.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	3936dc3353	MDEV-26724 Endless loop in json_escape_to_string upon ... empty string Part#3: - make json_escape() return different errors on conversion error and on out-of-space condition. - Make histogram code handle conversion errors.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	943b8fccf9	MDEV-26710: Histogram field in mysql.column_stats is too short Change it to LONGBLOB. Also, update_statistics_for_table() should not "swallow" an error from open_stat_tables.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	702f4efcd9	More "straightforward" memory management Do not put Histogram objects on MEM_ROOT at all	2022-01-19 18:10:10 +03:00
Sergei Petrunia	9271bd17f7	More code cleanups Remove Histogram_*::is_available(), it is not applicable anymore. Fix compilation on Windows	2022-01-19 18:10:10 +03:00
Sergei Petrunia	1d98168547	Move JSON histograms code into its own files	2022-01-19 18:10:10 +03:00
Sergei Petrunia	4ab2b78b65	Histogram code cleanup and fixes Factor the code that updates count, count_distinct, count_distinct_single_occurrence into class Basic_stats_collector Change from Histogram_builder and its descendant Histogram_builder_json to Histogram_builder (the interface), and Histogram_binary_builder, Histogram_json_builder. In Histogram_json_builder, do not forget to collect the right bound of the right-most bucket.	2022-01-19 18:10:10 +03:00
Sergei Petrunia	e675f44624	Code cleanup: don't duplicate the position-in-interval code	2022-01-19 18:10:09 +03:00
Sergei Petrunia	899dfb078e	Code cleanups part #3	2022-01-19 18:10:09 +03:00
Sergei Petrunia	859c14ff01	Better names: s/histogram_/histogram/, s/Histogram_json/Histogram_json_hb/	2022-01-19 18:10:09 +03:00
Sergei Petrunia	fc6a4a33b2	Cleanup histogram collection code	2022-01-19 18:10:09 +03:00

1 2 3 4 5 ...

312 commits