mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-30 18:41:56 +01:00

Author	SHA1	Message	Date
Sergei Petrunia	ce4956f322	Code cleanup	2022-01-19 18:14:07 +03:00
Sergei Petrunia	db8f15be93	MDEV-27229: Estimation for filtered rows less precise ... #5 Followup: remove this line from get_column_range_cardinality() set_if_bigger(res, col_stats->get_avg_frequency()); and make sure it is only used with the binary histograms. For JSON histograms, it makes the estimates unnecessarily imprecise.	2022-01-19 18:10:12 +03:00
Sergei Petrunia	1d14176ec4	MDEV-26519: Improved histograms: Make JSON parser efficient Previous JSON parser was using an API which made the parsing inefficient: the same JSON contents was parsed again and again. Switch to using a lower-level parsing API which allows to do parsing in an efficient way.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	05877df472	MDEV-26849: JSON Histograms: point selectivity estimates are off .. for non-existent values. Handle this special case.	2022-01-19 18:10:11 +03:00
Sergei Petrunia	702f4efcd9	More "straightforward" memory management Do not put Histogram objects on MEM_ROOT at all	2022-01-19 18:10:10 +03:00
Sergei Petrunia	9271bd17f7	More code cleanups Remove Histogram_*::is_available(), it is not applicable anymore. Fix compilation on Windows	2022-01-19 18:10:10 +03:00
Sergei Petrunia	1d98168547	Move JSON histograms code into its own files	2022-01-19 18:10:10 +03:00
Sergei Petrunia	4ab2b78b65	Histogram code cleanup and fixes Factor the code that updates count, count_distinct, count_distinct_single_occurrence into class Basic_stats_collector Change from Histogram_builder and its descendant Histogram_builder_json to Histogram_builder (the interface), and Histogram_binary_builder, Histogram_json_builder. In Histogram_json_builder, do not forget to collect the right bound of the right-most bucket.	2022-01-19 18:10:10 +03:00
Sergei Petrunia	859c14ff01	Better names: s/histogram_/histogram/, s/Histogram_json/Histogram_json_hb/	2022-01-19 18:10:09 +03:00
Sergei Petrunia	fc6a4a33b2	Cleanup histogram collection code	2022-01-19 18:10:09 +03:00
Sergei Petrunia	02a67307d3	Fix compiation on windows	2022-01-19 18:10:09 +03:00
Sergei Petrunia	3486bf4110	Code cleanup + reduce the diff size	2022-01-19 18:10:09 +03:00
Sergei Petrunia	a93b377863	Fix histogram memory management There are "local" histograms that are allocated by one thread for one TABLE object, and "global" that are allocated for TABLE_SHARE.	2022-01-19 18:10:09 +03:00
Sergei Petrunia	fcf58a5e0f	Code cleanup part#2: do not copy key values in xxx_selectivity() functions	2022-01-19 18:10:09 +03:00
Sergei Petrunia	2a1cdbabec	Fix JSON parsing: future-proof data representation in JSON, code cleanup	2022-01-19 18:10:09 +03:00
Sergei Petrunia	a0b4a86822	Code cleanup part #2 .	2022-01-19 18:10:09 +03:00
Sergei Petrunia	72c0ba43b2	Code cleanup part #1	2022-01-19 18:10:09 +03:00
Sergei Petrunia	f76e310ace	Rename histogram_type=JSON to JSON_HB	2022-01-19 18:10:09 +03:00
Michael Okoko	bff65a813e	Implement point selectivity for JSON histograms * Also merges tests relating to JSON statistics into one file Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	547f805311	Refactor histogram point selectivity Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	63cbd0748b	replace range_selectivity methods for Histograms and add tests Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	c129689ddc	Use binary search to compute range selectivity * it also adds an "explain select" statement to the test so that the fprintf calls can print the computed intervals to mysqld.1.err Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	69f24c238e	Use generic Histogram_base class for Histogram_builders This fixes the wrong calculation for avg_frequency in json histograms by replacing the specific histogram objects with the generic Histogram_base class. It also restores get/set size functions as they were useful in calculating fields for binary histogram. Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Sergei Petrunia	21e0f5487f	MDEV-21130: Histograms: use JSON as on-disk format A demo of how to use in-memory data structure for histogram. The patch shows how to * convert string form of data to binary form * compare two values in binary form * compute a fraction for val in [X, Y] range. grep for GSOC-TODO for notes.	2022-01-19 18:10:08 +03:00
Michael Okoko	fe2e516a50	inform test result of zero hist_size for json histogram Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	bf4d0dcfe2	implement parse and serialize for histogram json	2022-01-19 18:10:08 +03:00
Michael Okoko	9bba595528	remove unneeded shared methods Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	1fa7af749e	Split histogram classes and into JSON and binary classes Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Sergei Petrunia	1998b787ac	MDEV-21130: Histograms: use JSON as on-disk format Preparation for handling different kinds of histograms: - In Column_statistics, change "Histogram histogram" into "Histogram *histogram_". This allows for different kinds of Histogram classes with virtual functions. - [Almost] remove the usage of Histogram->set_values and Histogram->set_size. The code outside the histogram should not make any assumptions about what/how is stored in the Histogram. - Introduce drafts of methods to read/save histograms to/from disk.	2022-01-19 18:10:08 +03:00
Michael Okoko	9954aecc2b	Store bucket bounds and extend test cases for JSON histogram This fixes the memory allocation for json histogram builder and add more column types for testing. Some challenges at the moment include: * Garbage value at the end of JSON array still persists. * Garbage value also gets appended to bucket values if the column is a primary key. * There's a memory leak resulting in a "Warning: Memory not freed" message at the end of tests. Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:07 +03:00
Michael Okoko	2aca7b0c33	Prepare JSON as valid histogram_type Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:07 +03:00
Sergei Golubchik	e841957416	Merge branch '10.3' into 10.4	2021-02-23 09:25:57 +01:00
Sergei Golubchik	0ab1e3914c	Merge branch '10.2' into 10.3	2021-02-22 22:42:27 +01:00
Varun Gupta	a461e4d306	MDEV-19474: Histogram statistics are used even with optimizer_use_condition_selectivity=3 The issue here was histogram statistics were being used even when the level of optimizer_use_condition_selectivity doesn't allow usage of statistics from histogram. The histogram statistics are read for a table only when optimizer_use_condition_selectivity > 3. But the TABLE structure can be stored in the internal table cache and be reused for the next query. So in this case the histogram statistics will be available for the next query. The fix would be to make sure to use the histogram statistics only when optimizer_use_condition_selectivity > 3.	2021-02-16 11:53:13 +05:30
Marko Mäkelä	4b959bd8df	Merge 10.3 into 10.4	2020-07-20 15:34:59 +03:00
Marko Mäkelä	acc58fd835	Merge 10.2 into 10.3	2020-07-20 15:11:59 +03:00
Marko Mäkelä	ca9276e37e	Merge 10.1 into 10.2	2020-07-20 14:53:24 +03:00
Varun Gupta	dfdfeecb03	MDEV-22851: Engine independent index statistics are incorrect for large tables on Windows An oveflow was happening on windows because on Windows sizeof(ulong) is 4 bytes while it is 8 bytes on Linux. Switched avg_frequency and avg length for column statistics to ulonglong. Switched avg_frequency for index statistics to ulonglong.	2020-07-15 11:27:32 +05:30
Marko Mäkelä	8059148154	Merge 10.3 into 10.4	2020-06-03 07:32:09 +03:00
Marko Mäkelä	8300f639a1	Merge 10.2 into 10.3	2020-06-02 10:25:11 +03:00
Marko Mäkelä	d72eebaa3d	Merge 10.1 into 10.2	2020-06-01 09:33:03 +03:00
Sergey Vojtovich	c279878493	Thread safe histograms loading Previously multiple threads were allowed to load histograms concurrently. There were no known problems caused by this. But given amount of data races in this code, it'd happen sooner or later. To avoid scalability bottleneck, histograms loading is protected by per-TABLE_SHARE atomic variable. Whenever histograms were loaded by preceding statement (hot-path), a scalable load-acquire check is performed. Whenever histograms have to be loaded anew, mutual exclusion for loaders is established by atomic variable. If histograms are being loaded concurrently, statement waits until load is completed. - Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only meaningful within TABLE_SHARE (not used for collected stats). - TABLE_STATISTICS_CB::histograms_can_be_read and TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state atomic variable. - Simplified away alloc_histograms_for_table_share(). Note: there's still likely a data race if a thread attempts accessing histograms data after it failed to load it (because of concurrent load). It was there previously and goes out of the scope of this effort. One way of fixing it could be reviving TABLE::histograms_are_read and adding appropriate checks whenever it is needed. Part of MDEV-19061 - table_share used for reading statistical tables is not protected	2020-05-29 21:53:54 +04:00
Marko Mäkelä	c11e5cdd12	Merge 10.3 into 10.4	2019-10-10 11:19:25 +03:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Marko Mäkelä	24232ec12c	Merge 10.1 into 10.2	2019-10-09 08:30:23 +03:00
Sergey Vojtovich	adefaeffcc	MDEV-19536 - Server crash or ASAN heap-use-after-free in is_temporary_table / read_statistics_for_tables_if_needed Regression after `279a907`, read_statistics_for_tables_if_needed() was called after open_normal_and_derived_tables() failure. Fixed by moving read_statistics_for_tables() call to a branch of get_schema_stat_record() where result of open_normal_and_derived_tables() is checked. Removed THD::force_read_stats, added read_statistics_for_tables() instead. Simplified away statistics_for_command_is_needed().	2019-10-07 13:30:22 +04:00
Sergey Vojtovich	e43791d4dc	Cleanup EITS Moved EITS allocation inside read_statistics_for_tables_if_needed(). Removed redundant is_safe argument.	2019-10-02 15:23:59 +04:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00

1 2

93 commits