Commit graph

43 commits

Author SHA1 Message Date
Lena Startseva
a4234f0410 MDEV-27691: make working view-protocol
Update tests for version 10.8
2022-09-27 18:50:18 +07:00
Sergei Petrunia
51bce3c59a MDEV-28882: Assertion `tmp >= 0' failed in best_access_path
Histogram_json_hb::range_selectivity() may return small negative
numbers due to rounding errors in the histogram.

Make sure the returned value is non-negative.
Add an assert to catch negative values that are not small.

(attempt )
2022-06-22 13:39:48 +03:00
Sergei Petrunia
4842a56356 JSON_HB histogram: represent values of BIT() columns in hex always 2022-01-19 18:10:12 +03:00
Sergei Petrunia
dae20dde4e MDEV-26901: Estimation for filtered rows less precise ...
In Histogram_json_hb::point_selectivity(), do return selectivity of 0.0
when the histogram says so.

The logic of "Do not return 0.0 estimate as it causes a multiply-by-zero
meltdown in cost and cardinality calculations" is moved into
records_in_column_ranges() where it is one *once* per column pair (as
opposed to doing once per range, which can cause the error to add-up
to large number when there are many ranges)
2022-01-19 18:10:12 +03:00
Sergei Petrunia
db8f15be93 MDEV-27229: Estimation for filtered rows less precise ...
Followup: remove this line from get_column_range_cardinality()

      set_if_bigger(res, col_stats->get_avg_frequency());

and make sure it is only used with the binary histograms.
For JSON histograms, it makes the estimates unnecessarily imprecise.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
d3e511d421 MDEV-27243: Estimation for filtered rows less precise ...
Added a testcase
2022-01-19 18:10:12 +03:00
Sergei Petrunia
531dd708ef MDEV-27229: Estimation for filtered rows less precise ...
Fix special handling for values that are right next to buckets with ndv=1.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
905634dc3f MDEV-27230: Estimation for filtered rows less precise ...
Fix the code in Histogram_json_hb::range_selectivity that handles
special cases: a non-inclusive endpoint hitting a bucket boundary...
2022-01-19 18:10:12 +03:00
Sergei Petrunia
d8d57d2c27 MDEV-26764: JSON_HB Histograms: handle BINARY and unassigned characters
Encode such characters in hex.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
748b293c14 More test coverage 2022-01-19 18:10:12 +03:00
Sergei Petrunia
c2d2c1e727 MDEV-26519: Improved histograms
Save extra information in the histogram:

    "target_histogram_size": nnn,
    "collected_at": "(date and time)",
    "collected_by": "(server version)",
2022-01-19 18:10:12 +03:00
Sergei Petrunia
a0916cf5a2 MDEV-26519: Improved histograms: Better error reporting, test coverage
Also report JSON histogram load errors into error log, like it is already
done with other histogram/statistics load errors.

Add test coverage to see what happens if one upgrades but does NOT run
mysql_upgrade.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
a0f93f433a Rename histogram_hb_v2 -> histogram_hb 2022-01-19 18:10:11 +03:00
Sergei Petrunia
1d14176ec4 MDEV-26519: Improved histograms: Make JSON parser efficient
Previous JSON parser was using an API which made the parsing
inefficient: the same JSON contents was parsed again and again.

Switch to using a lower-level parsing API which allows to do
parsing in an efficient way.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
eb6a9ad705 MDEV-26886: Estimation for filtered rows less precise with JSON histogram
- Make Histogram_json_hb::range_selectivity handle singleton buckets
  specially when computing selectivity of the max. endpoint bound.
  (for min. endpoint, we already do that).

- Also, fixed comments for Histogram_json_hb::find_bucket
2022-01-19 18:10:11 +03:00
Sergei Petrunia
106c785e2d MDEV-26911: Unexpected ER_DUP_KEY, ASAN errors, double free detected in ...
When loading the histogram, use table->field[N], not table->s->field[N].

When we used the latter we would corrupt the fields's default value. One
of the consequences of that would be that AUTO_INCREMENT fields would
stop working correctly.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
ac0194bd0e MDEV-26892: JSON histograms become invalid with a specific (corrupt) value ..
Handle the case where the last value in the table cannot be represented
in utf8mb4.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
05877df472 MDEV-26849: JSON Histograms: point selectivity estimates are off
.. for non-existent values.

Handle this special case.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
f3f78bed85 MDEV-26750: Estimation for filtered rows is far off with JSON_HB histogram
Fix a bug in position_in_interval(). Do not overwrite one interval endpoint
with another.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
27539cd2c8 MDEV-26801: Valgrind/MSAN errors in Column_statistics_collected::finish ...
The problem was introduced in fix for MDEV-26724. That patch has made it
possible for histogram collection to fail. In particular, it fails for
non-assigned characters.

When histogram construction fails, we also abort the computation of
COUNT(DISTINCT). When we try to use the value, we get valgrind failures.

Switched the code to abort the statistics collection in this case.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
93d5980435 MDEV-26709: JSON histogram may contain bucketS than histogram_size allows
When computing bucket_capacity= records/histogram->get_width(), round
the value UP, not down.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
3936dc3353 MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
Part#3:
- make json_escape() return different errors on conversion error
  and on out-of-space condition.
- Make histogram code handle conversion errors.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
b17f33a04b MDEV-26737: Outdated VARIABLE_COMMENT for HISTOGRAM_TYPE in I_S.SYSTEM_VARIABLES
Fix the description
2022-01-19 18:10:11 +03:00
Sergei Petrunia
5d66eeb3a1 MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
.. part#2: correctly pass the charset to JSON [un]escape functions
2022-01-19 18:10:11 +03:00
Sergei Petrunia
43a8d9f156 MDEV-26595: ASAN use-after-poison my_strnxfrm_simple_internal / Histogram_json_hb::range_selectivity
Add testcase
2022-01-19 18:10:11 +03:00
Sergei Petrunia
5ef350a7f1 MDEV-26589: Assertion failure upon DECODE_HISTOGRAM with NULLs
Item_func_decode_histogram::val_str should correctly set null_value
when "decoding" JSON histogram.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
5c709ef18c MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
Correctly handle empty string when [un]escaping JSON
2022-01-19 18:10:11 +03:00
Sergei Petrunia
61cd4f4412 MDEV-26711: Values in JSON histograms are not properly quoted
Escape values when serializing to JSON. Un-escape when reading back.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
28ad128585 Fix off-by-one error in Histogram_json_hb::find_bucket 2022-01-19 18:10:10 +03:00
Sergei Petrunia
b179640219 MDEV-26590: Stack smashing/buffer overflow in Histogram_json_hb::parse
Provide buffer of sufficient size.
2022-01-19 18:10:10 +03:00
Sergei Petrunia
382250c05c Address review input 2022-01-19 18:10:10 +03:00
Sergei Petrunia
6375873c9a Fixes in opt_histogram_json.cc in the last commits
Aslo add more test coverage
2022-01-19 18:10:10 +03:00
Sergei Petrunia
223fa6a891 Make tests pass
- Fix bad tests in statistics_json test: make them meaningful and make them
  work on windows
- Fix analyze_debug.test: correctly handle errors during ANALYZE
2022-01-19 18:10:10 +03:00
Sergei Petrunia
2a1cdbabec Fix JSON parsing: future-proof data representation in JSON, code cleanup 2022-01-19 18:10:09 +03:00
Sergei Petrunia
f76e310ace Rename histogram_type=JSON to JSON_HB 2022-01-19 18:10:09 +03:00
Michael Okoko
058a90e6f5 Use existing statistics test to improve coverage for JSON statistics
Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2022-01-19 18:10:09 +03:00
Michael Okoko
bff65a813e Implement point selectivity for JSON histograms
* Also merges tests relating to JSON statistics into one file

Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2022-01-19 18:10:08 +03:00
Michael Okoko
3d952cd8bd Improve tests and test results to cover larger cases
Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2022-01-19 18:10:08 +03:00
Michael Okoko
c129689ddc Use binary search to compute range selectivity
* it also adds an "explain select" statement to the test so that the fprintf calls
  can print the computed intervals to mysqld.1.err

Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2022-01-19 18:10:08 +03:00
Michael Okoko
e778d12f83 report parse error when parsing JSON histogram fails
Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2022-01-19 18:10:08 +03:00
Michael Okoko
9954aecc2b Store bucket bounds and extend test cases for JSON histogram
This fixes the memory allocation for json histogram builder and add more column types for testing.
Some challenges at the moment include:
* Garbage value at the end of JSON array still persists.
* Garbage value also gets appended to bucket values if the column is a primary key.
* There's a memory leak resulting in a "Warning: Memory not freed" message at the end of tests.

Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2022-01-19 18:10:07 +03:00
Michael Okoko
237447de63 rough base for json histogram builder
Signed-off-by: Michael Okoko <okokomichaels@outlook.com>
2022-01-19 18:10:07 +03:00
Michael Okoko
79cdb535da add json statistics test and change histogram column type to blob 2022-01-19 18:10:07 +03:00