Sergei Petrunia
d8d57d2c27
MDEV-26764: JSON_HB Histograms: handle BINARY and unassigned characters
...
Encode such characters in hex.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
748b293c14
More test coverage
2022-01-19 18:10:12 +03:00
Sergei Petrunia
c2d2c1e727
MDEV-26519: Improved histograms
...
Save extra information in the histogram:
"target_histogram_size": nnn,
"collected_at": "(date and time)",
"collected_by": "(server version)",
2022-01-19 18:10:12 +03:00
Sergei Petrunia
a0916cf5a2
MDEV-26519: Improved histograms: Better error reporting, test coverage
...
Also report JSON histogram load errors into error log, like it is already
done with other histogram/statistics load errors.
Add test coverage to see what happens if one upgrades but does NOT run
mysql_upgrade.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
a0f93f433a
Rename histogram_hb_v2 -> histogram_hb
2022-01-19 18:10:11 +03:00
Sergei Petrunia
1d14176ec4
MDEV-26519: Improved histograms: Make JSON parser efficient
...
Previous JSON parser was using an API which made the parsing
inefficient: the same JSON contents was parsed again and again.
Switch to using a lower-level parsing API which allows to do
parsing in an efficient way.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
be55ad0d34
MDEV-27062: Make histogram_type=JSON_HB the new default
2022-01-19 18:10:11 +03:00
Sergei Petrunia
eb6a9ad705
MDEV-26886: Estimation for filtered rows less precise with JSON histogram
...
- Make Histogram_json_hb::range_selectivity handle singleton buckets
specially when computing selectivity of the max. endpoint bound.
(for min. endpoint, we already do that).
- Also, fixed comments for Histogram_json_hb::find_bucket
2022-01-19 18:10:11 +03:00
Sergei Petrunia
ac0194bd0e
MDEV-26892: JSON histograms become invalid with a specific (corrupt) value ..
...
Handle the case where the last value in the table cannot be represented
in utf8mb4.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
05877df472
MDEV-26849: JSON Histograms: point selectivity estimates are off
...
.. for non-existent values.
Handle this special case.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
f3f78bed85
MDEV-26750: Estimation for filtered rows is far off with JSON_HB histogram
...
Fix a bug in position_in_interval(). Do not overwrite one interval endpoint
with another.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
93d5980435
MDEV-26709: JSON histogram may contain bucketS than histogram_size allows
...
When computing bucket_capacity= records/histogram->get_width(), round
the value UP, not down.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
3936dc3353
MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
...
Part#3:
- make json_escape() return different errors on conversion error
and on out-of-space condition.
- Make histogram code handle conversion errors.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
5d66eeb3a1
MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
...
.. part#2: correctly pass the charset to JSON [un]escape functions
2022-01-19 18:10:11 +03:00
Sergei Petrunia
5c709ef18c
MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
...
Correctly handle empty string when [un]escaping JSON
2022-01-19 18:10:11 +03:00
Sergei Petrunia
61cd4f4412
MDEV-26711: Values in JSON histograms are not properly quoted
...
Escape values when serializing to JSON. Un-escape when reading back.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
d03daaf8a8
Use JSON_NAME, not the "histogram_hb_v2" constant
2022-01-19 18:10:10 +03:00
Sergei Petrunia
28ad128585
Fix off-by-one error in Histogram_json_hb::find_bucket
2022-01-19 18:10:10 +03:00
Sergei Petrunia
b179640219
MDEV-26590: Stack smashing/buffer overflow in Histogram_json_hb::parse
...
Provide buffer of sufficient size.
2022-01-19 18:10:10 +03:00
Sergei Petrunia
382250c05c
Address review input
2022-01-19 18:10:10 +03:00
Sergei Petrunia
cf8927e9cb
Fix the previous cset: next() should have element_count as parameter
2022-01-19 18:10:10 +03:00
Sergei Petrunia
b6121ca36a
Fix compile warnings/error on Windows
2022-01-19 18:10:10 +03:00
Sergei Petrunia
6375873c9a
Fixes in opt_histogram_json.cc in the last commits
...
Aslo add more test coverage
2022-01-19 18:10:10 +03:00
Sergei Petrunia
ace961a1e7
Fix compile error on windows
2022-01-19 18:10:10 +03:00
Sergei Petrunia
f460272054
MDEV-26519: JSON Histograms: improve histogram collection
...
Basic ideas:
1. Store "popular" values in their own buckets.
2. Also store ndv (Number of Distinct Values) in each bucket.
Because of #1 , the buckets are now variable-size, so store the size in
each bucket.
Adjust selectivity estimation functions accordingly.
2022-01-19 18:10:10 +03:00
Sergei Petrunia
d64e104810
Fix compilation on windows
2022-01-19 18:10:10 +03:00
Sergei Petrunia
5ddbd72af4
Correctly decode string field values for pos_in_interval_for_string call
2022-01-19 18:10:10 +03:00
Sergei Petrunia
e0f42d32e5
Fix compilation on windows part #3
2022-01-19 18:10:10 +03:00
Sergei Petrunia
9271bd17f7
More code cleanups
...
Remove Histogram_*::is_available(), it is not applicable anymore.
Fix compilation on Windows
2022-01-19 18:10:10 +03:00
Sergei Petrunia
1d98168547
Move JSON histograms code into its own files
2022-01-19 18:10:10 +03:00