Commit graph

44 commits

Author SHA1 Message Date
Sergei Golubchik
aebd16201f don't use session locale for the error log 2024-05-27 12:39:04 +02:00
Monty
dfdedd46e4 MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
This patch extends the timestamp from
2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999
for 64 bit hardware and OS where 'long' is 64 bits.
This is true for 64 bit Linux but not for Windows.

This is done by treating the 32 bit stored int as unsigned instead of
signed.  This is safe as MariaDB has never accepted dates before the epoch
(1970).
The benefit of this approach that for normal timestamp the storage is
compatible with earlier version.

However for tables using system versioning we before stored a
timestamp with the year 2038 as the 'max timestamp', which is used to
detect current values.  This patch stores the new 2106 year max value
as the max timestamp. This means that old tables using system
versioning needs to be updated with mariadb-upgrade when moving them
to 11.4. That will be done in a separate commit.
2024-05-27 12:39:02 +02:00
Marko Mäkelä
d5e15424d8 Merge 10.6 into 10.10
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.

Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue

Disabled test:
- spider/bugfix.mdev_27239 because we started to get
  +Error	1429 Unable to connect to foreign data source: localhost
  -Error	1158 Got an error reading communication packets
- main.delayed
  - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
    This part is disabled for now as it fails randomly with different
    warnings/errors (no corruption).
2023-10-14 13:36:11 +03:00
Sergei Petrunia
51bce3c59a MDEV-28882: Assertion `tmp >= 0' failed in best_access_path
Histogram_json_hb::range_selectivity() may return small negative
numbers due to rounding errors in the histogram.

Make sure the returned value is non-negative.
Add an assert to catch negative values that are not small.

(attempt #2)
2022-06-22 13:39:48 +03:00
Oleksandr Byelkin
307b2991d6 Fix JSON statistics time format and added tests for it and server version. 2022-02-07 08:44:32 +01:00
Marko Mäkelä
93756c992f MDEV-27229 fixup: GCC -Wunused-function 2022-01-25 09:00:18 +02:00
Marko Mäkelä
852534dc99 MDEV-26519 fixup: GCC 11 -Og -Wmaybe-uninitialized
GCC does not understand that the variable have_ndv determines
whether the variable ndv_ll is initialized. Let us add a
redundant initialization to pacify GCC.
2022-01-20 08:24:03 +02:00
Sergei Petrunia
ce4956f322 Code cleanup 2022-01-19 18:14:07 +03:00
Sergei Petrunia
4842a56356 JSON_HB histogram: represent values of BIT() columns in hex always 2022-01-19 18:10:12 +03:00
Sergei Petrunia
dae20dde4e MDEV-26901: Estimation for filtered rows less precise ... #4
In Histogram_json_hb::point_selectivity(), do return selectivity of 0.0
when the histogram says so.

The logic of "Do not return 0.0 estimate as it causes a multiply-by-zero
meltdown in cost and cardinality calculations" is moved into
records_in_column_ranges() where it is one *once* per column pair (as
opposed to doing once per range, which can cause the error to add-up
to large number when there are many ranges)
2022-01-19 18:10:12 +03:00
Sergei Petrunia
db8f15be93 MDEV-27229: Estimation for filtered rows less precise ... #5
Followup: remove this line from get_column_range_cardinality()

      set_if_bigger(res, col_stats->get_avg_frequency());

and make sure it is only used with the binary histograms.
For JSON histograms, it makes the estimates unnecessarily imprecise.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
531dd708ef MDEV-27229: Estimation for filtered rows less precise ... #5
Fix special handling for values that are right next to buckets with ndv=1.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
905634dc3f MDEV-27230: Estimation for filtered rows less precise ...
Fix the code in Histogram_json_hb::range_selectivity that handles
special cases: a non-inclusive endpoint hitting a bucket boundary...
2022-01-19 18:10:12 +03:00
Sergei Petrunia
08f1c4a2e0 MDEV-27203: Valgrind / MSAN errors in Histogram_json_hb::parse_bucket
In read_bucket_endpoint(), handle all possible parser states.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
d8d57d2c27 MDEV-26764: JSON_HB Histograms: handle BINARY and unassigned characters
Encode such characters in hex.
2022-01-19 18:10:12 +03:00
Sergei Petrunia
748b293c14 More test coverage 2022-01-19 18:10:12 +03:00
Sergei Petrunia
c2d2c1e727 MDEV-26519: Improved histograms
Save extra information in the histogram:

    "target_histogram_size": nnn,
    "collected_at": "(date and time)",
    "collected_by": "(server version)",
2022-01-19 18:10:12 +03:00
Sergei Petrunia
a0916cf5a2 MDEV-26519: Improved histograms: Better error reporting, test coverage
Also report JSON histogram load errors into error log, like it is already
done with other histogram/statistics load errors.

Add test coverage to see what happens if one upgrades but does NOT run
mysql_upgrade.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
a0f93f433a Rename histogram_hb_v2 -> histogram_hb 2022-01-19 18:10:11 +03:00
Sergei Petrunia
1d14176ec4 MDEV-26519: Improved histograms: Make JSON parser efficient
Previous JSON parser was using an API which made the parsing
inefficient: the same JSON contents was parsed again and again.

Switch to using a lower-level parsing API which allows to do
parsing in an efficient way.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
be55ad0d34 MDEV-27062: Make histogram_type=JSON_HB the new default 2022-01-19 18:10:11 +03:00
Sergei Petrunia
eb6a9ad705 MDEV-26886: Estimation for filtered rows less precise with JSON histogram
- Make Histogram_json_hb::range_selectivity handle singleton buckets
  specially when computing selectivity of the max. endpoint bound.
  (for min. endpoint, we already do that).

- Also, fixed comments for Histogram_json_hb::find_bucket
2022-01-19 18:10:11 +03:00
Sergei Petrunia
ac0194bd0e MDEV-26892: JSON histograms become invalid with a specific (corrupt) value ..
Handle the case where the last value in the table cannot be represented
in utf8mb4.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
05877df472 MDEV-26849: JSON Histograms: point selectivity estimates are off
.. for non-existent values.

Handle this special case.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
f3f78bed85 MDEV-26750: Estimation for filtered rows is far off with JSON_HB histogram
Fix a bug in position_in_interval(). Do not overwrite one interval endpoint
with another.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
93d5980435 MDEV-26709: JSON histogram may contain bucketS than histogram_size allows
When computing bucket_capacity= records/histogram->get_width(), round
the value UP, not down.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
3936dc3353 MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
Part#3:
- make json_escape() return different errors on conversion error
  and on out-of-space condition.
- Make histogram code handle conversion errors.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
5d66eeb3a1 MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
.. part#2: correctly pass the charset to JSON [un]escape functions
2022-01-19 18:10:11 +03:00
Sergei Petrunia
5c709ef18c MDEV-26724 Endless loop in json_escape_to_string upon ... empty string
Correctly handle empty string when [un]escaping JSON
2022-01-19 18:10:11 +03:00
Sergei Petrunia
61cd4f4412 MDEV-26711: Values in JSON histograms are not properly quoted
Escape values when serializing to JSON. Un-escape when reading back.
2022-01-19 18:10:11 +03:00
Sergei Petrunia
d03daaf8a8 Use JSON_NAME, not the "histogram_hb_v2" constant 2022-01-19 18:10:10 +03:00
Sergei Petrunia
28ad128585 Fix off-by-one error in Histogram_json_hb::find_bucket 2022-01-19 18:10:10 +03:00
Sergei Petrunia
b179640219 MDEV-26590: Stack smashing/buffer overflow in Histogram_json_hb::parse
Provide buffer of sufficient size.
2022-01-19 18:10:10 +03:00
Sergei Petrunia
382250c05c Address review input 2022-01-19 18:10:10 +03:00
Sergei Petrunia
cf8927e9cb Fix the previous cset: next() should have element_count as parameter 2022-01-19 18:10:10 +03:00
Sergei Petrunia
b6121ca36a Fix compile warnings/error on Windows 2022-01-19 18:10:10 +03:00
Sergei Petrunia
6375873c9a Fixes in opt_histogram_json.cc in the last commits
Aslo add more test coverage
2022-01-19 18:10:10 +03:00
Sergei Petrunia
ace961a1e7 Fix compile error on windows 2022-01-19 18:10:10 +03:00
Sergei Petrunia
f460272054 MDEV-26519: JSON Histograms: improve histogram collection
Basic ideas:
1. Store "popular" values in their own buckets.
2. Also store ndv (Number of Distinct Values) in each bucket.

Because of #1, the buckets are now variable-size, so store the size in
each bucket.

Adjust selectivity estimation functions accordingly.
2022-01-19 18:10:10 +03:00
Sergei Petrunia
d64e104810 Fix compilation on windows 2022-01-19 18:10:10 +03:00
Sergei Petrunia
5ddbd72af4 Correctly decode string field values for pos_in_interval_for_string call 2022-01-19 18:10:10 +03:00
Sergei Petrunia
e0f42d32e5 Fix compilation on windows part #3 2022-01-19 18:10:10 +03:00
Sergei Petrunia
9271bd17f7 More code cleanups
Remove Histogram_*::is_available(), it is not applicable anymore.
Fix compilation on Windows
2022-01-19 18:10:10 +03:00
Sergei Petrunia
1d98168547 Move JSON histograms code into its own files 2022-01-19 18:10:10 +03:00