Commit graph

302 commits

Author SHA1 Message Date
Marko Mäkelä
33907f9ec6 Merge 11.4 into 11.7 2024-12-02 17:51:17 +02:00
Marko Mäkelä
2719cc4925 Merge 10.11 into 11.4 2024-12-02 11:35:34 +02:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Brandon Nesterenko
78d7bb1d27 MDEV-34348: Miscellaneous fixes
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict

Various additional fixes, each too small to put into
their own commit.

Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
2024-11-23 08:14:23 -07:00
Brandon Nesterenko
840fe316d4 MDEV-34348: my_hash_get_key fixes
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict

Change the type of my_hash_get_key to:
 1) Return const
 2) Change the context parameter to be const void*

Also fix casting in hash adjacent areas.

Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
2024-11-23 08:14:22 -07:00
Brandon Nesterenko
dbfee9fc2b MDEV-34348: Consolidate cmp function declarations
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict

The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.

This patch consolidates the functions all into the
qsort_cmp2 interface.

Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
2024-11-23 08:14:22 -07:00
Sergei Golubchik
062f8eb37d cleanup: key algorithm vs key flags
the information about index algorithm was stored in two
places inconsistently split between both.

BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user
explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not.

RTREE index had key->algorithm == HA_KEY_ALG_RTREE
and always had key->flags & HA_SPATIAL

FULLTEXT index had  key->algorithm == HA_KEY_ALG_FULLTEXT
and always had key->flags & HA_FULLTEXT

HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF

long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH

In this commit:

All indexes except BTREE and HASH always have key->algorithm
set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except
for storage to keep frms backward compatible).

As a side effect ALTER TABLE now detects FULLTEXT index renames correctly
2024-11-05 14:00:47 -08:00
Sergei Golubchik
32e6f8ff2e cleanup: remove unconditional #ifdef's 2024-11-05 14:00:47 -08:00
Marko Mäkelä
43465352b9 Merge 11.4 into 11.6 2024-10-03 16:09:56 +03:00
Marko Mäkelä
12a91b57e2 Merge 10.11 into 11.2 2024-10-03 13:24:43 +03:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Sergei Golubchik
b88f1267e4 MDEV-33373 part 2: Unexpected ER_FILE_NOT_FOUND upon reading from logging table after crash recovery
CSV engine shoud set my_errno if use it.
2024-09-30 13:50:51 +02:00
Alexander Barkov
4e805aed85 Merge remote-tracking branch 'origin/11.4' into 11.5 2024-07-10 12:17:09 +04:00
Oleksandr Byelkin
2447dda2c0 Merge branch '10.11' into 11.1 2024-07-08 22:40:16 +02:00
Marko Mäkelä
27a3366663 Merge 10.6 into 10.11 2024-06-27 10:26:09 +03:00
Dave Gosselin
db0c28eff8 MDEV-33746 Supply missing override markings
Find and fix missing virtual override markings.  Updates cmake
maintainer flags to include -Wsuggest-override and
-Winconsistent-missing-override.
2024-06-20 11:32:13 -04:00
Yuchen Pei
2d3e2c58b6
Merge branch '10.11' into 11.1 2024-05-31 10:54:31 +10:00
Marko Mäkelä
22ba7e4ff8 Merge 10.6 into 10.11 2024-05-30 16:04:00 +03:00
Monty
b9f5793176 MDEV-9101 Limit size of created disk temporary files and tables
Two new variables added:
- max_tmp_space_usage : Limits the the temporary space allowance per user
- max_total_tmp_space_usage: Limits the temporary space allowance for
  all users.

New status variables: tmp_space_used & max_tmp_space_used
New field in information_schema.process_list: TMP_SPACE_USED

The temporary space is counted for:
- All SQL level temporary files. This includes files for filesort,
  transaction temporary space, analyze, binlog_stmt_cache etc.
  It does not include engine internal temporary files used for repair,
  alter table, index pre sorting etc.
- All internal on disk temporary tables created as part of resolving a
  SELECT, multi-source update etc.

Special cases:
- When doing a commit, the last flush of the binlog_stmt_cache
  will not cause an error even if the temporary space limit is exceeded.
  This is to avoid giving errors on commit. This means that a user
  can temporary go over the limit with up to binlog_stmt_cache_size.

Noteworthy issue:
- One has to be careful when using small values for max_tmp_space_limit
  together with binary logging and with non transactional tables.
  If a the binary log entry for the query is bigger than
  binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit
  when flushing the entry to disk, the query will abort and the
  binary log will not contain the last changes to the table.
  This will also stop the slave!
  This is also true for all Aria tables as Aria cannot do rollback
  (except in case of crashes)!
  One way to avoid it is to use @@binlog_format=statement for
  queries that updates a lot of rows.

Implementation:
- All writes to temporary files or internal temporary tables, that
  increases the file size, are routed through temp_file_size_cb_func()
  which updates and checks the temp space usage.
- Most of the temporary file monitoring is done inside IO_CACHE.
  Temporary file monitoring is done inside the Aria engine.
- MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache().
  MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means
  that we track the file usage and we give an error if the limit is
  breached. This is used to not give an error on commit when
  binlog_stmp_cache is flushed.
- global_tmp_space_used contains the total tmp space used so far.
  This is needed quickly check against max_total_tmp_space_usage.
- Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and
  handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL.
  This is needed until we move general errors to it's own error space
  so that they cannot conflict with system error numbers.
- Return value of my_chsize() and mysql_file_chsize() has changed
  so that -1 is returned in the case my_chsize() could not decrease
  the file size (very unlikely and will not happen on modern systems).
  All calls to _chsize() are updated to check for > 0 as the error
  condition.
- At the destruction of THD we check that THD::tmp_file_space == 0
- At server end we check that global_tmp_space_used == 0
- As a precaution against errors in the tmp_space_used code, one can set
  max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable
  the tmp space quota errors.
- truncate_io_cache() function added.
- Aria tables using static or dynamic row length are registered in 8K
  increments to avoid some calls to update_tmp_file_size().

Other things:
- Ensure that all handler errors are registered.  Before, some engine
  errors could be printed as "Unknown error".
- Fixed bug in filesort() that causes a assert if there was an error
  when writing to the temporay file.
- Fixed that compute_window_func() now takes into account write errors.
- In case of parallel replication, rpl_group_info::cleanup_context()
  could call trans_rollback() with thd->error set, which would cause
  an assert. Fixed by resetting the error before calling trans_rollback().
- Fixed bug in subselect3.inc which caused following test to use
  heap tables with low value for max_heap_table_size
- Fixed bug in sql_expression_cache where it did not overflow
  heap table to Aria table.
- Added Max_tmp_disk_space_used to slow query log.
- Fixed some bugs in log_slow_innodb.test
2024-05-27 12:39:04 +02:00
Monty
c4cad8d50c MDEV-33449 improving repair of tables
This task is to ensure we have a clear definition and rules of how to
repair or optimize a table.

The rules are:

- REPAIR should be used with tables that are crashed and are
  unreadable (hardware issues with not readable blocks, blocks with
  'unexpected data' etc)
- OPTIMIZE table should be used to optimize the storage layout for the
  table (recover space for delete rows and optimize the index
  structure.
- ALTER TABLE table_name FORCE should be used to rebuild the .frm file
  (the table definition) and the table (with the original table row
  format). If the table is from and older MariaDB/MySQL release with a
  different storage format, it will convert the data to the new
  format. ALTER TABLE ... FORCE is used as part of mariadb-upgrade

Here follows some more background:

The 3 ways to repair a table are:
1) ALTER TABLE table_name FORCE" (not other options).
   As an alias we allow: "ALTER TABLE table_name ENGINE=original_engine"
2) "REPAIR TABLE" (without FORCE)
3) "OPTIMIZE TABLE"

All of the above commands will optimize row space usage (which means that
space will be needed to hold a temporary copy of the table) and
re-generate all indexes. They will also try to replicate the original
table definition as exact as possible.

For ALTER TABLE and "REPAIR TABLE without FORCE", the following will hold:
If the table is from an older MariaDB version and data conversion is
needed (for example for old type HASH columns, MySQL JSON type or new
TIMESTAMP format) "ALTER TABLE table_name FORCE, algorithm=COPY" will be
used.

The differences between the algorithms are
1) Will use the fastest algorithm the engine supports to do a full repair
   of the table (except if data conversions are is needed).
2) Will use the storage engine internal REPAIR facility (MyISAM, Aria).
   If the engine does not support REPAIR then
   "ALTER TABLE FORCE, ALGORITHM=COPY" will be used.
   If there was data incompatibilities (which means that FORCE was used)
   then there will be a warning after REPAIR that ALTER TABLE FORCE is
   still needed.
   The reason for this is that REPAIR may be able to go around data
   errors (wrong incompatible data, crashed or unreadable sectors) that
   ALTER TABLE cannot do.
3) Will use the storage engine internal OPTIMIZE. If engine does not
   support optimize, then "ALTER TABLE FORCE" is used.

The above will ensure that ALTER TABLE FORCE is able to
correct almost any errors in the row or index data.  In case of
corrupted blocks then REPAIR possible followed by ALTER TABLE is needed.
This is important as mariadb-upgrade executes ALTER TABLE table_name
FORCE for any table that must be re-created.

Bugs fixed with InnoDB tables when using ALTER TABLE FORCE:
- No error for INNODB_DEFAULT_ROW_FORMAT=COMPACT even if row length
  would be too wide. (Independent of innodb_strict_mode).
- Tables using symlinks will be symlinked after any of the above commands
  (Independent of the setting of --symbolic-links)

If one specifies an algorithm together with ALTER TABLE FORCE, things
will work as before (except if data conversion is required as then
the COPY algorithm is enforced).

ALTER TABLE .. OPTIMIZE ALL PARTITIONS will work as before.

Other things:
- FORCE argument added to REPAIR to allow one to first run internal
  repair to fix damaged blocks and then follow it with ALTER TABLE.
- REPAIR will not update frm_version if ha_check_for_upgrade() finds
  that table is still incompatible with current version. In this case the
  REPAIR will end with an error.
- REPAIR for storage engines that does not have native repair, like InnoDB,
  is now using ALTER TABLE FORCE.
- REPAIR csv-table USE_FRM now works.
  - It did not work before as CSV tables had extension list in wrong
    order.
- Default error messages length for %M increased from 128 to 256 to not
  cut information from REPAIR.
- Documented HA_ADMIN_XX variables related to repair.
- Added HA_ADMIN_NEEDS_DATA_CONVERSION to signal that we have to
  do data conversions when converting the table (and thus ALTER TABLE
  copy algorithm is needed).
- Fixed typo in error message (caused test changes).
2024-05-27 12:39:03 +02:00
Alexander Barkov
310fd6ff69 Backporting bugs fixes fixed by MDEV-31340 from 11.5
The patch for MDEV-31340 fixed the following bugs:

MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

Backporting the fixes from 11.5 to 10.5
2024-05-21 14:58:01 +04:00
Alexander Barkov
fd247cc21f MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
This patch also fixes:
  MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
  MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
  MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
  MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
  MDEV-33088 Cannot create triggers in the database `MYSQL`
  MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
  MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
  MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
  MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
  MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER

- Adding a wrapper function CHARSET_INFO::streq(), to compare
  two strings for equality. For now it calls strnncoll() internally.
  In the future it will turn into a virtual function.

- Adding new accent sensitive case insensitive collations:
    - utf8mb4_general1400_as_ci
    - utf8mb3_general1400_as_ci
  They implement accent sensitive case insensitive comparison.
  The weight of a character is equal to the code point of its
  upper case variant. These collations use Unicode-14.0.0 casefolding data.

  The result of
     my_charset_utf8mb3_general1400_as_ci.strcoll()
  is very close to the former
     my_charset_utf8mb3_general_ci.strcasecmp()

  There is only a difference in a couple dozen rare characters, because:
    - the switch from "tolower" to "toupper" comparison, to make
      utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
    - the switch from Unicode-3.0.0 to Unicode-14.0.0
  This difference should be tolarable. See the list of affected
  characters in the MDEV description.

  Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
  Unlike utf8mb4_general_ci, it does not treat all BMP characters
  as equal.

- Adding classes representing names of the file based database objects:

    Lex_ident_db
    Lex_ident_table
    Lex_ident_trigger

  Their comparison collation depends on the underlying
  file system case sensitivity and on --lower-case-table-names
  and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.

- Adding classes representing names of other database objects,
  whose names have case insensitive comparison style,
  using my_charset_utf8mb3_general1400_as_ci:

  Lex_ident_column
  Lex_ident_sys_var
  Lex_ident_user_var
  Lex_ident_sp_var
  Lex_ident_ps
  Lex_ident_i_s_table
  Lex_ident_window
  Lex_ident_func
  Lex_ident_partition
  Lex_ident_with_element
  Lex_ident_rpl_filter
  Lex_ident_master_info
  Lex_ident_host
  Lex_ident_locale
  Lex_ident_plugin
  Lex_ident_engine
  Lex_ident_server
  Lex_ident_savepoint
  Lex_ident_charset
  engine_option_value::Name

- All the mentioned Lex_ident_xxx classes implement a method streq():

  if (ident1.streq(ident2))
     do_equal();

  This method works as a wrapper for CHARSET_INFO::streq().

- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
  in class members and in function/method parameters.

- Replacing all calls like
    system_charset_info->coll->strcasecmp(ident1, ident2)
  to
    ident1.streq(ident2)

- Taking advantage of the c++11 user defined literal operator
  for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
  data types. Use example:

  const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;

  is now a shorter version of:

  const Lex_ident_column primary_key_name=
    Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2024-04-18 15:22:10 +04:00
Monty
d9d0e78039 Add limits for how many IO operations a table access will do
This solves the current problem in the optimizer
- SELECT FROM big_table
  - SELECT from small_table where small_table.eq_ref_key=big_table.id

The old code assumed that each eq_ref access will cause an IO.
As the cost of IO is high, this dominated the cost for the later table
which caused the optimizer to prefer table scans + join cache over
index reads.

This patch fixes this issue by limit the number of expected IO calls,
for rows and index separately, to the size of the table or index or
the number of accesses that we except in a range for the index.

The major changes are:

- Adding a new structure ALL_READ_COST that is mainly used in
  best_access_path() to hold the costs parts of the cost we are
  calculating. This allows us to limit the number of IO when multiplying
  the cost with the previous row combinations.
- All storage engine cost functions are changed to return IO_AND_CPU_COST.
  The virtual cost functions should now return in IO_AND_CPU_COST.io
  the number of disk blocks that will be accessed instead of the cost
  of the access.
- We are not limiting the io_blocks for table or index scans as we
  assume that engines may not store these in the 'hot' part of the
  cache. Table and index scan also uses much less IO blocks than
  key accesses, so the original issue is not as critical with scans.

Other things:
  OPT_RANGE now holds a 'Cost_estimate cost' instead a lot of different
  costs. All the old costs, like index_only_read, can be extracted
  from 'cost'.
- Added to the start of some functions 'handler *file= table->file'
  to shorten the code that is using the handler.
- handler->cost() is used to change a ALL_READ_COST or IO_AND_CPU_COST
  to 'cost in milliseconds'
- New functions:  handler::index_blocks() and handler::row_blocks()
  which are used to limit the IO.
- Added index_cost and row_cost to Cost_estimate and removed all not
  needed members.
- Removed cost coefficients from Cost_estimate as these don't make sense
  when costs (except IO_BLOCKS) are in milliseconds.
- Removed handler::avg_io_cost() and replaced it with DISK_READ_COST.
- Renamed best_range_rowid_filter_for_partial_join() to
  best_range_rowid_filter() as using the old name made rows too long.
- Changed all SJ_MATERIALIZATION_INFO 'Cost_estimate' variables to
  'double' as Cost_estimate power was not used for these and thus
  just caused storage and performance overhead.
- Changed cost_for_index_read() to use 'worst_seeks' to only limit
  IO, not number of table accesses. With this patch worst_seeks is
  probably not needed anymore, but I kept it around just in case.
- Applying cost for filter got to be much shorter and easier thanks
  to the API changes.
- Adjusted cost for fulltext keys in collaboration with Sergei Golubchik.
- Most test changes caused by this patch is that table scans are changed
  to use indexes.
- Added ha_seq::keyread_time() and ha_seq::key_scan_time() to get
  make checking number of potential IO blocks easier during debugging.
2023-02-02 23:57:30 +03:00
Monty
b66cdbd1ea Changing all cost calculation to be given in milliseconds
This makes it easier to compare different costs and also allows
the optimizer to optimizer different storage engines more reliably.

- Added tests/check_costs.pl, a tool to verify optimizer cost calculations.
  - Most engine costs has been found with this program. All steps to
    calculate the new costs are documented in Docs/optimizer_costs.txt

- User optimizer_cost variables are given in microseconds (as individual
  costs can be very small). Internally they are stored in ms.
- Changed DISK_READ_COST (was DISK_SEEK_BASE_COST) from a hard disk cost
  (9 ms) to common SSD cost (400MB/sec).
- Removed cost calculations for hard disks (rotation etc).
- Changed the following handler functions to return IO_AND_CPU_COST.
  This makes it easy to apply different cost modifiers in ha_..time()
  functions for io and cpu costs.
  - scan_time()
  - rnd_pos_time() & rnd_pos_call_time()
  - keyread_time()
- Enhanched keyread_time() to calculate the full cost of reading of a set
  of keys with a given number of ranges and optional number of blocks that
  need to be accessed.
- Removed read_time() as keyread_time() + rnd_pos_time() can do the same
  thing and more.
- Tuned cost for: heap, myisam, Aria, InnoDB, archive and MyRocks.
  Used heap table costs for json_table. The rest are using default engine
  costs.
- Added the following new optimizer variables:
  - optimizer_disk_read_ratio
  - optimizer_disk_read_cost
  - optimizer_key_lookup_cost
  - optimizer_row_lookup_cost
  - optimizer_row_next_find_cost
  - optimizer_scan_cost
- Moved all engine specific cost to OPTIMIZER_COSTS structure.
- Changed costs to use 'records_out' instead of 'records_read' when
  recalculating costs.
- Split optimizer_costs.h to optimizer_costs.h and optimizer_defaults.h.
  This allows one to change costs without having to compile a lot of
  files.
- Updated costs for filter lookup.
- Use a better cost estimate in best_extension_by_limited_search()
  for the sorting cost.
- Fixed previous issues with 'filtered' explain column as we are now
  using 'records_out' (min rows seen for table) to calculate filtering.
  This greatly simplifies the filtering code in
  JOIN_TAB::save_explain_data().

This change caused a lot of queries to be optimized differently than
before, which exposed different issues in the optimizer that needs to
be fixed.  These fixes are in the following commits.  To not have to
change the same test case over and over again, the changes in the test
cases are done in a single commit after all the critical change sets
are done.

InnoDB changes:
- Updated InnoDB to not divide big range cost with 2.
- Added cost for InnoDB (innobase_update_optimizer_costs()).
- Don't mark clustered primary key with HA_KEYREAD_ONLY. This will
  prevent that the optimizer is trying to use index-only scans on
  the clustered key.
- Disabled ha_innobase::scan_time() and ha_innobase::read_time() and
  ha_innobase::rnd_pos_time() as the default engine cost functions now
  works good for InnoDB.

Other things:
- Added  --show-query-costs (\Q) option to mysql.cc to show the query
  cost after each query (good when working with query costs).
- Extended my_getopt with GET_ADJUSTED_VALUE which allows one to adjust
  the value that user is given. This is used to change cost from
  microseconds (user input) to milliseconds (what the server is
  internally using).
- Added include/my_tracker.h  ; Useful include file to quickly test
  costs of a function.
- Use handler::set_table() in all places instead of 'table= arg'.
- Added SHOW_OPTIMIZER_COSTS to sys variables. These are input and
  shown in microseconds for the user but stored as milliseconds.
  This is to make the numbers easier to read for the user (less
  pre-zeros).  Implemented in 'Sys_var_optimizer_cost' class.
- In test_quick_select() do not use index scans if 'no_keyread' is set
  for the table. This is what we do in other places of the server.
- Added THD parameter to Unique::get_use_cost() and
  check_index_intersect_extension() and similar functions to be able
  to provide costs to called functions.
- Changed 'records' to 'rows' in optimizer_trace.
- Write more information to optimizer_trace.
- Added INDEX_BLOCK_FILL_FACTOR_MUL (4) and INDEX_BLOCK_FILL_FACTOR_DIV (3)
  to calculate usage space of keys in b-trees. (Before we used numeric
  constants).
- Removed code that assumed that b-trees has similar costs as binary
  trees. Replaced with engine calls that returns the cost.
- Added Bitmap::find_first_bit()
- Added timings to join_cache for ANALYZE table (patch by Sergei Petrunia).
- Added records_init and records_after_filter to POSITION to remember
  more of what best_access_patch() calculates.
- table_after_join_selectivity() changed to recalculate 'records_out'
  based on the new fields from best_access_patch()

Bug fixes:
- Some queries did not update last_query_cost (was 0). Fixed by moving
  setting thd->...last_query_cost in JOIN::optimize().
- Write '0' as number of rows for const tables with a matching row.

Some internals:
- Engine cost are stored in OPTIMIZER_COSTS structure.  When a
  handlerton is created, we also created a new cost variable for the
  handlerton. We also create a new variable if the user changes a
  optimizer cost for a not yet loaded handlerton either with command
  line arguments or with SET
  @@global.engine.optimizer_cost_variable=xx.
- There are 3 global OPTIMIZER_COSTS variables:
  default_optimizer_costs   The default costs + changes from the
                            command line without an engine specifier.
  heap_optimizer_costs      Heap table costs, used for temporary tables
  tmp_table_optimizer_costs The cost for the default on disk internal
                            temporary table (MyISAM or Aria)
- The engine cost for a table is stored in table_share. To speed up
  accesses the handler has a pointer to this. The cost is copied
  to the table on first access. If one wants to change the cost one
  must first update the global engine cost and then do a FLUSH TABLES.
  This was done to be able to access the costs for an open table
  without any locks.
- When a handlerton is created, the cost are updated the following way:
  See sql/keycaches.cc for details:
  - Use 'default_optimizer_costs' as a base
  - Call hton->update_optimizer_costs() to override with the engines
    default costs.
  - Override the costs that the user has specified for the engine.
  - One handler open, copy the engine cost from handlerton to TABLE_SHARE.
  - Call handler::update_optimizer_costs() to allow the engine to update
    cost for this particular table.
  - There are two costs stored in THD. These are copied to the handler
    when the table is used in a query:
    - optimizer_where_cost
    - optimizer_scan_setup_cost
- Simply code in best_access_path() by storing all cost result in a
  structure. (Idea/Suggestion by Igor)
2023-02-02 23:54:45 +03:00
Sergei Golubchik
a398fcbff6 MDEV-26635 ROW_NUMBER is not 0 for errors not caused because of rows 2021-10-26 17:29:40 +02:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Nikita Malyavin
21809f9a45 MDEV-17556 Assertion `bitmap_is_set_all(&table->s->all_set)' failed
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.

This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
 bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
 sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
 directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.

The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.

To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
 orig_*, all_set bitmaps are never substituted already.

This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
 to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
 MY_BITMAP* instead of my_bitmap_map*

These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
2021-01-27 00:50:55 +10:00
Michael Widenius
58e759a939 Added 'final' to some classes to improve generated code
Final added to:
- All reasonable classes inhereted from Field
- All classes inhereted from Protocol
- Almost all Handler classes
- Some important Item classes

The stripped size of mariadbd is just 4K smaller, but several object files
showed notable improvements in common execution paths.
- Checked field.o and item_sum.o

Other things:
- Added 'override' to a few class functions touched by this patch.
- Removed 'virtual' from a new class functions that had/got 'override'
- Changed Protocol_discard to inherit from Protocol instad of Protocol_text
2020-08-04 17:27:32 +02:00
Monty
10b88deb74 Changes needed for ColumnStore and insert cache
MCOL-3875 Columnstore write cache

The main change is to change thr_lock function get_status to
return a value that indicates we have to abort the lock.

Other thing:
- Made start_bulk_insert() and end_bulk_insert() protected so that the
  insert cache can use these
2020-06-14 19:39:42 +03:00
Sergei Golubchik
c1c5222cae cleanup: PSI key is *always* the first argument 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Alexander Barkov
530f3f7cfc MDEV-20806 Federated does not work with INET6, returns NULL with warning ER_TRUNCATED_WRONG_VALUE 2019-10-12 07:25:53 +04:00
Alexander Barkov
a0d3a351b8 MDEV-20790 CSV table with INET6 can be created and inserted into, but cannot be read from 2019-10-11 22:51:01 +04:00
Alexander Barkov
afe6eb499d Revert "MDEV-20342 Turn Field::flags from a member to a method"
This reverts commit e86010f909.

Reverting on Monty's request, as this change makes merging
things from 10.5 to 10.2 much harder.
2019-08-14 20:27:00 +04:00
Alexander Barkov
e86010f909 MDEV-20342 Turn Field::flags from a member to a method 2019-08-14 13:33:01 +04:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
Eugene Kosov
c9aa495fb6 MDEV-19955 make argument of handler::ha_write_row() const
MDEV-19486 and one more similar bug appeared because handler::write_row() interface
welcomes to modify buffer by storage engine. But callers are not ready for that
thus bugs are possible in future.

handler::write_row():
handler::ha_write_row(): make argument const
2019-07-05 13:14:19 +03:00
Robert Bindar
bf70430ead MDEV-17709 Remove handlerton::state 2019-06-06 22:09:31 +04:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Monty
163b34fe25 Optimize flush tables with read lock (FTWRL) to not wait for select's
Part of MDEV-5336 Implement LOCK FOR BACKUP

The idea is that instead of waiting in close_cached_tables() for all
tables to be closed, we instead call flush_tables() that does:
- Flush not used objects in table cache to free memory
- Collect all tables that are open
- Call HA_EXTRA_FLUSH on the objects, to get them into "closed state"
- Added HA_EXTRA_FLUSH support to archive and CSV
- Added multi-user protection to HA_EXTRA_FLUSH in MyISAM and Aria

The benefit compared to old code is:
- FTWRL doesn't have to wait for long running read operations or
  open HANDLER's
2018-12-09 22:12:25 +02:00
Sergei Golubchik
f0f0d07250 MDEV-14500 filesort to support engines with slow rnd_pos
If the engine wants to avoid rnd_pos() - force a temporary table
before a filesort. But don't do it if addon_fields are used.
2018-11-20 15:06:03 +01:00
Alexander Barkov
e61568ee93 Merge remote-tracking branch 'origin/10.3' into 10.4 2018-07-03 14:02:05 +04:00
Sergei Golubchik
36e59752e7 Merge branch '10.2' into 10.3 2018-06-30 16:39:20 +02:00
Vicențiu Ciorbaru
aa59ecec89 Merge branch '10.0' into 10.1 2018-06-12 18:55:27 +03:00
Vicențiu Ciorbaru
170bec36c0 Merge branch '5.5' into 10.0 2018-06-12 17:59:31 +03:00
Sergei Golubchik
e7ca377cb7 MDEV-16342 SHOW ENGINES: MyISAM description is useless
rewrite tautological engine descriptions
2018-06-11 09:57:54 +02:00