mirror of
https://github.com/MariaDB/server.git
synced 2025-04-13 18:55:33 +02:00

For the adaptive hash index, dtuple_fold() and rec_fold() were employing a slow rolling hash algorithm, computing hash values ("fold") for one field and one byte at a time, while depending on calls to rec_get_offsets(). We already have optimized implementations of CRC-32C and have been successfully using that function in some other InnoDB tables, but not yet in the adaptive hash index. Any linear function such as any CRC will fail the avalanche test that any cryptographically secure hash function is expected to pass: any single-bit change in the input key should affect on average half the bits in the output. But we always were happy with less than cryptographically secure: in fact, ut_fold_ulint_pair() or ut_fold_binary() are just about as linear as any CRC, using a combination of multiplication and addition, partly carry-less. It is worth noting that exclusive-or corresponds to carry-less subtraction or addition in a binary Galois field, or GF(2). We only need some way of reducing key prefixes into hash values. The CRC-32C should be better than a Rabin–Karp rolling hash algorithm. Compared to the old hash algorithm, it has the drawback that there will be only 32 bits of entropy before we choose the hash table cell by a modulus operation. The size of each adaptive hash index array is (innodb_buffer_pool_size / 512) / innodb_adaptive_hash_index_parts. With the maximum number of partitions (512), we would not exceed 1<<32 elements per array until the buffer pool size exceeds 1<<50 bytes (1 PiB). We would hit other limits before that: the virtual address space on many contemporary 64-bit processor implementations is only 48 bits (256 TiB). So, we can simply go for the SIMD accelerated CRC-32C. rec_fold(): Take a combined parameter n_bytes_fields. Determine the length of each field on the fly, and compute CRC-32C over a single contiguous range of bytes, from the start of the record payload area to the end of the last full or partial field. For secondary index records in ROW_FORMAT=REDUNDANT, also the data area that is reserved for NULL values (to facilitate in-place updates between NULL and NOT NULL values) will be included in the count. Luckily, InnoDB always zero-initialized such unused area; refer to data_write_sql_null() in rec_convert_dtuple_to_rec_old(). For other than ROW_FORMAT=REDUNDANT, no space is allocated for NULL values, and therefore the CRC-32C will only cover the actual payload of the key prefix. dtuple_fold(): For ROW_FORMAT=REDUNDANT, include the dummy NULL values in the CRC-32C, so that the values will be comparable with rec_fold(). innodb_ahi-t: A unit test for rec_fold() and dtuple_fold(). btr_search_build_page_hash_index(), btr_search_drop_page_hash_index(): Use a fixed-size stack buffer for computing the fold values, to avoid dynamic memory allocation. btr_search_drop_page_hash_index(): Do not release part.latch if we need to invoke multiple batches of rec_fold(). dtuple_t: Allocate fewer bits for the fields. The maximum number of data fields is about 1023, so uint16_t will be fine for them. The info_bits is stored in less than 1 byte. ut_pair_min(), ut_pair_cmp(): Remove. We can actually combine and compare int(n_fields << 16 | n_bytes). PAGE_CUR_LE_OR_EXTENDS, PAGE_CUR_DBG: Remove. These were never defined, because they would only work with latin1_swedish_ci if at all. btr_cur_t::check_mismatch(): Replaces !btr_search_check_guess(). cmp_dtuple_rec_bytes(): Replaces cmp_dtuple_rec_with_match_bytes(). Determine the offsets of fields on the fly. page_cur_try_search_shortcut_bytes(): This caller of cmp_dtuple_rec_bytes() will not be invoked on the change buffer tree. cmp_dtuple_rec_leaf(): Replaces cmp_dtuple_rec_with_match() for comparing leaf-page records. buf_block_t::ahi_left_bytes_fields: Consolidated Atomic_relaxed<uint32_t> of curr_left_side << 31 | curr_n_bytes << 16 | curr_n_fields. The other set of parameters (n_fields, n_bytes, left_side) was removed as redundant. btr_search_update_hash_node_on_insert(): Merged to btr_search_update_hash_on_insert(). btr_search_build_page_hash_index(): Take combined left_bytes_fields instead of n_fields, n_bytes, left_side. btr_search_update_block_hash_info(), btr_search_update_hash_ref(): Merged to btr_search_info_update_hash(). btr_cur_t::n_bytes_fields: Replaces n_bytes << 16 | n_fields. We also remove many redundant checks of btr_search.enabled. If we are holding any btr_sea::partition::latch, then a nonnull pointer in buf_block_t::index must imply that the adaptive hash index is enabled. Reviewed by: Vladislav Lesin
367 lines
11 KiB
C++
367 lines
11 KiB
C++
/*****************************************************************************
|
|
|
|
Copyright (c) 1994, 2016, Oracle and/or its affiliates. All Rights Reserved.
|
|
Copyright (c) 2019, 2021, MariaDB Corporation.
|
|
|
|
This program is free software; you can redistribute it and/or modify it under
|
|
the terms of the GNU General Public License as published by the Free Software
|
|
Foundation; version 2 of the License.
|
|
|
|
This program is distributed in the hope that it will be useful, but WITHOUT
|
|
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
|
|
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License along with
|
|
this program; if not, write to the Free Software Foundation, Inc.,
|
|
51 Franklin Street, Fifth Floor, Boston, MA 02110-1335 USA
|
|
|
|
*****************************************************************************/
|
|
|
|
/******************************************************************//**
|
|
@file include/ut0ut.h
|
|
Various utilities
|
|
|
|
Created 1/20/1994 Heikki Tuuri
|
|
***********************************************************************/
|
|
|
|
#ifndef ut0ut_h
|
|
#define ut0ut_h
|
|
|
|
/* Do not include univ.i because univ.i includes this. */
|
|
|
|
#include <ostream>
|
|
#include <sstream>
|
|
#include <string.h>
|
|
|
|
#ifndef UNIV_INNOCHECKSUM
|
|
|
|
#include "db0err.h"
|
|
|
|
#include <time.h>
|
|
|
|
#ifndef MYSQL_SERVER
|
|
#include <ctype.h>
|
|
#endif /* MYSQL_SERVER */
|
|
|
|
#include <stdarg.h>
|
|
|
|
#include <string>
|
|
|
|
/** Index name prefix in fast index creation, as a string constant */
|
|
#define TEMP_INDEX_PREFIX_STR "\377"
|
|
|
|
#define ut_max std::max
|
|
#define ut_min std::min
|
|
|
|
/*************************************************************//**
|
|
Calculates fast the remainder of n/m when m is a power of two.
|
|
@param n in: numerator
|
|
@param m in: denominator, must be a power of two
|
|
@return the remainder of n/m */
|
|
template <typename T> inline T ut_2pow_remainder(T n, T m){return n & (m - 1);}
|
|
/*************************************************************//**
|
|
Calculates the biggest multiple of m that is not bigger than n
|
|
when m is a power of two. In other words, rounds n down to m * k.
|
|
@param n in: number to round down
|
|
@param m in: alignment, must be a power of two
|
|
@return n rounded down to the biggest possible integer multiple of m */
|
|
template <typename T> inline T ut_2pow_round(T n, T m) { return n & ~(m - 1); }
|
|
/********************************************************//**
|
|
Calculates the smallest multiple of m that is not smaller than n
|
|
when m is a power of two. In other words, rounds n up to m * k.
|
|
@param n in: number to round up
|
|
@param m in: alignment, must be a power of two
|
|
@return n rounded up to the smallest possible integer multiple of m */
|
|
#define UT_CALC_ALIGN(n, m) ((n + m - 1) & ~(m - 1))
|
|
template <typename T> inline T ut_calc_align(T n, T m)
|
|
{ return static_cast<T>(UT_CALC_ALIGN(n, m)); }
|
|
|
|
/**********************************************************//**
|
|
Returns the number of milliseconds since some epoch. The
|
|
value may wrap around. It should only be used for heuristic
|
|
purposes.
|
|
@return ms since epoch */
|
|
ulint
|
|
ut_time_ms(void);
|
|
/*============*/
|
|
#endif /* !UNIV_INNOCHECKSUM */
|
|
|
|
/** Determine how many bytes (groups of 8 bits) are needed to
|
|
store the given number of bits.
|
|
@param b in: bits
|
|
@return number of bytes (octets) needed to represent b */
|
|
#define UT_BITS_IN_BYTES(b) (((b) + 7) >> 3)
|
|
|
|
/** Determines if a number is zero or a power of two.
|
|
@param[in] n number
|
|
@return nonzero if n is zero or a power of two; zero otherwise */
|
|
#define ut_is_2pow(n) (!((n) & ((n) - 1)))
|
|
|
|
/** Functor that compares two C strings. Can be used as a comparator for
|
|
e.g. std::map that uses char* as keys. */
|
|
struct ut_strcmp_functor
|
|
{
|
|
bool operator()(
|
|
const char* a,
|
|
const char* b) const
|
|
{
|
|
return(strcmp(a, b) < 0);
|
|
}
|
|
};
|
|
|
|
/**********************************************************//**
|
|
Prints a timestamp to a file. */
|
|
void
|
|
ut_print_timestamp(
|
|
/*===============*/
|
|
FILE* file) /*!< in: file where to print */
|
|
ATTRIBUTE_COLD __attribute__((nonnull));
|
|
|
|
#ifndef UNIV_INNOCHECKSUM
|
|
|
|
/**********************************************************//**
|
|
Sprintfs a timestamp to a buffer, 13..14 chars plus terminating NUL. */
|
|
void
|
|
ut_sprintf_timestamp(
|
|
/*=================*/
|
|
char* buf); /*!< in: buffer where to sprintf */
|
|
|
|
/*************************************************************//**
|
|
Prints the contents of a memory buffer in hex and ascii. */
|
|
void
|
|
ut_print_buf(
|
|
/*=========*/
|
|
FILE* file, /*!< in: file where to print */
|
|
const void* buf, /*!< in: memory buffer */
|
|
ulint len); /*!< in: length of the buffer */
|
|
|
|
/*************************************************************//**
|
|
Prints the contents of a memory buffer in hex. */
|
|
void
|
|
ut_print_buf_hex(
|
|
/*=============*/
|
|
std::ostream& o, /*!< in/out: output stream */
|
|
const void* buf, /*!< in: memory buffer */
|
|
ulint len) /*!< in: length of the buffer */
|
|
MY_ATTRIBUTE((nonnull));
|
|
/*************************************************************//**
|
|
Prints the contents of a memory buffer in hex and ascii. */
|
|
void
|
|
ut_print_buf(
|
|
/*=========*/
|
|
std::ostream& o, /*!< in/out: output stream */
|
|
const void* buf, /*!< in: memory buffer */
|
|
ulint len) /*!< in: length of the buffer */
|
|
MY_ATTRIBUTE((nonnull));
|
|
|
|
/* Forward declaration of transaction handle */
|
|
struct trx_t;
|
|
|
|
/** Get a fixed-length string, quoted as an SQL identifier.
|
|
If the string contains a slash '/', the string will be
|
|
output as two identifiers separated by a period (.),
|
|
as in SQL database_name.identifier.
|
|
@param [in] trx transaction (NULL=no quotes).
|
|
@param [in] name table name.
|
|
@retval String quoted as an SQL identifier.
|
|
*/
|
|
std::string
|
|
ut_get_name(
|
|
const trx_t* trx,
|
|
const char* name);
|
|
|
|
/**********************************************************************//**
|
|
Outputs a fixed-length string, quoted as an SQL identifier.
|
|
If the string contains a slash '/', the string will be
|
|
output as two identifiers separated by a period (.),
|
|
as in SQL database_name.identifier. */
|
|
void
|
|
ut_print_name(
|
|
/*==========*/
|
|
FILE* ef, /*!< in: stream */
|
|
const trx_t* trx, /*!< in: transaction */
|
|
const char* name); /*!< in: table name to print */
|
|
|
|
/**********************************************************************//**
|
|
Catenate files. */
|
|
void
|
|
ut_copy_file(
|
|
/*=========*/
|
|
FILE* dest, /*!< in: output file */
|
|
FILE* src); /*!< in: input file to be appended to output */
|
|
|
|
/*************************************************************//**
|
|
Convert an error number to a human readable text message. The
|
|
returned string is static and should not be freed or modified.
|
|
@return string, describing the error */
|
|
const char*
|
|
ut_strerr(
|
|
/*======*/
|
|
dberr_t num); /*!< in: error number */
|
|
|
|
#endif /* !UNIV_INNOCHECKSUM */
|
|
|
|
namespace ib {
|
|
|
|
/** This is a wrapper class, used to print any unsigned integer type
|
|
in hexadecimal format. The main purpose of this data type is to
|
|
overload the global operator<<, so that we can print the given
|
|
wrapper value in hex. */
|
|
struct hex {
|
|
explicit hex(uintmax_t t): m_val(t) {}
|
|
const uintmax_t m_val;
|
|
};
|
|
|
|
/** This is an overload of the global operator<< for the user defined type
|
|
ib::hex. The unsigned value held in the ib::hex wrapper class will be printed
|
|
into the given output stream in hexadecimal format.
|
|
@param[in,out] lhs the output stream into which rhs is written.
|
|
@param[in] rhs the object to be written into lhs.
|
|
@retval reference to the output stream. */
|
|
inline
|
|
std::ostream&
|
|
operator<<(
|
|
std::ostream& lhs,
|
|
const hex& rhs)
|
|
{
|
|
std::ios_base::fmtflags ff = lhs.flags();
|
|
lhs << std::showbase << std::hex << rhs.m_val;
|
|
lhs.setf(ff);
|
|
return(lhs);
|
|
}
|
|
|
|
/** This is a wrapper class, used to print any number in IEC style */
|
|
struct bytes_iec {
|
|
explicit bytes_iec(unsigned long long t): m_val(t) {}
|
|
double get_double() const { return static_cast<double>(m_val); }
|
|
const unsigned long long m_val;
|
|
};
|
|
|
|
/** Like hex operator above, except for bytes_iec */
|
|
std::ostream &operator<<(std::ostream &lhs, const bytes_iec &rhs);
|
|
|
|
/** The class logger is the base class of all the error log related classes.
|
|
It contains a std::ostringstream object. The main purpose of this class is
|
|
to forward operator<< to the underlying std::ostringstream object. Do not
|
|
use this class directly, instead use one of the derived classes. */
|
|
class logger
|
|
{
|
|
protected:
|
|
/* This class must not be used directly */
|
|
ATTRIBUTE_COLD ATTRIBUTE_NOINLINE logger() = default;
|
|
public:
|
|
template<typename T> ATTRIBUTE_COLD ATTRIBUTE_NOINLINE
|
|
logger& operator<<(const T& rhs)
|
|
{
|
|
m_oss << rhs;
|
|
return *this;
|
|
}
|
|
|
|
/** Handle a fixed character string in the same way as a pointer to
|
|
an unknown-length character string, to reduce object code bloat. */
|
|
template<size_t N> logger& operator<<(const char (&rhs)[N])
|
|
{ return *this << static_cast<const char*>(rhs); }
|
|
|
|
/** Output an error code name */
|
|
ATTRIBUTE_COLD logger& operator<<(dberr_t err);
|
|
|
|
/** Append a string.
|
|
@param buf string buffer
|
|
@param size buffer size
|
|
@return the output stream */
|
|
ATTRIBUTE_COLD __attribute__((noinline))
|
|
std::ostream &write(const char *buf, std::streamsize size)
|
|
{
|
|
return m_oss.write(buf, size);
|
|
}
|
|
|
|
std::ostream &write(const byte *buf, std::streamsize size)
|
|
{ return write(reinterpret_cast<const char*>(buf), size); }
|
|
|
|
std::ostringstream m_oss;
|
|
};
|
|
|
|
/** The class info is used to emit informational log messages. It is to be
|
|
used similar to std::cout. But the log messages will be emitted only when
|
|
the dtor is called. The preferred usage of this class is to make use of
|
|
unnamed temporaries as follows:
|
|
|
|
info() << "The server started successfully.";
|
|
|
|
In the above usage, the temporary object will be destroyed at the end of the
|
|
statement and hence the log message will be emitted at the end of the
|
|
statement. If a named object is created, then the log message will be emitted
|
|
only when it goes out of scope or destroyed. */
|
|
class info : public logger {
|
|
public:
|
|
ATTRIBUTE_COLD
|
|
~info();
|
|
};
|
|
|
|
/** The class warn is used to emit warnings. Refer to the documentation of
|
|
class info for further details. */
|
|
class warn : public logger {
|
|
public:
|
|
ATTRIBUTE_COLD
|
|
~warn();
|
|
};
|
|
|
|
/** The class error is used to emit error messages. Refer to the
|
|
documentation of class info for further details. */
|
|
class error : public logger {
|
|
public:
|
|
ATTRIBUTE_COLD
|
|
~error();
|
|
/** Indicates that error::~error() was invoked. Can be used to
|
|
determine if error messages were logged during innodb code execution.
|
|
@return true if there were error messages, false otherwise. */
|
|
static bool was_logged() { return logged; }
|
|
|
|
private:
|
|
/** true if error::~error() was invoked, false otherwise */
|
|
static bool logged;
|
|
};
|
|
|
|
/** The class fatal is used to emit an error message and stop the server
|
|
by crashing it. Use this class when MySQL server needs to be stopped
|
|
immediately. Refer to the documentation of class info for usage details. */
|
|
class fatal : public logger {
|
|
public:
|
|
ATTRIBUTE_NORETURN
|
|
~fatal();
|
|
};
|
|
|
|
/** Emit an error message if the given predicate is true, otherwise emit a
|
|
warning message */
|
|
class error_or_warn : public logger {
|
|
public:
|
|
ATTRIBUTE_COLD
|
|
error_or_warn(bool pred)
|
|
: m_error(pred)
|
|
{}
|
|
|
|
ATTRIBUTE_COLD
|
|
~error_or_warn();
|
|
private:
|
|
const bool m_error;
|
|
};
|
|
|
|
/** Emit a fatal message if the given predicate is true, otherwise emit a
|
|
error message. */
|
|
class fatal_or_error : public logger {
|
|
public:
|
|
ATTRIBUTE_COLD
|
|
fatal_or_error(bool pred)
|
|
: m_fatal(pred)
|
|
{}
|
|
|
|
ATTRIBUTE_COLD
|
|
~fatal_or_error();
|
|
private:
|
|
const bool m_fatal;
|
|
};
|
|
|
|
} // namespace ib
|
|
|
|
#endif
|
|
|