fil_node_t::find_metadata() tries to find out whether file
is on an SSD, and the disk sector size.
On Windows, it opens the corresponding volume for finding this data.
This does not go well, if datadir is on network path/UNC. The volume name
is invalid, CreateFile() function fails, and a cryptic (from the end user
perspective) error is reported. Like this
[ERROR] InnoDB: File \\.\\\workpc\work: 'CreateFile()' returned OS error 203.
The fix is not to report error if open volume failed, and the path was not
on fixed disk, i.e not on HDD or SSD. This is not a fatal error, there is
a fallback anyway.
- wait notification, tpool_wait_begin/tpool_wait_end - to notify the
threadpool that current thread is going to wait
Use it to wait for IOs to complete and also when purge waits for workers.
Before commit 90c52e5291 introduced
aligned_malloc(), InnoDB always used a pattern of over-allocating
memory and invoking ut_align() to guarantee the desired alignment.
It is cleaner to invoke aligned_malloc() and aligned_free() directly.
ut_align(): Remove. In assertions, ut_align_down() can be used instead.
The function fsp_header_get_space_id() returns ulint instead of
uint32_t, only to be able to complain that the two adjacent
tablespace ID fields in the page differ. Remove the function,
and merge the check to the callers.
Also, make some more use of aligned_malloc().
Ensure that pointer to a buffer which is used for both reads and writes
is properly aligned.
I've intentially put permanent check right before io_submit() call instead
of the time of setting iocb date. I think that'll make check better.
LinuxAIOHandler::resever_slot(): check arguments alignment in debug builds
Replace all io_context* occurrences with io_context_t
Even in release mode die immediately when some io_* functions return
EINVAL. This always means some programming bug and it's better to fail fast.
LinuxAIOHandler::resubmit(): fix condition. Stop ignoring -1 return code which
corresponds to EPERM and io_submit() really can return this one.
Use io_destroy() to stop leaking io_context_t.
Make m_aio_ctx std::vector instead of C array. I think that internal check
for index overflow might be useful.
Add debug assertions for EFAULT because for me receiving it
looks like a programming bug.
Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.
- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling
- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.
- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.
- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.
- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
Ignore GetDiskFreeSpace() errors in os_file_get_status_win32
The call is only used to calculate filesystem block size, and this in
turn is only shown in information_schema.sys_tablespaces.FS_BLOCK_SIZE.
There is no other use of this field, it does not affect any Innodb
functionality
Replace ut_usectime() with my_interval_timer(),
which is equivalent, but monotonically counting nanoseconds
instead of counting the microseconds of real time.
os_event_wait_time_low(): Use my_hrtime() instead of ut_usectime().
FIXME: Set a clock attribute on the condition variable that allows
a monotonic clock to be chosen as the time base, so that the wait
is immune to adjustments of the system clock.
According to the code, it was Windows specific "simulated AIO"
workaround. The simulated s not supported on Windows anymore.
Thus, remove the dead code
Many InnoDB internal variables and counters were only exposed
in an unstructured fashion via SHOW ENGINE INNODB STATUS.
Expose more variables via SHOW STATUS. Many of these were
exported in XtraDB.
Also, introduce SHOW_SIZE_T and use the proper size for
exporting the InnoDB variables.
Remove some unnecessary indirection via export_vars, and
bind some variables directly.
dict_sys_t::rough_size(): Replaces dict_sys_get_size()
and includes the hash table sizes.
This is based on a contribution by Tony Liu from ServiceNow.
Some I/O functions and macros that are declared in os0file.h used to
return a Boolean status code (nonzero on success). In MySQL 5.7, they
were changed to return dberr_t instead. Alas, in MariaDB Server 10.2,
some uses of functions were not adjusted to the changed return value.
Until MDEV-19231, the valid values of dberr_t were always nonzero.
This means that some code that was incorrectly checking for a zero
return value from the functions would never detect a failure.
After MDEV-19231, some tests for ALTER ONLINE TABLE would fail with
cmake -DPLUGIN_PERFSCHEMA=NO. It turned out that the wrappers
pfs_os_file_read_no_error_handling_int_fd_func() and
pfs_os_file_write_int_fd_func() were wrongly returning
bool instead of dberr_t. Also the callers of these functions were
wrongly expecting bool (nonzero on success) instead of dberr_t.
This mistake had been made when the addition of these functions was
merged from MySQL 5.6.36 and 5.7.18 into MariaDB Server 10.2.7.
This fix also reverts commit 40becbc3c7
which attempted to work around the problem.
InnoDB duplicates file descriptor returned by create_temp_file() to
workaround further inconsistent use of this descriptor.
Use mysys file descriptors consistently for innobase_mysql_tmpfile(NULL).
Mostly close it by appropriate mysys wrappers.
Windows does atomic writes, as long as they are aligned and multiple
of sector size. this is documented in MSDN.
Fix innodb.doublewrite test to always use doublewrite buffer,
(even if atomic writes are autodetected)
Problem:
io_getevents() - read asynchronous I/O events from the completion
queue. For each IO event, the res field in io_event tells whether IO
event is succeeded or not. To see if the IO actually succeeded we
always need to check event.res (negative=error,
positive=bytesread/written).
LinuxAIOHandler::collect() doesn't check event.res value for each event.
which leads to incorrect value in n_bytes for IO context (or IO Slot).
Fix:
Added a check for event.res negative value.
RB: 20871
Reviewed by : annamalai.gurusami@oracle.com