This follows up commit
commit 94a520ddbe and
commit 7c5519c12d.
After these changes, the default test suites on a
cmake -DWITH_UBSAN=ON build no longer fail due to passing
null pointers as parameters that are declared to never be null,
but plenty of other runtime errors remain.
There are 2 issues here:
Issue #1: memory allocation.
An IO_CACHE that uses encryption uses a larger buffer (it needs space for the encrypted data,
decrypted data, IO_CACHE_CRYPT struct to describe encryption parameters etc).
Issue #2: IO_CACHE::seek_not_done
When IO_CACHE objects are cloned, they still share the file descriptor.
This means, operation on one IO_CACHE may change the file read position
which will confuse other IO_CACHEs using it.
The fix of these issues would be:
Allocate the buffer to also include the extra size needed for encryption.
Perform seek again after one IO_CACHE reads the file.
order with Galera and encrypt-tmp-files=1
Problem:- If trans_cache (IO_CACHE) uses encrypted tmp file
then on next DML server will crash.
Case:-
Lets take a case , we have a table t1 , We try to do 2 inserts in t1
1. A really long insert so that trans_cache has to use temp_file
2. Just a small insert
Analysis:- Actually server crashes from inside of galera
library.
/lib64/libc.so.6(abort+0x175)[0x7fb5ba779dc5]
/usr/lib64/galera/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5State...
mysys/stacktrace.c:247(my_print_stacktrace)[0x7fb5a714940e]
sql/signal_handler.cc:160(handle_fatal_signal)[0x7fb5a715c1bd]
sql/wsrep_hton.cc:257(wsrep_rollback)[0x7fb5bcce923a]
sql/wsrep_hton.cc:268(wsrep_rollback)[0x7fb5bcce9368]
sql/handler.cc:1658(ha_rollback_trans(THD*, bool))[0x7fb5bcd4f41a]
sql/handler.cc:1483(ha_commit_trans(THD*, bool))[0x7fb5bcd4f804]
but actual issue is not in galera but in mariadb, because for 2nd
insert we should never call rollback. We are calling rollback because
log_and_order fails it fails because write_cache fails , It fails
because after reinit_io_cache(trans_cache) , my_b_bytes_in_cache says 0
so we look into tmp_file for data , which is obviously wrong since temp
was used for previous insert and it no longer exist.
wsrep_write_cache_inc() reads the IO_CACHE in a loop, filling it with
my_b_fill() until it returns "0 bytes read". Later
MYSQL_BIN_LOG::write_cache() does the same. wsrep_write_cache_inc()
assumes that reading a zero bytes past EOF leaves the old data in the
cache
Solution:- There is two issue in my_b_encr_read
1st we should never equal read_end to info->buffer. I mean this
does not make sense read_end should always point to end of buffer.
2nd For most of the case(apart from async IO_CACHE) info->pos_in_file
should be equal to info->buffer position wrt to temp file , since
in this case we are not changing info->buffer it should remain
unchanged.
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
--encrypt-binlog and --encrypt-tmp-files used to mean
"encrypt XXX if encryption is available, otherwise don't encrypt",
now they mean "encrypt or fail with an error".
Instead of encrypt(src, dst, key, iv) that encrypts all
data in one go, now we have encrypt_init(key,iv),
encrypt_update(src,dst), and encrypt_finish(dst).
This also causes collateral changes in the internal my_crypt.cc
encryption functions and in the encryption service.
There are wrappers to provide the old all-at-once encryption
functionality. But binlog events are often written piecewise,
they'll need the new api.