The cause of im_daemon_life_cycle.imtest random failures was the following
behaviour of some implementations of LINUX threads: let's suppose that a
process has several threads (in LINUX threads, there is a separate process for
each thread). When the main process gets killed, the parent receives SIGCHLD
before all threads (child processes) die. In other words, the parent receives
SIGCHLD, when its child is not completely dead.
In terms of IM, that means that IM-angel receives SIGCHLD when IM-main is not dead
and still holds some resources. After receiving SIGCHLD, IM-angel restarts
IM-main, but IM-main failed to initialize, because previous instance (copy) of
IM-main still holds server socket (TCP-port).
Another problem here was that IM-angel restarted IM-main only if it was killed
by signal. If it exited with error, IM-angel thought it's intended / graceful
shutdown and exited itself.
So, when the second instance of IM-main failed to initialize, IM-angel thought
it's intended shutdown and quit.
The fix is
1. to change IM-angel so that it restarts IM-main if it exited with error code;
2. to change IM-main so that it returns proper exit code in case of failure.
im_daemon_life_cycle fails randomly.
1. Move IM-angel functionality into a separate file, create Angel class.
2. Be more verbose;
3. Fix typo in FLUSH INSTANCES implementation;
4. Polishing.
Corrected spelling in copyright text
Makefile.am:
Don't update the files from BitKeeper
Many files:
Removed "MySQL Finland AB & TCX DataKonsult AB" from copyright header
Adjusted year(s) in copyright header
Many files:
Added GPL copyright text
Removed files:
Docs/Support/colspec-fix.pl
Docs/Support/docbook-fixup.pl
Docs/Support/docbook-prefix.pl
Docs/Support/docbook-split
Docs/Support/make-docbook
Docs/Support/make-makefile
Docs/Support/test-make-manual
Docs/Support/test-make-manual-de
Docs/Support/xwf
- BUG#22306: STOP INSTANCE can not be applied for instances in Crashed,
Failed and Abandoned;
- BUG#23476: DROP INSTANCE does not work
- BUG#23215: STOP INSTANCE takes too much time
BUG#22306:
The problem was that STOP INSTANCE checked that mysqld is up and running.
If it was not so, STOP INSTANCE reported an error. Now, STOP INSTANCE
reports an error if the instance has been started (mysqld can be down).
BUG#23476:
The problem was that DROP INSTANCE tried to stop inactive instance. The fix is
trivial.
BUG#23215:
The problem was that locks were not acquired properly, so the
instance-monitoring thread could not acquire the mutex, holded by the
query-processing thread.
The fix is to simplify locking scheme by moving instance-related information to
Instance-class out of Guardian-class. This allows to get rid of storing a
separate list of Instance-information in Guardian and keeping it synchronized
with the original list in Instance_map.
- change some return types from int to bool;
- add [ERROR] tag to log_error() output;
- add [INFO] tag to log_info() output;
- change log messages to be more consistent.
1) add support for joinable threads to Thread class;
2) move checking of thread model to Manager from mysqlmanager.cc,
because it is needed only for IM-main process.
Alik's patch for BUG#22306: STOP INSTANCE can not be applied for
instances in Crashed, Failed and Abandoned" to ease review process.
Evaluate global variable linuxthreads before starting threads to avoid
a race.
spawned threads with a reusable class Thread.
This is the second idea implemented in the Alik's patch for
BUG#22306: STOP INSTANCE can not be applied for instances in Crashed,
Failed and Abandoned.
Commiting separately to ease review process.
The problem was that IM stoped guarded instances on shutdown,
but didn't wait for them to stop.
The fix is to wait for guarded instances to stop before exitting
from the main thread.
The idea is that Instance-monitoring thread should add itself
to Thread_registry so that it will be taken into account on shutdown.
However, Thread_registry should not signal it on shutdown in order to
not interrupt wait()/waitpid().
- WL#3158: IM: Instance configuration extensions;
- WL#3159: IM: --bootstrap and --start-default-instance modes
The following new statements have been added:
- CREATE INSTANCE;
- DROP INSTANCE;
The behaviour of the following statements have been changed:
- SET;
- UNSET;
- FLUSH INSTANCES;
- SHOW INSTANCES;
- SHOW INSTANCE OPTIONS;
tests fail on FreeBSD.
The patch contains of the following:
- make Instance Manager, running in the daemon mode, dump
the pid of angel-process in the special file;
- default value of angel-pid-file-name is 'mysqlmanager.angel.pid';
- if ordinary (IM) pid-file-name is specified in the configuration,
angel-pid-file-name is updated according to the following
rule: extension of the basename of pid-file-name is replaced by
'.angel.pid.
For example:
- pid-file-name: /tmp/im.pid
=> angel-pid-file-name: /tmp/im.angel.pid
- pid-file-name: /tmp/im.txt
=> angel-pid-file-name: /tmp/im.angel.pid
- pid-file-name: /tmp/5.0/im
=> angel-pid-file-name: /tmp/5.0/im.angel.pid
- add support for configuration option to customize angel
pid file name;
- fix test suite to use angel pid to kill Instance Manager
by all means if something went wrong.
Background
----------
The problem is that on some OSes (FreeBSD for one) Instance
Manager does not get SIGTERM, so can not shutdown gracefully.
Test suite wasn't able to cope with it, so this leads to the
mess in test results.
The problem should be split into two:
- fix signal handling;
- fix test suite.
This patch fixes test suite so that it will be able to kill
uncooperative Instance Manager. In order to achieve this,
test suite needs to know PID of IM Angel process.
duration of the whole 'flush instances'. As a consequence,
it was possible to query instance map, while it is in the
inconsistent state. The patch was reworked after review.