Sometimes I would get "can't connect to database" and other mysql errors reported in the LMCE Launch Manager. Other times the Orbiter Gen would fail or the Orbiter launch would just fail.
I knew it was unlikely to be hardware 'cos if I rebooted and it made it all the way through to Media Director and Orbiter startup, that the system would run 100% reliably for up to a week.
After a day or 2 of looking at the system, I found the problem- during mysql startup it runs a fast check of all the database tables. One of the pluto databases was corrupted, and it was corrupted in a way that would actually crash the mysqld process when it tried to check it. The mysqld process would immediately get respawned but anything that had already connected to the DB during bootup would be disconnected.
So I'd be getting messages about "can't connect to mysql using socket", but when I would try it myself from the command line it would work fine.
I found it by disabling the mysqlcheck in the mysql startup scripts, enabling error logging in the /etc/mysql/my.cnf file, and I could then see which DB table it was crashing on.
Then just had to open the DB in mysql using the command line interface to mysql, and optimize the table. This has fixed the table enough to stop it crashing with the mysqlcheck step. But it is still corrupt in some way (logs reporting incorrect timestamps) so will have to do a full fix tonight and try to unload/reload the database. From memory the table was "File". Can't remember which pluto DB, easy enough to find cd /var/lib/mysql/ then do a find . -name "File".
Of course, the symptoms make perfect sense now, any processes that were connected to the DB during bootup when mysql was in the middle of checking and crashed/respawned would be disconnected and fail. As my DB tables for Myth grew it would take longer to check until it got to the point where it was hitting the right point in the bootup process to disconnect the processes started before the mysqlcheck.
Will append more "how to recover the DB table" once I've done it.
This was a real stinker of a problem- intermittent ones always are!
One thing which is a pain is that the "InnoDB" style of database whcih is being used for the pluto DBs and tables does not have the ability to recover by doing a self repair, unlike the ISAM databases which are used by Mythtv. It'd be interesting to know why the InnoDB engine was used in stead of the ISAM engine (MySQL gives you a choice)
OK I was wrong about InnoDB- InnoDB has a lot more function such as row locking and ACID etc so it is v good to use this instead of ISAM. Though you don't get the self-repair facilities of ISAM