Here are my questions and Arash's responses. 

Back-up and Recovery

Q: Is the database backed up, and if so, how frequently? In particular, if we suspect the db has been corrupted, will we be able to recover by retreating to an earlier saved state? Are transactions being written to a binary log?

A: Backups are performed nightly. We have a master-master, active-passive replication set up. That implies binary logs are turned on, as replication uses the binlogs. All transactions are recorded in the binary logs.

Monitoring

Q: Are resources and performance being monitored? If so, are we able to see relevant statistics, plots, etc.?

A: No, we don't have performance monitoring, other than what Nagios is providing, which is monitoring the server only, not the database performance, except for reporting whether the database is actually up or not. Nagios may have the capability to monitor some database performance parameters, but I need to talk to unix-admin to see how that can be enabled for MySQL databases. However, MySQL slow query log is turned on, which can be analyzed to find slow queries.

Configuration

Q: What sort of configuration limits might there be? We ran into a problem some weeks ago on mysql-dev01 (see INC0046808) with InnoDB space. Is this quantity being monitored? Would it be  possible to make it auto-extending? Are there other configurable resource limits we should know about?

A: I do monitor InnoDB space on production databases, so you should not come across that problem on a production database (the problem you are referring to was on mysql-dev01, which is a dev database). 

InnoDB data files are not auto-extending, because we want to prevent those files getting too large.

The configuration limit we do have is the number of concurrent connections to the database, which is limited to 600.

Outage Warning

Q: Since we will be heavily dependent on this database's availability by users in various U.S. timezones and in Europe, we ask that we be given ample warning of any planned outages.

A: Scheduled outages will be communicated to owners of databases in advance. Usually, outages are not urgent, except for when there is an OS patch for a security vulnerability that security may want us to deploy urgently. In that case there may may be only a 1 or 2 day advance notice to users.

  • No labels