AWS Thinkbox Discussion Forums

launcher / monitor hanging

Something happened over the weekend, and the launcher / monitor applications hung up on most machines. These messages are repeating endlessly:

Launcher log:

2014-01-13 09:14:00: Error occurred while updating network settings: An error occurred while trying to connect to the Database ). It is possible that the Mongo Database server is incorrectly configured, currently offline, blocked by a firewall, or experiencing network issues.
2014-01-13 09:14:00: Full error: Unable to connect to a member of the replica set matching the read preference Primary (FranticX.Database.DatabaseConnectionException)
2014-01-13 09:14:00: Error occurred while updating Repository options: An unexpected error occurred while interacting with the database ):
2014-01-13 09:14:00: An error occurred while trying to connect to the Database ). It is possible that the Mongo Database server is incorrectly configured, currently offline, blocked by a firewall, or experiencing network issues.
2014-01-13 09:14:00: Full error: Unable to connect to a member of the replica set matching the read preference Primary (FranticX.Database.DatabaseConnectionException)

Monitor log:
2014-01-13 00:20:52: Error occurred while reloading network settings: An error occurred while trying to connect to the Database ). It is possible that the Mongo Database server is incorrectly configured, currently offline, blocked by a firewall, or experiencing network issues.
2014-01-13 00:20:52: Full error: Unable to connect to a member of the replica set matching the read preference Primary (FranticX.Database.DatabaseConnectionException)
2

You mentioned in another post that you guys had a full power outage over the weekend. It looks like these machines are unable to connect to the database, and that’s why they keep repeating those messages. Ideally though, the Monitor and Launcher shouldn’t lock up in this case, so I’ve logged that as a bug.

Cheers,
Ryan

The power failure last a bit, but not 3 days :slight_smile: These machines were hanging since the power failure, never able to reconnect.

Could only fix by restarting both launcher & slave

Ah, okay. Interesting that they could never reconnect…

I’ve amended the bug we logged to include this information.

Cheers,
Ryan

Privacy | Site terms | Cookie preferences