AWS Thinkbox Discussion Forums

deadline slave auto restarting if it crashes?

Should it auto restart if it crashes? It seems to be quite sensitive right now, we get 10% of our slaves crash usually within a 2 hour period. I would really not want to go all backburner on deadline, and set up a batch file that keeps starting the slave :frowning:

There currently isn’t a system to do this (and there actually has never been one in the past either). In theory, it could be as simple as having a setting in the Launcher (similar to Launch Slave at Startup) that would fire up the slave process if it’s no longer running. You would turn this on for slaves, but off for workstations.

The other way to make this work would be to only restart the slave if it shows up as stalled in the repository. That way, if someone closes the slave properly, it won’t keep starting up again.

Thoughts? We can definitely add this to the wish list, although it might be a post-6.0 thing. We’d rather spend resources on making the slave stable first. :slight_smile:

I saw your other thread about the crashing slave on a post-job task. Is that generally when your slaves crash, or are there other scenarios?

I just remembered that for now, you could use the Slave Scheduler in power management to keep the slaves up and running. Just pick a group for all your render nodes, and then set the times so that they’re always running.

Thanks for the reminder about power management, yep, that should work!

Most of today’s cases were due to that other report (and not starting up after restarting the machine). I agree, focusing on stability is probably the better way to spend the energies

Privacy | Site terms | Cookie preferences