AWS Thinkbox Discussion Forums

Auto disable failing render nodes?

Most render managers I’ve worked with will automatically disable a failing render node on a current job, either by default or after a user-set number of failures to prevent the node from continuing to pick up frames it can’t process. E.g. when it is missing plugins or software needed to successfully render.

Can Deadline do this?

My IT people don’t seem to know how to set it up like this, and are currently relying on getting email notification of failures. But this does not automatically take the bad node out of the job and can result in a completely failed job if the number of failures reaches the set job threshold.

I have had a quick look around I don’t see such a feature and don’t have time to look for it – can this be done and can someone point me to it so I can pass the info along?

Sure, see here in your repo config:
docs.thinkboxsoftware.com/produc … -detection

Specifically:
Slave Failure Detection -> “Mark a Slave as bad after it has generated this many errors for a job in a row”

I would highly recommend taking the time to maybe consider using some of the other settings, so you deploy both “JOB” centric and “SLAVE” centric failure detection as you can get both/either bad jobs and/or bad Slaves.

Personally, back in production, I liked the Bill Clinton classic: “3 strikes and you are out!” catchphrase, referring to the thought: “if a Slave can’t render one or more of these tasks from the same job after 3 attempts, chances are it’s never gonna be able to, so time to move on”. (Incidentally, that bill (pun possibly intended) was a terrible idea by the then POTUS…however we shall swifty glance over that point, to remain on topic here) :wink:

Thanks Mike, we discovered these settings. I agree with the three strikes and you’re out philosophy – unfortunately IT thinks it should be 10 :imp:

Privacy | Site terms | Cookie preferences