It doesn’t look like the auto-update will detect slaves that are no longer responding aka turned off. So they’ll be listed as “Idle” even after they’re powered down.
That would be the case if the slaves aren’t shut down gracefully. When a slave is closed, one of the last things it does is update its state in the database to indicate that it is offline. When this doesn’t happen, it will eventually get marked as stalled.
This behavior is consistent with previous versions of Deadline, so the question is why the slaves aren’t shutting down gracefully. The next time this happens, can you grab the most recent slave log from the slave machine that didn’t update its state to offline?
Thanks!
- Ryan
They didn’t shut down gracefully because I pushed their power buttons haha. The log will say “Power gone”.
Still, if it doesn’t get a response from a slave or ping a slave it should at least go to ‘stalled’ no?
Deadline doesn’t ping the slaves. It determines that a slave is stalled if it hasn’t updated it’s state in a certain number of minutes (I believe 10 or 20 is the default, can’t confirm because I’m not in the office). If Pulse is running, stalled slave detection happens at regular intervals (the repository cleanup step). If Pulse isn’t running, the slaves check each other at random intervals, so the time it takes to detect a stalled slave could be greater than 10 minutes.
Maybe there is a better way we could be handling this, and we’re definitely open to the discussion, but odds are we won’t be changing how this works for 6.
Well unless the random time was >24 hours it didn’t succeed in checking. But maybe that’s because none of the slaves were running, so there was no peer-to-peer. Seems like the monitor should also do its own check.
That’s exactly why. If Pulse or the slaves aren’t running, nothing is doing the checking.
The Monitor has a manual way of checking. It’s in the Tools menu while in super user mode (I’m pretty sure that’s been added to the v6 Monitor).
Maybe the Monitor could do random checking like the slaves do…