Job Priorities not working?

Hi,

We’re seeing a few weird things with our jobs at the moment. Basically slaves are getting stuck on lower priority jobs when a new higher priority job gets submitted. There is nothing different between the jobs other than priority, they have the same group, pool and plugin so the new job should take all the machines. We run our farm using pool-priority-balanced. If I suspend the lower priority job then the machines jump onto the higher priority one and stay there even after I’ve unsuspended the lower priority job. Any thoughts on why this might be happening? We’ve noticed it since we installed the latest version 8.09 but I can’t be certain it wasn’t happening before.

Thanks

Nick

I believe this has been happening on and off for awhile now, but tracking it down has been quite the challenge.

One user recently had this issue because ‘sequential rendering’ was enabled on the lower priority job. That forced the Slave to stay stuck to the existing job and when a higher-priorty job came along, the Slave stayed on that previous one. The sequential mode makes the most sense for simulation jobs where it would be a waste moving a Slave over.

Is that lower priority job sequential (check the ‘general section’ in the job properties)? Also, did the Slave pick up that lower priority job when there were higher priority jobs it should have started.

Hi Edwin,

I’ve checked the job and it’s not set to sequential, the plugin is 3DS Max if that makes any difference. I’m pretty sure this a problem with the job not releasing the slaves rather than the lower priority job being picked when there are higher priority jobs. I also think the jobs that are holding the machines were submitted before I did the update to the latest version as well. I don’t know if something could have changed between versions? it doesn’t seem to be happening on the newer jobs, they’re behaving as expected at the moment.

As a bit of extra info, we’ve got a task buffer of 3 and I’ve also got the “Enhanced Balanced Logic” turned on at the moment but I think this was off and it was still doing it. I know this is for same priority jobs but I don’t know if that’s having an effect as well.

I can keep an eye on it, it’s happened a few times recently. Is there anything I can check when it happens to see what the logic is or why it might be deciding not to move over?

Nick

Thanks for the clarification there. I don’t think any of the settings you mentioned can affect the job dequeue logic.

The scheduler thread inside of the Slave is the one responsible for finding and working on tasks. Really, what we’re looking for are the verbose Slave logs from the machines that aren’t working on the correct priority jobs. I’m hoping there’s something obvious there, but short of limits, blacklists, or the Slave being marked bad, it’s going to be really hard to check the logic here.

I’ll keep an eye on it then, I couldn’t see anything particularly in the slave logs when I looked but if it happens again I’ll let you know.

Nick

Also, make sure verbose logging is enabled if it’s not already. Otherwise, it’ll be a bit of a waste.

docs.thinkboxsoftware.com/produc … ation-data