Pulse/Manager Problems?

Using the latest non-beta Deadline 5 release.

Not sure where the problem is here but I am having a problem with Nuke jobs sitting in the queue and not rendering on submission. Basically the jobs sit there not picking up (pools/groups/limits/priorities are all correct) on any slaves at all, once Pulse is restarted everything kicks off only to halt again an hour or so later. I THINK (still unconfirmed) that a certain users jobs are causing it, but they are just that users completed jobs sitting in the queue, again though still not sure if that is causing the problem.

Strange problem that seemingly came out of nowhere. I am seeing a lot of the below errors in the Pulse list, checking for orphaned limit stubs returns nothing (not sure that is what I should be checking but yeah).

Scheduler - unable to dequeue task: Could not secure stub for 000_080_004_44c2750e limit.

Joe

Based on that scheduler message, and the behavior you described, it sounds like the job’s machine limit is playing a role. You’ll notice that the name of that limit is actually the job’s ID (which you can see in the Job ID column in the slave list in the Monitor). You can use that to see if these messages match the job(s) that aren’t getting picked up for rendering.

Also, I know you mentioned that you checked the job’s Limits, but did you also check its Machine Limit?

Finally, is this only a problem with Nuke jobs? Do any other job types have this problem? Have you had a chance to confirm if it’s only jobs that belong to a specific user?

Thanks!

  • Ryan

Back from vacation!

Let me get back into this one, I will have a talk with the Nuke folks today. Just had a couple Nuke jobs that ran through fine, definitely thinking it is a specific user at this point.

Checked all the limits and everything, Job and Machine, both appropriate. This problem is only with Nuke, it seems to cause a large amount of instability on the farm though as it pretty much semi-crashes the pulse machine making Monitor on the workstations even more unstable.

I will hopefully be able to find the user / problem today.

J

Sooooo. I’ve been poking my head in with the artists and keeping an eye on the monitor lately, zero crashes… Sigh. At least for the time being, the problem has disappeared.