Hi there,
Occasionally we get various jobs that get submitted to the farm that seem to ignore their machine limit of 20. They vary from different plugins. They aren’t using a limit because they don’t need to. Looking at the .job file, the limit is set to 20, I can’t see anything else that could be causing the problem.
looking at the limitgroup folder for the job, the stubs are in there, and the .limitgroup limit is set to 20.
Which version of Deadline are you using? There were race condition issues that could cause this behavior, but these should be addressed in the current version.
Cheers,
- Ryan
5.1 at the moment, hanging out for 6
Just upgraded to 5.2 today, hoping it might solve this issue but we are getting the same thing.
We noticed that slaves are rendering jobs, but not creating a .stub in the limitgroup folder for the job.
Is there anything we can do to force stub creation?
Same trouble, generally it takes 3 machines maximum.
That’s strange. Do any stubs get created? If some slaves are creating stubs and some are not, do you know if it’s the same group of machines that are unable to create their stub?
Last night I noticed that the stubs that are created are from the machines that have most recently picked up a task for the job. It looks like stubs are being deleted at a certain percentage of rendering, even though that setting isn’t turned on.
I will investigate more when it happens again.
It’s basically any slave, it doesn’t seem to be slave specific.
Hmm, I’m not sure what would be causing this behavior.
In Deadline 6, we are no longer using stub files, and this is now taken care of in the database. In theory, that should make the system more robust. The beta is still open, so if you’re interested in testing with the beta, just shoot an email to beta@thinkboxsoftware.com.
Cheers,
- Ryan
Thanks Ryan, we might do that.
Since this thread already existed I’m piggy back on it. I’m having a similar issue with Deadline 6. it seems that some of the jobs are obeying the limits but one job just won’t listen. Seems like it took the other jobs some time to actually start obeying their limits as well. Hopefully this last one will catch on. But I wanted to put the issue out there in case there was a solution.
James
There was a bug in the limit group code in Deadline 6 where a slave might acquire a limit group “stub” (which is used to update the “in use” count) and fail to return it. It’s something that’s already been fixed in the 6.1 beta. Beta 3 of 6.1 includes this fix, as well as other stability issues, so if you don’t already have beta access and want to test, shoot an email to beta@thinkboxsoftware.com.
Cheers,
- Ryan