We have a job that started rendering on 100 machines, then got cancelled, reconfigured, the machine limit set to 1 machine, then resumed.
After all this, slaves were not picking it up at all,saying the limit is exhausted for the job, even though nothing is rendering it:
2015-04-27 12:10:08: Scheduler - Preliminary check: The 553e873b7a3a9e28e43018c1 limit is maxed out.
It took about 5-10 minutes for a machine to finally pick it up. Any ideas?
It’s likely that a limit stub was orphaned at some point, which kept the “in use” count above zero until the repository repair operation detected this and fixed it (which is why it took 5-10 minutes for it to get picked up again).
Cheers,
Ryan
Hi,
we just had a similar issue. I’m not 100% sure though so i thought i’d post to confirm.
The highlighted Nuke job in the Monitor screenshot was only using 1 Slave despite some other candidates being online/idle. In the Pulse screenshot you’ll see that the machines that could’ve joined the job are the ones that presumably were using the limit stubs although they were idle. I also attached the Pulse log for reference.
If the issue here is the same as Laszlo described is this normal/intended behaviour? Could this maybe be optimized in a future version?
Cheers,
Holger
deadlinepulse-cell-temp-01-2015-08-20-0001.zip (554 KB)