AWS Thinkbox Discussion Forums

[7.1.0.17] job not picking up

We have a job that started rendering on 100 machines, then got cancelled, reconfigured, the machine limit set to 1 machine, then resumed.
After all this, slaves were not picking it up at all,saying the limit is exhausted for the job, even though nothing is rendering it:

2015-04-27 12:10:08: Scheduler - Preliminary check: The 553e873b7a3a9e28e43018c1 limit is maxed out.

It took about 5-10 minutes for a machine to finally pick it up. Any ideas?

It’s likely that a limit stub was orphaned at some point, which kept the “in use” count above zero until the repository repair operation detected this and fixed it (which is why it took 5-10 minutes for it to get picked up again).

Cheers,
Ryan

Hi,

we just had a similar issue. I’m not 100% sure though so i thought i’d post to confirm.
The highlighted Nuke job in the Monitor screenshot was only using 1 Slave despite some other candidates being online/idle. In the Pulse screenshot you’ll see that the machines that could’ve joined the job are the ones that presumably were using the limit stubs although they were idle. I also attached the Pulse log for reference.

If the issue here is the same as Laszlo described is this normal/intended behaviour? Could this maybe be optimized in a future version?

Cheers,
Holger
deadlinepulse-cell-temp-01-2015-08-20-0001.zip (554 KB)


Privacy | Site terms | Cookie preferences