selective interruptibility ?

moonwalker · May 9, 2016, 4:43pm

Hello,

we’d like to make the most of our farm, and are trying to use the interruptible options.

One of the thing we would like to do is allowing all our users to send jobs on most of the nodes as a secondary pool.
And in top of that a job will interrupt tasks that runs on other nodes than their primary.

Let’s say our farm is like this :
node01 to node05 are part of pool_1 and pool_all
node06 to node10 are part of pool_2 and pool_all

job number 100 is launched with pool_1 as primary pool and pool_all as secondary pool. It uses all nodes as long as it is the only job running.

Then job number 101 is launched with pool_2 as primary pool and pool_all as secondary pool.
The aim is that job 101 will interrupt all the tasks launched by job 100 running on pool_2.

Could we make all job interruptible by default on secondary pool but uninterruptible on primary pool ?

Cheers,
Pierre

Coulter · May 9, 2016, 9:05pm

Hi Pierre,

I see what you are getting at. I cannot think of a clean way to accomplish this at the moment. Others may chime in with some ideas, though.

I’ve made of note of this as a use case.

moonwalker · May 10, 2016, 4:30pm

Hi James,

Since we usually have very different jobs simultaneously (for instance 30 sec nuke tasks vs 6 hours maya tasks), I can’t find another way to accomplish maximum usage of our farm except using interruptibility, but I could have miss something.

I’d be delighted to have another lead, though

Thanks,
Pierre

Bobo · May 10, 2016, 6:20pm

Interrupting tasks is generally a bad idea, as it always involves lost processing time.
For the case you described (several hours rendering vs. several seconds of compositing), the one type of job is heavy on CPU usage while the other is mostly I/O-bound. Tests have shown that running a render or simulation job in parallel to a compositing job on the same physical machine saturates the network resources much better than running each type of job on dedicated individual machines.

So we advise to run two Slaves on each machine. In Deadline 7 and 8, multiple Slaves on the same OS instance will use only one Deadline license, so you can launch as many Slaves as your hardware allows, but 2 should be enough in your case. You can then add the first Slave of each machine to a “Maya” group, and the second Slave to a “Nuke” group. You can also assign them to Pools as needed to juggle the priorities of multiple projects.

When a Nuke job is submitted and set to the Nuke group, it will only run on the second Slaves in parallel to the Maya-rendering first Slaves of every machine. Obviously the Maya render jobs will get a slight performance hit while a Nuke job is sharing CPU resources with them, and the Nuke jobs might take a minute instead of 30 seconds while sharing resources with the Maya rendering job, but nothing will ever get interrupted and you will saturate your bandwidth and CPU resources much better in the long run.

At least that’s the theory. Let us know if you have any concerns…