AWS Thinkbox Discussion Forums

Prioritising Slaves

Hey all. In theory this seems basic enough that maybe I’ve missed it, or it’s just not how other people run their render farms.

Is there a way to prioritise individual slaves so that our more powerful machines can get preference for taking jobs? We currently use pools and groups to manage the priorities of jobs and the order they’re picked up in. However the way we’ve arranged things, we still want all the slaves to have these pools (in different arrangements) so that they can all take care of any job. But ideally we could make it so that the weaker machines would defer to the stronger one.

Is there a way to do this, or is it counter to how the slaves pick up jobs?

I feel like this could be accomplished best using “Pool, Priority , First in first out”.
Where you make a pool of all your super machines as pool (A) and then all the other machines as Pool (B).

Then for your jobs, you make pool (A) as the primary and (B) as the secondary.
This means that no matter what the priorty, it will always pull from the super machines first and will only use the other machines, if there are no super machines left. You can still use groups, to define a particular machine criteria, ie if you want a job to only go to the super machines that use amd processors.

The pool order that a machine has, also makes a difference, so your super machines, if they are added to pool (A), it should be the first entry in there pool list.

Does this help?

Cheers
Kym

It’s actually a really hard problem to do when your dequeue logic is built on a distributed shared-nothing architecture as Deadline’s dequeue logic is.

The problem is a Slave doesn’t have any concept that it is faster or slower than another, and will essentially look for work at all times (with the benefit that it’s hard to break the farm). Pools allow you to say a job should hit the faster machines first, but if you only care that fast machines pick up before the slow ones at all times, that’s not going to work well. It’s usually a noticeable issue when idle farms pick up a new job.

The only workaround I’ve found is to make use of power management and define a specific startup order. As machines go idle they’ll shut off, and when machines are needed again, the fastest ones start first. That’s not entirely ideal since a slow machine who is still running may take a job and the fast machine will remain offline.

It’s a hard problem without a master controller to hold up the slower machines, and it’s come up many times in the past. It’s also tricky to decide when enough fast machines have picked up so slow machines can go. What if the fast machines were removed from the rack for example?

Here are some other threads:
forums.thinkboxsoftware.com/vie … 11&t=14592
forums.thinkboxsoftware.com/vie … 11&t=13784
forums.thinkboxsoftware.com/vie … 11&t=13567
forums.thinkboxsoftware.com/vie … 11&t=12228
forums.thinkboxsoftware.com/vie … 11&t=15959

That’s what I was concerned about, it seems simple but that doesn’t mean it’s simple to implement.

I’ll look through those threads, thanks! I had forgotten that secondary pools exist, so I’ll read up on them in the thread eamsler linked. Thanks kwatts!

Also I should say, for now my way of mitigating this has been to limit the concurrent tasks on my slower machines. So that even when they do pick up jobs first, they take on less work at a time and leave the rest for the more powerful ones that take big chunks of work at a time.

It would be great if prioritizing slaves was implemented in Deadline! My company would really love to get this for our Redshift Rendering so we could use all GPU’s but prioritize the computers with multiple GPU’s for concurrent frame rendering. Thanks!

I’ll add a +1 for prioritizing slaves.

We have both renderNodes & workstation in our rendering pool, but workstation used to render 3 times longer than renderNodes.
When the queue is not full, it’s a bit disturbing to have workstation taking jobs while renderNodes stay Idle :slight_smile:

Thanks for the +1 pingus!

One interesting workaround here for when you have more nodes than work to do; The power management tool in Deadline does have a wake-up order. If you’re able to make use of power management, it works fairly well. It won’t work for pingus’ workstations, but it should both save power and make sure the fast machines are the first to be woken up to start working.

Privacy | Site terms | Cookie preferences