non homogeneous hardware

Is there a way to setup my farm to ensure the fastest machines pickup jobs first?
Perhaps all other things being equal, send jobs to the lowest normalized render time nodes first?

Hello,

There is no way to do that in Deadline as we have no functionality for a slave, for instance on your slow machines, to be held back in favor of your fast machines. We can definitely have internal conversations about this, but I don’t know when or if such a feature would make it in if it was possible.

I am sure it would be more complex that this, but i would think the logic could work something like this:

The job scheduling order creates and “eligible node list” based on the user settings in the job scheduling order then take the nodes in that list and sort by lowest normalized render time constant and send jobs to the machines with the lowest normalized render time first?

thanks,

Hi,
This is indeed, walking into a much bigger discussion, which involves the entire Deadline scheduling architecture! Essentially, the current render-time/load-time multiplier is specifically to allow jobs to be automatically timed out if they are processing longer than their counterparts tasks either side of them, taking into account individual machine specs (via the slave multiplier values). Trying to hack something to work from this, will be a slippery slope…Alternative hackery might be to submit your jobs to a “pool” containing the “fast” machines, then after a period of time, editing the job to be assigned to another “pool”, which has wider coverage of spec. machines. This could potentially be automated via event plugin, script or deadlinecommand + cron. Alternatively, our “primary” & “secondary” pool system could be used as a “first wave”, followed by a “second” wave for job de-queuing, although it’s not quite the same thing :slight_smile:

Looking at the bigger picture, this is something we want to ‘solve’ in Deadline v8.0 as we want to consider machines with different specs, different amounts of RAM or different spec GPU cards installed as just a few examples. Unfortunately, this is not a trivial problem to solve. However, we have had some brilliant feedback/input/ideas of late from other studios and we most warmly welcome any other ideas/concepts to be considered for further down the road!

Cheers,
Mike

this isnt anything that is killing us, I was just looking at the farm and it popped into my head that it would be nice to prioritize our best hardware.

I am excited you guys are thinking about this stuff and I am double excited your open enough to share this info.

thanks!

Full disclosure - Deadline newbie here, about to purchase, and have been using the trial for a month. So take my thoughts as less expert but certainly fresh to the product.

This feature is one that as a newbie, you cannot believe is absent in a mature product. Backburner has it, but it is a brain-dead automatic solution that supposedly sends frames to the fastest machine first. This fails because the performance index is easily and almost always confused, and you can’t manually override.

Do we have to overthink it? The last thing I want is a “smart” solution. Which will muck up, and will need handholding. I just want a MANUAL machine priority list for my slaves - I only have a handful, and I render large/long Archviz still images, not animations. So frame times vary wildly, and there is a large penalty for the slowest box picking up the last frame while faster machines sit idle. All I want is to have the dual-xeons have first crack at any frame. The single socket hexcores should only be rendering if there are no idle dual socket machines.

Right now, I can only do this with pools, but then I have used up that feature and cannot use it to balance jobs.

Would a manual slave priority list be hard to implement?

Hi,
Thanks for the feedback.

The main issue here is that slaves work independently, and aren’t aware of what the others are doing. So one slave has no knowledge if other slaves are looking for jobs, and therefore they can’t coordinate if one slave should skip a task because there is currently a “faster” slave that is already looking for work.

The fact that the slaves look for job asynchronously also plays a role. For example, all the slaves are rendering, and the slow slave finishes its task. It then grabs the last frame of the job because it’s the only slave currently looking for work. A fast slave then finishes a minute later and it has nothing to do. What should be the behaviour here? Should the slow slave have never picked up the task in the first place? It couldn’t know when the fast slave would finish. Should the slow slave give up its task? What if it’s already been rendering for 30 minutes?

It’s not a “simple” problem with Deadline’s architecture, and that’s why we haven’t implemented it. However, taking into account what I said in my previous reply, there are plans to tackle this and the performance index feature in a future version. Are you interested in joining the v7.0 private beta program? I think it might be an eye opener for you on the other features we are currently working on :wink: (Send email request to beta [@] thinkboxosftware.com for the NDA to be signed, etc).

Regards,
Mike