multiple slaves - dynamic startup

LaszloSebo · October 4, 2013, 4:40pm

We would like to start a discussion about maybe making the ‘multiple slaves on one box’ feature a bit more intelligent,… what we want is to be able to utilize our farm as much as possible. Most render jobs spend a large percentage of their time opening max, syncing plugins / textures, prepping the scene, browsing textures etc, then do the actual render. In these semi idle times, the second slave could go through smaller tasks, quicker nuke renders easily, increasing out utilization.

However, there is no way for us to tell the 2nd slave to wait till there is a certain idle cpu time spent, and only pick up jobs then.

We are thinking about maybe writing a tool that gets triggered every couple minutes that looks at cpu usage, and if its below a threshold would start a second slave, then somehow it would self-shutdown if the first slave is making the machine ‘busy enough’. But we would rather like to avoid hacking something like this in ourselves, when deadline could do that much better.

Any ideas?

rrussell · October 4, 2013, 6:31pm

Your idea of starting/stopping slaves based on system resource usage is an interesting idea, but I’m curious if running a nuke job (for example) while a max job is syncing plugins or loading textures would be ideal. When Max is in this low cpu state, it’s probably doing a lot of IO over the network, and aren’t nuke jobs typically more IO than CPU intensive? If that’s the case, running the nuke job could slow down Max’s ability to pull data from the network, and thus slow down the overall speed of the Max job. I could be wrong here, but I’m just curious if you guys have run benchmark tests to see if this is the case.

Would using affinity be an option here? For example, what if the Nuke slave’s affinity was set to 1, and the affinity for the 3dsmax slave was set to N or N-1, where N is the number of cores the machine has. If you set it to N-1, the Max job would lose a core for rendering, but then the Nuke jobs would always have 1 core that they could be processed with.

cbond · October 4, 2013, 6:45pm

Laszlo - its our goal over Deadline 7,8 to allow the finite control you want in Deadline …but some of that requires a concerted effort across the team. it’s in our roadmap!

otoh, we did a lot of testing on multicore machines and the benefit is a lot less than you think. more machines trumps trying to do more on the same machines every time.
of course, i’m not saying that there isnt SOME benefit - its mostly an IO and RAM issue. if you start swapping ram between processes to disk - its way worse. even swapping ram between processors [and not to disk] slows you down.

anyway, its a big research project of ours to figure out and optimise these things, and give [more] flexibility in how the farm works.

cb