Frost V-Ray instancing issue?

celluloidvfx · September 17, 2016, 2:42pm

Hi guys,

we’re just setting up a scene with Frost’s new V-Ray instancing option.
It’s giving errors on the farm for some frames/machines.
Not sure if this has anything to do with Frost at all but i thought i’d rather post it sooner than later as you’re approaching official release.

Attached is the Deadline archive of the job.

Cheers,
Holger
christiansm__3dsmax__rnd_rats_pflow_frost_ornatrix_v005_csm_random_anim__57dbf0b458504722bc3280d8.zip (1.84 MB)

paul · September 19, 2016, 5:38pm

Thank you for your report!

All of the errors I looked at seem to be caused by running out of memory. It’s not obvious to me why this is happening, so we’ll need to look into this.

Would it be possible for you to please send us a copy of your point cache (rat_run_cycle.pc2)?

celluloidvfx · September 19, 2016, 5:54pm

Hi Paul,

sure (attached).

We actually figured that it’s a memory issue as well. And it’s most likely neither Frost’s nor Deadline’s fault.
It’s pretty surely coming from the fact that we’re offsetting the instances and thus it’s not just one geo being instanced but quite a bit more.
Fortunately, we apparently have some machines which are able to handle the job so we’ll limit it to those.

Apart from that, i’m a bit surprised that one of the machines (cell-ws-27) was able to render it despite having only 32GB of RAM. The machines that failed also have 32GB and the other ones that rendered successfully have 64GB. Is this dependent on the number of cores/threads? If yes, is it maybe possible to implement some option into Frost to allow reducing the threads and thus the RAM usage so we could render those scenes on the other machines as well - although being it much slower probably?

Cheers,
Holger
rat_run_cycle.zip (2.94 MB)

paul · September 19, 2016, 11:22pm

Thanks!

I imagine this is a rather common case, so it would be worthwhile for us to try to make it more efficient. I’ll add that to our wish list.

Yes, that’s surprising to me too. I wonder if that workstation (cell-ws-27) might be using a different version of V-Ray or Frost?

I think that part of the instancing code is single-threaded, but it’s possible that V-Ray is multi-threading something.

celluloidvfx · September 20, 2016, 2:04pm

For us, it definitely is fairly common. So if you guys can do something on Frost’s end to improve this that’d be great.

I just double-checked and the versions are all the same. The hardware is of course different.
The machines which rendered the scene successfully are dual-Xeon with 8 cores (16 threads hyperthreading) on each CPU → 32 threads and 64GB RAM (cell-rs-17 to cell-rs-22).
The machines which failed are:
a) single-CPU i7-2600K (4 cores / 8 threads) with 32 GB (cell-rs-09 to cell-rs-16)
b) single-CPU i7-5820K (6 cores / 12 threads) with 32 GB (cell-rs-23 to cell-rs-30)
The exception being cell-ws-27 with i7-5960X 8 cores / 16 threads.

Not sure what to conclude here…

V-Ray is definitely multi-threading. Might make sense to test the job rendering only on e.g. 4 threads to see if that makes any difference. Is there a way to submit it to Deadline with the option of using 4 threads max?

Cheers,
Holger

paul · September 20, 2016, 5:55pm

I think you’ll need to set the Deadline Slave’s CPU Affinity.

Note, if you are using a version of Deadline older than 8.0.8.0 or 8.1.4.1: apparently setting the CPU Affinity didn’t work in all cases. If it doesn’t work for you, please see this thread. The last post in that thread includes a script that fixes the problem.

celluloidvfx · September 20, 2016, 6:06pm

We’re indeed still on Deadline 7.2. I’ll check that post to work around the issue then.
So do you also think it makes sense to test this and see whether it’ll make any difference?

paul · September 21, 2016, 1:27am

Frankly, I’d be surprised if it makes a difference, but I’ve been surprised many times before. I tested this locally, and the relevant code was running single-threaded. It’s conceivable that V-Ray is multi-threading something else, or multi-threading the same code in some cases, but that doesn’t seem to explain the pattern that you saw.

Looking at the RamPeakPer entries in the log, I’m wondering if the job just barely fit into memory on cell-ws-27. Maybe it has some different background programs running, giving it that extra little bit of memory it needs to work?

celluloidvfx · September 21, 2016, 9:52am

I’m testing this right now with the machines that were failing before. I set those Slaves to only use 2 of the available 12 threads. Let’s see, maybe you’ll be surprised once more

The machine ws-27 is definitely not any different from the other machines in terms of OS, processes, programs, software version(s). We keep the machines in the render farm as homogenous as possible. So i can exclude that almost 100%.

celluloidvfx · September 21, 2016, 6:06pm

OK, no surprises for you this time!
Same issue even with just 2 threads.
I’m still curious why that one machine is able to render the job…