we’re just setting up a scene with Frost’s new V-Ray instancing option.
It’s giving errors on the farm for some frames/machines.
Not sure if this has anything to do with Frost at all but i thought i’d rather post it sooner than later as you’re approaching official release.
All of the errors I looked at seem to be caused by running out of memory. It’s not obvious to me why this is happening, so we’ll need to look into this.
Would it be possible for you to please send us a copy of your point cache (rat_run_cycle.pc2)?
We actually figured that it’s a memory issue as well. And it’s most likely neither Frost’s nor Deadline’s fault.
It’s pretty surely coming from the fact that we’re offsetting the instances and thus it’s not just one geo being instanced but quite a bit more.
Fortunately, we apparently have some machines which are able to handle the job so we’ll limit it to those.
Apart from that, i’m a bit surprised that one of the machines (cell-ws-27) was able to render it despite having only 32GB of RAM. The machines that failed also have 32GB and the other ones that rendered successfully have 64GB. Is this dependent on the number of cores/threads? If yes, is it maybe possible to implement some option into Frost to allow reducing the threads and thus the RAM usage so we could render those scenes on the other machines as well - although being it much slower probably?
For us, it definitely is fairly common. So if you guys can do something on Frost’s end to improve this that’d be great.
I just double-checked and the versions are all the same. The hardware is of course different.
The machines which rendered the scene successfully are dual-Xeon with 8 cores (16 threads hyperthreading) on each CPU -> 32 threads and 64GB RAM (cell-rs-17 to cell-rs-22).
The machines which failed are:
a) single-CPU i7-2600K (4 cores / 8 threads) with 32 GB (cell-rs-09 to cell-rs-16)
b) single-CPU i7-5820K (6 cores / 12 threads) with 32 GB (cell-rs-23 to cell-rs-30)
The exception being cell-ws-27 with i7-5960X 8 cores / 16 threads.
Not sure what to conclude here…
V-Ray is definitely multi-threading. Might make sense to test the job rendering only on e.g. 4 threads to see if that makes any difference. Is there a way to submit it to Deadline with the option of using 4 threads max?
Note, if you are using a version of Deadline older than 8.0.8.0 or 8.1.4.1: apparently setting the CPU Affinity didn’t work in all cases. If it doesn’t work for you, please see this thread. The last post in that thread includes a script that fixes the problem.
We’re indeed still on Deadline 7.2. I’ll check that post to work around the issue then.
So do you also think it makes sense to test this and see whether it’ll make any difference?
Frankly, I’d be surprised if it makes a difference, but I’ve been surprised many times before. I tested this locally, and the relevant code was running single-threaded. It’s conceivable that V-Ray is multi-threading something else, or multi-threading the same code in some cases, but that doesn’t seem to explain the pattern that you saw.
Looking at the RamPeakPer entries in the log, I’m wondering if the job just barely fit into memory on cell-ws-27. Maybe it has some different background programs running, giving it that extra little bit of memory it needs to work?
I’m testing this right now with the machines that were failing before. I set those Slaves to only use 2 of the available 12 threads. Let’s see, maybe you’ll be surprised once more
The machine ws-27 is definitely not any different from the other machines in terms of OS, processes, programs, software version(s). We keep the machines in the render farm as homogenous as possible. So i can exclude that almost 100%.