AWS Thinkbox Discussion Forums

HOUDINI + REDSHIFT keeps crashing with GPU affinity

Hi i am using a remote render node service called iRender.

i can basically rent out a computer with multiple GPUs and set it up like a workstation (with deadline and all that).

+COMPUTER SPEC
CPU : AMD ryzen threadripper pro 3955WX
RAM : 120GB
GPU : RTX3090 x 4

+RENDER SETUP
4 concurrent task with GPU affinity to 1 GPU, so i have 4 render task running one GPU each.

+PROBLEM
the render task will launch, and gets to rendering the frame phase. and they will start rendering a few frames, and will error out with below error messages.
this will happen somewhat randomly, some tasks error out immediately, some will render few frames and error out.

If i submit a task with bit of geometry, likely it will crash out immediatly.

When i check the log, each task is using a single, but differnt GPU each as expected, (task 0 + gpu 0, task1, gpu 1… etc)

i have attached an error message as a screenshot,

Failed to allocate mem (152176896 bytes)

[RaiseException] c:\Windows\system32\KERNELBASE.dll
[CxxThrowException] C:\programData\redshift\plugins\Houdini\18.5.499\dso\redshifthoudini.dll
Failed to allocate mem (152176896 bytes)

been trying to fix this for days, while paying for the rental fee.

Any help would be really appreciated!


Adding more information on tests.

to rule out that its a version specific issue (either with houdini, redshift or daedline), i have uninstalled all the version, and installed the latest (houdini + redshift + deadline) and ran the same test. it crashes exactly the same way.

in summary

  1. open houdini, select all 4 RTX3090, render locally, all 4 GPUS utilized, renders fine.

  2. from houdini, submit to deadline with 4 task with GPU affinity set to 1 gpu each, each task will launch the renderer but will crash either immediately (if heavy scene), or eventually crash.

  3. tested with 2 task, 2 gpu each. will still crash.

  4. when checking logs, deadline seems to allocate the right gpu to each task, and each GPU do gets utilized (i can see the cuda performance jump up as the render begining, but crash almost immediately due to memory error)

error log attached on previous post.

any insight will be much appreciated!!

Privacy | Site terms | Cookie preferences