Houdini + RS / Redshift SA: Performance tests and doubts

Hello!

I was doing some performance test comparison between render a scene directly with Houdini + Redshift and Redshift Standalone loading the .rs files directly on Deadline and my main doubt right now is the following, is there any way or setting to do to have the same performance in Deadline RS SA like Houdini + RS? I’m going to share my results, maybe somebody can clarify to me what is causing the time difference in each render:

Basic scene - 10frames:
RS Ref Path_v01.zip (60.0 KB)

PC specs:

x2 RTX 2080ti.
Houdini 18.0.566.
Redshift 3.0.28.
Windows 10.


Results:

  • Houdini + Redshift direct:

    • Total render time: 4m 38s.
    • Frame render time: ~21s.
  • Deadline + RS SA - 01:

    This method was submitting the .rs sequence without change anything,

    • Total render time: 8m 5s.
    • RS Log frame render time: ~29.4s.
    • Deadline frame render time column: ~42s.

  • Deadline + RS SA - 02:

    This one was using both gpus but only one task, setting 10 frames per task.

    • Total render time: 6m 57s.
    • RS Log frame render time: ~29.6s.

  • Deadline + RS SA - 03:

    This last one was using GPU affinity, 1 gpu per task, two task from 5 frames each one.

    • Total render time: 5m 31s
    • RS Log frame render time: ~56.2s.


As you can see, the third test with Deadline RS SA was the most closer one to the Houdini render times but still have almost 1 minute of difference. As I said in other post, I’m not the most experienced user of Deadline but based on this, I’m going to share the following doubts:

  • This can be improved even more?
  • I need to change other setting to have less render time?
  • Use GPUs per tasks is a recommended thing?
  • Why the performance is different than Houdini directly? I read about the time that consumes open a render scene for each task, but I supposed that the test 2 should be more closer than Houdini because have only one task for 10 frames.

Hope someone can help me to understand and improve this.

Thanks a lot and cheers!

Could you please run one more test - take the command line from the Deadline Worker log, and run Redshift SA with the same .rs file OUTSIDE of Deadline. Post the times of that test. Let’s try to understand how much of the times you are seeing are Deadline-related and what is the pure Redshift SA time.

Hi @Bobo!

Sorry for the late response, I was looking a script to render a rs proxy sequence but didn’t find it. So I can only make the test with only one frame and the average time was between ~28s and ~30s:

Curious thing that the time is similar like the time inside the deadline log but still different like inside Houdini. If I multiply this time by 10 the total render time should be around 5 minutes (I can’t know this properly without a proper script to render a sequence with rscmd), also different than the more quicker test with deadline (3rd test).

What do you think? Let me know, thanks!

So from what we know so far,

  • A single GPU renders a frame from an .rs file in about 56 seconds.
  • Two GPUs render a frame in about 29 seconds. So it is nearly twice as fast as one GPU, which we would hope, but is not always true for Redshift.
  • The above is true for Redshift Standalone with or without Deadline, so it is what Redshift Standalone does.
  • Houdini renders the same frame in 21 seconds on 2 GPUs. So there is a difference of about 8 seconds between stand-alone rendering and rendering in the interactive session.

You should post the actual log from the Deadline tests as a text file instead of a screenshot, so we can look at the exact timing of every single line. We need to find out where the 8 seconds are spent, my hunch is that it is time spent by Redshift Standalone to start, get a license, and load the RS file. But I could be wrong.

Here are the .txt logs from each deadline test and also from a rscmdline render too:

Render Logs.zip (48.6 KB)

Just to add one more point to your list:

  • On the Test 01 task log said that the render time was 29.4s but on the Render Time column of the task said ~41s / ~43s.

Here is a shortened version of the first log. Let’s see what we can learn from it:

=======================================================
Log
=======================================================
2020-09-14 12:59:25:  0: Loading Job's Plugin timeout is Disabled
2020-09-14 12:59:27:  0: Executing plugin command of type 'Sync Files for Job'
2020-09-14 12:59:27:  0: Plugin rendering frame(s): 1
2020-09-14 12:59:28:  0: Executing plugin command of type 'Render Task'
2020-09-14 12:59:28:  0: INFO: Process is now running
2020-09-14 12:59:30:  0: STDOUT: Redshift Command-Line Renderer (version 3.0.28 - API: 3023)
2020-09-14 12:59:31:  0: STDOUT: Initializing GPUComputing module (CUDA). Ordinal 0
2020-09-14 12:59:31:  0: STDOUT: Initializing GPUComputing module (CUDA). Ordinal 1
2020-09-14 12:59:32:  0: STDOUT: OptiX denoiser init...
2020-09-14 12:59:32:  0: STDOUT: Loading Redshift procedural extensions...
2020-09-14 12:59:32:  0: STDOUT: Loading: C:\Users\Agata\Desktop\render\RS Ref Path_v01.RS\RS Ref Path_v01.001.rs
2020-09-14 12:59:32:  0: STDOUT: License acquired
2020-09-14 12:59:32:  0: STDOUT: HID rehost=f96bfd190a59202ea8994d4b4029428f2f5ebf8b.0
2020-09-14 12:59:33:  0: STDOUT: 	Time to process all materials and shaders: 1.059111 seconds
2020-09-14 12:59:33:  0: STDOUT: Allocating GPU mem...(device 0)
2020-09-14 12:59:34:  0: STDOUT: 	Done (Allocator size: 6268 MB. CUDA reported free mem before: 8616 MB, after: 1079 MB)
2020-09-14 12:59:34:  0: STDOUT: Allocating GPU mem...(device 1)
2020-09-14 12:59:34:  0: STDOUT: Failed to leave 878 MB free (remainder: 223 MB). Trying again
2020-09-14 12:59:35:  0: STDOUT: 	Done (Allocator size: 7362 MB. CUDA reported free mem before: 8628 MB, after: 1044 MB)
2020-09-14 12:59:35:  0: STDOUT: Allocating GPU mem for ray tracing hierarchy processing
2020-09-14 12:59:35:  0: STDOUT: Irradiance point cloud...
2020-09-14 12:59:36:  0: STDOUT: 	Total irradiance point cloud construction time 1.1s
2020-09-14 12:59:36:  0: STDOUT: Rendering blocks... (resolution: 1920x1080, block size: 128, unified minmax: [16,256])
2020-09-14 13:00:01:  0: STDOUT: 	Processing blocks...
2020-09-14 13:00:01:  0: STDOUT: 	Time to render 135 blocks: 25.7s
2020-09-14 13:00:01:  0: STDOUT: Rendering time: 29.4s (2 GPU(s) used)
2020-09-14 13:00:02:  0: STDOUT: Saving: RS Ref Path_v01\RS Ref Path_v01.001.exr
2020-09-14 13:00:02:  0: STDOUT: Saving: C:\Users\Agata\Desktop\render\RS Ref Path_v01\RS Ref Path_v01.P.001.exr
2020-09-14 13:00:02:  0: STDOUT: Saving: C:\Users\Agata\Desktop\render\RS Ref Path_v01\RS Ref Path_v01.Z.001.exr
2020-09-14 13:00:02:  0: STDOUT: Saving: RS Ref Path_v01\RS Ref Path_v01.001.exr
2020-09-14 13:00:06:  0: STDOUT: License returned 
2020-09-14 13:00:07:  0: STDOUT: Saving: C:\Users\Agata\Desktop\render\RS Ref Path_v01\RS Ref Path_v01.proxy.001.exr
2020-09-14 13:00:07:  0: STDOUT: Shutdown Rendering Sub-Systems...
2020-09-14 13:00:08:  0: STDOUT: 	Finished Shutting down Rendering Sub-Systems
2020-09-14 13:00:08:  0: INFO: Process exit code: 0
2020-09-14 13:00:08:  0: Done executing plugin command of type 'Render Task'

The task starts at timestamp 12:59:25. The actual plugin job synchronization starts 2 seconds later. The render task start at 59:28. The Redshift Executable is launched at 59:30. So it takes about 5 seconds before the Redshift Standalone is called. This is more or less Deadline Worker overhead.

The Redshift starts initializing, and at 59:32 it loads the scene. It takes another second to evaluate the shaders at 59:33. At 59:35 it starts the Irradiance Point Cloud calculations, which take 1.1s. So we have spent 10 seconds so far before the actual raytracing begins.

At 59:36 it starts the rendering of 135 blocks, which takes 25.7 seconds on 2 GPUs and finishes at 13:00:01.

From 00:02 to 00:06 is saves the output, then returns the license. There is a proxy EXR saving that takes another second before the Rendering Sub-System shutdown starts, and ends another second later at 00:08. At that same time, it finishes the task. So this is 7 seconds after the rendering of blocks reported 29.4 seconds of rendering.

This means about 10 seconds before rendering the blocks, 25.7 seconds rendering blocks, and about 7 seconds after rendering to save images, shut down and finish. This comes to 42.7 or more or less 43 seconds.

The 29.4 seconds seem to be counted from the moment the scene is loaded to the moment the blocks have been rendered. So the overhead of running Deadline and Redshift Standalone is about 13 seconds.

The problem is that the Redshift Standalone overhead still exists when you run from the command line without Deadline, but it stays unreported because there are no time stamps between where you call the command line and where the actual scene is loaded and the 29.4 seconds are counted until the end of the blocks rendering. Redshift still spends a few seconds loading the executable, allocating memory, releasing licenses, and shutting down the renderer, but that overhead is not captured in numbers. At least the Deadline log gives you explicit time stamps for every single step.

In other words, we can assume that of the 13 seconds combined overhead in the Deadline Task, about 5 seconds are due to Deadline copying data around, and about 8 seconds are loading/unloading the renderer. You can try to use a stopwatch when running the manual command line to see if this is true. I would expect the actual process from calling the command to the prompt reappearing to be around 40 seconds…

I cannot answer why the render time inside of Houdini is only 21 seconds. If you take 4m 38s and turn it into seconds, it is 4*60+38=278 seconds. Divided by 10 it would mean 27.8 seconds per frame. This is still faster than the 29.4 we got in some of the Standalone tests, so for some reason Redshift runs faster inside of Houdini… It is possible that since some libraries are pre-loaded, rendering inside it does not have any overhead. Remember that the pure rendering time (Irradiance point cloud, Rendering blocks) is actually around 26 seconds in Redshift Standalone. So it is plausible that Houdini just has everything needed in memory already, and all it does is the pure rendering process without any other overhead. Where did you get the 21s figure from?

There is of course something we have not discussed yet. Your test is very very short. Most real-world production scenes take many minutes and even hours to render a single frame. However, the overhead of Deadline and Redshift typically remain constant. So having an overhead of 13 seconds when the rendering takes 26 seconds is significant. But having 13 seconds overhead when a frame takes 3600+ seconds is barely significant.

Just wanted to put things in perspective.

Privacy | Site terms | Cookie preferences