Slow renders Through Deadline

Bobbyb · May 19, 2016, 9:44pm

I’m working in a studio that uses Deadline, and I have noticed that render times are drastically slower through deadline and I am looking for suggestion about what could be causing the slow down.

Fox example a scene that take 1.2 seconds to render on my local machine takes 50 seconds to render through deadline to the same machine.

When I have used Deadline in the past the first frame a computer picked up would be slower due to the computer needing to open the scene, but then all successive frames would render at the same speed as rendering locally on that machine. I remember in the past deadline would que multiple frame task to a machine at a time, but now it only seems to que a new task to the machine once it has finished it’s task.

I suspect that software is restarting each frame, but any possible reason and solution would be appreciated.

eamsler · May 23, 2016, 6:03pm

It depends if the software we’re controlling has a ‘batch mode’ option. If it does, we only need to load things once and can ask the software to move onto the next frame quickly.

What’s the app you’re submitting jobs for? Maybe we just need to set a checkbox.

wheely · October 3, 2016, 1:57am

Hello, sorry to resurrect this thread, but I am facing the same issue with local renders = fast, Deadline renders = slow.

Edwin, you mentioned something about a checkbox that could solve this, so I’ve attached a grab of our typical Maya submission parameters.

I really do hope that it’s as simple as marking a check-box!

Maya 2015 x64
Vray 3.1
Deadline 7.2.4.0

Please let me know if there is any more information needed.

Thanks!

eamsler · October 3, 2016, 3:55pm

The magic “use mayabatch” checkbox is ticked, so that’s not going to be it here…

Have you tried opening the scene on the render node and rendering it there? I wonder if somehow those machines might render slower. Another problem might be that your file server is being bogged down at render time due to the machines all asking for the same files all at once (render farms’ file servers are basically under a never ending barrage of distributed denial of service attacks all day).

For the speed on this scene, if it does work faster loading it manually and clicking ‘render’ on the render node, try limiting the number of machines that are allowed to render at once with the “Machine Limit” box. Set it to three and see if that helps at all. That would be to lighten the load on the file server. There are more tricks if that’s the case, and we outlined most of them here:

docs.thinkboxsoftware.com/produc … mance.html

wheely · October 4, 2016, 1:49am

Edwin,

Thanks very much for your tips. Our network infrastructure is on a 10-gigabyte-per-second backbone, with blades being a mix of i7 4900 and 5800 CPUs (with one Xenon E5-2650). Theoretically, the renders should run quite a bit faster on the blades than on our staff’s PCs… We are facing the slower Deadline renders even when the farm is completely devoid of jobs, meaning that there is minimal disk and network I/O requests. I will try to open a scene and render it locally on one of the blades.

I’ve also done as you had suggested and Auto Adjusted the repository performance settings.

Thanks again, Edwin. I will update this thread with more info.

MikeOwen · October 4, 2016, 7:08am

Can you post a few of the job log reports? I want to compare the time stamps per line of StdOut to see where the bottleneck/all the time is being spent. Perhaps, it is as simple as a single line which is causing the slowdown here? Note, the first task picked up by each Slave will be ‘slower’ than the subsequent tasks it renders for the same job, as it needs to ‘load’ the Maya scene file, which may well be GB’s in file size from a file server (assuming you have “Use MayaBatch Plugin” checkbox enabled ~which you should).

Adam_Tennant · October 4, 2016, 5:11pm

I’m testing out Deadline and I’m experiencing the same thing.
As I’m just testing I’ve got the repository, database and client all on the same machine which is a bit odd I know.

So I’m submitting jobs from Nuke to Deadline (I’ve installed the submitter). The Nuke script I’m rendering from is local and the job is rendering to the same directory structure. When I run the render in Nuke it’s pretty quick but it’s a LOT slower when I do it through Deadline. I’ve tried doing that auto adjust thing mentioned above as well as adjusting concurrent task settings etc but no luck. Any thoughts?

MikeOwen · October 4, 2016, 5:22pm

I would say the same answer as I gave before. Can we see the job log reports to identify where the slowdown is?

wheely · October 5, 2016, 8:46am

Mike,

Seems this issue might be intermittent, so I’ve instructed the render team here to let me know if it does occur again. I will send the log file here at that point.

Thanks very much, Mike.

Edit: I did some testing as per Edwin’s suggestions (did a timed test of rendering one frame through Deadline, locally on a blade, and locally on a user PC) and did not come across any major render-time differences . Again, if we see this happening again during real production rendering, I will post the log here. Thanks!

Adam_Tennant · October 5, 2016, 11:47am

I did a test of a single frame (again Deadline took a lot longer as before) and I’ve attached the job log.
joblog.txt (6.88 KB)

eamsler · October 5, 2016, 2:54pm

Well, here’s the breakdown of the important parts of the log:

### Deadline starts nuke
2016-10-05 12:38:47:  0: INFO: Process is now running
### Nuke has finished loading itself
2016-10-05 12:38:53:  0: STDOUT: Loading C:/Program Files/Nuke9.0v1/plugins/dpxWriter.dll
2016-10-05 12:38:53:  0: STDOUT: [12:38:53 GMT Daylight Time] Read nuke script: C:/Users/adam/AppData/Local/Thinkbox/Deadline8/slave/atomic-3/jobsData/57f4e6436e5fd10d1cd34bac/thread0_temp3EXB10/IDistort_test.nk
### Writing frames starts here
2016-10-05 12:39:34:  0: STDOUT: Writing D:/Temp/IDistort_test/idistort_test_v02.001.dpx .5
### Finished here:
2016-10-05 12:39:34:  0: STDOUT: READY FOR INPUT
2016-10-05 12:39:34:  0: Done executing plugin command of type 'Render Task'

So, it looks like about six seconds for loading Nuke itself, but you’re right in that frame rendering is taking far longer than it should. I think that there used to be an environment variable to set how memory was allocated and that used to help speed things up… I’ll need to look into that and see if I can find something for Nuke 9 on Windows.

Adam_Tennant · October 5, 2016, 3:21pm

Great, thanks.
I’ve found out that we have licenses so I’m going to log in with the company account and put in a proper support call.

MikeOwen · October 5, 2016, 4:22pm

It should be noted that local rendering in Nuke GUI takes advantage of any locally cached data / proxy data, whilst the Nuke CLI rendering exe does not look at the local cache, so streaming of a slow network file server any large footage will take time.

Adam_Tennant · October 5, 2016, 4:40pm

That’s a good point Mike. I could try clearing the Nuke cache then comparing the renders.

Adam_Tennant · October 6, 2016, 10:40am

So I copied the Nuke script onto the network and cleared the Nuke cache then ran the test again.
All the Deadline software (client, repository etc) is still on the local machine.
When I rendered 50 frames from the write node in the script to the same network location as the script it took 1 minute.
I then used the Deadline menu in Nuke to submit the same 50 frames and it took 10 minutes.

I’ve attached the log for one of the tasks.
joblog02.txt (12.7 KB)

MikeOwen · October 6, 2016, 11:34am

Hi Adam,

Thanks for the log. Few things I noticed:

It looks like the Nuke script file was created in Nuke 9.0v7, but you are trying to render it using 9.0v1 via Deadline, as per this log warning line:

You could update the Nuke exe path to use via “Configure Plugins…” → “Nuke” in Deadline Monitor to point the applicable Nuke.exe to the correct ‘revision’ build of Nuke, as I assume it is present on this machine you are conducting all your testing on.

I notice you are enabling the use of GPU for rendering, via the “–gpu” CLI flag:

Just checking, but this machine has GPU available?

So, it looks like a custom script here is the root cause of your troubles. Nuke is reporting that each frame takes between 2.48-3.86 seconds per frame to render. HOWEVER, there is this script called: “sb_saveRenderBackup”, which I am now guessing, is carrying out a Python based file ‘stat’ command, and for reason, hitting major slowdown as this script completes and then Nuke reports via it’s StdOut a total frame render time of 1 minute, 7 seconds, which would explain the 10 min render time you are seeing:

Do you need to run this script during network rendering? ie: I’m guessing again, but if the script does some ‘smart’ timely backup of the Nuke script, this is unnecessary for offloaded network rendering of ‘throwaway’ Nuke script files which are already saved on your network file server? Perhaps, add a check in the py script, to only run this script if in Nuke GUI mode?

Adam_Tennant · October 6, 2016, 12:42pm

Hi Mike, thanks for the reply.

As per the documentation I added the 9.07 .exe into the path under the 9.01 entry. Would that show up in the log?
This is the entry in the Configure plugins under Nuke 9 render executables:
(the last line is the bit I added)

C:\Program Files\Nuke9.0v1\Nuke9.0.exe
/Applications/Nuke9.0v1/Nuke9.0v1.app/Contents/MacOS/Nuke9.0v1
/usr/local/Nuke9.0v1/Nuke9.0
C:\Program Files\Nuke9.0v7\Nuke9.0.exe

Yes, it has an NVIDA GeForce GTX 470 that has a bunch of CUDA GPU cores.

That’s really interesting, thanks for pointing that out. I think that script is causing some other occasional network slowdown instances I’ve noticed as well. Great, I can take that to the powers that be and point the finger now.

That being said, the other network render still worked fast and that’s presumably running the same script? I’m not sure, I’ll have to check it out.

Thanks again,
Adam.

MikeOwen · October 6, 2016, 12:50pm

Coolio.

On the Nuke exe issue:

The Deadline Slave will use the FIRST valid Nuke.exe path it finds in the list you give it, via the plugin config. If you want to ensure it ‘finds’ Nuke 9.0v7 first, then either make sure it’s listed first in the list or just remove the other ones.

Adam_Tennant · October 6, 2016, 1:54pm

Ah right, cool. I’ll change that. Thanks.

Adam_Tennant · October 6, 2016, 2:40pm

So, it seems that script was the culprit. We removed it from the menus and I re-ran the Deadline render and it went through in 1 min.

Problem solved. Thanks Mike!