AWS Thinkbox Discussion Forums

DEADLINE takes long time to start the job, stuck at certain phase for about 10 min

Hi everyone, sharing some findings to see if it might help with the diagnostics.

SETUP -----------------------------------------

ALL sim files are cached out, ALL rendering nodes are REDSHIFT PROXIE,
in HOUDINI GUI, once opened (opening time 5sec), starts rendering almost instaneously.

I have a single HIP file, I have about 20 nodes that does all the geometrical processing (opening multiple 500mb files and processing material etc).

have many nodes that is used to layout the scene and render. They are NOT referencing the PROCESSING nodes. they are all composed of REDSHIFT proxy. infact i went through all the nodes and removed any referencing to the PROCESSING NODES, even if it is for positioning.

i have about 50+ ROP outputs, all REDSHIFT ROPS.

this STARTING POINT was 65mb, when sent to DEADLINE, takes about 20 min just to launch.
and when rendering locally (open Houdini, open file, press render button), no load time, other than

TESTING 1 -------------------------

deleted all the ROP except the one i need to test render (this helped last time).
file size 59mb, DEADLINE load time 15min+ (i gave up after 15min)

deleted other rendering layout nodes, most of them is just redshift PROXIES
file size 58mb, DEADLINE load time 15min+ (i gave up after 15min)

deleted most of the rendering layout nodes
file size 34mb, DEADLINE load time 15min+ (i gave up after 15min)

deleted half of PROCESSING nodes (nodes that takes long time to process, BUT is not referenced or needed for actual render)
file size 28mb, DEADLINE load time 10min.

deleted all PROCESSING nodes
filesize 21mb, DEADLINE load time under 1min.

TESTING 2 ----------------------------------------

I also ran a test, where i deleted all the PROCESSING NODES only, so had all the 50+ rops and all the rendering nodes (most of them redshift proxy).
DEADLINE load time 20min+

once i deleted all ROPS as well,
FILESIZE 51mb, DEADLINE load time under 1 min.

MY SUMMARY ----------------------------------------------

it seems its not just simple file size issue. i have managed to render the same scene at about 50mb just fine through DEADLINE.

also, when i deleted all the PROCESSING NODES, still was taking a while. it was only when i deleted ROP as well, went down to under 1 min.

Hi @antoinedurr, yes, all my sim files are cached out.

sorry i just posted a long thread about my findings and my current setup.

but my actual render nodes (that i need to render) all comprise of REDSHIFT PROXIES, and just some simple keyframe animation, no SIM and i made sure there was no reference to any other nodes.

So if i delete all the nodes that is not required to render, i can get it rendering under 1min.

which is confusing, as it seems perhaps DEADLINE is literally processing the entire scene (and all its inactive nodes).

this is supported by my testing, where when i delete half of my PROCESSING NODES (that is not required to render), i did manage to shave the DEADLINE load time by half.

Iā€™m just about to get started with debugging RS on our Windows farm. I wonder if the enviroment, i.e. env vars, are different and/or not set properly when launching on the farm. For example, if I set the DL ROP to write out .rs files and then render them as a separate job, they all fail indicating no GPU (on the same hosts that normally render RS jobs just fine). So definitely something not square there. Licensing oddities? Just throwing things out there.

Hi Bobo, thank you for your help so far, i have tried your suggestion, but unfortunately, it did not help.

I have also did some testing, and posted on the forums. any chance you could have a look into this issue further?

Thanks!

Hi everyone, update on some findings.

just to get the project ā€œrenderableā€ i took a single project file of 64mb, which included all my modelling, processing setup + rendering, and divided into PROCESSING / RENDERING.

this is obviously not ideal at all, since if i need to edit anything, i need to keep opening and closing PROCESSING file to export updated proxy or geo.

however, by doing this, i have managed to bring down the DEADLINE launching HOUDINI time to about 3 min. again, still quite an overhead, but far better then 20min i was getting previously.

so the file i am submitting to deadline right now, literally only has geo nodes with redshift proxies (not even bgeo file, but all proxie files) and REDSHIFT ROP files just to render them out.

so there is nothing to process or load, if i were to open this file in houdini , the file will open within 5 seconds, and render will launch in under 5 seconds as well (as everything is redshift proxy already)

and HOUDINI takes about 20 seconds to launch from cold.

thus my confusion why its taking so long to launch the rendering process.

it seems --------------------------

submit a job to deadline,
deadline will launch HYTHON almost immediately.

HYTHON process will then be on about 1.5% - 2% CPU usage (hardly any disk read or network access) for about 3 min.

this is where the DEADLINE log will be stuck for about 3min.

and then the RENDERING will begin, and HYTHON CPU, GPU, RAM usage will peak.


honestly any help will be appreciated.

i am having to really bend my workflow just to work around this issue, adding unnecessary hours just jumping back and forth between files, and even then, it seems there are still weird overhead on each task.

Thank you in advance for any help!

1 Like

According to the posted logs, and looking at the hrender_dl.py, the 10 minutes wait occurs here:

# Print out load warnings, but continue on a successful load.
try:
    hou.hipFile.load( inputFile )
except hou.LoadWarning as e:
    print(e)

In other words, the Deadline integration script calls hou.hipFile.load( inputFile ) and then sits there for 10 minutes waiting for that function to return. So there is nothing we can tell you about what is happening inside Houdini in that time.

You should try calling exactly the same function hou.hipFile.load( inputFile ) from your own HYTHON script outside of Deadline, and measure how long it takes to load. If it takes 10 minutes, you need to have a word with SideFX about why it takes this long vs. loading in an interactive session.

Unfortunately i cant agree with you bobo, i found a guy on redshift form who took the script from deadline 10.0.26 and he said there are no delays, so its not a probl with houdiniā€¦

as i already wrote in my previous post the cmd i was using to run the houdini from cmd and render the scene, and as i said there are no delays in loading and starting render, only issue i had was with deadline,

and btw scene was very very simple 1 mesh and 1 dome lightā€¦

Does the old script trick work for you?

when I start Houdini env and then use hbatch it loads fine and at the point where in deadline it sits for 20-30 minutes here, it just ends loading and ready to start rendering. There is no 20-30 minutes pause anywhere. I load the scene with hbatch and then Redshift_setGPU to set GPUs used and then render -f frame range ROP and all go into work. No delays like when the same scene is submitted from Deadline, having in this scene 30 minutes wait before starting to render.
So same scene, the only difference is submitted to the deadline or render from the command line

Hi bobo, i dont have that build, :frowning: i asked the guy to share the hrender_dl.py, but no sign of him yet, think you can dig the build and share just a py so we can give it a try?

I;ve turned off path mapping to test, no luck issue is still there

hrender_dl_10_0_26.zip (4.3 KB)

that was fast bobo, checking in next 30min and let you know asap :wink:

haha, started in a split second :slight_smile: i can confirm its working, @mirko plz try on your side

1 Like

Awesome, I will file a ticket with the dev. team to start looking into what changed.

i would check what has changed but im way to busy with some projects hereā€¦ so would be great to know what went wrong :slight_smile:

Could anyone donate a simple non-production scene file that reproduces the problem so I can attach it to the ticket for the dev. team to play with?

I think i might share the scene with few cache files, let me wrap everything and send youā€¦

2021-04-29 19:05:13: 0: STDOUT: Unknown command: }
2021-04-29 19:05:15: 0: STDOUT: ROP type: Redshift_ROP

Seems to be working here now as well!
Usually, right after Unknown command: } line, there was 30 minutes pause here for this project.
I can check with the client if I can share this scene. It is a bit bigger as well but if it can helpā€¦ it was such a radical pause as well so good example if this one is working then it is working!But have to check if I can share itā€¦

scene

a small note, so devā€™s could dive into the files

  • there are 2 files inside, v3, 4 -

i wanna to clean up the scene from garbage nodes, so i started deleting them and i ended up with v4 scene, with only 2 nodes, geo node and 1 light node

so i have submited to deadline v4 scene and notice that render is working fine??? surprise :slight_smile: so i have uploaded the original the v3 scene the one that ā€˜does notā€™ work tooā€¦

Privacy | Site terms | Cookie preferences