AWS Thinkbox Discussion Forums

Submitting to Deadline with Houdini randomly causes errors

Hi,

I have recently been experiencing errors when submitting to Deadline from Houdini and Redshift.

I have been using Prism to submit the jobs to deadline.

Every now and then the jobs will throw up the following error:

=======================================================
Error
=======================================================
FailRenderException : Error: Caught exception: The attempted operation failed.
   at Deadline.Plugins.DeadlinePlugin.FailRender(String message) (Python.Runtime.PythonException)
  File "C:\ProgramData\Thinkbox\Deadline10\workers\eidyAMDWorkstation\plugins\66773f9c034303160f99abe1\Houdini.py", line 438, in HandleStdoutError
    self.FailRender(self.GetRegexMatch(1))
   at Python.Runtime.Dispatcher.Dispatch(ArrayList args)
   at __FranticX_Processes_ManagedProcess_StdoutHandlerDelegateDispatcher.Invoke()
   at FranticX.Processes.ManagedProcess.RegexHandlerCallback.CallFunction()
   at FranticX.Processes.ManagedProcess.e(String dl, Boolean dm)
   at FranticX.Processes.ManagedProcess.Execute(Boolean waitForExit)
   at Deadline.Plugins.DeadlinePlugin.DoRenderTasks()
   at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)
   at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)

=======================================================
Type
=======================================================
RenderPluginException

=======================================================
Stack Trace
=======================================================
   at Deadline.Plugins.SandboxedPlugin.d(DeadlineMessage bgq, CancellationToken bgr)
   at Deadline.Plugins.SandboxedPlugin.RenderTask(Task task, CancellationToken cancellationToken)
   at Deadline.Slaves.SlaveRenderThread.c(TaskLogWriter ajv, CancellationToken ajw)

If I resubmit the job then it will pretty much always work fine, it’s annoying because I’m having to try and get one frame rendering on each job before I’m sure it won’t fail. So it’s difficult to submit lots of jobs at once since i have to test each of them.

Do you have any idea what this might be? If the job can make it past the preprocessing stage and the GPU fires up then I know the job will work fine.

I’m on Windows 10, Deadline 10.2.1.0, Redshift 3.5.23, Houdini 19.5.805.

Thanks!

what gpu does it show in use at the end of the log?

It’s showing the third slot gpu in this particular case, but it’s not always the same.

Spitballing, you don’t happen to have the stdout log for the failures by any chance? There was a time (in a previous life) where I had to get a regex filter amended because it was picking up an “Error: file not found in cache, copying it now” message. Was intermittent as it depended on what else was going on and when the cache was being flushed etc etc.

I’ve lost the full stdout log but I do remember that there was a message like “Error: Circle01 missing…” something like that, so it sounds like a similar issue. How did you manage to fix it?

If i read the trace correctly here:

(theoretically, somewhere in your render failure it should be telling you what the message it tripped against was too)

If you check that file I’m guessing it’d be a copy of the Houdini.py plugin which has a bunch of regexs, and there’s this one in particular that would match

self.AddStdoutHandlerCallback( "(Error: .*)" ).HandleCallback += self.HandleStdoutError

So in our case, we just patched the handler for that case to be a bit more discerning and let certain cases pass through.

1 Like

Thanks so much for your help. I’ll see when I run into the error again and then patch against that specific instance to see if it fixes it. It’s a really hard bug to test.

Privacy | Site terms | Cookie preferences