Render slaves failing in mr standalone

After 12 hours of successful rendering our slaves have suddenly failed. Several days of trouble shooting have failed to figure out why. I’ve had problems in the past getting Deadline and mental ray standalone 3.7.53 to work well together. After posting here previously the response was that Deadline was only relaying an error message from mental ray. To check that I’ve entered the error code into those that Deadline should ignore but Deadline doesn’t seem to get the message. What seems to be happening is mental ray is reading the host file, trying to verify that all the hosts are present, and failing because it is unable to connect with every single host. I need Deadline to continue rendering even if various slaves are unavailable.

This is a continual problem with our set-up. Because our render farm is linked to a much larger corporate network we have regular periodic software installs and I usually have to go through this annoying process of checking the mr port numbers, firewall settings etc. to ensure nothing has been changed from the updates, then getting all the slaves to see each other, see the license server, and then get Deadline to work with them. Here are some specifics:

Using Deadline 3.1 on 64 bit windows XP machines with 32 bit mr standlone installed.

I set up the render from the command line and it worked. The same render fails within Deadline.

Constructor: MentalRay
0: Task timeout is disabled.
0: Loaded job: Untitled (999_050_999_3f35e7ed)
0: INFO: StartJob: initializing script plugin MentalRay
0: INFO: Handling stdout that matches regex “'ERROR :."
0: INFO: Handling stdout that matches regex "\s
\S+\s+\S+\s+progr:\s+([\d.]+)%."
0: INFO: Handling stdout that matches regex "\S+\s+\S+\s+error\s+([0-9]+)(.
)”
0: INFO: Handling stdout that matches regex “\s*[e,E]+rror[:](.)”
0: INFO: Handling stdout that matches regex “.Can’t connect to any SPM license server.
0: INFO: About: Mental Ray Plugin for Deadline
0: Plugin rendering frame(s): 66
0: INFO: Starting Mental Ray Render
0: INFO: Stdout Handling Enabled: True
0: INFO: Popup Handling Enabled: False
0: INFO: Using Process Tree: True
0: INFO: Hiding DOS Window: True
0: INFO: Creating New Console: False
0: INFO: Enforcing 32 bit build of Mental Ray
0: INFO: Render Executable: “c:\program files (x86)\autodesk\mrstand3.7.53-maya2010\bin\ray.bat”
0: INFO: Rendering to network drive…
0: INFO: Render Argument: -verbose 5 -file_dir “V:\deadline_test” -texture_path “\**\avvr\Projects\Pueblo_Revisited\Maya\sourceimages” -threads 10 -render 66 66 “V:\Projects\Pueblo_Revisited\Maya\scenes\pueblo_ext_02.mi”
0: INFO: Startup Directory: “c:\program files (x86)\autodesk\mrstand3.7.53-maya2010\bin”
0: INFO: Process Priority: BelowNormal
0: INFO: Process is now running
0: STDOUT: MSG .0 info : mental ray, version 3.7.53.5
0: STDOUT: MSG .0 info : use -copyright option to view copyright and terms of use.
0: STDOUT: MSG .0 progr: using 1 sharable license
0: STDOUT: MSG 0.0 info : version 3.7.53.5, Jun 17 2009, revision 88216
0: STDOUT: MI 0.0 progr: reading startup file “C:/Program Files (x86)/Autodesk/mrstand3.7.53-maya2010/rayrc”
0: STDOUT: MI 0.0 progr: parsing file C:/Program Files (x86)/Autodesk/mrstand3.7.53-maya2010/rayrc
0: STDOUT: MSG 0.0 info : reading hosts file …rayhosts
0: STDOUT: MSG 0.0 info : connecting host sfbd05221:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05219:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05218:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05220:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05222:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05216:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05217:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05223:7410
0: STDOUT: MSG 0.0 info : connecting host sfbd05224:7410
0: STDOUT: JOB 0.0 info : started threads 0,1,2,3 on sfbd05221:7410 now known as host 1
0: STDOUT: JOB 0.0 info : started threads 0,1,2,3 on sfbd05219:7410 now known as host 2
0: STDOUT: JOB 0.0 info : started threads 0,1,2,3 on sfbd05218:7410 now known as host 3
0: STDOUT: JOB 0.0 info : started threads 0,1,2,3 on sfbd05220:7410 now known as host 4
0: STDOUT: JOB 0.0 info : started threads 0,1,2,3 on sfbd05222:7410 now known as host 5
0: STDOUT: MSG 0.0 fatal 011512: SLAVE 6 DIED
0: STDOUT: MSG 0.0 fatal 011512: SLAVE 6 DIED
0: STDOUT: MSG 0.0 info : cleaning up memory mapped frame buffers
0: INFO: Process exit code: 1
Scheduler Thread - Render Thread 0 threw an error:
Scheduler Thread - Exception during render: An error occurred in RenderTasks(): Error in CheckExitCode(): Renderer returned non-zero error code, 1 (FranticX.Processes.ManagedProcessAbort) (Deadline.Plugins.RenderPluginException)
at Deadline.Plugins.ScriptPlugin.RenderTasks(Int32 startFrame, Int32 endFrame, String& outMessage)

Thanks for any info you can provide. I’m tapped on ideas.

Dan

Hi Dan,

It looks like mental ray is exiting prematurely, with an exit code of 1, which indicates that there was a problem. There really isn’t a way for Deadline to ignore this, because even if Deadline were to ignore this exit code, that still doesn’t help the fact that mr is exiting prematurely (Deadline isn’t killing the render in response to an error message in this case). I should note that the error numbers you can configure to be ignored are those printed out in mr’s stdout, not the exit code of ray.bat (sorry for the confusion there).

You mention that running the render from the command line works. Just to confirm, are you running the following on the render node that produced the error in Deadline?

"c:\program files (x86)\autodesk\mrstand3.7.53-maya2010\bin\ray.bat" -verbose 5 -file_dir "V:\deadline_test" -texture_path "\\*********\*********\avvr\Projects\Pueblo_Revisited\Maya\sourceimages" -threads 10 -render 66 66 "V:\Projects\Pueblo_Revisited\Maya\scenes\pueblo_ext_02.mi"

If not, please run this on the same machine, and post the output that you get.

Cheers,

  • Ryan

Hi Ryan,

The render syntax I use from the command line is slightly different than that excerpt from the deadline output. It goes as follows:

C:\Program Files\Autodesk\mrstand3.7.53-maya2010\bin>mentalrayrender -v 5 -render 80…220 -file_dir V:\deadline_test -texture_path “\sfosb066\BSII_RD\avvr\Projects\Pueblo_Revisited\Maya\sourceimages” “\sfosb066\BSII_RD\avvr\Projects\Pueblo_Revisited\Maya\scenes\pueblo_ext_02.mi”

From the command line all of our mental ray render nodes participate in calculating final gather etc… and rendering the frame, though the render consistently fails after an hour and has to be restarted.

Dan

Hi Ryan,

Just a little more clarification, the command line render syntax I just posted was what I enter on my 32 bit machine to send to the 64 bit render nodes. The path to the mental ray standalone plug-ins that’s been set up in Deadline is slightly different to reflect the folder structure on those 64 bit machines. It’s also pointing to the ray.exe file.

C:\Program Files (x86)\Autodesk\mrstand3.7.53-maya2010\bin\ray.exe

I need you to run the same command that Deadline is using, on one of your render nodes that produces this error. What we’re trying to determine if the error itself is related to Deadline or not. The command you’re using renders with mental ray in a different way than how Deadline does it.

In fact, that might be the root of the problem. If I’m not mistaken, your command starts up a satellite render across the machines that are configured in the host file. Deadline just performs a straight up command line render, and the only machine that should be participating is the machine the command is being executed on. Maybe the extra machine names in the host file on each render node is screwing things up…

Here’s the output from a problem render node:

C:\Program Files (x86)\Autodesk\mrstand3.7.53-maya2010\bin>ray.bat -v 5 -file_dir V:\deadline_test -texture_path “\sfos
b066\BSII_RD\avvr\Projects\Pueblo_Revisited\Maya\sourceimages” “\sfosb066\BSII_RD\avvr\Projects\Pueblo_Revisited\Maya\s
cenes\pueblo_ext_02.mi”
MSG .0 info : mental ray, version 3.7.53.5
MSG .0 info : use -copyright option to view copyright and terms of use.
MSG .0 progr: using 1 sharable license
MSG 0.0 info : version 3.7.53.5, Jun 17 2009, revision 88216
MI 0.0 progr: reading startup file “C:/Program Files (x86)/Autodesk/mrstand3.7.53-maya2010/rayrc”
MI 0.0 progr: parsing file C:/Program Files (x86)/Autodesk/mrstand3.7.53-maya2010/rayrc
MSG 0.0 info : reading hosts file ..rayhosts
MSG 0.0 info : connecting host sfbd05221:7410
MSG 0.0 info : connecting host sfbd05219:7410
MSG 0.0 info : connecting host sfbd05218:7410
MSG 0.0 info : connecting host sfbd05220:7410
MSG 0.0 info : connecting host sfbd05222:7410
MSG 0.0 info : connecting host sfbd05213:7410
MSG 0.0 info : connecting host sfbd05216:7410
MSG 0.0 info : connecting host sfbd05217:7410
MSG 0.0 info : connecting host sfbd05224:7410
JOB 0.0 info : started threads 0,1,2,3 on sfbd05221:7410 now known as host 1
JOB 0.0 info : started threads 0,1,2,3 on sfbd05219:7410 now known as host 2
JOB 0.0 info : started threads 0,1,2,3 on sfbd05218:7410 now known as host 3
JOB 0.0 info : started threads 0,1,2,3 on sfbd05220:7410 now known as host 4
JOB 0.0 info : started threads 0,1,2,3 on sfbd05222:7410 now known as host 5
JOB 0.0 info : started threads 0,1,2,3 on sfbd05213:7410 now known as host 6
JOB 0.0 info : started threads 0,1,2,3 on sfbd05216:7410 now known as host 7
JOB 0.0 info : started threads 0,1,2,3 on sfbd05217:7410 now known as host 8
MSG 0.0 fatal 011512: SLAVE 9 DIED
MSG 0.0 fatal 011512: SLAVE 9 DIED
MSG 0.0 info : cleaning up memory mapped frame buffers

C:\Program Files (x86)\Autodesk\mrstand3.7.53-maya2010\bin>

Here it is with exact syntax:

C:\Program Files (x86)\Autodesk\mrstand3.7.53-maya2010\bin>ray.bat -verbose 5 -file_dir “V:\deadline_test” -texture_path
“\sfosb066\BSII_RD\avvr\Projects\Pueblo_Revisited\Maya\sourceimages” -threads 10 -render 66 66 “V:\Projects\Pueblo_Rev
isited\Maya\scenes\pueblo_ext_02.mi”
MSG .0 info : mental ray, version 3.7.53.5
MSG .0 info : use -copyright option to view copyright and terms of use.
MSG .0 progr: using 1 sharable license
MSG 0.0 info : version 3.7.53.5, Jun 17 2009, revision 88216
MI 0.0 progr: reading startup file “C:/Program Files (x86)/Autodesk/mrstand3.7.53-maya2010/rayrc”
MI 0.0 progr: parsing file C:/Program Files (x86)/Autodesk/mrstand3.7.53-maya2010/rayrc
MSG 0.0 info : reading hosts file ..rayhosts
MSG 0.0 info : connecting host sfbd05221:7410
MSG 0.0 info : connecting host sfbd05219:7410
MSG 0.0 info : connecting host sfbd05218:7410
MSG 0.0 info : connecting host sfbd05220:7410
MSG 0.0 info : connecting host sfbd05222:7410
MSG 0.0 info : connecting host sfbd05213:7410
MSG 0.0 info : connecting host sfbd05216:7410
MSG 0.0 info : connecting host sfbd05217:7410
MSG 0.0 info : connecting host sfbd05224:7410
JOB 0.0 info : started threads 0,1,2,3 on sfbd05221:7410 now known as host 1
JOB 0.0 info : started threads 0,1,2,3 on sfbd05219:7410 now known as host 2
JOB 0.0 info : started threads 0,1,2,3 on sfbd05218:7410 now known as host 3
JOB 0.0 info : started threads 0,1,2,3 on sfbd05220:7410 now known as host 4
JOB 0.0 info : started threads 0,1,2,3 on sfbd05222:7410 now known as host 5
JOB 0.0 info : started threads 0,1,2,3 on sfbd05213:7410 now known as host 6
JOB 0.0 info : started threads 0,1,2,3 on sfbd05216:7410 now known as host 7
JOB 0.0 info : started threads 0,1,2,3 on sfbd05217:7410 now known as host 8
MSG 0.0 fatal 011512: SLAVE 9 DIED
MSG 0.0 fatal 011512: SLAVE 9 DIED
MSG 0.0 info : cleaning up memory mapped frame buffers

C:\Program Files (x86)\Autodesk\mrstand3.7.53-maya2010\bin>

OK, so this basically confirms that the problem isn’t directly related to Deadline.

So the question now is, why does rendering this way result in this error. This could be a question you could send to Autodesk. However, one thing you could try first is to temporarily remove the rayhosts file from this machine (just move it somewhere else, like your desktop) to see if that makes a difference.

Cheers,

  • Ryan

Thanks Ryan, I’l give that a try. I realize this isn’t really a deadline issue but is somehow related to Mental ray failing. I just went onto the render node that failed (05224) and restarted mental ray from the services window and the render started from the command line.

A little embarrassing. During the last software upgrade my mapped path to the drive containing scene and sourceimages files was removed on all render nodes. I should’ve realized it’s usually not the most complicated reason somethng isn’t working but the simplest.

Thanks for your help.

Dan