RunManagedProcess fails with WindowsError: [Errno 22]

I’m running an advanced type plugin on CentOS slaves (deadline 5.0.0.44528)
RunManagedProcess fails with following error :

WindowsError: [Errno 22] ApplicationName=’/Apps/Maya/linux64/2011_SAP/bin/Render’, CommandLine=’ -r file -mr:art -s 495 -e 504 -proj //hal/Projects/SomeProj/Maya //hal/Projects/SomeProj/Maya/scenes/someScene.mb’, CurrentDirectory=’//hal/Projects/SomeProj/Maya/scenes’

Construction of PluginProcess object and execution of RunManagedProcess() is enclosed in try/except block if that is relevant information.

try:
proc= mfMaya3PluginProcess()
if(IsRunningOnWindows()):proc.cmdline= SearchPath( “render.exe” )
else:proc.cmdline= SearchPath( “Render” )
proc.parline= mayaCmdParameters
proc.startupdirectory= os.path.dirname(self.mayaSceneFile)

		LogInfo("Render " + mayaCmdParameters)
		RunManagedProcess(proc);

	except Exception:
		LogInfo("Exception in mf_RenderTasks - RunManagedProcess()")
		errStr = traceback.format_exc()
		LogInfo( errStr )
		self.mf_postTask()
		FailRender(errStr)

The failure comes from time to time, but when it comes it happens on all rendernodes at once and I have to reboot them.
The non-standard thing I am doing is setting environment variables on per-job basis (and that includes PATH), but it seems to be OK.

What is WindowsError: [Errno 22] ?
Any thoughts on that ?

thanks

We think this problem is a result of the Deadline Slave process running out of process handles. We recently discovered how this problem could occur, and it should be addressed in the upcoming 5.1 release (which is currently in beta).

We believe the problem occurs when the machine’s resources are being maxed out (due to a heavy render). Deadline 5.0 uses external processes like “top” to gather this info, and normally these processes would run quickly and get cleaned up. However, if the system is getting hammered, it’s possible these processes won’t return in a timely fashion, and the process is left running. Eventually, the problem steamrolls to a point where Deadline simply has too many external processes running and errors like this start to occur.

In 5.1, there are two improvements that should theoretically fix this problem:

  1. Many of these external process calls have been replaced by system library calls. This greatly reduces the number of external processes the slave starts in the first place.
  2. If the remaining processes don’t return in a timely fashion, they are killed.

If you would like to join the 5.1 beta, see here for more info:
viewtopic.php?f=10&t=5919

Cheers,

  • Ryan

Hello Russel,
I’m too busy right now to roll in 5.1 beta, so I’ll rather wait for final release.
Thanks for clarifications and quick answer

all the best :slight_smile: