Hello,
I have an issue with deadline and arnold when rendering with GPU
It randomly return an error that fail the render :
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB ERROR   | [gpu] no compatible NVIDIA GPUs found (GPU-rendering)
my server is a 8GPU one, with 8x RTX 2080TI, running on centos7 and deadline 10.1 (up to date version at the writing time)
any idea what can cause that ?
here is the full log
=======================================================
Error
FailRenderException : ERROR : RenderBox.LDTArnoldRender.task : Render aborted
at Deadline.Plugins.DeadlinePlugin.FailRender(String message) (Python.Runtime.PythonException)
File “/var/lib/Thinkbox/Deadline10/workers/hawkins/plugins/60b4b7cded2ed4683cc20d25/Gaffer.py”, line 189, in HandleGafferError
self.FailRender(self.GetRegexMatch(0))
at Python.Runtime.Dispatcher.Dispatch(ArrayList args)
at __FranticX_Processes_ManagedProcess_StdoutHandlerDelegateDispatcher.Invoke()
at FranticX.Processes.ManagedProcess.RegexHandlerCallback.CallFunction()
at FranticX.Processes.ManagedProcess.e(String cj, Boolean ck)
at FranticX.Processes.ManagedProcess.Execute(Boolean waitForExit)
at Deadline.Plugins.DeadlinePlugin.DoRenderTasks()
at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)
at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)
=======================================================
Type
RenderPluginException
=======================================================
Stack Trace
at Deadline.Plugins.SandboxedPlugin.d(DeadlineMessage bep, CancellationToken beq)
at Deadline.Plugins.SandboxedPlugin.RenderTask(Task task, CancellationToken cancellationToken)
at Deadline.Slaves.SlaveRenderThread.c(TaskLogWriter ajr, CancellationToken ajs)
=======================================================
Log
2021-05-31 12:23:13:  2: Loading Job’s Plugin timeout is Disabled
2021-05-31 12:23:13:  2: WARNING: Python version for ‘Gaffer’ plugin is not specified! Defaulting to Py2.
2021-05-31 12:23:13:  2: SandboxedPlugin: Render Job As User disabled, running as current user ‘william’
2021-05-31 12:23:16:  2: Secrets Management feature is enabled.
2021-05-31 12:23:16:  2: Executing plugin command of type ‘Initialize Plugin’
2021-05-31 12:23:16:  2: INFO: Executing plugin script ‘/var/lib/Thinkbox/Deadline10/workers/hawkins/plugins/60b4b7cded2ed4683cc20d25/Gaffer.py’
2021-05-31 12:23:16:  2: PYTHON: selectGPUDevices :
2021-05-31 12:23:16:  2: PYTHON: startingGPU : 2
2021-05-31 12:23:16:  2: PYTHON: startIndex : 2
2021-05-31 12:23:16:  2: PYTHON: endIndex : 3
2021-05-31 12:23:16:  2: PYTHON: resultGPUs 1 : [‘2’]
2021-05-31 12:23:16:  2: INFO: The Worker is overriding the GPUs to render, so the following GPUs will be used: 2
2021-05-31 12:23:16:  2: PYTHON: startIndex 2
2021-05-31 12:23:16:  2: PYTHON: used GPUs : [‘2’, ‘2’]
2021-05-31 12:23:16:  2: INFO: GPUs per task is greater than 0, so the following GPUs will be used: 2,2
2021-05-31 12:23:16:  2: PYTHON: GPUs return : [‘2’, ‘2’]
2021-05-31 12:23:16:  2: INFO: About: Gaffer for Deadline
2021-05-31 12:23:16:  2: INFO: The job’s environment will be merged with the current environment before rendering
2021-05-31 12:23:16:  2: Done executing plugin command of type ‘Initialize Plugin’
2021-05-31 12:23:16:  2: Start Job timeout is disabled.
2021-05-31 12:23:16:  2: Task timeout is disabled.
2021-05-31 12:23:16:  2: Loaded job: RenderBox.LDTArnoldRender (60b4b7cded2ed4683cc20d25)
2021-05-31 12:23:16:  2: Executing plugin command of type ‘Start Job’
2021-05-31 12:23:16:  2: DEBUG: S3BackedCache Client is not installed.
2021-05-31 12:23:16:  2: INFO: Executing global asset transfer preload script ‘/var/lib/Thinkbox/Deadline10/workers/hawkins/plugins/60b4b7cded2ed4683cc20d25/GlobalAssetTransferPreLoad.py’
2021-05-31 12:23:16:  2: INFO: Looking for legacy (pre-10.0.26) AWS Portal File Transfer…
2021-05-31 12:23:16:  2: INFO: Looking for legacy (pre-10.0.26) File Transfer controller in /opt/Thinkbox/S3BackedCache/bin/task.py…
2021-05-31 12:23:16:  2: INFO: Could not find legacy (pre-10.0.26) AWS Portal File Transfer.
2021-05-31 12:23:16:  2: INFO: Legacy (pre-10.0.26) AWS Portal File Transfer is not installed on the system.
2021-05-31 12:23:16:  2: Done executing plugin command of type ‘Start Job’
2021-05-31 12:23:16:  2: Plugin rendering frame(s): 11-15
2021-05-31 12:23:16:  2: Executing plugin command of type ‘Render Task’
2021-05-31 12:23:16:  2: INFO: Performing path mapping
2021-05-31 12:23:17:  2: CheckPathMapping: Swapped "__children[“SCENE_MANAGEMENT_TOOLS”][“Instancer”][“Backdrop1”][“description”].setValue( “Here is another way to set up the instancer when you only have a single prototype. You don’t need MeshToPoints. If you were wanting to use the prototypeRoots primvar, you also need to an int prototypeIndex vertex primvar too. See:\n\nhttp://www.gafferhq.org/documentation/0.57.0.0/Reference/NodeReference/GafferScene/Instancer.html#prototypemode” )
2021-05-31 12:23:17:  2: " with "__children[“SCENE_MANAGEMENT_TOOLS”][“Instancer”][“Backdrop1”][“description”].setValue( “Here is another way to set up the instancer when you only have a single prototype. You don’t need MeshToPoints. If you were wanting to use the prototypeRoots primvar, you also need to an int prototypeIndex vertex primvar too. See:\n\nhtt/mnt/Projects_f//www.gafferhq.org/documentation/0.57.0.0/Reference/NodeReference/GafferScene/Instancer.html#prototypemode” )
2021-05-31 12:23:17:  2: "
2021-05-31 12:23:17:  2: INFO: Stdout Redirection Enabled: True
2021-05-31 12:23:17:  2: INFO: Asynchronous Stdout Enabled: False
2021-05-31 12:23:17:  2: INFO: Stdout Handling Enabled: True
2021-05-31 12:23:17:  2: INFO: Popup Handling Enabled: False
2021-05-31 12:23:17:  2: INFO: Using Process Tree: True
2021-05-31 12:23:17:  2: INFO: Hiding DOS Window: True
2021-05-31 12:23:17:  2: INFO: Creating New Console: False
2021-05-31 12:23:17:  2: INFO: Running as user: william
2021-05-31 12:23:17:  2: INFO: Executable: “/opt/gaffer-0.59.8.0-linux-python2/bin/gaffer”
2021-05-31 12:23:17:  2: INFO: Argument: execute -script “/var/lib/Thinkbox/Deadline10/workers/hawkins/jobsData/60b4b7cded2ed4683cc20d25/thread2_temp8gFaz0/projectLighting_v001.gfr” -nodes RenderBox.LDTArnoldRender -frames 11-15 -context “-shot:layer” “‘beauty’” “-dispatcher:jobDirectory” “’/mnt/Projects_b/0079_VisonR_GERLAIN_2021-05-03_Kaiz3r/3D/dispatcher/deadline/projectLighting_v001/000046’” “-dispatcher:scriptFileName” “’/mnt/Projects_b/0079_VisonR_GERLAIN_2021-05-03_Kaiz3r/3D/dispatcher/deadline/projectLighting_v001/000046/projectLighting_v001.gfr’”
2021-05-31 12:23:17:  2: INFO: Full Command: “/opt/gaffer-0.59.8.0-linux-python2/bin/gaffer” execute -script “/var/lib/Thinkbox/Deadline10/workers/hawkins/jobsData/60b4b7cded2ed4683cc20d25/thread2_temp8gFaz0/projectLighting_v001.gfr” -nodes RenderBox.LDTArnoldRender -frames 11-15 -context “-shot:layer” “‘beauty’” “-dispatcher:jobDirectory” “’/mnt/Projects_b/0079_VisonR_GERLAIN_2021-05-03_Kaiz3r/3D/dispatcher/deadline/projectLighting_v001/000046’” “-dispatcher:scriptFileName” “’/mnt/Projects_b/0079_VisonR_GERLAIN_2021-05-03_Kaiz3r/3D/dispatcher/deadline/projectLighting_v001/000046/projectLighting_v001.gfr’”
2021-05-31 12:23:17:  2: INFO: Startup Directory: “/opt/gaffer-0.59.8.0-linux-python2/bin”
2021-05-31 12:23:17:  2: INFO: Process Priority: BelowNormal
2021-05-31 12:23:17:  2: INFO: Process Affinity: default
2021-05-31 12:23:17:  2: INFO: Process is now running
2021-05-31 12:23:22:  2: STDOUT: WARNING : RendererAlgo::CameraOutput : Camera missing for location “/SQ002/cameras/SQ002_005_CAM” at frame 11
2021-05-31 12:23:22:  2: STDOUT: WARNING : RendererAlgo::CameraOutput : Camera missing for location “/SQ002/cameras/SQ002_001_CAM” at frame 11
2021-05-31 12:23:22:  2: STDOUT: WARNING : RendererAlgo::CameraOutput : Camera missing for location “/SQ002/cameras/SQ002_003_CAM” at frame 11
2021-05-31 12:23:22:  2: STDOUT: WARNING : RendererAlgo::CameraOutput : Camera missing for location “/SQ002/cameras/SQ002_004_CAM” at frame 11
2021-05-31 12:23:22:  2: STDOUT: WARNING : RendererAlgo::CameraOutput : Camera missing for location “/SQ002/cameras/SQ002_002_CAM” at frame 11
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB WARNING | rendering with watermarks because of failed authorization:
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |  [clm.v2] timeout before callback was called
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |  environment variables:
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |         ARNOLD_LICENSE_ORDER   = (not set)
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |         ARNOLD_LICENSE_MANAGER = (not set)
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |  [rlm]  solidangle_LICENSE     = (not set)
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |  [rlm]  RLM_LICENSE            = (not set)
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |  [clm]  ADSKFLEX_LICENSE_FILE  = (not set)
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB         |  [clm]  LM_LICENSE_FILE        = (not set)
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   428MB ERROR   | [gpu] no compatible NVIDIA GPUs found (GPU-rendering)
2021-05-31 12:23:32:  2: STDOUT: 00:00:13   443MB WARNING | Aborted by user:  received abort signal
2021-05-31 12:23:33:  2: Sending kill command to process gaffer with id: 118162
2021-05-31 12:23:33:  2: Done executing plugin command of type ‘Render Task’