We have a set number of workers shutting themselves down after what seems to be 12 to 24 hours of running. This is new behavior for them, they have been working fine for the last 1-2 years. We run deadline as a service and the service itself is getting a stop command but I don’t know where it is coming from. As far as I know, we are not running pulse and we do not have power options turned on in the event the slaves go idle for a period of time.
It’s about 8 of our 22 machines. Is there a setting I can check? Would it be a worker setting or a repository setting? Since it’s only a small number of machines, I don’t suspect it being a global setting or our entire farm would be affected.
are they linux or windows, do they have controls set for shutdown on the OS?
are they possible overheating and shutting down? power issues?
do users have the option to shut down the machines? can you check the worker logs around shutdown time?
is it always the same 8 nodes or any 8/22 nodes?
Yes, it is always the same machines. I don’t suspect power issues as the machines stay on, just the service is shutting down. These machines have been solid for the last year plus, so this behavior is just recent.
They are all windows machines. I will check the logs to see if I can see anything.
Here is the most recent log entry, is there a setting I can check to see what I sending this shutdown command? I’ve felt I’ve been all through the monitor settings but can’t find anything that could be sending this command. This same command is there on all 8 of the exact same machines that shut down.
2022-03-09 23:08:42: Deadline Launcher 10.1 [v10.1.17.4 Release (d3559fe75)]
2022-03-09 23:08:42: Launcher Thread - ::1 has connected
2022-03-09 23:08:42: Launcher Thread - Received command: ForceStopSlave
2022-03-09 23:08:42: Sending command to Worker on port 63119: StopSlave
2022-03-09 23:08:42: Got reply: Connection Accepted.
2022-03-09 23:08:48: Launcher Thread - Responded with: Success
2022-03-09 23:08:48: Launcher Thread - ::1 has connected
2022-03-09 23:08:48: Launcher Thread - Received command: StopPulse
2022-03-09 23:08:48: No Pulse to shutdown
2022-03-09 23:08:48: Launcher Thread - Responded with: Success
2022-03-09 23:08:48: Launcher Thread - ::1 has connected
2022-03-09 23:08:48: Launcher Thread - Received command: StopBalancer
2022-03-09 23:08:48: No Balancer to shutdown
2022-03-09 23:08:48: Launcher Thread - Responded with: Success
2022-03-09 23:08:48: Launcher Thread - ::1 has connected
2022-03-09 23:08:48: Launcher Thread - Received command: StopMonitor
2022-03-09 23:08:48: No Monitor to shutdown
2022-03-09 23:08:48: Launcher Thread - Responded with: Success
2022-03-09 23:08:48: Launcher Thread - ::1 has connected
2022-03-09 23:08:48: Launcher Thread - Received command: StopRemoteConnectionServer
2022-03-09 23:08:48: No Remote Connection Server to shutdown
2022-03-09 23:08:48: Launcher Thread - Responded with: Success
2022-03-09 23:08:49: Launcher Thread - ::1 has connected
2022-03-09 23:08:49: Launcher Thread - Received command: StopLicenseForwarder
2022-03-09 23:08:49: No License Forwarder to shutdown
2022-03-09 23:08:49: Launcher Thread - Responded with: Success
2022-03-09 23:08:49: Launcher Thread - ::1 has connected
2022-03-09 23:08:49: Launcher Thread - Received command: StopWebService
2022-03-09 23:08:49: No Web Service to shutdown
2022-03-09 23:08:49: Launcher Thread - Responded with: Success
2022-03-09 23:08:49: Launcher Thread - ::1 has connected
2022-03-09 23:08:49: Launcher Thread - Received command: StopLauncher
2022-03-09 23:08:49: Shutting down
2022-03-09 23:08:49: Launcher Thread - OnConnect: Listener Socket has been closed.
2022-03-09 23:08:49: Launcher Thread - OnConnect: Listener Socket has been closed.
Here is the slave log as well, the above is the service log.
2022-03-09 23:08:42: Info Thread - requesting Worker info thread quit.
2022-03-09 23:08:42: 0: RenderThread CancelCurrentTask called, will transition from state None to None
2022-03-09 23:08:42: 0: In the process of canceling current task: ignoring exception thrown by PluginLoader
2022-03-09 23:08:42: 0: Executing plugin command of type ‘Cancel Task’
2022-03-09 23:08:42: 0: Done executing plugin command of type ‘Cancel Task’
2022-03-09 23:08:42: 0: RenderThread CancelCurrentTask called, will transition from state None to None
2022-03-09 23:08:42: 0: Executing plugin command of type ‘Cancel Task’
2022-03-09 23:08:42: 0: Done executing plugin command of type ‘Cancel Task’
2022-03-09 23:08:43: Info Thread - shutdown complete
2022-03-09 23:08:44: 0: Done executing plugin command of type ‘Render Task’
2022-03-09 23:08:44: 0: Error in EndJob: Error: FailRenderException : Received cancel task command from Deadline.
2022-03-09 23:08:44: at Deadline.Plugins.DeadlinePlugin.FailRender(String message) (Python.Runtime.PythonException)
2022-03-09 23:08:44: File “C:\ProgramData\Thinkbox\Deadline10\workers\XXXXXXXX\plugins\621e91aa4065f20ebc270045\VraySpawner.py”, line 144, in RenderTasks
2022-03-09 23:08:44: self.FailRender( “Received cancel task command from Deadline.” )
2022-03-09 23:08:44: at Python.Runtime.Dispatcher.Dispatch(ArrayList args)
2022-03-09 23:08:44: at __FranticX_GenericDelegate0Dispatcher.Invoke()
2022-03-09 23:08:44: at Deadline.Plugins.DeadlinePlugin.RenderTasks()
2022-03-09 23:08:44: at Deadline.Plugins.DeadlinePlugin.DoRenderTasks()
2022-03-09 23:08:44: at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)
2022-03-09 23:08:44: at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)
2022-03-09 23:08:46: 0: Error occurred when checking sandbox stdout: Cannot check if stdout is available for a process that has not been launched.
2022-03-09 23:08:46: Scheduler Thread - shutdown complete
2022-03-09 23:08:47: Error occurred when checking sandbox stdout: Cannot check if stdout is available for a process that has not been launched.
2022-03-09 23:08:48: Worker - Final cleanup
2022-03-09 23:08:48: Worker - Shutdown complete