Scheduler Thread - Task “23_1046-1046” could not be found because task has been modified:
current status = Rendering, new status = Queued
current slave = LAPRO0408, new slave =
current frames = 1046-1046, new frames = 1046-1046
Scheduler Thread - Cancelling task…
Scheduler Thread - Task “23_1046-1046” could not be found because task has been modified:
current status = Rendering, new status = Queued
current slave = LAPRO0408, new slave =
current frames = 1046-1046, new frames = 1046-1046
Scheduler Thread - Cancelling task…
Scheduler Thread - Task “23_1046-1046” could not be found because task has been modified:
current status = Rendering, new status = Queued
current slave = LAPRO0408, new slave =
current frames = 1046-1046, new frames = 1046-1046
Scheduler Thread - Cancelling task…
---- 2013/12/03 17:26 ----
Scheduler Thread - Task “23_1046-1046” could not be found because task has been modified:
current status = Rendering, new status = Queued
current slave = LAPRO0408, new slave =
current frames = 1046-1046, new frames = 1046-1046
Scheduler Thread - Cancelling task…
Scheduler Thread - Task “23_1046-1046” could not be found because task has been modified:
current status = Rendering, new status = Queued
current slave = LAPRO0408, new slave =
current frames = 1046-1046, new frames = 1046-1046
Scheduler Thread - Cancelling task…
Connecting to slave log: LAPRO0408
2014-08-19 07:09:00: 0: PYTHON: Success!
2014-08-19 07:09:00: 0: INFO: JobPreLoad.main version: 1.57
2014-08-19 07:09:00: 0: INFO: RUN SCENEJANITOR
2014-08-19 07:49:04: 0: Slave timed out – canceling current task…
2014-08-19 09:38:20: Scheduler Thread - Task “92_1072-1072” could not be found because task has been modified:
2014-08-19 09:38:20: current status = Rendering, new status = Suspended
2014-08-19 09:38:20: current slave = LAPRO0874, new slave =
2014-08-19 09:38:20: current frames = 1072-1072, new frames = 1072-1072
2014-08-19 09:38:20: Scheduler Thread - Cancelling task…
2014-08-19 09:38:41: Scheduler Thread - Task “92_1072-1072” could not be found because task has been modified:
2014-08-19 09:38:41: current status = Rendering, new status = Suspended
2014-08-19 09:38:41: current slave = LAPRO0874, new slave =
2014-08-19 09:38:41: current frames = 1072-1072, new frames = 1072-1072
2014-08-19 09:38:41: Scheduler Thread - Cancelling task…
2014-08-19 09:39:05: Scheduler Thread - Task “92_1072-1072” could not be found because task has been modified:
2014-08-19 09:39:05: current status = Rendering, new status = Suspended
2014-08-19 09:39:05: current slave = LAPRO0874, new slave =
A subprocess hung up and deadline is unable to handle that. THe machine has been hanging for a day now
In theory, this is something we would be able to detect in version 8 when we sandbox the python environment, since the python code will now be running as a separate process. In fact, this will likely just be caught by the task timeout functionality.
For now though, rather than run os.system, could you use subprocess instead? Then you could give the process a certain amount of time to run before killing it and moving on.
Yeah, im refactoring the code now. It probably isnt sufficient to just have a wait, i should probably spawn a subprocess and do my own timeout handling, correct?
Adding a wait might just block the code at a later point?
This is happening very often btw, on various jobs (nuke, maya, max etc). Any subprocess we start can basically indefinitely hang up deadline, ultimately resulting in deadline being unreliable as a render manager, as it cant really manage the jobs its starting.
That encapsulated python environment is more important i think than you guys realize
Anyway, this wasn’t a subprocess, all this is within nuke:
2014-08-20 16:11:11: 0: STDOUT: Wed Aug 20 16:11:10 2014 (+0ms) :(localizeReads): Nuke localizer version: 0.52 : Localizing nodes...
2014-08-20 16:11:11: 0: STDOUT: removing localizePathFilter filter
2014-08-20 16:11:11: 0: STDOUT: Wed Aug 20 16:11:10 2014 (+0ms) :(localizeReads): Frame Range: 1060-1063, 1
2014-08-20 16:11:11: 0: STDOUT: Wed Aug 20 16:11:10 2014 (+0ms) :(localizeReads): Root Frame: 1067
2014-08-20 16:11:11: 0: STDOUT: Wed Aug 20 16:11:10 2014 (+0ms) :(localizeReads): Nuke localizer version: calling localizeNodes()
2014-08-20 16:11:11: 0: STDOUT: WritePath found: //inferno2/projects/gold/scenes/SHR_shr_rsrc/2d/slapcomps/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005/linear_2112x1188x1/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005.1060.exr
2014-08-20 16:11:11: 0: STDOUT: WritePath found: //inferno2/projects/gold/scenes/SHR_shr_rsrc/2d/slapcomps/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005/linear_2112x1188x1/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005.1061.exr
2014-08-20 16:11:11: 0: STDOUT: WritePath found: //inferno2/projects/gold/scenes/SHR_shr_rsrc/2d/slapcomps/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005/linear_2112x1188x1/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005.1062.exr
2014-08-20 16:11:11: 0: STDOUT: WritePath found: //inferno2/projects/gold/scenes/SHR_shr_rsrc/2d/slapcomps/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005/linear_2112x1188x1/SHR_shr_rsrc_2d_slapcomps_vehbrigeaglea_v0005.1063.exr
2014-08-20 16:26:59: 0: Slave timed out -- canceling current task...
2014-08-20 16:47:05: Scheduler Thread - Task "15_1060-1063" could not be found because task has been modified:
2014-08-20 16:47:05: current status = Rendering, new status = Failed
2014-08-20 16:47:05: current slave = LAPRO1344-secondary, new slave =
2014-08-20 16:47:05: current frames = 1060-1063, new frames = 1060-1063
2014-08-20 16:47:05: Scheduler Thread - Cancelling task...
2014-08-20 16:47:29: Scheduler Thread - Task "15_1060-1063" could not be found because task has been modified:
2014-08-20 16:47:29: current status = Rendering, new status = Failed
2014-08-20 16:47:29: current slave = LAPRO1344-secondary, new slave =
2014-08-20 16:47:29: current frames = 1060-1063, new frames = 1060-1063
2014-08-20 16:47:29: Scheduler Thread - Cancelling task...
2014-08-20 16:47:53: Scheduler Thread - Task "15_1060-1063" could not be found because task has been modified:
2014-08-20 16:47:53: current status = Rendering, new status = Failed
2014-08-20 16:47:53: current slave = LAPRO1344-secondary, new slave =
2014-08-20 16:47:53: current frames = 1060-1063, new frames = 1060-1063
2014-08-20 16:47:53: Scheduler Thread - Cancelling task...
2014-08-20 16:48:15: Scheduler Thread - Task "15_1060-1063" could not be found because task has been modified:
2014-08-20 16:48:15: current status = Rendering, new status = Failed
2014-08-20 16:48:15: current slave = LAPRO1344-secondary, new slave =
2014-08-20 16:48:15: current frames = 1060-1063, new frames = 1060-1063
Machine has been hanging for hours before someone noticed.
It’s our top priority for version 8, and we’ve already started the design phase for it. Unfortunately, this is a MAJOR change, so it’s not something we can quickly squeeze into v7. It even has us considering a refactoring of the Deadline plugin architecture.
We completely understand that this is a major inconvenience, but we have to implement a change like this properly, otherwise we risk destabilizing things even more.
Sadly, we now have to have scripts like this (to reload modules constantly):
import scl.pipeline.scanPath as scanPath
reload(scanPath)
import scl.nuke.publishUtils as publishUtils
reload(publishUtils)
import scl.shotgun.createVersion as createVersion
reload(createVersion)
import scl.nuke.makeProxyQuicktime as makeProxyQuicktime
reload(makeProxyQuicktime)
from scl.pipeline import scLog
reload(scLog)
import scanline.Paths as Paths
reload(Paths)
import scl.nuke.getSequence as getSequence
reload(getSequence)
import scl.shotgun.getsettings as getsettings
reload(getsettings)
import shootBotLib2
reload(shootBotLib2)
And scripts like this (to clear leftover env variables from previous jobs):
def unsetEnvironmentVariables(deadlinePlugin):
deadlinePlugin.LogInfo('JobPreLoad.unsetEnvironmentVariables')
aEnvironmentVariablesToUnset = []
oSclConfigRegex = re.compile(r'^(scl_config_)(.*?)$', re.IGNORECASE)
for key in os.environ.keys():
if oSclConfigRegex.match(key):
deadlinePlugin.LogInfo('Delete: %s' % key)
del os.environ[key]
if ('PYTHONPATH' in os.environ.keys()):
del os.environ['PYTHONPATH']
Bend-aids all over the place…
Imagine you used 3dsmax, but you couldn’t shut it down. You opened it on monday and would have it running for 2-3 weeks straight, but working on various shows, scenes, versions in the same session. And you couldn’t reset your scene, instead you would just delete all the objects and then merge in the next scene. It would leave the material editor entries there, global variables, script configs etc. So much sadness.