AWS Thinkbox Discussion Forums

job changed, but not really

We have an odd situation happening, where a slave picks up a job, then keeps working on its tasks, but sometimes thinks that the job changed, so it restarts max.

First frame:

2016-05-28 11:35:07:  Scanline: OnSlaveStartingJob sSlaveName LAPRO0889
2016-05-28 11:35:07:  Scanline: OnSlaveStartingJob sJobID 5749e4bb0294998b088060be
2016-05-28 11:35:07:  Scanline: RemoveJobFolderContents event called LAPRO0889
2016-05-28 11:35:07:  Scanline: cleaning sJobsDataFolder: C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\LAPRO0889\jobsData
2016-05-28 11:35:07:  Scanline: removing : C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\LAPRO0889\jobsData\5748e40efdc3803dc028c4d7
2016-05-28 11:35:08:  Scheduler Thread - Synchronization time for job files: 78.000 ms
2016-05-28 11:35:08:  Syncronizing Plugin 3dsmax from \\inferno2\deadline\repository8\plugins\3dsmax
2016-05-28 11:35:09:  Syncronization complete. Took: 1 seconds
2016-05-28 11:35:11:  0: Got task!
2016-05-28 11:35:11:  0: Plugin will be reloaded because a new job has been loaded, or one of the job files or plugin files has been modified
2016-05-28 11:35:11:  0: Loaded plugin 3dsmax (\\inferno2\deadline\repository8\plugins\3dsmax)
2016-05-28 11:35:11:  0: INFO: Executing plugin script 'C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\lapro0889\plugins\5749e4bb0294998b088060be\3dsmax.py'
2016-05-28 11:35:12:  0: INFO: About: 3dsmax Plugin for Deadline
2016-05-28 11:35:12:  0: INFO: The job's environment will be merged with the current environment before rendering
2016-05-28 11:35:12:  0: Start Job timeout is disabled.
2016-05-28 11:35:12:  0: Task timeout is 21600 seconds (Regular Task Timeout)
2016-05-28 11:35:12:  0: Loaded job: [SHALLOWSLA] BF_124_0180_v0079_str_Tryagain_cache_flowline_fluid_0 (5749e4bb0294998b088060be)

End of first frame, dequeuing the second frame:

2016-05-28 11:37:11:  0: Render time for frame(s): 1.997 m
2016-05-28 11:37:11:  0: Total time for task: 2.136 m
2016-05-28 11:37:11:  0: Saving task log...
2016-05-28 11:37:13:  Scheduler Thread - Render Thread 0 completed its task
2016-05-28 11:37:13:  Scheduler Thread - Seconds before next job scan: 2
2016-05-28 11:37:15:  Scheduler - Performing Job scan on Primary Pools with scheduling order Pool, Weighted, Balanced
2016-05-28 11:37:15:  Scheduler - Using enhanced scheduler balancing
2016-05-28 11:37:15:  Scheduler - Successfully dequeued 1 task(s).  Returning.
2016-05-28 11:37:15:  Scheduler - Returning limit stubs not in use.
2016-05-28 11:37:15:  Scanline: OnSlaveStartingJob sSlaveName LAPRO0889
2016-05-28 11:37:15:  Scanline: OnSlaveStartingJob sJobID 5749e4bb0294998b088060be
2016-05-28 11:37:15:  Scanline: RemoveJobFolderContents event called LAPRO0889
2016-05-28 11:37:15:  Scanline: cleaning sJobsDataFolder: C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\LAPRO0889\jobsData
2016-05-28 11:37:15:  Scanline: ignoring current jobid: 5749e4bb0294998b088060be
2016-05-28 11:37:15:  Scheduler Thread - All job files are already synchronized
2016-05-28 11:37:16:  0: Got task!
2016-05-28 11:37:16:  0: Plugin already loaded: 3dsmax

End of second frame, dequeing the third frame, but note that it now restarts max, because “Plugin will be reloaded because a new job has been loaded, or one of the job files or plugin files has been modified”…

2016-05-28 11:37:56:  0: Render time for frame(s): 40.261 s
2016-05-28 11:37:56:  0: Total time for task: 41.462 s
2016-05-28 11:37:56:  0: Saving task log...
2016-05-28 11:37:57:  Scheduler Thread - Render Thread 0 completed its task
2016-05-28 11:37:57:  Scheduler Thread - Seconds before next job scan: 2
2016-05-28 11:38:00:  Scheduler - Performing Job scan on Primary Pools with scheduling order Pool, Weighted, Balanced
2016-05-28 11:38:00:  Scheduler - Using enhanced scheduler balancing
2016-05-28 11:38:00:  Scheduler - Successfully dequeued 1 task(s).  Returning.
2016-05-28 11:38:00:  Scheduler - Returning limit stubs not in use.
2016-05-28 11:38:00:  Scanline: OnSlaveStartingJob sSlaveName LAPRO0889
2016-05-28 11:38:00:  Scanline: OnSlaveStartingJob sJobID 5749e4bb0294998b088060be
2016-05-28 11:38:00:  Scanline: RemoveJobFolderContents event called LAPRO0889
2016-05-28 11:38:00:  Scanline: cleaning sJobsDataFolder: C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\LAPRO0889\jobsData
2016-05-28 11:38:00:  Scanline: ignoring current jobid: 5749e4bb0294998b088060be
2016-05-28 11:38:00:  Scheduler Thread - Synchronization time for job files: 40.000 ms
2016-05-28 11:38:00:  Syncronizing Plugin 3dsmax from \\inferno2\deadline\repository8\plugins\3dsmax
2016-05-28 11:38:02:  Syncronization complete. Took: 2 seconds
2016-05-28 11:38:03:  0: Got task!
2016-05-28 11:38:03:  0: Plugin will be reloaded because a new job has been loaded, or one of the job files or plugin files has been modified
2016-05-28 11:38:03:  0: Unloading plugin: 3dsmax
2016-05-28 11:38:03:  0: INFO: End Job called - shutting down 3dsmax plugin

Any way to add more verbose logging there to figure out why deadline thinks there was a change? The job did not change in any way on the server, aux files are still the same etc. We do modify the local copy of the max file with our path-localizer system, but i dont think that matters. Also, if did, deadline would have been restarting max for frame 2 as well, not just frame 3.

Any clues?

This is still an ongoing problem, i didnt see any related fixes in the 8.0.2.1 release notes

I’ll have a look and see if I can dig up anything, it might be that our Max plugin is modifying itself in some weird way.

I don’t think that modifying the max file itself would trigger this, I’m not sure that we’re even checking aux files for changes. Just for reference though, at which step of the job dequeuing/rendering process are you guys making the modifications? (e.g., in a JobPreload.py, or event plugin, InitializePlugin, etc, etc).

At the very least, I can get some print statements in there for next build (would be next week likely), to help debug. That build should also have some of the sandbox fixes that you guys have been waiting on, as well.

Cheers,
Jon

The modifications are done when max has already started up, and then attempts to load the max file. It first checks if the max file is on the network (which it always is), then it gets copied locally, its contents localized, and then loaded. This is all handled via maxscripts by max post startup:

2016-06-07 15:27:23:  0: INFO: Argument:  -p "S:\TMP\dl.ini" -q -s "S:\deadlinedata\lapro1417\plugins\57574a28047df23f18faa041\deadlineStartupMax2015.max"
2016-06-07 15:27:23:  0: INFO: Startup Directory: "C:\3ds Max 2015"
2016-06-07 15:27:23:  0: INFO: Process Priority: BelowNormal
2016-06-07 15:27:23:  0: INFO: Process Affinity: default
2016-06-07 15:27:23:  0: INFO: Process is now running
2016-06-07 15:27:23:  0: INFO: Waiting for connection from 3dsmax
......
2016-06-07 15:27:59:  0: INFO: Connected to 3dsmax plugin: Lightning Jan  5 2015 12:30:13 R
2016-06-07 15:27:59:  0: INFO: Scene file to render: "\\inferno2\projects\common\pipeline\submission_queue\2016_06_07\CHR_CYB_0000_v0153_mur_Sand_near_images_render3d_FX-X_152638196\CHR_CYB_0000_v0153_mur_Sand_ne.max"
2016-06-07 15:27:59:  0: INFO: Directory name: "\\inferno2\projects\common\pipeline\submission_queue\2016_06_07\CHR_CYB_0000_v0153_mur_Sand_near_images_render3d_FX-X_152638196"
2016-06-07 15:27:59:  0: INFO: Uh oh, this is on the network! Lets copy it local to: "S:\deadlinedata\lapro1417\jobsData\57574a28047df23f18faa041"
2016-06-07 15:27:59:  0: INFO: Getting value of SCL_LOCALIZATION_DISABLE_NFS
2016-06-07 15:27:59:  0: INFO: Value of SCL_LOCALIZATION_DISABLE_NFS = 
2016-06-07 15:27:59:  0: INFO: Using NFS path: I:\
2016-06-07 15:27:59:  0: INFO: Copying I:\projects\common\pipeline\submission_queue\2016_06_07\CHR_CYB_0000_v0153_mur_Sand_near_images_render3d_FX-CyborgThurusters-Sand-near_152638196\CHR_CYB_0000_v0153_mur_Sand_ne.max to S:\deadlinedata\lapro1417\jobsData\57574a28047df23f18faa041\CHR_CYB_0000_v0153_mur_Sand_ne.max
2016-06-07 15:28:00:  0: INFO: Camera: "CHR_CYB_0000_camMaster"
2016-06-07 15:28:00:  0: INFO: Restarting renderer after each frame: 1
2016-06-07 15:28:00:  0: INFO: Showing frame buffer: 1
2016-06-07 15:28:00:  0: INFO: Overriding render output: "\\inferno2\projects\drift\scenes\CHR_CYB_0000\images\render3d\FX-CyborgThurusters-Sand-near\v0153_mur_Sand_near\beauty_1920x1080x1_linear\CHR_CYB_0000_FX-X_beauty_v0153..exr"
2016-06-07 15:28:00:  0: INFO: Overriding save file option: True
2016-06-07 15:28:00:  0: INFO: Frame number base: 0
2016-06-07 15:28:00:  0: INFO: Remove padding from output filename: 0
2016-06-07 15:28:00:  0: INFO: Enable gamma correction: 0
2016-06-07 15:28:00:  0: INFO: Ignore missing external file errors: 0
2016-06-07 15:28:00:  0: INFO: Ignore missing UVW errors: 1
2016-06-07 15:28:00:  0: INFO: Ignore missing XREF errors: 1
2016-06-07 15:28:00:  0: INFO: Ignore missing DLL errors: 0
2016-06-07 15:28:00:  0: INFO: Disabling Multipass: 0
2016-06-07 15:28:00:  0: INFO: Post load script: "S:\deadlinedata\lapro1417\jobsData\57574a28047df23f18faa041\PostLoadScript_0.ms"
2016-06-07 15:28:00:  0: INFO: Loading 3dsmax scene file
2016-06-07 15:28:00:  0: STDOUT: - 15:27:58.281 INFO: Begin resolve assembly 'Autodesk.Max.StateSets, Version=17.3.374.0, Culture=neutral, PublicKeyToken=null- 15:27:58.281 INFO: Resolved assembly
2016-06-07 15:28:00:  0: STDOUT: - 15:27:58.297 INFO: Begin resolve assembly 'Autodesk.Max.StateSets.XmlSerializers, Version=17.3.374.0, Culture=neutral, PublicKeyToken=null- 15:27:58.297 INFO: Cannot resolve assembly
2016-06-07 15:28:00:  0: STDOUT: - 15:27:58.297 INFO: Begin resolve assembly 'Autodesk.Max.StateSets.XmlSerializers- 15:27:58.297 INFO: Cannot resolve assembly
2016-06-07 15:28:00:  0: STDOUT: Tue Jun 07 15:28:00 2016 (+1662ms)  :(3dsmax): debug : SC_RenderSlaveScripts.LocalizeDependencies :: Starting File Localization ...
2016-06-07 15:28:00:  0: STDOUT: Tue Jun 07 15:28:00 2016 (+153ms)   :(3dsmax): debug : // SC_PathLocalizer 2.72 :: Deadline scene path to localize: S:\deadlinedata\lapro1417\jobsData\57574a28047df23f18faa041\
2016-06-07 15:28:00:  0: STDOUT: Tue Jun 07 15:28:00 2016 (+207ms)   :(3dsmax): Render slave and localize is enabled, localizing scene.
....
2016-06-07 15:28:20:  0: INFO: scLog: debug // SC_PathLocalizer 2.72 ::  Full cycle: 641 t2: 1 t3: 186 t4: 446
2016-06-07 15:28:21:  0: INFO: scLog: startmarker SC_PathLocalizer.localizeMaxSceneDependencies setMaxFileAssets2
2016-06-07 15:28:21:  0: INFO: scLog: endmarker SC_PathLocalizer.localizeMaxSceneDependencies setMaxFileAssets2
2016-06-07 15:28:22:  0: INFO: scLog: debug // SC_PathLocalizer 2.72 :: Localization completed (1552 files processed
2016-06-07 15:28:22:  0: INFO: scLog: endmarker SC_PathLocalizer.localizeMaxSceneDependencies
2016-06-07 15:28:23:  0: INFO: scLog: debug SC_RenderSlaveScripts.LocalizeDependencies :: Done.
2016-06-07 15:28:23:  0: INFO: scLog: debug MaxThinkingImporterContainer.fnUpdatePaths
2016-06-07 15:28:33:  0: INFO: scLog: info versionizeAPI.fnPostFileOpen()
2016-06-07 15:28:33:  0: STDOUT: Tue Jun 07 15:28:33 2016 (+24203ms) :(3dsmax): versionizeAPI.fnPostFileOpen()

Here is another one from a little while ago:

2016-06-07 11:38:46:  0: Render time for frame(s): 8.192 s
2016-06-07 11:38:46:  0: Total time for task: 9.711 s
2016-06-07 11:38:46:  0: Saving task log...
2016-06-07 11:38:47:  Scheduler Thread - Render Thread 0 completed its task
2016-06-07 11:38:47:  Scheduler Thread - Seconds before next job scan: 2
2016-06-07 11:38:50:  Scheduler - Performing Job scan on Primary Pools with scheduling order Pool, Weighted, Balanced
2016-06-07 11:38:50:  Scheduler - Using enhanced scheduler balancing
2016-06-07 11:38:50:  Scheduler - Successfully dequeued 1 task(s).  Returning.
2016-06-07 11:38:50:  Scheduler - Returning limit stubs not in use.
2016-06-07 11:38:50:  Scanline: OnSlaveStartingJob sSlaveName LAPRO0440
2016-06-07 11:38:50:  Scanline: OnSlaveStartingJob sJobID 575714260294992518d6abaa
2016-06-07 11:38:50:  Scanline: RemoveJobFolderContents event called LAPRO0440
2016-06-07 11:38:50:  Scanline: cleaning sJobsDataFolder: C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\LAPRO0440\jobsData
2016-06-07 11:38:50:  Scanline: ignoring current jobid: 575714260294992518d6abaa
2016-06-07 11:38:50:  Scheduler Thread - Synchronization time for job files: 53.989 ms
2016-06-07 11:38:50:  Syncronizing Plugin 3dsmax from \\inferno2\deadline\repository8\plugins\3dsmax
2016-06-07 11:38:52:  Syncronization complete. Took: 2 seconds
2016-06-07 11:38:53:  0: Got task!
2016-06-07 11:38:53:  0: Plugin will be reloaded because a new job has been loaded, or one of the job files or plugin files has been modified
2016-06-07 11:38:53:  0: Unloading plugin: 3dsmax
2016-06-07 11:38:53:  0: INFO: End Job called - shutting down 3dsmax plugin

and the same job on another machine a few frames later:

2016-06-07 11:47:41:  0: Render time for frame(s): 1.789 m
2016-06-07 11:47:41:  0: Total time for task: 1.900 m
2016-06-07 11:47:41:  0: Saving task log...
2016-06-07 11:47:42:  Scheduler Thread - Render Thread 0 completed its task
2016-06-07 11:47:42:  Scheduler Thread - Seconds before next job scan: 2
2016-06-07 11:47:44:  Scheduler - Performing Job scan on Primary Pools with scheduling order Pool, Weighted, Balanced
2016-06-07 11:47:44:  Scheduler - Using enhanced scheduler balancing
2016-06-07 11:47:44:  Scheduler - Successfully dequeued 1 task(s).  Returning.
2016-06-07 11:47:44:  Scheduler - Returning limit stubs not in use.
2016-06-07 11:47:44:  Scanline: OnSlaveStartingJob sSlaveName LAPRO1875
2016-06-07 11:47:44:  Scanline: OnSlaveStartingJob sJobID 575714260294992518d6abaa
2016-06-07 11:47:44:  Scanline: RemoveJobFolderContents event called LAPRO1875
2016-06-07 11:47:44:  Scanline: sJobsDataFolder: C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\LAPRO1875\jobsData does not exist!
2016-06-07 11:47:44:  Scheduler Thread - Synchronization time for job files: 39.000 ms
2016-06-07 11:47:44:  Syncronizing Plugin 3dsmax from \\inferno2\deadline\repository8\plugins\3dsmax
2016-06-07 11:47:46:  Syncronization complete. Took: 2 seconds
2016-06-07 11:47:47:  0: Got task!
2016-06-07 11:47:47:  0: Plugin will be reloaded because a new job has been loaded, or one of the job files or plugin files has been modified
2016-06-07 11:47:47:  0: Unloading plugin: 3dsmax
2016-06-07 11:47:47:  0: INFO: End Job called - shutting down 3dsmax plugin

One from yesterday:

2016-06-06 10:29:21:  0: Render time for frame(s): 1.131 m
2016-06-06 10:29:21:  0: Total time for task: 1.149 m
2016-06-06 10:29:21:  0: Saving task log...
2016-06-06 10:29:22:  Scheduler Thread - Render Thread 0 completed its task
2016-06-06 10:29:22:  Scheduler Thread - Seconds before next job scan: 2
2016-06-06 10:29:24:  Scheduler - Performing Job scan on Primary Pools with scheduling order Pool, Weighted, Balanced
2016-06-06 10:29:24:  Scheduler - Using enhanced scheduler balancing
2016-06-06 10:29:24:  Scheduler - Successfully dequeued 1 task(s).  Returning.
2016-06-06 10:29:24:  Scheduler - Returning limit stubs not in use.
2016-06-06 10:29:24:  Scanline: OnSlaveStartingJob sSlaveName LAPRO0715
2016-06-06 10:29:24:  Scanline: OnSlaveStartingJob sJobID 5755b223394f4d69c06b8037
2016-06-06 10:29:24:  Scanline: RemoveJobFolderContents event called LAPRO0715
2016-06-06 10:29:24:  Scanline: cleaning sJobsDataFolder: C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline8\slave\LAPRO0715\jobsData
2016-06-06 10:29:24:  Scanline: ignoring current jobid: 5755b223394f4d69c06b8037
2016-06-06 10:29:25:  Scheduler Thread - Synchronization time for job files: 51.000 ms
2016-06-06 10:29:25:  Syncronizing Plugin 3dsmax from \\inferno2\deadline\repository8\plugins\3dsmax
2016-06-06 10:29:27:  Syncronization complete. Took: 2 seconds
2016-06-06 10:29:28:  0: Got task!
2016-06-06 10:29:28:  0: Plugin will be reloaded because a new job has been loaded, or one of the job files or plugin files has been modified
2016-06-06 10:29:28:  0: Unloading plugin: 3dsmax
2016-06-06 10:29:28:  0: INFO: End Job called - shutting down 3dsmax plugin
Privacy | Site terms | Cookie preferences