Strange timeout error

We recently re-imaged our render node and keep getting this errors.

We turned on Disable Progress Update option.
But, no luck.

I even change renderer to scanline, but the error log still shows that it stopped at VRay.

=======================================================
Error

Error: Timed out waiting for the next progress update.
2016/10/14 17:59:00 INF: [06764] [01796] SYSTEM: Production renderer is changed to Scanline Renderer. Previous messages are cleared.
2016/10/14 17:59:12 DBG: [06764] [01796] Starting network
2016/10/14 17:59:12 DBG: [06764] [01796] Calling NetRenderPreLoad
2016/10/14 17:59:12 DBG: [06764] [01796] in NetWorkerPreLoad. jobFile: ; jobname: C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\deadlineStartupMax2017.max
2016/10/14 17:59:12 DBG: [06764] [01796] in Init. jobFile: ; jobname: C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\deadlineStartupMax2017.max
2016/10/14 17:59:13 DBG: [06764] [01796] in Init. calling nrGetIface. jobToSend: C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\deadlineStartupMax2017.max
2016/10/14 17:59:13 DBG: [06764] [01796] in NetWorkerPreLoad. curJobname: C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\deadlineStartupMax2017.max; init: 0
2016/10/14 17:59:13 DBG: [06764] [01796] in NetWorkerPreLoad. calling PostInitMessageSystem()
2016/10/14 17:59:13 DBG: [06764] [01796] in NetWorkerPreLoad. srv_pid: 0
2016/10/14 17:59:13 DBG: [06764] [01796] leaving NetWorkerPreLoad. LoadLib()
2016/10/14 17:59:13 DBG: [06764] [01796] NetRenderPreLoad passed
2016/10/14 17:59:19 DBG: [06764] [01796] Interface8::LoadFromFile(C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\deadlineStartupMax2017.max)
2016/10/14 17:59:19 INF: [06764] [01796] Starting to load file: C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\deadlineStartupMax2017.max
2016/10/14 17:59:19 WRN: [06764] [01796] MAXScript Callback script Exception: – Runtime error: No method found which matched argument list
2016/10/14 17:59:19 INF: [06764] [01796] Done loading file: C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\deadlineStartupMax2017.max
2016/10/14 17:59:19 INF: [06764] [01796] SYSTEM: Production renderer is changed to NA. Previous messages are cleared.
2016/10/14 17:59:20 DBG: [06764] [01796] Interface8::LoadFromFile(C:/Users/render/AppData/Local/Thinkbox/Deadline7/slave/fla-rnj164/jobsData/58017d3fe5d3a2128c411ac7/shd405_046_020_fx-jailfire-jfSIM_001.max)
2016/10/14 17:59:20 INF: [06764] [01796] Starting to load file: C:/Users/render/AppData/Local/Thinkbox/Deadline7/slave/fla-rnj164/jobsData/58017d3fe5d3a2128c411ac7/shd405_046_020_fx-jailfire-jfSIM_001.max
2016/10/14 17:59:20 WRN: [06764] [01796] MAXScript Callback script Exception: – Runtime error: No method found which matched argument list
2016/10/14 17:59:20 INF: [06764] [01796] SYSTEM: Production renderer is changed to Scanline Renderer. Previous messages are cleared.
2016/10/14 17:59:21 INF: [06764] [01796] Done loading file: C:/Users/render/AppData/Local/Thinkbox/Deadline7/slave/fla-rnj164/jobsData/58017d3fe5d3a2128c411ac7/shd405_046_020_fx-jailfire-jfSIM_001.max
2016/10/14 17:59:22 INF: [06764] [01796] [V-Ray] ============================================================
2016/10/14 17:59:22 INF: [06764] [01796] [V-Ray] Console created, V-Ray A for x64 from Aug 29 2016, 21:47:41
2016/10/14 17:59:22 INF: [06764] [01796] [V-Ray] ============================================================
2016/10/14 17:59:22 INF: [06764] [01796] [V-Ray] Compiled with Intel C++ compiler, version 15
2016/10/14 17:59:22 INF: [06764] [01796] [V-Ray] Host is 3ds Max, version 19
2016/10/14 17:59:22 INF: [06764] [01796] [V-Ray] V-Ray DLL version is 3.25.01

at Deadline.Plugins.ScriptPlugin.StartJob(Job job, String& outMessage, AbortLevel& abortLevel)

=======================================================
Type

RenderPluginException

=======================================================
Stack Trace

at Deadline.Plugins.Plugin.StartJob(Job job)
at Deadline.Slaves.SlaveRenderThread.a(TaskLogWriter A_0)

=======================================================
Log

2016-10-14 17:58:49: BEGIN - FLA-RNJ164\render
2016-10-14 17:58:49: 0: Loaded plugin 3dsmax (R:\DeadlineRepository7\plugins\3dsmax)
2016-10-14 17:58:49: 0: Start Job timeout is disabled.
2016-10-14 17:58:49: 0: Task timeout is disabled.
2016-10-14 17:58:49: 0: Loaded job: shd405_046_020_fx-jailfire-jfSIM_001_sim_FFX_JailFire_002_Default (58017d3fe5d3a2128c411ac7)
2016-10-14 17:58:49: 0: Skipping drive mapping because they have already been mapped for this job
2016-10-14 17:58:50: 0: INFO: Executing plugin script C:\Users\render\AppData\Local\Thinkbox\Deadline7\slave\fla-rnj164\plugins\58017d3fe5d3a2128c411ac7\3dsmax.py
2016-10-14 17:58:50: 0: INFO: Slave Running as Service: False
2016-10-14 17:58:50: 0: INFO: About: 3dsmax Plugin for Deadline
2016-10-14 17:58:50: 0: INFO: The job’s environment will be merged with the current environment before rendering
2016-10-14 17:58:50: 0: INFO: Start Job called - starting up 3dsmax plugin
2016-10-14 17:58:50: 0: INFO: Sys Env Var PATH: C:\Program Files\Thinkbox\Deadline7\bin;C:\Python27;C:\Python27\Scripts;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Program Files\Common Files\Autodesk Shared;C:\Program Files (x86)\QuickTime\QTSystem;C:\Ruby187\bin;C:\Program Files (x86)\Autodesk\Backburner;C:\Python27
2016-10-14 17:58:50: 0: INFO: Sys Env Var PATH length: 370
2016-10-14 17:58:50: 0: INFO: Backburner Path(s) Found in PATH: ‘C:\Program Files (x86)\Autodesk\Backburner’
2016-10-14 17:58:50: 0: INFO: Backburner server.exe version: 2017.0.0.2221
2016-10-14 17:58:50: 0: INFO: Rendering with 3dsmax version: 2017
2016-10-14 17:58:50: 0: INFO: Not forcing a build of 3dsmax because version 2014 and later is 64 bit only
2016-10-14 17:58:50: 0: INFO: Rendering with executable: C:\Program Files\Autodesk\3ds Max 2017\3dsmax.exe
2016-10-14 17:58:50: 0: INFO: Slave 3dsmax.exe version: 19.2.472.0
2016-10-14 17:58:50: 0: INFO: Submitted from 3dsmax.exe version: 19.2.472.0
2016-10-14 17:58:50: 0: INFO: Checking registry for 3dsmax language code
2016-10-14 17:58:50: 0: INFO: Language code string: ENU
2016-10-14 17:58:50: 0: INFO: Language sub directory: en-US
2016-10-14 17:58:50: 0: INFO: Fail on existing 3dsmax process: 0
2016-10-14 17:58:50: 0: INFO: Load 3dsmax timeout: 4000 seconds
2016-10-14 17:58:50: 0: INFO: Start job timeout: 4000 seconds
2016-10-14 17:58:50: 0: INFO: Progress update timeout: 8000 seconds
2016-10-14 17:58:50: 0: INFO: Progress update timeout disabled: 1
2016-10-14 17:58:50: 0: INFO: Slave mode enabled: 1
2016-10-14 17:58:50: 0: INFO: Silent mode enabled: 0
2016-10-14 17:58:50: 0: INFO: Local rendering enabled: 1
2016-10-14 17:58:50: 0: INFO: Running render sanity check using 3dsmaxcmd.exe
2016-10-14 17:58:51: 0: INFO: 3dsmaxcmd.exe exit code: 3
2016-10-14 17:58:51: 0: INFO: 3dsmaxcmd.exe returned:

Support for 3dsMax 2017 was only added in Deadline 8.0.0.69. Are you running an older version of Deadline?

Yes… but we are getting the same error when we send 3dsMax 2016 job.

ok, thanks for confirming. Do you get exactly the same error in 3dsMax 206 or is it similar?

Have you restarted the machine since installing 3dsMax and re-imaging? Are there any outstanding Windows updates required here?

Can you post the full error log report by saving from Deadline Monitor. I would like to see the Slave Info / Props at the bottom of your report for reference. Do you have Anti-Virus, Firewall, UAC enabled?

Please open up a command prompt on this machine and execute:

C:\Program Files\Autodesk\3ds Max 2017\3dsmaxcmd.exe

What happens?

Yes, exactly same error.
I’ll post the full report.

As of now, only work around is rebooting slaves.

Here is the saved log file.
Job_2016-10-16_20-37-05_58044762dde34616e48b8a82.txt (48.9 KB)

When I run

C:\Program Files\Autodesk\3ds Max 2017\3dsmaxcmd.exe

It shows 3dsmaxcmd options list.

Also, I see that the test to see if Max can start successfully returned an invalid exit code:

2016-10-14 17:58:51: 0: INFO: 3dsmaxcmd.exe exit code: 3

What happens when you run 3dsmaxcmd.exe from the command line (no arguments or anything)? Also, as per Mike’s question, which version are you on at the moment?

We are using 3dsMax 2016 with Deadline 7.
When I run 3dsmaxcmd.exe, I got option list.

You can try disabling the 3dsmaxcmd sanity check if this is failing out for you, but in fact is actually ok. I doubt this will work, but it may reveal the true error in the 3dsmax configuration.
docs.thinkboxsoftware.com/produc … figuration

Disable the option: “Run Render Sanity Check” by setting it to False

then trying running the job again.

I should note that your latest job log report is not failing at this point but is rather failing much later on in the 3dsMax process when it tries to execute the customize.ms file. Typically, this error is due to something scene file specific.

“this error is due to something scene file specific.”
I don’t think so.
Now it is starting to fail on more and more jobs.
Also… if we blacklist the one which errored out, other node render without problem.

Here are more reports.
Timeout_Error_Reports.zip (43.1 KB)

As you can see, the job is aborted after switch Production Renderer to VRay.

any answers?
This is getting worse.

Hi,

Thanks for the log reports. If this is an urgent issue, I would highly recommend our support team working CST timezone hours for direct support:
support.thinkboxsoftware.com/
(I’m a developer based in UK)

So, I’m not certain what the issue is so far, although I noticed a few things in your logs. So, more questions/notes from me.

  1. What, if anything has changed recently in your pipeline between the time period of 3dsmax jobs working fine in your Deadline queue and the current situation. Please do list out all and any IT changes, network changes, plugin installs, security/admin changes, pipeline configuration changes that may have taken place during this period. Perhaps, something will standout here. Maybe somebody just deployed a new maxscript or something to all systems, which has an issue?

EDIT: Have you updated the version of any 3rd party plugins OR installed a new 3rd party plugin recently?

  1. I’m regularly seeing this error occur in your 3dsMax logs, which seems related to the “skin” modifier in 3dsMax. Doing a job that suddenly requires a large use of the “skin” modifier in your 3dsMax scene files recently? Perhaps, this is the ‘change’ in your studio workflow?

It seems this is a known Autodesk 3dsMax error and they provide this solution. Can you try it?
knowledge.autodesk.com/support/ … tries.html

  1. Although not related, I did just want to note in passing that you are running Deadline 7.2.1.0 and the latest and last, available version is 7.2.4.0:
    docs.thinkboxsoftware.com/produc … notes.html
    which does contain a couple of potentially useful fixes for you, albeit not directly related to today’s issue at hand.

  2. Unknown Log Message:

I’m wondering if you have any 3rd party or custom scripts which might have recently changed to cause an issue here during network rendering?

  1. Out of the logs you provided, the machines all have Backburner “2017” installed which will help them render with 3dsMax 2017 and indeed, backwards compatible. However, I note 2 machines where that is not the case:
  1. Additionally, this machine is suffering from a version mismatch:
  1. So, finally, please see attached an updated customize.ms script file, which provides a new thing we added in Deadline 8, to try and capture the exception if this maxscript file fails to execute, which seems common in your log reports. Unzip and update what should be the same file, with the exception of the added “try()catch()” block around everything. Can you deploy and try running both 3dsMax 2016 and 2017 jobs and see if we get some juicy error log reports with more info in them?

Install to:
“<your_repo>/plugins/3dsmax/customize.ms”

customize.ms.zip (32.1 KB)

  1. I wish I could know…

  2. That happens because higher version of 3dsMax unregister dll for skin for lower version. We already have fix for this. This is not related to this error. If this is the problem, we can just turn on ignore dll error.

  3. We turn off FindMissingMapsOnSceneLoad for faster loading.

    1. I don’ t think this would be the issue.
  4. I’ll try.

BUT, my fundamental question is…
What is progress update error?
Why is this happening even after I disable progress update error?
Why does it happen even the job doesn’t reach the progress update time threshold?

Here is a error log after replacing customize.ms
Job_2016-10-18_12-08-58_580673480d2c9c1d78873ef3.txt (50.6 KB)

It looks like the job is dying while running customize.ms.

Hi,

Thanks for your replies and testing notes and most recent log. Can we try something else? In “Configure Plugins…” -> 3dsmax -> can you enable this setting to True: “Kill ADSK Comms Center Process”. I would re-enable the “Run Render Sanity Check” setting if you did disable it previously, as you do really want to catch the odd rendernode which has a bad install/config of Backburner DLL’s as that will cause other issues further down the line when using Deadline. Hopefully, killing this annoying ADSK background process regularly will do the trick here.

Kill ADSK Comms Center Process: If enabled, Deadline will kill Autodesk Communications Center process if it’s running during network rendering.

docs.thinkboxsoftware.com/produc … figuration

Also, to answer your fundamental question. The execution of the “customize.ms” script has it’s own hard-coded timeout of 60 seconds, which is not connnected or related to the Progress Update Timeout, which is designed to look for a lack of progress coming from the rendered % progress. Modifying the Progress Update Timeout will make no difference here and I encourage you to keep this setting on it’s default value. For your reference, there are a number of times, where we need to single-fire execute a script file from within the depths of the 3dsmax.py plugin. In these cases, renderer or otherwise, Progress Update Timeout would not be applicable, and so a standard 60s timeout is applied as that is a reasonable amount of time. The exception here, is the execution of a MAXScript job, where it might be controlling the rendering or a long running process and so a 60s timeout would be bad.

OK… i’ll try.
Is customize.ms must-have thing?
Can I skip for test?

You can potentially coment out the customize.ms file in the 3dsmax.py plugin file, but if the issue really is this background hanging Autodesk process, then the 2 things are actually unrelated. Let me explain. Autodesk fire up in the background another process, whose name slightly changes depending on the version of 3dsMax, but generally speaking it is called: “WSCommCntr.exe” or something similar, like with a number in the fileName. There are a number of ‘milestone’ moments during the starting up and loading of 3dsMax and it’s scene file, where this background process is restarted if it is not already running (this is why we must automate it, as 3dsMax tries to restart this app). It fires at these following MAXScript callback points (we also fire it at several key moments from our Python plugin layer):

callbacks.addScript #filePreOpen
callbacks.addScript #filePostOpen
callbacks.addScript #preRender
callbacks.addScript #PostSystemStartup
callbacks.addScript #PostSystemShutdown

In summary, this ensure that this background process is killed so it does not cause 3dsMax to hang.

So…why does it hang when the “customize.ms” file is executed and not anywhere else? Good question. It actually does fire, it’s just that the “customize.ms” file can take literally a second or maybe 2 or worst, 3 seconds to execute (it applies lots of nice things to your 3dsMax scene file, which without it firing, users would not have those features - load state sets, re-load current scene selection for V-Ray render selection masks, all 3rd party renderer maxscript propeties, etc). So, disabling the “customize.ms” file IMHO…is a really bad idea. Better to just fix the root cause here…

Anyway, back on topic…the “customize.ms” file executes AFTER the 3dsMax scene file has LOADED, ie: this is why the “filePostOpen” maxscript callback is in place, as this is one of the moments that Autodesk try to spawn this background process again. This then causes the hang.

What the hell does this process do? Another good question. According to the EULA you accepted when you installed 3dsMax, it’s sending data to ADSK’s Amazon web servers. Like usage data. You can kinda tell if there is communication or indeed, if you have a local client proxy block / firewall block, by opening up the “infocenter.log” file in the 3dsMax directory under “…/Users/%username%/3dsMax…etc”. Either way, for network ‘headless’ rendering, it’s a PITA and blocks. As a side note, it tends to be worse, whenever ADSK has released a service pack and it want’s to tell all your machines the great news via the “InfoCenter” or “Communication Center” menu system in 3dsMax. Again, not much point if your a headless render machine.

So, I would say, let’s try this plugin fix of simply enabling this plugin config setting: “Kill ADSK Comms Center Process” and see if it helps?

OK… Kill ADSK Comms Center Process worked.
Thanks.

It is also good to know there is another timeout setting.
Please add that to the document.

Thanks!