Customize.ms

We’re still seeing a lot of customize.ms stalls on random tasks.

0: INFO: Executing script: C:\Users\renderadmin\AppData\Local\Thinkbox\Deadline6\slave\RENDER-I7-06\plugins\customize.ms

It’ll just sit there until it fails.

What’s the slave application’s memory usage like when it gets stuck like this? We’re aware of a memory issue with the 3dsmax plugin, and it’s been fixed in 6.1 beta 3. I’m curious if what you’re seeing here is at all related…

Not sure I’ll check next time but it’s not a huge scene and we have 16GB on that machine quoted.

[code]50% RAM usage (4084Total\1942Cached\694free\1.79Used)

Processes:
System Idle 75%
WSCommCntr4.exe 25% “Autodesk InfoCenter”
3dsMax.exe 0%[/code]

Ending the WSCom process seems to unstick max and result in it continuing. Either that or it crashed max and just randomly worked the second time.

Here is a minidump of the wscomcntr exe
mediafire.com/?xq48znc2asue8pv

ADSK InfoCenter / CommunicationCenter typically causes issues after a reasonable period of time after a new version of 3dsMax has been installed on a machine. It tries to call home and check for updates / subscription based stuff / RSS feeds (of which Shane’s and Ken’s blogs are somewhat irrelevant now!). I found it did randomly caused issues for network rendering and was very hard to debug. IIRC, one of the issues was the “customise.ms” would throw a wobbly. At first, I thought it was a bug in this code, but nah, it works fine, it’s just that the instance of 3dsMax is up and running by this stage in the Deadline network rendering process and 3dsMax has had time to think about checking in over it’s built-in call-home webService a.k.a InfoCenter and thought to itself, now how shall I crash out and annoy Gavin… :slight_smile:

Personally, I always kill the whole exe from running on our systems. Don’t need it and these kind of issues disappeared after I disabled it. (This might not be your issue, but I think it would be best to eliminate this out of the equation)

IIRC, you need to ensure that “C:\Program Files\Common Files\Autodesk Shared\WSCommCntr4\lib\WSCommCntr4.exe” is allowed through the local Windows Firewall on your machines.
Also, you can trace any issue with it via its log: “c:\max_root\InfoCenter.log”

Any change in the log messages once you manually allow the “WSCommCntr4.exe” through the firewall?

Do you have any software/networking which would actively be blocking this exe from communicating over http / port:80/8080/443? Are you slaves running with local admin rights? If not, can you temporarily test this?

This is good (disregard the max path below, I do custom installs):

9/10/2013 5:07:16 PM: Configuration Read from: C:\3dsMax2013x64\infocenter.xml 9/10/2013 5:07:27 PM: Polling:5 threads started 9/10/2013 5:07:27 PM: SubStatus: Polling... 9/10/2013 5:49:28 PM: Configuration Read from: C:\3dsMax2013x64\infocenter.xml 9/10/2013 5:49:38 PM: Polling:5 threads started 9/10/2013 5:49:38 PM: SubStatus: Polling... 9/10/2013 5:50:00 PM: Configuration Read from: C:\3dsMax2013x64\infocenter.xml 9/10/2013 5:50:10 PM: Polling:5 threads started

Also, make sure the XML file listed above under your “max_root” isn’t corrupt and permissions are OK for access by your slaves.

This is bad:

9/10/2013 6:07:58 PM: System.Runtime.InteropServices.COMException (0x80010105): The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)) at Autodesk.Private.InfoCenterLib.ICommCntrController.GetWebServiceManager(Int32 ProcessId) at Autodesk.Private.InfoCenter.RssSource.Retrieve(String Url, Boolean onlyHeader) 9/10/2013 6:07:58 PM: System.Runtime.InteropServices.COMException (0x80010105): The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)) at Autodesk.Private.InfoCenterLib.ICommCntrController.GetWebServiceManager(Int32 ProcessId) at Autodesk.Private.InfoCenter.RssSource.Retrieve(String Url, Boolean onlyHeader) 9/10/2013 6:07:58 PM: System.Runtime.InteropServices.COMException (0x80010105): The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)) at Autodesk.Private.InfoCenterLib.ICommCntrController.GetWebServiceManager(Int32 ProcessId) at Autodesk.Private.InfoCenter.RssSource.Retrieve(String Url, Boolean onlyHeader) 9/10/2013 6:07:58 PM: System.Runtime.InteropServices.COMException (0x80010105): The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)) at Autodesk.Private.InfoCenterLib.ICommCntrController.GetWebServiceManager(Int32 ProcessId) at Autodesk.Private.InfoCenter.RssSource.Retrieve(String Url, Boolean onlyHeader) 9/10/2013 6:07:58 PM: System.Runtime.InteropServices.COMException (0x80010105): The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)) at Autodesk.Private.InfoCenterLib.ICommCntrController.GetWebServiceManager(Int32 ProcessId)

How’s your log looking on several of your machines?

If your firewall is completely OFF, running as local admin and your logs are clean, then you could try making sure the InfoCenter Settings are OK. See picture below. Click on settings button and on the setting for “how often to check in to Autodesk to updates” or whatever the setting is called, select in the drop-down list: “never” on a couple of your slaves and then let’s see if the issue continues during network rendering.

GUID-9EC30ABB-A478-49AB-BCA1-7E0F137C6A55-low.png

Do you just delete the exe? How do I kill it completely? That seems like the most sure-fire solution. :smiley:

Deleted. I’ll see if that stops the hung slaves going forward and hopefully put the “customize.ms” hangs far behind me. :smiley:

Let us know how that deletion impacts this problem :slight_smile:

Deleted and just saw it occur again. Automatic time-out detection caught it eventually but it looked like 3ds max was just idling at 1-2% CPU.

This is almost always on fast-rendering jobs <1 min per frame.

False alarm… that slave some how escape the delete purge… or there is a backup location for the file.

Can you share the infocenter log and job log report?

Glad to hear that may have fixed it, except for the one anomaly.

@Gavin - it would be good to get some feedback in a couple of weeks time to verify this has fixed the issue. :slight_smile:

Well I can verify that the process doesn’t seem to be on any of the render nodes any more. So the good news is that once you delete it, it does in fact appear to go away for good. It doesn’t appear to Zombie back. The bad news is that one render node post-purge now won’t render half the Max jobs and stalls out with one core maxed out on startup. So deleting it in that instance might have borked its installation or else it’s just a conicidence. I imagine coincidence is the most likely explanation since every other render node seems to be unscathed.

Oh for the sake of posterity here is the .bat file I ran on the farm:

taskkill /F /IM WSCommCntr4.exe del "C:\Program Files\Common Files\Autodesk Shared\WSCommCntr4\lib\WSCommCntr4.exe"

Fair enough if you decided to delete the EXE, although to resolve this issue, you should only need to ensure it’s poked through the windows client firewall. You can confirm on this one outstanding node if there is still an InfoCenter issue by looking at it’s local log:

“c:\max_installed_location\InfoCenter.log”

DISCLAIMER: I can’t guarantee that deleting the EXE might cause some other issue further down the line for you OR a future SP/PU actually re-installs the EXE!

Thank you guys this post just help me a lot!

I just want to ad that there are more then on of this WSCommCntr#.exe depending on the max version you have installed. We had 1,3 & 4 on our render machines. I deleted them all via batch and execute command in deadline so far no probs.

Cheers
Joachim

Good stuff.
Could you tell me which version of 3dsMax had which version of the EXE?
I think it’s probably time that the code in Deadline was enhanced to help stop this issue.

I will take a closer look into that. What I can say is that our machines had WSCommCntr3 and WSCommCntr4 installed. We are currently using 3ds Max 2012 and 2014.

I did like to add that this is not purely a Deadline problem. We recently switched form renderpal to Deadline6. We had the same problem with renderpal but to a lesser degree because we used bigger task sizes.