Lingering 3dsmax.exe

So, I have a question I wanted to through out to the group. I have noticed that many of our larger jobs will error out until the 3dsmax.exe is killed. So, if I submit a job that is fairly large, it will render OK, but the next job that loads up, rather it be larger or small, will error out and fail. It appears the 3dsmax.exe is not quitting all the time. My solution has been to submit a batch command to kill the 3dsmax.exe process on all the render nodes. Once this is done, I can restart the next job and it runs just fine.

Now, I realize this is an Autodesk issue to address, because we used a similar process with Backburner, but I was wondering if any of you have experienced this same problem and wondered what mechanisms you might have in place to automate this process. The batch process is simple, but requires me to initiate the process manually. Any thoughts?

Thanks,

Hey Mark,

Can we get some more info on this? Could you perhaps share a log file or two with us so we can take a look at what’s happening on the slave side? Thanks.

Hey Mark,

We’re having the same issues over here as well with the same “solution”, except that we want to be able to render multiple jobs per slave at a time so killing the 3dsmax process is in fact killing the second job running. At the moment we don’t have a better solution so either we sacrifice rendering two jobs at once or we manually kill problematic jobs. They’re a little tricky to catch though, especially on a farm of 300 machines. If we find a better fix I will update here :slight_smile: If you could do the same that would be great!

Thanks
Courtenay

Yeah, that is my worry, Courtenay. I could apply a script to “flush” the render nodes upon completion of a job, but this will affect other machines rendering whatever. I am including 2 log files from 2 completely different jobs. You will notice the first error is common between both:

Error in StartJob: ExecuteScript: Timed out waiting for the lightning 3dsmax plugin to acknowledge the ExecuteScript command.

Again, the way to solve this is to flush the render nodes using the following batch script:

taskkill /F /IM 3dsmax.exe
rd “C:\Users\USERNAME\AppData\Local\Autodesk\3dsMaxDesign\2014 - 64bit” /s /q

This kills the 3dsmax.exe then deletes the user preference file. Usually, just flushing the 3dsmax.exe works, but sometimes I include the user pref folder to do a clean sweep. Perhaps I am missing something else.
Job_2014-03-06_17-34-08_53191400d7a84f06dc2698c0.txt (22.7 KB)
Job_2014-03-07_08-24-24_5319e4a82f4921032cdbf6c1.txt (23 KB)

You know, we are finding even killing the EXE is not working all the time. It seems a reboot is the most sure fire way to avoid the errors. There has got to be something I am doing wrong or missing.

Hi Mark,
Couple of things here:

  1. Machine: “W-AMDEN-S-DES23” is running SP4 but you’re submitting from a SP3 machine. I honestly don’t think this is the cause of your main issue, but I always recommend making sure ALL machines are running EXACTLY the same version of 3dsMax. Please could you maybe downgrade back to SP3, whilst ADSK fix SP4, as they recently pulled this new download due to another un-related issue in their software.
  1. “Missing dll: Built-in - AnchorHelperObject” I don’t think you and I need to have another conversation on this little ADSK gem :slight_smile: However, it got me thinking, that it would be really interesting if you could take 1 or more of your machines out of NT service mode and run them for a while in normal GUI/desktop mode in Windows with Deadline Slave in GUI mode as well, just to see if anything is different/any improvement is made, running 3dsMax. Any difference in the error/log reports? If so, please do post them.

  2. Now onto my main suggestion; please can you go to “Configure plugins > 3dsmax” and ENABLE the “Kill ADSK Comms Center Process” setting. Once set, send (or re-queue) a load of Max jobs to your farm and see if it helps at all? I think this is going to fix your issue, but if you don’t see any change, then please DISABLE this setting again.

@Courtenay - Please could you provide some example logs like Mark has so we can confirm you are experiencing exactly the same issue.

Regards,
Mike

Hello Mark and Courtenay,

Can I ask you to verify for me, individually, if deadline is running as a service on your render nodes? Thanks.

Cheers,

Aha! I missed that configuration dialog (ADSK Comms). I’ll give it a try and report back. Thanks.