I know in the 8.06 release notes it said that a bug with CPU affinity and the 3DS Max has been fixed but I’m not sure it has (we’re running 8.07 now). I’ve got CPU Affinity set to only use half the cores on a machine but it still maxs out the the CPU on all cores to 100% when using 3DS Max. I can send an After Effects job to the machine and it’ll do that on the specified cores.
I also have been having this issue, to overcome this i started to host my servers in a virtual environment and set the CPU affinity this was instead. I also find using this method gives me greater control as i can reset the VM install updates ect without effecting my normal work station… i know this isnt the answer your looking for but it is a work around for know
Thanks for the tip, although I think this is more involved than we want to get at the moment! I was just doing some tests to see if there’s any difference in performance when running two slaves on a dual cpu machine split across the different chips as opposed to 1 with all resources.
The fix you mention was specifically related to using the “1 CPU per task” option in our submitter not respecting CPU Affinity Overrides. As you can see, this issue came up previously but we were unable to identify the cause and couldn’t reproduce on our end…
Would you be willing to post logs from the slaves and/or jobs that aren’t respecting CPU Affinity? Can you reproduce this with a simple test scene for us (no assets/plugins)? Which version of 3DS Max are you using? Plugin versions? Which renderer is being used? All of these would be greatly appreciated, and should help identify where the problem lies.
I’ve attached a file that will do it. We’re using Deadline 8.0.7.3, 3DS Max 2016(no services packs) and Vray 3.40.02 on windows 7. The machine the slave was running on is an HP Z640 with dual Xeon E5-2630v3 processors, I split the cores from 0-15 and 16-31 as well as the GPU’s 0-7,8-15. If you need any more information let me know.
Unfortunately, we’re having issues reproducing this on our end… Even on our own dual CPU machine (with the same OS, same 3dsmax, same vray, same Deadline). Although we do’nt have quite the same amount of logical processors as you… So maybe there’s something there…
Got a few more questions for you though, if you’re up for it !
Could you send us the logs for the 3dsmax job with verbose logging for slaves turned on (setting found here)? Does the slave running the job have the correct CPU affinity? Do you have plugin sandboxes enabled or disabled (sandbox setting found here)?
We have two types of plugins, “Advanced” and “Simple”. 3dsmax is advanced and After Effects is simple. I wonder if there’s any difference for you based on which type of plugin is running… What if you submit this job to the 3dscommand plugin? Alternatively, if you already use the software, could you try running a job for one of these other advanced plugins (maya batch, c4d batch, modo, nuke, or microstation are most of the other advanced plugins) and see if the CPU affinity is correct for those?
Is it just this machine that experiences these issues? Or machines similar to this one? Is it a NUMA system or is it SMP (disregard if unfamiliar with those)?
There’s a lot here, so let me know whenever you get the chance! Your help is greatly appreciated.
I’ve set it to verbose and run the job again as well as a Mayabatch, 3DS cmd and 3DS Max with mental ray jobs. Maya respects the CPU affinity and only use the specified cores, 3DS cmd doesn’t and uses the whole cpu, 3DS Max with Mental Ray also respects the CPU Affinity. I realise there’s a lot of extra information in these logs so can I email them to you rather than post them here?
I don’t know if the machine is setup as NUMA or SMP. It looks like it might be the dual Xeon machines, I tested on a dual Xeon x5650 and it doesn’t respect the settings for 3DS Max either but an i7-4790k it used only the selected cores (both for 3DS Max and Vray jobs).
Hopefully that gives you a bit more information to work with…
Thanks Nick! I really appreciate the amount of help you’re giving us!
Those are indeed some interesting results… We’ve got some ideas floating around here that might address this. On a dual Xeon machine, could you confirm if the slave running the job has the proper affinity set? Specifically, I’d like to confirm the deadlineslave.exe process is showing the proper affinity, even though the 3dsmax.exe process has the wrong affinity.
Both deadlineslave.exe’s are showing CPU affinity correctly and both 3dsmax.exe’s are showing it incorrectly. I’ve also emailed the logs through to you.
If it’s the combination of Xeon CPU(s), 3dsmax, and VRay together, then we should have a machine up and ready tomorrow that’ll let us test this (knock on wood). In the meantime however, could you test something for me? I’d like to verify that vray threads aren’t apart of the issue.
I’ve attached your original test scene with the modified value. I just need you to submit this scene as a job for your slave to render and to then check the affinity of the 3dsmax process.
Also, unrelated to the test above and no promises, but I believe I can get the affinity to work for max/vray with a modified plugin script… Should have something for you to test today or tomorrow. Anyways, thanks again for your co-operation!
Alrighty, hopefully this fixes the issue for you with 3dsmax (3dsmax command will still be broken, but I’d like to test this first).
Make a backup of yourDeadlineRepository/plugins/3dsmax/3dsmax.py
Can either rename your current plugin file, or move it to a new location so it’s not overwritten.
Unzip the attached file, and copy the included file (3dsmax.py) into yourDeadlineRepository/plugins/3dsmax/ folder
Run the job with a problematic slave
Check the affinity, hopefully it matches what you specify now. Otherwise, we’ll probably have to wait until the next release of Deadline 8.0 to try a different fix for it.
To get your repo back to normal, just place your backed up 3dsmax.py into its original folder/back to its original name.
Thanks for the update. I’ve tested this out on our machine and it almost works… It’s very strange. So I ran the slave and it picked up a 3dsmax job. I watched the performance in the task manager and 3dsmax.exe shows the correct affinity of 16 cores as it loaded up 3dsmax, I think V-Ray then kicked in and all of sudden the affinity for 3dsmax.exe goes up to 32 and it’s using the CPU 100%…
So we think we’ve narrowed it down slightly to this section of the job:
This seems to be the point at which the render actually start, up until this point it is converting the scene for VRay. We also have Embree enabled so I’m not sure if this would have an impact?
A little background information, there’s only one instance where we manually set the CPU affinity for a specific program (aside from that test script I sent you) which is with the “1 CPU per Task” option in 3dsmax. How we normally do CPU affinity is we set it at the slave level and any processes spawned by that slave will then inherit its CPU affinity (ie. Slave > sandbox > 3dsmax).
The script I had you test explicitly sets the CPU affinity for 3dsmax, and it seems that for whatever reason V-Ray is overriding the CPU affinity (which is something we haven’t been able to reproduce with our dual xeon machine, though mind you it has quite a few less logical processors …).
As for your second post, I’m not familiar with embree, but I’ll take a look into it. Does disabling it give you the proper CPU affinity? It seems that it’s on by default (V-Ray Tech Overview), but it doesn’t affect CPU affinity for us. In the 3dsmax render settings dialog, what if you turn off “Dynamic bucket splitting” under Settings in the System rollout.
Given all the results that you’ve been experiencing, it’s looking more and more like a V-Ray/Chaos Group issue. Will probably try a few more things before getting in contact with Chaos Group.
Very interesting, I agree that it looks more like a V-Ray thing. Turned both embree and dynamic splitting off and the exact same result, stays at 50% until it starts rendering then it changes the affinity and jumps to 100%. There are definitely a lot of cores in this machine, I’ve been demoing it to see what the performance difference is like, thus why I’ve been testing the affinity. We get a marked improvement from running two slaves (even with the affinity not working) over running one.
Anyways, the gist of it is that they’re always explicitly setting CPU usage to all (WHY DOESN’T IT WORK FOR ME?! ). So hopefully the following is the final test for this.
Create an environment variable on that machine with the following key-value pair:
VRAY_USE_THREAD_AFFINITY=0
This should disable VRay overriding the CPU affinity, but it won’t run on more than 64 cores. If that works for you, I’ll make some modifications to the script to automatically handle all this logic. Thus no one will have to manually do this.
I’ve tried that as a system variable and a user variable, still no luck, it still jumps up to 100% CPU as soon as it gets into the render. Vray just doesn’t want to be limited!