I’m not sure where to post this, so apologies if its in the wrong spot.
Running the latest Deadline on an M1 Mac in Rosetta mode will cause a worker crash after a short while (10-15 mins, maybe). This happens 100% of the time on an M1 Mac Mini and an M1 Macbook Air. Both are running Big Sur 11.3. Is there any word on whether a native Apple Silicon port is in the works? Regardless, Rosetta should be adequate in the short term, if only it were stable.
Deadline runs very stable on intel macs, as well as Windows machines.
I just bought a mac mini M2 to try and see if it works as a nice compact render machine (mainly for nuke jobs). I seem to be experiencing the same problem still, but maybe even shorter periods between crashes. Any updates about this? It seems like an ideal small setup for the job, but if this doesn’t work at all, I might have to rethink my approach…
Thanks for reaching out, Can you please share the Crash report from your Mac, we have seen this reported before and after reviewing the crash reports shared from customer by our engineers, they have found other similar crashes online. The solution for those folks was this:
I did re-install Rosetta 2 thanks to these steps. Unfortunately nothing changed and both our identical Macbook Pro M1 Max 64GB stall after some time. Sometimes after 10mn, sometimes 20, 1h, up to 1h30 so far.
They both run Deadline client 10.2.1.1. And are connected to an RCS 10.2.1.0. I couldn’t find the 10.2.1.0 OSX client version online anymore.
It looks like I’m having the same issue on my farm. I just re-installed Rosetta 2 per your suggestion, however the Worker still hangs. Sometimes after a few minutes, and sometimes after an hour or so. The issue is specific to my M1 Mac Studios, the Intel Macs and PCs work as expected.
Mac Studio (2022) M1 Ultra 128GB
Mac OS 12.6.6
Deadline Client v10.2.1.1 with RCS v10.2.1.1
Please let me know if you need any additional information, thanks!
Can you send the crash report to investigate about the reason for Deadline client application is getting crashed, Spindump a built-in tool can provide insights by capturing thread activity and stack traces at the time of the crash.
Run the spindump while the Deadline worker is running and let it crash, so that spindump collects the required data and send over report on this thread for investigation. Please refer this blog post found on Internet that outlines the steps about running spindump to collect the stack trace of Deadline worker.
@DaveJacobson84 could you be able to share the spindumps with me, I would like to take a look on that.
Also, if the worker is not crashing and just stalling for a while it should write logs in the application log folder. Feel free to share the worker logs as well.
If it has some sensitive data, you can also open a ticket with Thinkbox Support.
Hi @karpreet, just seeing if you’ve managed to find any useful information in the spindumps. Please let me know if there’s anything else I can do to help move this along. Thanks!
I have managed to reduce the frequency of the constant hanging/crashing by increasing the interval between housecleaning (now set to 3600) Also I am seeing this happen to the RCS about once a day too
Seeing exactly the same issue on the 2 different versions of deadline on new M2 Ultra Studios running latest ventura.
The machines do not crash but the deadline worker seems to hang up and cause a spinning wheel of death on each machine.
seems to be random length of time of when they do seize up, not consistent.
Machines then show stalled in Deadline Monitor,
Machine power settings are all set as always on, no screensaver etc.
I have a second set of nodes running monterey on mac Pro 2019s and do not get any stalls on those.
Hey Thinkbox, any luck reproducing this issue on your end? We’re currently unable to utilize any of our M1 Macs for Deadline rendering. Can we expect to see Deadline running natively on Apple Silicon any time soon? Thanks!
Consistent failures on both M1 and M2.
tried multiple options, setting cleanup to 3600
enabling "prevent app nap: for both monitor and slave,
system power setting are all set to always on, rosetta installed several times.
csrutil disabled, security settings set to low, Thinkbox, C4D set to full disk access
Yet all 14 x M2 studio nodes will randomly become stalled only in deadline app, both slave and monitor.
Monitor crashing consistently on M1.
My second farm which is 15 Mac Pros (2019 Intels) do not have any issue!
This is really unnaceptable for a product many businesses rely on.
We’ve not had any luck reliably re-creating this issue to root-cause the issue; it is as intermittent for you folks as it is for us. We’ve got an engineering issue rolling, but we can’t share any details about when it would be resolved, or when a release with a fix would happen due to Amazon policy.
QT seems to be the common denominator in the spinlogs we’ve gotten in this thread and in Worker randomly crashing on MacOS M2 Ultra. We only use QT to create the UI. Which means running the Worker with -nogui should resolve the issue, but that’s not what we’ve been seeing.
If possible, could we get a spinlog from a crash when running the Worker with the -nogui flag? It could be there’s two causes of a hang and crash, and the second only gets to show up with QT removed from the equation.
Hey Justin, thanks so much for the response. I’ve just sent you a DM with a -nogui spindump. I had previously attempted running it with the -nogui flag, but it still stalls/hangs. It also might be worth mentioning I’m having no issues whatsoever running the Monitor on my M1 systems, it’s just the Worker that’s problematic. Thanks!