AWS Thinkbox Discussion Forums

mac slave crashing with Deadline 6.1

We are experiencing frequent crashing of Deadline Slave on all our Mac OS nodes in 6.1.

Currently on 6.1.52824 (Beta 6), going to test this week moving to 6.1.53328 (Beta 8)

At moment we have a script to check if slave is running every 2min & start if its not running. Most of time only Slave dies but sometimes Launcher too. Sometimes mono is crashing not sure if thats related, I’ve attached one of those crash reports to this thread. We are seeing less frequent mono crashes on our linux nodes (we have very small # mac nodes vs 100s linux nodes) attaching example stacktrace too.

Our nodes are all identical hardware/software. Let me know any other useful info to provide, will report back after testing Beta 8.

Model Name: Mac Pro
Model Identifier: MacPro5,1
Processor Name: 6-Core Intel Xeon
Processor Speed: 2.93 GHz
Number of Processors: 2
Total Number of Cores: 12
L2 Cache (per Core): 256 KB
L3 Cache (per Processor): 12 MB
Memory: 24 GB

System Software Overview:
System Version: OS X 10.8.4 (12E55)
Kernel Version: Darwin 12.4.0
Boot Volume: boot
Boot Mode: Normal
User Name: System Administrator (root)

Mono JIT compiler version 2.6.7 (tarball Tue Aug 24 16:33:27 MDT 2010)
Copyright © 2002-2010 Novell, Inc and Contributors. mono-project.com
TLS: normal
GC: Included Boehm (with typed GC)
SIGSEGV: normal
Notification: Thread + polling
Architecture: x86
Disabled: none
mono-stacktrace.log (13.2 KB)
mono_2013-10-22-172358_localhost.crash.log (79 KB)

The launcher had a memory leak in beta 6, so it could be that it ends up eating all the system memory before the crashes occur. This was fixed in beta 7 (and therefore beta 8) so upgrading is definitely recommended.

We also discovered a potential issue where we weren’t acquiring the python GIL in some cases when building up parameters to pass to certain scripts (pre/post job or task scripts, dependency scripts, etc), so if you are using any of those types of scripts, that could explain the slave crash as well. This will be fixed in beta 9.

Let us know if you see any improvements after upgrading to beta 8. You might just want to restart the Macs after upgrading as well, just to “clean things up”.

Cheers,

  • Ryan

Thanks Ryan! We still haven’t pulled trigger but will update asap this week, do you have eta for beta9?

We’re hoping to release it tomorrow, or Wednesday at the latest. This is assuming nothing catastrophic happens during the build process tomorrow morning… :slight_smile:

Running beta9 now for few days…we haven’t had many mac jobs to put in through paces yet, so far just 1 slave crash with associated mono crash. Will report back as we continue testing.

Privacy | Site terms | Cookie preferences