[8.0.12.4] crash - hang

A slave threw this error while tearing down after a render. Note the first line that is pretty odd to begin with, the slave is canceling the same job that its cancelling for?. Its been hanging ever since this process:

2017-03-01 17:50:26:  Scheduler Thread - Cancelling current job 58b77570c2efd02b686b8a28 because it is interruptible and higher priority job 58b77570c2efd02b686b8a28 has been found.
2017-03-01 17:50:26:  0: In the process of canceling current task: ignoring exception thrown by PluginLoader
2017-03-01 17:50:26:  0: Unloading plugin: 3dsmax
2017-03-01 17:50:26:  0: INFO: End Job called - shutting down 3dsmax plugin
2017-03-01 17:50:32:  0: WARNING: Timed out waiting for the renderer to close.
2017-03-01 17:50:32:  0: WARNING: Did not receive a success message in response to EndJob: 
2017-03-01 17:50:32:  0: INFO: Disconnecting socket connection to 3dsmax
2017-03-01 17:50:32:  0: INFO: Waiting for 3dsmax to shut down
2017-03-01 17:50:32:  0: INFO: 3dsmax has shut down
2017-03-01 17:50:33:  Scheduler Thread - In the process of canceling current tasks: ignoring exception thrown by render thread 0
2017-03-01 17:50:33:  Scheduler Thread - Seconds before next job scan: 2
2017-03-01 17:50:36:  Scanline_NukeCacheCleanup: Scanline_NukeCacheCleanupEventListener.OnSlaveIdle sSlaveName LAPRO0496
2017-03-01 17:50:36:  Scanline_NukeCacheCleanup: Calling cleanupSaveCache
2017-03-01 17:50:36:  Scanline_NukeCacheCleanup: cleanupSaveCache called
2017-03-01 17:50:36:  Scanline_NukeCacheCleanup: Info file name: S:/ns/_info
2017-03-01 17:50:36:  Scanline_NukeCacheCleanup: Info file found
2017-03-01 17:50:36:  Scanline_NukeCacheCleanup: Skipping save cache check, waitin -14.9134163889 hours...
2017-03-01 17:50:36:  Scheduler - Performing Job scan on Primary Pools with scheduling order Pool, Weighted, Balanced
2017-03-01 17:50:36:  Scheduler - Using enhanced scheduler balancing
2017-03-01 17:50:36:  Scheduler - Preliminary check: Slave is not whitelisted for 58b6a2ff13ead4145aa86b2c limit.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b6a2f013ead413f40bca80 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The ingest2d limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The r3d_convert limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b772976861fc1650fa6fef limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77530e087f02f3099ea85 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7764ff0035847a4a07217 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7769fcdb7408258f2b415 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77712997d663878d37a4d limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b777ec2d173f23d06f2637 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b777f83178be29bc85aacd limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7789362235c2bb018d9e3 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b778d4b495f21608830751 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77903cd1ff22d589f62ad limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77925b2729b120c5280ef limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7792678d6425d90eff346 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7713b89379c11f01373ce limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b774e06487a523f887dfd9 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Returning 58b76f5e373bca3af4988678
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76fdae389b824c064f42a limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b771aec659b005c8909557 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b771d277c7e428e89f6b7f limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b778561426bc224008a78d limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b6f3831dbf03a5b804602a limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: Slave is not whitelisted for mx_fumefx3 limit.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b70eea5c371e1320095414 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b70f435c371e256c19e8d0 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Returning 58b70fe35c371e0ef4fd16bb
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727be92eb31499805749e limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727c192eb311058e683c9 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727c492eb3118a83e13bc limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727c792eb313808d22c6a limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727ca92eb3131b40cc142 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727cc92eb311ecc51a3ed limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727cf92eb311c989ee60d limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b727d292eb312704ec282e limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b73dfcb25c5b046cae648c limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b73eb0b25c5b4a60e04940 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b745adbc7b1a13743254ed limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b745b0bc7b1a53b421e1a5 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b745b3bc7b1a47843d3a06 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b745b6bc7b1a640441b564 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b748fbeb60810810ccb63f limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b74937eb608110d86a46ed limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7493aeb608127bc27e9e5 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7498aeb608125a467971b limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7498deb608124043c1716 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b74a53eb608127f4676228 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b74a56eb60812f0cffaa91 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b74b2ceb608122bc1fdcf4 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b74b2feb60812480cf0260 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b74b7ceb60812a80891474 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b74b80eb608124600a7fda limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7583db25c5b42f030d3db limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b75be5b25c5b57e43e9243 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76601b25c5b0308321706 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76639b25c5b5b901824aa limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76db9b25c5b0afc619120 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76dbc92eb3121bc7e9ed1 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76eaf92eb31511c9106a9 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76f72bc7b1a28a87b8d9c limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76f75bc7b1a3390862711 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76f78bc7b1a3ce48672b4 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76f7abc7b1a2a28828e9d limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7704ab25c5b59e448f3cf limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b770c1bc7b1a15642803e2 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b770c4bc7b1a0f3404e353 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b770c7bc7b1a448c11be26 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b770cabc7b1a3cb03b9414 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77186eb6081213c0dd2de limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7718feb60812a7c67a257 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77192eb60810588151f82 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77282eb608121a059274f limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7728aeb60812d488f8a95 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7728eeb608128ec0f4dbc limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b772d6eb6081161c7e0fbb limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b772deeb60812bf8b6b288 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b772e2eb608105b027a8cf limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b7732ceb60812ca0541177 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77335eb60812548e061f6 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77339eb608110f848aefc limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b6ff245c371e23a4b0f1f1 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b710525c371e1c5492c936 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b72e6f92eb312edc56e00e limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b72e7492eb313b08d65791 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b748ffeb608118243744a3 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76eb292eb314ee4b6acdc limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b69dd371302e2ad4a20cc3 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b700b65c371e0b405d71a0 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b72e6c92eb31422056946b limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b72e7192eb312c38ae7000 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b70c506d6b041628f28c6f limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: Slave is not whitelisted for mx_128gb limit.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58af6364b25c5b3558f1c81f limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76ea4bc7b1a5a981d75e0 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b76ea6bc7b1a50cc02b4c6 limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b6f12ed1a83a76c09d438e limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Preliminary check: Slave is not whitelisted for 58b5cd74b453550794df7d04 limit.
2017-03-01 17:50:36:  Scheduler - Preliminary check: Slave is not whitelisted for 58b77988a3f32f3b587c60b0 limit.
2017-03-01 17:50:36:  Scheduler - Preliminary check: The 58b77a198e592f52e414b2ee limit is maxed out.
2017-03-01 17:50:36:  Scheduler - Returning 58b72dabd26a751f94265a79
2017-03-01 17:50:36:  Scheduler - Successfully dequeued 1 task(s).  Returning.
2017-03-01 17:50:36:  Scheduler - Returning limit stubs not in use.
2017-03-01 17:50:36:  Scheduler -   returning mx_flowlinesim
2017-03-01 17:50:36:  Scheduler -   returning rendernodes
2017-03-01 17:50:36:  Scheduler -   returning quicktime_confirmed_working
2017-03-01 17:50:36:  Scheduler -   returning nuke9
2017-03-01 17:50:36:  Scheduler -   returning nuke
2017-03-01 17:50:37:  System.Threading.ThreadStartException: Thread failed to start. ---> System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
2017-03-01 17:50:37:     --- End of inner exception stack trace ---
2017-03-01 17:50:37:     at System.Threading.Thread.StartInternal(IPrincipal principal, StackCrawlMark& stackMark)
2017-03-01 17:50:37:     at System.Threading.Thread.Start(StackCrawlMark& stackMark)
2017-03-01 17:50:37:     at System.Threading.Thread.Start(Object parameter)
2017-03-01 17:50:37:     at Deadline.Slaves.CommandListener.b()
2017-03-01 17:50:37:     at Deadline.Slaves.CommandListener.c()
2017-03-01 17:50:37:     at Deadline.Slaves.CommandListener..ctor(Int32 commandPort)
2017-03-01 17:50:37:     at a.a(String[] A_0)

Its been sitting there ever since, still holding on to the task in its gui:
Capture.PNG

There is no sign of any other processes or high ram usage.

Slave nameis lapro0496 (for future ref)

Hmm, interesting. If I were to guess, I’d say the two issues are unrelated; it seems to have succeeded in it’s ‘requeue’ of the Job.

Was this with Plugin Sandboxing turned back on? That OOM error is definitely coming from somewhere inside a Deadline Sandbox (unless something REALLY weird is going on). It’s possible that it was the Event Plugin sandbox (which doesn’t get turned off currently), so that might explain it if you do have it turned off still.

Is the process still running (either the Slave or the Sandbox)? I imagine you’ve probably restarted it at this point, but if so, a memory dump would be helpful in determining what it’s hanging on.

Also, if you check the log, were there any Event Plugin timeouts that occurred earlier on? Are you guys using event plugin timeouts? Now that I’m looking at some of this code, there’s definitely some weird behaviour around timeouts that might result in this kind of hanging; we’ll definitely look at getting that fixed either way.

We actually only reenabled plugin sandboxing on the 2nd of march (yesterday), so this crash seems to have happened before that.

The process has been restarted since,… i didnt find any obvious even errors in the logs, neither anything in the windows event logs.

I dont think we are using any event timeouts… Is that a feature im unaware of? :slight_smile:

What do you think about this line?

Yeah, I’m not sure what’s going on there, that’s very strange. Glancing at the code, there is a check to try and make sure that the detected job isn’t the current one, but there might be something else mutating the state in another thread.

To confirm, it was actually rendering that Job ID? Wondering if this is a display issue somehow… Has it only been this one instance so far, or has this been happening repeatedly?

Yeah thats the job it was rendering!