AWS Thinkbox Discussion Forums

Various Issues with slaves

And again…
This time it’s cell-rs-18. Stalled around 22:09.
Logs attached.

Cheers,
Holger
Deadline_logs_2014-11-01_2.zip (986 KB)

Thanks! This is actually a case where they slave didn’t shut down cleanly, but we should be able to prevent this specific problem from occurring in the future. We’ll get this fixed for beta 7.

Cheers,
Ryan

We’ve been running beta7 since yesterday but it seems the problem is still there.
Just happened around 45 mins ago. Pulse and Slave logs attached.

Cheers,
Holger
Deadline_logs_2014-11-07.zip (187 KB)

Strange. We’re not able to reproduce it any more on our end…

Just to confirm, was Pulse also updated to beta 7?

Yes, Pulse is also on b7.
Does it make sense to give you any Windows log or something?

Note sure if that would help. The slave log isn’t showing a crash, so I wouldn’t expect anything to show up in the windows event logs.

In beta 8, we’re going to be adding additional logging to the slave right before it exits. Hopefully it helps narrow this down further.

Is the problem happening as often with beta 7 as it was before?

Also, are you running the slaves as services or as normal applications? We tested both cases here and couldn’t reproduce, but it would still be good to know.

Thanks!
Ryan

Right now, we’re running them as normal applications but might be switching to service at some point.
I can’t tell yet whether it’s happening as often as before as we’ve been running beta7 only since yesterday and the rendering demand has decreased quite a bit since we just finished a project but i’ll keep an eye on it for sure.

Cheers,
Holger

Brief update: it’s definitely still happening. Maybe/probably not as often as before but hard to tell still. It won’t help to send you any more logs before the additional logging info in beta8, will it?

We’re still struggling to reproduce here. When you upgraded Pulse to beta 7, was it by manually running the client installer, or by auto-upgrading? Regardless, can you try running the client installer on the pulse machine again to see if it helps?

Thanks!
Ryan

It was with the manual installer.
Should i uninstall first or just run it again?

Cheers,
Holger

Just run it again.

Thanks!

I did re-install Pulse but it’s still happening.
Don’t have any scientific numbers but considering the fact that the renderfarm hasn’t been as busy the last 7 days or so but it’s still happening more or less frequently i guess the issue is almost as severe as before.
But i’ll wait for the next beta before i send any logs.

Cheers,
Holger

Thanks for confirming it’s still an issue. We’ll keep trying to reproduce here, and hopefully the Slave logs in beta 8 might help a bit.

Cheers,
Ryan

Assuming beta8 now became RC1, so the extended Slave logging is included in 7.0.0.46?

Cheers,
Holger

and here we go again… :wink:
It happened around 12:59, that’s when the notification mail came in.
Slave and Pulse logs attached. The Slave was quit manually in this case.

cheers,
Holger
Deadline_logs_2014-11-17.zip (569 KB)

Hey Holger,

Thanks! We were finally able to reproduce this problem on Friday last week, and we think we have a solution for it. We’ll be including the fix in RC2.

Thanks again for your help (and patience) with this one. :slight_smile:

Cheers,
Ryan

hey Ryan,

great news! Looking forward to RC2 then!
You’re welcome, glad i could help.

Cheers,
Holger

Hey Ryan,

i’m afraid i have bad news…
I just upgraded all machines to to RC2 but it happened again with one machine when it was shut down by Pulse.
Must have happened around 23:07, it was marked as ‘Stalled’ on 23:21.
Slave and Pulse logs attached.

Cheers,
Holger
Deadline_logs_2014-11-23.zip (48.1 KB)

and another one right after. This machine was shut down manually by shutting down Windows - without closing Deadline Slave before.
Logs attached.
Deadline_logs_2014-11-23_2.zip (68.9 KB)

Thanks for the logs! Neither of them are showing the new logging we added right before the slave closes, so they must be getting killed prior to that. We’ll look into it.

Thanks!
Ryan

Privacy | Site terms | Cookie preferences