When deadline do the auto shutdown by power management, just 1 slave happen “stalled slave report”.
However, actually this PC doesn’t stall. It is shut down definitely. So, just error message happens.
This problem may not occur anytime. But often occurs when I use many slaves.
And this problem happens on the deadline 6.1. When I use 6.0, I didn’t see this problem.
thanks
This means the slave didn’t exit cleanly, and had to be killed to shutdown the machine.
It would be helpful if we could see the slave log from the session prior to the machine shutting down. It should show what the slave was doing when it was told to shutdown. You can find the log folder from the Slave application by selecting Help -> Explore Log Folder, or by right clicking on the Launcher icon and selecting Explore Log Folder. Just make sure to grab a log from the session BEFORE the machine was shutdown.
Thanks!
Ryan
hmm, please let me know where is the default log folder.
Those PCs on the service mode(windows PCs). I don’t use slave app.
And strangely, I cannot see my all of service mode PC’s log from monitor. I just get connecting error.
But I can see the log of the other PCs(workstation which I start slave app manually) from monitor app.
thanks
Yup, on Windows, that folder is:
%PROGRAMDATA%\Thinkbox\Deadline6\logs
Cheers,
Ryan
I attached logs with slave reports when error happened.
Repository may miss in receiving the command of shutdown sometime??
thanks
error-log.zip (11.6 KB)
Thanks for the logs! I took a look, and it looks like for whatever reason, one of the slave’s threads fails to exit. It’s the thread that’s responsible for updating the slave’s state to Offline when it shuts down. You mentioned that this only happens on one machine right? Have you tried reinstalling Deadline 6.1 on this machine to see if that helps?
Cheers,
Ryan
It happens one machine per once time. not specific PC.
It is different every time which PC makes stall error. But just only 1 every time.
And when I use 5 - 10 PCs, I didn’t meet this. It happens when I used over 50 slaves. However, it may not occur every time…
And this problem does not occur on the DL6.0.
Ah, okay, thanks for clearing that up for me.
I know you guys are running Pulse, since you’re using Power Management, but do you have the Pulse host name (or IP address) configured in your Repository Options?
thinkboxsoftware.com/deadlin … e_Settings
I notice in the slave log that the slave is doing some housecleaning operations, which it shouldn’t be doing if pulse is running. There are two reasons a slave would do this if pulse is running:
- The pulse host name (or IP address) isn’t set (see above).
- The host or IP is set, but the slave is unable to connect to pulse using it.
If it’s (2), it could be due to a firewall on the pulse machine, or a DNS issue that prevents the slave from connecting to it.
I’m wondering if a slave is getting stuck on a long house cleaning operation when they stall like this.
Cheers,
Ryan
hmmm,
I already set the IP address of pulse to the repository options.
And I add deadlinepulse.exe to the allow list of firewall. Actually, pulse server is same as repository machine.
And no any other problem on the network.
strange…
How can I check the connection to pulse from the slave PC??
I can see it on the monitor of “Connected To Pulse” list. Of course it said “Yes” about all of active machines. And if some of machines shutdown, it said “No” about that.
Interesting… maybe that was just a coincidence that this slave happened to be doing some housecleaning checks. I double checked that log and the housecleaning process did finish before the slave shutdown, so it probably wasn’t related anyways.
The last think I’ll ask you to do is enable Slave verbose logging in the Repository Options:
thinkboxsoftware.com/deadlin … on_Logging
Then the next time it happens, send us the slave log again. If this doesn’t help, we’ll simply have to try and reproduce on our end.
Thanks!
Ryan
OK, I will set verbose logging. And will send log when that happens again.
thanks!