you can see from the attached file that deadline thinks the frame is still rendering but the slave listed as active is actually not doing much of anything. This is occurring on our farm constantly and prevents jobs from finishing unless the tasks that are hanging are manually re queued.
Hey there,
Could you post a sample Slave Log from when this happened, pointing out the particular task for which this occurred? A Pulse log might also be useful, since I see you guys are running Pulse. There should be something in there that will help me figure out what’s going on.
Until we pin down what the root cause of this is, you might want to look into setting up the Auto Job Timeout feature. This will automatically fail and re-queue tasks that are taking longer than they should, so at least you won’t have to do it manually all the time. You can find this feature in the Repository Options. (See “Auto Timeout Settings” at http://www.thinkboxsoftware.com/deadline-5-repositoryoptions#Jobs for more details)
Cheers,
- Jon
we will have to wait for the behavior to surface again to get the log files, I cant correlate it to an event any more the artist re-queued the task and I lost it…
I will try the timeout setting and see what that does.
thanks
Alright, sounds good! Let us know if/when it pops up again and we’ll figure this out.
Cheers,
- Jon
This is now happening on my farm… For me, Pulse is the problem, if I set the Host IP to blank then the machines release the tasks correctly. If I set the IP to the pulse server correctly then I get problems where frames will be completed by slaves but the deadline monitor still has them as rendering indefinitely.
It is worth noting that I am running pulse on a non server version of windows with around 20 slaves.
Cheers, Adrian
Hi Adrian,
This “task hanging” issue has been resolved in Deadline 5.1, which should be released tomorrow.
Note though that running Pulse on a non-server version of Windows is not recommended or supported, due to the 10 connection limitation:
thinkboxsoftware.com/deadlin … ine_Client
We’re actually changing the wording of the Note in 5.1’s documentation because the current wording implies that you will only run into problems with 100 nodes or more. That’s not necessarily the case.
Cheers,
- Ryan