Tasks Hanging

We occasionally run into these tasks that appear to complete but sit on the farm hung up until we re queue them.
any idea where we can look for a solution to this?

[/img]

This is an issue we’ve focused on in 5.1. We discovered there was two causes for this:

  1. If slaves were getting their tasks from pulse, it was possible that the slaves wouldn’t “see” the task file assigned to it right away. So the slave would think the task had been requeued and would move on. We’ve improved this in 5.1 by introducing a Task Confirmation system to ensure the slaves don’t start rendering their task until they can confirm they can “see” it.

  2. During rendering, a network issue could cause the slave to no longer see its task file (even though it can still connect to the repository and still see the job folder). Again, this would make the slave think its task was requeued and it would move on. We’ve improved this by having the slave confirm that the number of task files it can see matches the expected task count for the job. This should prevent network hiccups from causing the slave to abandon tasks like this.

Have you guys moved to the 5.1 beta yet? If not, then when you have the chance to upgrade, let us know if you still see this problem.

Cheers,

  • Ryan

we were waiting to install the beta until the build with the code to work around our submission problem get pushed out.

thanks for the update

Ah, that’s right. Beta 4 should be out next week.

Cheers,

  • Ryan