Slaves idle, not picking up Jobs

Everything looks correct, no slave logs show any access/permission errors.
Repository sits on NAS share, Jobs are submitted OK, slaves stay at idle and do not pick up jobs (status stays at “checking repository integrity”)
Slaves show up as able to render Jobs, so no submission/pool/group issue. Slaves are Windows 7 64.

When I move the Repository to any of the slaves, with the same Jobs/settings/everything, slaves pick up the jobs and render normally.
So it looks like a permissions issue, but as said the strange thing is that all logs are clear of errors. Also checked the NAS logs for access errors, not found any.
I would be very grateful for any help in resolving this

thanks in advance

Another client just mentioned that the JobRepositoryScan.lock file (in the jobs folder) caused him grief because Slaves (or in his case Pulse) couldn’t get a write lock on that file.

If you delete it, do any of the Slaves pick anything up?

Thanks for the quick response, yes I forgot to mention I tried that too but Slaves still do not pick anything up

If you open the Monitor on one of the slave machines, are you able to see the jobs in the queue?

Also, we should try enabling verbose logging for the slaves, which can be done from the Monitor while in Super User Mode. Select Tools -> Configure Repository Options, and then enable Slave Verbose Logging in the Application Logging section. After committing the change, restart your slave applications so that they recognize the change immediately. Then watch the slave to see if it’s printing out any warnings or errors.

Finally, are you running Pulse?

Hopefully this helps us narrow down the problem a bit.

Cheers,

  • Ryan

Yes jobs are visible in slave Monitor.
Verbose logging shows no errors, only Scheduler reports (“Job chooser found no jobs”)
Pulse is not running, but everything was working fine without it when I used Deadline on the same NAS/slaves configuration some time ago.
Also now, when I move the Repository to a slave, everything works OK without Pulse running.

I notice now that all job tasks are marked as “Deleted” in the Monitor upon submission.
However, in the job/tasks directory in the repository, they still appear with the .Queued.task extension !
The tasks cannot be marked as Complete/Failed because the app can’t find the .Deleted.task file

All permissions are set to 777 in the smb.conf file on the NAS

Very strange!

Is it possible that you have a mismatch of Deadline versions installed? If you have Deadline 5.x submitting the jobs and 4.x on your slaves (or vice versa), the tasks will actually show up as deleted because we changed the way we stored jobs in 5.x.