Has anyone else been having issues with deadline slaves stalling within an hour of starting since Friday last week? Sophos had a cock up with their virus def’s and I think it might have screwed up Deadline somehow. Same on all OS machines, stalls, and does not respond, without going locally and killing the process.
Re-installing the repository now, but dont hold out much hope! Will have to start re-installing the slaves next.
If the reinstall doesn’t work, can you send us a log from a few different slaves after it hangs like this? You can find the log folder from the Slave application by selecting Help -> Explore Log Folder, or you can do the same from the Launcher’s right-click menu. I’m curious to see if the slaves get stuck in the same place, or if there is a common error message printed at some point before they hang.
I dont think the re-install has worked on the xp machines. Stupid Sophos!!
The windows crash error for the slave, said somwthing about Kernal32.dll being the cause.
The windows 7 machines are up for over an hour now ok, so maybe Sophos has done something to windows xp. I’m surprised none of your other deadline users use Sophos, its a widely used one.
… And as I press submit, the win 7 machine crashes too!!
I’m taking this off the forums, Ill send you the two logs from that machine right now, as well as the windows 7 crash report, its the Kernalbase.dll that has failed.
Thanks for the log! Go to \your\repository\temp and try manually removing everything in that folder. We have seen cases where the temporary time check files we create in this folder aren’t cleaned up, and eventually the folder fills up with files to the point where it crashes the slaves when they try to purge them.
We have taking steps in the 5.2 beta release we uploaded last week to try and make the cleanup process more robust. If you have access to the Deadline beta boards, you can upgrade and see if the problem occurs again over time.