Im a bit fuzzy as how this works now. In deadline 5, when a job accumulated 100 errors (or as many as your limit), it would get failed. To resume it, i normally would clear its errors, then resume.
However now, if i clear the error reports (which takes a LOOONG time btw…), and then resume the job, it gets failed right again.
So how am i supposed to do this without upping the error limit? Once the job error reports are cleared, the Error counter in the job list shows 0, yet the Task error timers are showing values in the 30-40s (with no corresponding reports…)
I can’t reproduce this behavior. However, I did discover a bug in RC2 that prevented job failure detection from working at all.
I’m guessing you guys are still on RC1. The bug I ran into will be fixed in RC3, so you might want to wait until that’s available (which should be next week). If you still see this behavior in RC3, let us know!
I think that’s something we’ll have to look into post-6.0. I’m sure the slowness is because it’s deleting physical log files off the repository. We’d have to change this so that a deleted flag gets set for reports so that the background housecleaning process can remove them from disk later. We’ll definitely make this a priority for 6.1 though.