Yeah, it’s the reports that cause this slowdown. In beta 12, deleting a job will no longer include deleting the reports or auxiliary files. Those will eventually be cleaned up by the housecleaning code.
If you are running Pulse, housecleaning is performed pretty regularly. If Pulse isn’t running, the slaves will eventually get to them between tasks.
Pulse is no longer needed for the performance boost. In fact, it’s not used as a proxy at all anymore. Here’s all it does now:
Perform regular cleanup operations (release job dependencies, delete jobs marked for removal, etc). Note that the slaves will do this themselves if Pulse isn’t running, but it’s more random.
Power management. Redundancy for temperature checking is built into the slaves, but the other areas of Power Management require Pulse to be running.
Statistics gathering. Job statistics are handled by the slaves, but stats for the slaves and the repository in general require Pulse.
Slave Throttling (so that only a certain number of slaves can load a job at once, which can help network bandwidth).
So it’s still nice to have running, but Deadline’s performance is no longer dependent on it.
Using beta 17, deleting jobs still seems to be fairly slow.
I am doing a regular cleanup, and trying to delete ~4600 jobs. Its been going for about 3h 40minutes so far. There is nothing in the logs, and i dont see any progress bar, so im not sure how long it will take, but i would expect it to be much faster, in the range of a couple of seconds
We’ll have another look at this to see if we can replicate it, though at this point we suspect it might be related to your other issue of Mongo using up a ton of CPU. At the very least, we could probably do this in the background, so that the Monitor isn’t locked up while you wait for Jobs to delete
After looking into this a bit more, I was able to replicate this with a large amount of jobs. I don’t think it was quite as pronounced as what you’re seeing, but it locked up my Monitor for ~20-30 mins for 5,000 jobs. I suspect that it’s related to Event Plugins at this point, I’ll have a go at improving the way we call/check the event plugins.
FYI…
I think Ryan put a new event plugin in for “onJobDeleted” when I mentioned it the other day…which will be really useful for me, but not if it kills performance!
I noticed I’ve made improvements to how it loads the Event Plugins in this case, but it doesn’t seem to have accounted for all of the delay I’m seeing when deleting a large amount of jobs. I’ll do some more tweaking.