AWS Thinkbox Discussion Forums

Not performing housekeeping, another house cleaning process already in progress

I cannot perform Housecleaning, tell’s me there’s another house cleaning process already in progress.

If i run it straight from the server the error i get is the following:

A previous house cleaning process has expired, cleaning up…
Performing house cleaning
Performing Job Cleanup Scan…
Job Cleanup Scan - Loading completed jobs
Job Cleanup Scan - Loaded 4927 completed and 22 active/pending jobs in 1.998 s
Job Cleanup Scan - Scanning completed jobs
Process timed out after 300 seconds without an update.
Error running housecleaning process: Process timed out after 300 seconds without an update. (FranticX.Processes.ManagedProcessAbort)

Tried increasing the timeout value to a bigger value 3000, but I still times out. Any ideas?

Version 8.0.17.1

It may be that what it is cleaning up is quite large. The timeout is to protect Pulse or the Slave from locking up which could happen if a bad house cleaning event script was written.

You can force a house cleaning from Deadline Command on any machine by running the following in a command prompt or terminal window:

cd /path/to/deadlines/bin
./deadlinecommand DoHouseCleaning False True 

That should do a verbose house cleaning even if the safe amount of time has passed. I believe that we’ll still stop things from happening if another house cleaning event is firing, so you can increase the timespan between cleanings in the Repository configuration so it’s less likely to conflict.

I’d just let the ‘deadlinecommand’ call run for awhile and see if it ever completes. The timeout of 3,000 only works out to 50 minutes, so if it’s still not done in a few hours, we’ll have to think up something more creative to continue troubleshooting.

Thanks, will let it run for a while and see how it goes. Thanks!

Edit: That was fast haha, it didn’t go anywhere.

Events plugin <plugin_name> could not be loaded from the repository because: Error executing event plugin script “”: WindowsError : [Error 3] The system cannot find the path specified: ‘x:\’ (Deadline.Events.DeadlineEventPluginException)
Deadline Command 8.0 [v8.0.14.1 Release (534e1f067)]
Purging old House Cleaning logs.
Another house cleaning process is already in progress

And it will stop right there and won’t go beyond that. Don’t think it has to do with that Events plugin error, does it?

Well, that’s fun. I wonder if it has anything to do with the double slash mentioned in that path. Normally multiple slashes just compress down into a single one, but maybe whatever we’re calling in the .net framework isn’t having any of that.

What’s your network root set to? Mainly, what does you “Change Repository” dialog show? If it’s “X:\”, can you try “X:”?

Screen Shot 2017-04-19 at 8.55.09 AM.png

If that’s not it, we should probably take a look at that plugin you might have and see if it has that double slash.

The change repository line reads fine i think:

\<machine_name>\repository8

That one’s UNC path, and i am guessing it is correct.

Let me check what does that plugin say and check with the guys here who wrote it if i can change it to just X:\

Edit: Just realized that it is telling me about that X: error, because there’s no X: mounted in the machine i am running the housecleaning job from. So it shouldn’t be an issue, i can still mount it so we can eliminate that possibility if you want, but we might need to check if it’s not some other thing. What do you think?

Edit2: After mounting X just to get rid of that error i get the error when running housekeeping from other machines:

2017-04-19 09:46:44: Startup Directory: “C:\Program Files\Thinkbox\Deadline8\bin”
2017-04-19 09:46:44: Process Priority: BelowNormal
2017-04-19 09:46:44: Process Affinity: default
2017-04-19 09:46:44: Process is now running
2017-04-19 09:46:50: Deadline Command 8.0 [v8.0.14.1 Release (534e1f067)]
2017-04-19 09:46:50: Purging old House Cleaning logs.
2017-04-19 09:46:50: Another house cleaning process is already in progress
2017-04-19 09:46:51: Process exit code: 0

So it’s not that.

A little bit more info about this one, it is not about a house keeping process already in progress it seems, because i’ve been able to run it now, It just never finishes. Increased the timeout to 1 hour and it didn’t finish. I am going to increase it a little bit more, and test BUT it seems it gets stuck there and then times out.

Seems there was a machine doing housekeeping that was stuck or something like that, either that or setting a higher timeout value did the trick. I have been unable to run and complete it now. Sorry for the trouble.

S’all good. The house cleaning shouldn’t get stuck permanently. I think we have the same sort of concept as we do for stalled Slaves in that if the house cleaning status hasn’t been updated for awhile we’ll force it from another machine.

We should have a verbose mode for that as well, so if you ever want to find out what it’s busy doing you can run Pulse, set the house cleaning to happen within the Pulse thread, and turn on Pulse verbose logging. If I recall correctly, it’s pretty wordy.

Privacy | Site terms | Cookie preferences