Currently, Deadline slaves seem to cache all plugins in the repository at launch, and reuse them until the slave process is restarted. This is a fairly major concern for us, as we need the ability to make changes to plugins without having to restart the entire farm every time.
In lieu of implementing something hacky like having every task blow away the slave user’s plugin cache in $HOME after completing, I’m wondering if you have any thoughts on a good way to allow slaves to automatically re-synchronize plugins. This could obviously be a repository-wide option, but as far as how it could be done, some initial thoughts are
Re-synchronize all plugins every time a task is loaded.
Re-synchronize the plugin required by the task that is about to be executed.
Store some kind of hash or update timestamp in the DB and have the slaves hit that to determine if they need to re-synchronize plugins.
Also, does the current caching behavior apply to event plugins as well?
The slaves don’t cache all the plugins at launch. Whenever a slave picks up a new job, it clears out the previously cached plugin and copies over the new one. So newly submitted jobs are always guaranteed to use the updated plugin. We’ve confirmed this behavior still works as expected with Deadline 6.
Event plugins work a bit differently. They will always pick up the new changes when they are fired.
Ah, that makes more sense. I guess my next question is, is there a way to force a slave to re-synchronize a given plugin in the middle of a job?
I’m thinking of a situation where a slave has been plowing through frames and erroring on each one (due to a plugin issue). I then apply a fix to the plugin it’s using, push the code change into the Deadline repository, and re-queue the errored tasks, but until that particular slave switches to another job or I restart it, it won’t see my fixes.
Yeah, that’s a valid point. Maybe the slave could just cache the last write times of the plugin’s .py and .dlinit files, and resync them if the time changes. This is what we do for event plugins.
Yeah, I think something like that is pretty necessary. Things could get really weird if one slave picks up a job, starts running tasks, the plugin changes, then another slave picks up tasks on the same job, now with a different version of the same plugin.