Some repo settings seem to keep resetting when we have a mixed beta9 and beta10 farm:
scheduling weights
house cleaning in separate thread
If i open the repository settings from a beta 10 monitor, and reset these values, they will go back to their default settings ina beta9 monitor. If i then fix them there, they reset to their defaults in the meta10 monitor…
So there is no way to get both beta9 and beta10 clients to use the proper settings.
Nope, there isn’t. This is because beta 9 still uses integers and beta 10 uses floats. We had to change the property name in the repository options to avoid casting issues when pulling the settings from the database.
The problem is that both sets of property names don’t exist at any given time. So if you commit settings from beta 10, the integer settings are gone, and then beta 9 slaves will just resort to their defaults. Vice versa when committing the settings for beta 9.
That’s making updating a pretty big pita
Basically, it becomes an ‘all or nothing’ update. Meaning, that if anything goes wrong (like with beta10), we might be down for an extended period of time.
Would you suggest maybe updating pulse only to beta11, set all the settings for it from a beta11 monitor (separate threads etc), then go back to a beta9 monitor and reset the weight settings? Or would that also revert the separate thread settings of pulse?
We are currently holding off on further updates for 1-2 weeks due to deliveries and the problems beta10 caused.
Tell all the slaves to restart after they finish their current task.
From a beta 11 Monitor, set the repository options.
Because the beta 9 slaves will restart after their current task, you won’t have to worry about them pulling the wrong weight settings when they look for their next job.
We have tasks that take 5-10 hours to render, so when everything is going fine a full update takes about a day, sometimes two. I’ve also found that about 1/3rd of slaves fail to self update and have to be done manually (either launcher is down, or there is a popup on the machine, or the slave hangs etc).
I usually started by updating pulse first, as we see within 5-10 minutes if things are bad if pulse is misbehaving. But to prep pulse, i needed to tweak the repo settings (as the default is to NOT run in a separate thread), which got picked up by the old slaves and they would go into essentially FIFO mode right away.
If we start with the slaves and pulse ends up to be crashy, we need another couple hours to roll back :\