so where i’m currently working, we have our deadline queue set up as first-in, first-out with priority as a total over-ride. to keep people from gabbing “too much” of the farm, we then enforce “reasonable” limits for the number of machines a single job can take. of course “too much” and “reasonable” are pretty subjective…
it works okay, i suppose, but the farm is often wide open unless there are a lot of people rendering a lot of jobs (max machines typically in the 20-50 range with a farm of 500-600 slaves). what would help things would be to have a soft-max setting that would grab additional machines if they’re available. as resources get short, tasks would be culled based on render time (newest renders first) until the max machines for a job is reached. so you’d set your “max machines” to 20 and your “soft max” to 200, for example.
think of it as a “i’ll take it if it’s available, but if you need it, kill it” kind of approach.
well, that would actually probably help with some of the pool management stuff (which is used here as a priority mechanism) but i don’t see how it would help with allocating additional resources when available and then killing them when somebody else needs them.
i guess i should have been more clear, when i say “culled” i mean actively killed by deadline. so i would like to tell deadline not to allocate more than 20 machines (max machines=20), but if there are open farm machines, go ahead and take them as well, up to 100 (soft limit = 100). but those extra 80 tasks might get killed at any time to make room for tasks that are unable to find available resources to allocate up to the their max procs.
What Gavin said. The Job is Interruptable flag might do what you need. The one downside is that Jobs of higher priority will always interrupt Jobs of lower priority even if the Job is on a Slave within the primary pool.
There is, of course, always the script option. With some effort, a script could be written to periodically scan for Slaves that could be working on a more appropriate Job and instruct them to drop their current Job.
the interruptible job thing is interesting, but the issue isn’t that there would only be my one job set up to be back off the farm when other stuff comes along. the idea would be that all jobs are set up this way, so the question is what prevents a higher priority job from then grabbing way too many slaves?
where can i find information about a script approach? can a script control scheduling or can it only kill jobs?
I would say that, by definition, high priority jobs are meant to come first. The issue with interruptible Jobs is lost productivity due to partially-completed tasks on the interrupted Jobs. This can be minimized by setting the Interruptible % setting: “A task for this job will only be interrupted if the task progress is less than or equal to this value.”
While it’s possible to greatly influence scheduling through scripting by poking a Job’s priority, pool, machine limit, and white list settings, I wouldn’t recommend it. It’s best to let one of Deadline built-in scheduling options do most of the heavy lifting. You might look at the weighted options, for example.
For your scenario, I might use an OnJobSubmitted event to validate/influence Job settings up front. If the Interruptible Job / Interruptible % settings are not sophisticated enough, use a cron job (Scheduled Task in Windows) script to periodically review what Slaves are working on and boot Jobs off of them based on whatever criteria you have in mind. The normal Slave scheduling will then identify and pick up the best Job.
regarding priority, keep in mind this topic is about a “greedy” kind of setting for when nobody else is on the farm. so if the a higher priority job were to simply be greedy and kick everybody off the farm when it was already booked, other users would be unhappy. if greedy settings were only countervailed by other greedier settings, i would fear fist-fights.
i’m not really in charge of the farm where i work. i tend to prefer a balanced scheduler so this max machines thing is a non-issue, but it’s not really my call.
anyway, since the insistence is to use max machines to “be friendly” to other artists, my thought was to add a really simple piece of logic to the scheduler. when determining if a job has reached its machine limit, see if there are free slaves available. if there are, then check the “soft” limit instead of the hard limit. additionally (and perhaps optionally) when a new task is trying to run and is not at its hard max for machines, kill tasks on slaves that are running jobs in excess of their hard max (youngest first). that second part, i could see maybe being a bit more complex.
i’ll read up on the scripting stuff and see if there’s something that’d work.
Yes, but keep in mind that if the scheduling is set to [Pool, Priority, First-In First-Out] that a Job’s Pool has greater influence than its Priority. So it would work out like this:
Assume a single, high-Priority-valued Job is the only Job active on the farm. Of course all Slaves whose Pools match the Primary Pool of the Job will pick up the Job. If the Job also has a Secondary Pool assigned, then other Slaves that are doing nothing, will see if the Job’s Secondary Pool matches one of their Pools. If so, they too will pick up the Job.
Now assume a mid-Priority-valued Job with a different Primary Pool than the first Job is submitted to the queue. Slaves that have the Primary Pool of this second Job, that were working on the first Job only on the basis of the first Job’s Secondary Pool setting, will either pick up this second Job immediately, if Interruptible Job is set on the Job, or as soon as they finish the current Task if not. This is the case despite the fact that this new Job has a lower Priority because the Primary Pool of a Job has greater precedence.
That means that the “My Priority is greater than your Priority” game cannot be waged, except within the scope of Jobs with the same Primary Pool.