(preamble: I’m running Deadline 3.1.0.35390; I have not changed the default Job Scheduling Order in the Repository, so it’s still “Pool_Priority_Date”).
I wanted to make it so that the jobs for a certain project would receive priority on about 1/3 of the Slaves. On all other slaves, I wanted things to continue as normal, meaning jobs are prioritized based on priority and submission time, but pool is irrelevant. So, I created a pool called “hot” named after that project, and I put about 1/3 of the slaves into that pool.
For jobs submitted to the default “none” pool, everything worked as expected: jobs were picked up in order by priority and then date, across all slaves.
For the slaves in the “hot” pool, everything also went as expected: they preferentially picked up jobs submitted to the “hot” pool, but if there were none, they would pick up jobs from the “none” pool as well.
The point that surprised me (and that seems to make pools unusable for the above use case) is that the machines in the “none” pool didn’t pick up jobs from the “hot” pool, even if there are no jobs in the “none” pool. This seems contradictory to the general statement in the documentation (under “What Are Groups/Pools?” in “How Deadline Works”) that pools only manage priorities; it’s like what would happen if I was using groups. That said, this fact is documented in the detailed description under “How A Job Is Selected For A Slave” in “How Deadline Works”.
This has the effect of making jobs submitted to the “hot” pool actually worse off than jobs submitted to the default “none” pool: yes, jobs submitted to the “hot” pool get priority on 1/3 of the machines; but unfortunately they’re locked out from the other 2/3 of the machines completely, even if those machines are idle.
Slaves always seem to behave as though they were assigned to the “none” pool in addition to whatever pool(s) they were explicitly assigned to. However, jobs do not seems to have that same implicit behavior: if a job is assigned to an explicit pool, it is not considered to also be eligible for the “none” pool, unfortunately meaning it is ineligible to run on any slaves other than the ones in its pool. So in that sense, pools do affect more than just priority management.
I thought about creating another pool called “general” or “default”, and then placing all Slaves into that pool – for the Slaves also in the “hot” pool, “hot” would be at a higher priority than “default”. However, that doesn’t change the fact that the machines not assigned to the “hot” pool will refuse to pick up jobs assigned to “hot”.
Department A / Department B
If I had exactly two departments, each with their own pool of machines, pools would work just fine. Dept A’s machines could put Dept A Pool above Dept B pool; Dept B’s machines could do the opposite; and jobs would be submitted to the department as expected. However, I actually have something on the order of 15 projects on any given day, and only one of them has machines they should be prioritized based on pool. So I really need a way to have one set of Slaves that prioritize jobs by Pool, and another set that just do Priority/Date based sorting.
In Summary
I haven’t been able to see how to implement this use case with the current Deadline behavior. Any suggestions would be welcome!
If there’s no way to do this currently, it seems like this use case could be resolved without affecting other use cases by adding the implicit behavior that all jobs are considered to be eligible for the “none” pool in addition to any explicit pool they were assigned to, similar to the way slaves consider themselves to always be in the “none” pool in addition to any pool they were explicitly assigned to. But that’s a code change – right now jobs are considered to be in one and only one pool.