"Priority" considered before "Pool" v2.7

Hi,

We are a 3d department of 17 animators and have been using Deadline v2.7 for over a week now, so we’ve started to get used to it.

We have a lot of render tasks running through Deadline at the moment. Our render farm is split into 6 pools: Fast_a, Fast_b, Fast_c, Slow_a, slow_b, slowest. Fast_a/b/c pools are set to exclude ‘slowest’ pool and this is the only pool exclusion we have. Some queued render tasks are set to render on assigned pool only!! but some haven’t in order that they can pick up render nodes if there are not being used by other pools.

We have the Tools->ConfigureRepositoryOptions->SchedulingSettings set to “Pool_Priority_Date”. However, this doesn’t seem to be working properly. Currently there is a render task with Pool: Fast_A, Priority: 80 that is taking nodes from Pool: Slow_a, even though there are 6 render tasks queued up to render on Slow_a only!! I would understand that the render task assigned to Fast_a should only takes render nodes from other pools if they are not being used to render other tasks.

Anyone have any clues as to why Pool is being ignored? Is this normal Deadline behaviour?

Cheers,
Olly : )

Hi Olly,

Pools are priority based, and the order by which slaves select jobs is affected by the order in which pools are assigned to the slaves. For example, if a slave is assigned these pools in the following order:

Fast_a, Slow_a

Then the slave will always give preference to Fast_a jobs, regardless of the number of jobs in the Slow_a pool that are queued up. The slave will only move on to Slow_a jobs once all Fast_a jobs are complete. If while rendering a Slow_a job a new Fast_a job gets submitted, the slave will again give preference to the Fast_a job.

Note that the Use Machines In Pool Only option only affects the priority of a job for a slave if that slave hasn’t been assigned that pool. So if a job is submitted to the Fast_a pool without the Use Machines In Pool Only option enabled, it will still be preferred by the slave above, even if there are jobs in the Slow_a pool that have the Use Machines In Pool only option enabled.

I’m guessing there is a little bit of confusion in how you’re expecting the pool system to work, so hopefully this clears things up. :slight_smile:

If you have anymore questions, let us know!

Cheers,

  • Ryan

Hey Ryan,

Thanks for your reply. The thing is that each of our render slaves is only assigned to one pool, so it makes even less sense now. From what you’re saying, from the way we have our slaves setup a task submitted to Fast_a, should only ever render using the fast_a slaves surely?

From what you’re saying, from the way we have our slaves setup a task submitted to
Fast_a, should only ever render using the fast_a slaves surely?

This is correct ONLY if you have the Use Pool Machines Only option enabled for the job. If the Use Pool Machines Only option is disabled, then the job can potentially run on any machine. However, a slave with the Fast_b pool assigned will only render the Fast_a job once all Fast_b jobs have completed.

Cheers,

  • Ryan

However, a slave with the Fast_b pool assigned will only render the Fast_a job once all Fast_b jobs have completed.

Yes, that’s how I thought it should behave but it’s not been occurring like this. Our Slow_a slaves have been rendering tasks submitted to Fast_a pool, even though there are 6 tasks queued up for Slow_a. In addition, I’ve just noticed that jobs assigned to Fast_a pool (which has ‘Slowest’ as an excluded pool) have been rendered on Slowest slaves. So it appears that the Pools are not being obeyed properly at all. Have you heard of such a thing before?

Cheers,
Olly : )

Our Slow_a slaves have been rendering tasks submitted to Fast_a pool,
even though there are 6 tasks queued up for Slow_a.

This is a little strange. There is probably a good explanation for this, but it’s hard to make a guess without knowing more about how your farm is configured. Can you post the following:

  1. Find a Fast_a job in the Monitor that is rendering ahead of a Slow_a job. Right-click and select Repository Directory to open up the job folder. Post the *.job files.

  2. Find a Slow_a job in the Monitor that should be rendering ahead of the Fast_a job. Right-click and select Repository Directory to open up the job folder. Post the *.job files.

  3. Let us know if you’re using the Machine Limit or Limit Groups settings for any of your jobs? Those can impact the order of jobs as well.

  4. Finally, have any errors been reported for the Slow_a jobs? Let us know!

In addition, I’ve just noticed that jobs assigned to Fast_a pool (which has
‘Slowest’ as an excluded pool) have been rendered on Slowest slaves.

Does the Fast_a job have the Use Pool Machines Only option disabled? If it does, it will render on any machine that doesn’t have the Fast_a pool excluded. So if the slaves that are assigned the Slowest pool don’t have the Fast_a pool excluded, then there is the potential for these slaves to render a Fast_a job that doesn’t have Use Pool Machines Only disabled.

Cheers,

  • Ryan

Hey Ryan,

  1. Attached are *.job files for Slow_a task that has not been able to render. Also attached *.job files for Fast_b task which was rendering using “Slowest” pool, even tho “Slowest” is an excluded pool.

3)‘Limit groups’ as not been used at all. Originally the offending “fast_b” job had no machine limit and also was not set to render from pool only. Coz of me whining, my fellow animator had added a machine limit and changed the setting to ‘Use Pool Machines Only’ before I could get to the Job Repository files for you, so I dont’ know if the *.job files will show the history properly.

  1. No errors reported on Slow_a jobs.

Cheers,
Olly : )
slow_a_repository.rar (1.42 KB)
fast_b_repository.rar (1.49 KB)

Thanks. I noticed that both jobs have the Use Machines In Pool only option enabled, so the Slow_a job should only be rendering on machines that have been assigned the Slow_a pool, and the Fast_b job should only be rendering on machines that have been assigned the Fast_b pool. Is this not the case?

Now that we have switched all the tasks to Use Machines in Pool only, they are rendering with the correct slaves. However, this is not an ideal solution in the long term…we want to be able to submit tasks to Fast pool and submit tasks to Slow pool with Use Machines in Pool UNchecked so that any task can pick up the slaves from the other pool once there are no tasks waiting for that pool.

We’ll try and recreate the error today so that we can send you an accurate *.job file.

Cheers : )

Hi Ryan,
I think we have sorted out the pools issue… we were just thinking in reverse…

Thanks for the info…

Robin

Hi Robin,

Glad to hear it! If you have any more questions, let us know!

Cheers,

  • Ryan