Hello all,
I manage a small render farm of ~16 machines (23 at night with workstations), and I am struggling to understand some of the behavior I’m seeing in our render pools when the scheduling order for the repo is set to “Pool, weighted, first-in-first-out”.
The weight values I’m using are set to:
- Priority Weight: 30
- Submission Time Weight: 2
- Rendering Task Weight: -1
- Error Weight: -5
- Rendering Task Buffer: 1
When a number of jobs are submitted one after another - let’s say, over the course of like 30 minutes some illustrators send ~6 or 8 jobs - the very first job to be submitted will get workers immediately. That is expected, and all is well.
The problem happens when that first job is finished - the workers are released from the job and instead of choosing the very next job in the list (based on submission time), they will choose a seemingly arbitrary job from the list, sometimes even the most recent job. This happens whether the job is a 3ds Max DBR job or an animation - in either case, the workers are choosing newer jobs over older ones - it’s as if the “weight” values that the workers are using are not being updated all together, so older jobs will appear to have a smaller weight value than a newer job - even though the older job should have a higher weight value and therefore be chosen first.
This doesn’t seem right to me - and I’ve played around with all the weight values to try to understand what’s happening under the hood, but I haven’t had any luck whatsoever. The results are confusing and I haven’t been able to figure out a pattern.
The only clue for me is the fact that when I double-click on a job to open up its properties, then close the properties window, the weight value listed in the Monitor changes immediately to update based on the submission time - but it’s entirely unclear to me whether the workers see the updated value or if they are using a cached value from the repository.
I have Pulse running on the main repository machine, but I don’t think Pulse is updating the weight values very often, if at all.
With a weighted system like this, how does each worker actually evaluate all the jobs and their weight values? And why would a worker choose a newer job over an older job with all else being equal?
Thanks for any help anyone can provide - I’m just tearing my hair out trying to understand this weighted system so that I can more efficiently allocate our small render farm across the jobs! Let me know if there is any more information I can provide to make this easier to diagnose and understand. Thanks!