I have a specific type of render job I can run 30 of at the same time before maxing out my system resources. This was tied to specific beefy systems dedicated to this task. I created 30 workers on each of these systems and they would pick up the jobs as they came in (each job represents what could be considered a frame) without issue, never quite maxing out resources or RAM.
I now need to use these systems for other jobs as well at the same time. If any of these workers, while running their 30 tasks, pick up a job of this different type, they’ll max out system resources and bring everything to a halt.
I can do 30 of one type of job, or just one of the other (it will use almost all available resources).
Is there a way to prevent certain job types from starting, if any of another types are currently rendering on a select group of workers? I’m thinking Resource Limits could be the way?
I thought I could convert my 30 jobs into a single job, set to 30 concurrent tasks at a time, and limit that worker to one job at a time. (But it seems the max number of concurrent tasks is limited to 16. Is there a way to set this higher?
If I didn’t know you need more than 16 I’d say that concurrent tasks on your tiny task job will take care of your issues. However I’m not quite sure how to square this particular circle without getting excessively janky. Dynamically starting and stopping Workers tends to create issues.
If possible, I’d look into doubling the resources used by the tiny tasks, so the concurrent task limit of 16 works for you. If that’s not possible I’ll come back if I’ve got a better idea!
Unfortunately with the way the job works I can’t double the resources sadly, limited to a single “frame” per task. I could potentially start creating small task jobs for each set of 16 tasks… but then I’m losing out on the concurrency gains by half if I’m limiting the now single worker to one job at a time.
There’s no way to override this arbitrary 16 concurrent task limit?
Sorry, 16 is the hardcoded maximum for concurrent tasks no override possible.
I don’t suppose two Workers on the machine would be okay? Or does the ‘big’ job really choke out the whole system?
Given concurrent tasks aren’t enough, Deadline doesn’t have tools to deal with this sort of setup well. It is possible to enable and disable Workers, but if that happens while a task is in-progress it’ll create an error on the job.