OnJobSubmitted Race Condition

I have a custom event that handles special cases when a job is submitted to Deadline. For example, if it is a tile job, the job group will automatically be changed to tile. The reason is because tile jobs require a ton of RAM and we have specific renderers dedicated to them. However, idle renderers will pick up the job before the OnJobSubmitted event finishes and end up taking 2-3 times as long to render. Is there a way to force render slaves to wait until the OnJobSubmitted event finishes before they start rendering?

Hello,

I think the best two options in this situation are either to use machine limits on submission, to ensure only those specific machines render the job, or to have the job submitted as suspended, so that the entire thing can be finished before you queue the job up. Hope this helps.

Cheers,

Dwight

Unfortunately neither of those solutions are automatic, which defeats the purpose of using the OnJobSubmitted event to handle things automatically.

For the first solution my example of the tile job is one of many conditions handled in the OnJobSubmitted event and having the artists manually enter machine limits each time is unacceptable.

As for the second, submitting as suspended would require manually queueing each job in the Deadline Monitor after the job has been submitted. Because we submit dozens of layers at a time, this would be a very time-consuming process.

Do you have any solutions that don’t involve manual processes from those submitting the jobs?

Thanks for your help.

OnJobSubmitted is executed by the local machine which is doing the job submission. Essentially, OnJobSubmitted event should be working exactly the way you want it to. There was a bug in v6.0 which stopped the OnJobStarted event firing, but your not using that event. Are you guys running v6.1?
Could you share your code so I can see what your doing?

Yes, we are using 6.1. Here is an example of what I’m doing:

from Deadline.Events import *
from Deadline.Scripting import *
from Deadline.Jobs import *
from Deadline.Slaves import *

def GetDeadlineEventListener():
    return CustomJobEvent()

class CustomJobEvent(DeadlineEventListener):
    def __init__(self):
        self.OnJobSubmittedCallback += self.OnJobSubmitted

    def OnJobSubmitted(self, job):
        if job.TileJob:
            job.JobGroup = 'tile'
            RepositoryUtils.SaveJob(job)

But here is what’s happening:

  1. Job is submitted to Deadline
  2. OnJobSubmitted is called
  3. A render slave picks up a task (for example, a workstation that does not include the ‘tile’ group)
  4. The job group is changed to ‘tile’

Step 2 and 3 might be swapped, but it doesn’t matter. In this case the workstation would soon run out of RAM and lock up. If the event is not guaranteed to finish before render slaves start picking up tasks, what other method is there for handling special cases like this?

This is a bit of a hack, but if you were to modify the scripts that create the tile jobs, you could have them submit the job as suspended, then the “OnJobSubmitted()” call could set them to “pending” or “queued” depending on if job the tile job is dependent on an initial render.

Ryan tells me we just changed this in 6.2 beta this morning so that OnJobSubmitted() is fired before the job is put into the database. I can get you on the beta if you want to give that a try.

Here’s a modified script with 2 x major changes:

  1. Cleanup the event plugin, otherwise you are going to have memory issues.
  2. Suspend the job, make the job group change and then resume the job again.

[code]from Deadline.Events import *
from Deadline.Scripting import *
from Deadline.Jobs import *
from Deadline.Slaves import *

def GetDeadlineEventListener():
return CustomJobEvent()

def CleanupDeadlineEventListener( eventListener ):
eventListener.Cleanup()

class CustomJobEvent(DeadlineEventListener):
def init(self):
self.OnJobSubmittedCallback += self.OnJobSubmitted

def Cleanup( self ):
    del self.OnJobSubmittedCallback

def OnJobSubmitted(self, job):
    if job.TileJob:
    	
    	if job.JobStatus != "Suspended":
    		RepositoryUtils.SuspendJob(job)
        
        job.JobGroup = 'tile'
        RepositoryUtils.SaveJob(job)

        RepositoryUtils.ResumeJob(job)[/code]

Not ideal but as Edwin said, it’s now fixed internally and will be available in the next v6.2 beta release.

Regards,
Mike