AWS Thinkbox Discussion Forums

Getting up-to-date job status?

We have a custom API we’re using to notify users via Slack when tasks have erred and when they’ve completed the job with errors remaining unrendered.

The problem is, with my current implementation the OnJobError callback has a cached version of the job. I’m checking the number of failed tasks in the Job (which is cached, not live), and sending the message. If 17 slaves grab the job before the first failure, and then all 17 fail, they ALL send a Slack message, and that generally all happens at once. :laughing: Knowing my artists, they will be less than thrilled about such a thing.

What’s the best way to work around this? Is there a way to check and see if the task is the first task to have failed?

Hey there,

It will be difficult to ensure that only one slave sends the message, since they are all operating independently of each other. If you changed the event callback to OnJobFailed and had jobs fail on a single error you would ensure that the message was set when an error occurred, and only once. If you don’t want your jobs failing you could use the extra info property on the job to set whether a slack message has been set yet, and set it in the db once it has. The second option is no guarantee though, multiple slaves could still send the slack message at the same time if they failed a task at the same time.

Do either of these options work for you?

Also, if you want to get the most recent Job data you can just pull the Job from the db directly in your event. Note that this would cause all your slaves that error to hit the db an additional time, which will increase the load on the database machine. The function you want to call is job = RepositoryUtils.GetJobs( [job.JobId], true)

The true flag is telling deadline to pull it right from the database, and the function expects the job IDs to be in a list, so just make a list and put the one job in it :slight_smile:

Privacy | Site terms | Cookie preferences