On Task Timeout - 'Complete' instead of 'Error/Notify'

I am trying to make a job that will do a distributed render using ‘vrayspawner.exe’. I am submitting as a ‘commandScript’ job. In the job properties I have a maximum render time. I would like a user to be able to reserve a render machine for x amount of time and then have the job complete after that time. Is it possible to set a “On Task Timeout” to ‘Complete’ the job instead of ‘Error’ or ‘Notify’.

Is there a better way of going about this?

Unfortunately, marking a task as complete when it times out isn’t an option.

Maybe you could create a new submitter (or modify the existing commandscript one) to override the job error limit to be 1. Then when it errors out due to a time out, that will cause the job to fail. This can be done by adding the following options to the job info file (the file that contains general job properties like priority, pool, etc):

OverrideJobFailureDetection=True
FailureDetectionJobErrors=1

Cheers,

  • Ryan

That is what I have right now. :smiley:
The issue is that it produces a sea of red on Monitor. Is it possible to auto-delete that specific job when it is done/failed?

Heh, auto delete only works for completed jobs. :confused:

Maybe we need an option to mark as complete on timeout. I guess it could be useful in other cases (ie: a sim job that caches it’s data on the fly).

Cheers,

  • Ryan

That would be really useful for us to have as an extra feature. :slight_smile:

We’re also looking at getting a different state out a hard Timeout setting but would like to end up with a “Suspended” state so we could automate a “Transfer Job” to move it to another Repository and let some other group of machines (which aren’t in our Slave pool) work on it. It seems like we could do this if we could include a Resubmit with “Suspend on submission” (“InitialStatus=Suspended”) in an OnJobFailed Event Plug-in and also set a JobExtraInfo value that could be then checked in an “OnJobSubmitted” Event Plug-in that would start a Job Transfer if it read it. Would that work and/or is it the best approach?

Some of these pieces seem to be available. It’s possible, at least manually, to Resubmit a failed job with “Suspend on submission” or to script a new job submission with “InitialStatus=Suspended”, so there’s some process by which a job or a Failed job can become a Suspended job; the Transfer Job script is also available for use. It’s not clear, though, whether it’s straightforward to script a “Resubmit” with “Suspend on submission” (is this visible somewhere or do you need to build it using the commandline submission process?) as part of an “OnJobFailed” Event Plug-in; it just seems like it’s functionally doable since it’s available in Deadline Monitor.

With a little bit of scripting, this should be pretty straightforward. Timeouts, by default, accumulate errors for a job, so as long as you have your job failure detection settings configured so that a job can fail, you’re on the right path:
thinkboxsoftware.com/deadlin … detection/

Now that jobs can fail, you can write an event plugin to handle when jobs fail (using the OnJobFailed event):
thinkboxsoftware.com/deadlin … teventsdk/

In the OnJobFailed function, you should be able to use the majority of the code from the Transfer Job script in \your\repository\scripts\Jobs\JobTransferSubmission\JobTransferSubmission.py. To suspend the transferred job, just include this in the job info file (where you specify properties like priority, pool, plugin, etc):

InitialStatus=Suspended

Based on your post, it sounds like you’re familiar with all these concepts, but if you have any additional questions, just let me know!

Cheers,

  • Ryan

Thanks. If we can get into a Transfer Job from an “OnJobFailed” Event plug-in, without going through a “Suspended” state first, that will work. We can edit the “Suspended” check out of the Transfer Job script if this doesn’t have any functional meaning for our purpose.

The only reason we enforced the suspended state for job transfers was so that the job wasn’t actively being rendered while being transferred somewhere else. Since the failed state is a non-active state, transferring failed jobs is fine, so you can just edit that in the transfer job script.

Now that I think about it, we should probably edit the submitter that ships with Deadline to allow the transfer of failed and completed jobs.

Cheers,

  • Ryan