Hi,
We have a few CommandScript and Post Job Scripts that execute commandlines that only take a few seconds to complete. Problem is these jobs seem to get picked up by multiple machines… usually only 2, but I’ve seen up to 4. It all happens within 10-30 seconds. I can see in the logs that the PostScript was successfully executed multiple times for a job that was never requeued.
I have a feeling this has something to do with the fact the commandline finishes so fast. Perhaps files aren’t written quickly enough into the Rendering and Completed job folders?
Thanks for any help,
Paul
Hi Paul,
This sounds like a race condition between slaves trying to dequeue the
same task at the same time, and somehow each getting a “lock” on it. I
tried reproducing this here though without any luck. I’ve even done some
tests with a special plugin we use for testing here that simply sleeps
for a specified period of time (which simulates the process of
rendering). I set the sleep time to 10 milliseconds and even 0
milliseconds, but couldn’t reproduce this. I did all my tests with the
latest version of Deadline 2.7 that is on our website.
We’ve reworked the post job script code a bit for the 3.0 release, so
we’ll have to keep an eye on this one during the beta period (which
should be starting soon). Unfortunately, I can’t think of anything to
work around this problem for now.
Cheers,
- Ryan
–
Ryan Russell
Frantic Films Software
http://www.franticfilms.com/software/
204-949-0070
2 Questions
1.) Did you test with Pulse activated?
2.) It looks like it is supposed to be there, but what exactly does “Sending cancel task command to Plugin” mean in the Slave’s log.
As always, Thanks for all your help!
-Paul
Hey Paul,
Ah, I think I’ve discovered the problem. There is a bug with Pulse that
occasionally causes a task to be requeued seconds after it is
successfully dequeued. From our understanding, the problem didn’t happen
too frequently, but regardless we believe we have addressed the issue
for the next release. The fact that these particular tasks of yours only
take seconds to complete would explain why they are executed multiple times.
Cheers,
- Ryan
–
Ryan Russell
Frantic Films Software
http://www.franticfilms.com/software/
204-949-0070