Error 'finger-pointer' script

Hello all,

Firstly, I’m a complete newbie at python (or indeed, any form of programming whatsoever), but I’m not crazy; I know that, for me, what I’m proposing will be a pretty big undertaking for someone with no experience. I’ve been studying on codeacademy for a few weeks, so I’ve got a handle on Python at least.
But nonetheless, there’s a relatively simple script/addon I’d love to be able to have with Deadline, and I feel that it’s simple enough that I could do it myself, as long as I can identify solutions to one or two key issues before I start.

I’d like to write a job plugin to be run on active or finished/failed jobs, the script would look up any error reports attached to the job, keyword/keyphrase search the logs for that job, rank the search results by a predetermined priority, and then finally present a list of the issues ordered by severity (with weighting added for the number of times a given error message appears).
With enough interpretation of the various error messages, the script could be used to quickly point a finger at the most likely cause of rendering issues or failures.

E.g. 500 mentions of some mentalray node discrepancies would register as minor compared to 1 mention of mayabatch crashing due to running out of RAM, so the script would report that the job has failed due to a lack of RAM (but still report the mentalray errors, just as a lower priority), as well as the affected machine(s ).

Another example would be that any instance of “0: STDOUT: mel: Error: Cannot load scene. Please check the scene name.” would have the script flag the issue as “Resave scene in new project” as the most likely resolution to the issue, and so on and so forth.

I have two main issues before I begin working on this;

  1. We have deadline 5.1 currently, but should be upgraded with deadline 6 within a couple of weeks… Should I even bother beginning work on this script before we get deadline 6? Or would it be safe to work on something this simple for deadline 5.1 and then carry on once we’re upgraded?

  2. I’ve not been able to identify any way of obtaining actual error report text through deadlinecommand or via python hooks in Deadline, I’ve only been able to find out how to query the number of errors generated, not their actual content.
    As a fallback, I guess I would be able to have the script ask an artist to enter the Job ID listed in Monitor, and then have the script independently load the .errorreport XML file directly from the repository. It’s just that it would be nice to have the python script itself right there in Deadline monitor so all you need is a right click to select and run it on the highlighted job.
    Is there some way of accessing error report text through deadline/command itself? Or will I need to build this script for use outside of deadline?

Finally, I’m quite happy to make the script publicly available for other deadline users once I’ve gotten it started and working. I’d absolutely love it others were able to chip in their thoughts and findings on the accuracy of suggestions the scripts give to users and recommendations on affecting the weighting it applies to error results, since it would help make it better for everyone.

Hi cgkomodo,

This is an interesting idea and should be possible with Dealine 5.1. That said, I always recommend updating to the latest version of Deadline since there are many advantages.

I would use an Event plugin to inspect jobs as they are completed or failed:
DL 5: thinkboxsoftware.com/deadlin … eventssdk/
DL 6: thinkboxsoftware.com/deadlin … eventssdk/

For accessing the Report info, check out the Deadline.Reports namespace in the Deadline Scripting Reference. There are several applicable functions there.

I am curious, how will you be consuming the output of your script? Will it write text files to a review folder, send e-mails, or perhaps write the results to a database that can be viewed through a web page or custom app?

Hi James,

The company has already purchased Deadline 6 actually, there’s just a snafu involving sorting out the server used for (ALL!) software licensing so we don’t have the licenses installed for us to use yet.

I looked at those two reference pages, but they only seem to mention the -scripts- errors adding their own error information into Deadline’s error log, they don’t seem to say anything about reading error log information that already exists FROM a deadline job or about a “deadline.reports” namespace?

I had intended on the script just popping up a python command window and printing the resulting interpreted error message(s ), though now you’ve said it, saving something to a text file in a central error directory will probably also be a good idea (in case an artist closes the window by accident or wants to refer back to the error), unfortunately email support is a bit beyond me (mostly because that kind of access to the email system is about 3 paygrades above what I’m allowed to touch here).

Hi cgkomodo,

The links I provided were only to describe how to write an Event plugin. For documentation on the reporting functions, you’ll need to refer to the Deadline Scripting Reference which is made available as a PDF document from the same place where your IT personnel downloaded the Deadline installers. If you’re having trouble getting your hands on it via your IT department, just send me a private message.

In terms of consuming the output, a pop-up window may not be a good choice. The Event plugin is executed by whichever slave triggers the event, so the odds of actually seeing it would be low since the pop-up would appear on a random slave on the farm. One of the other options I mentioned would be better. Sending an e-mail is not terribly difficult (well, depending on the server config).

Hi James,

Ahh, I wasn’t aware of that document, and I’ve tried to get a copy from post pro but I think they’ve lost it or didn’t download it, so I’ll PM you shortly. Thanks for that.

Do you reckon this idea would not work if executed from within monitor, as a job script? Or is it going to be better an event plugin and then save a text file in a central error folder?

Cheers
Paul,

Either approach would work, but I’m generally a fan of “first-chance” style processing, meaning doing as much work as possible up front (where it makes sense). Saving a few seconds here and there can really add up over time, and it’s well established that having computers do rote work is more cost effective than troubling people with it. So, I would prefer the Event plugin approach for this.

Hi James,

Been a really busy week for me and I’ve not had a lot of chance to get a good start on my script.
Though unfortunately I’ve come across a pretty obstructive issue already :S

I can’t seem to find the deadline python modules, I’ve searched the python directories on the machine, including the python installation within the deadline folders themselves.
I’ve also tried importing the names of any deadline modules I can find, but I always get errors coming up saying that the modules aren’t found.

Surely I must be doing something wrong since deadline itself wouldn’t function if these modules weren’t installed properly?

Copied straight from our standalone python API docs, available via link on our website or via your official Deadline v6.1 download link as provided to you by Sales.

Set-up
In order to use the Standalone Python API you must have Python 2.6 or later installed. Copy the “Deadline” Folder containing the Standalone Python API from \your\repository\api\python to the site-packages folder of your Python installation and the API is ready to use.

Using the API
First, a DeadlineCon object must be created, which is used to communicate with Pulse to send and receive requests. First “import Deadline.DeadlineConnect as Connect”, then create your connection object “connectionObject = Deadline.DeadlineConnect.DeadlineCon(‘theHostAddress’, thePortNumber)”. The “connectionObject” variable can now be used to communicate requests to Pulse.

Example: Getting group names and suspending a job

import Deadline.DeadlineConnect as Connect

connectionObject = Connect.DeadlineCon(‘localhost’, 8080)
print connectionObject.Groups.GetGroupNames()
#[“group1”,“group2”,“group3”]
jobId = …(valid job ID)
print connectionObject.Jobs.SuspendJob(jobId)
#‘Success’