encapsulated python session?

LaszloSebo · July 11, 2014, 5:55pm

Is the encapsulated python session per job/task on the roadmap for 7?
We still fight with the fact that sometimes deadline slave is started before some of the network drives are accessible, and then initialize a ‘bad’ python environment…

Only fix we have is to go machine by machine, look at their logs and restart the slave… :\

rrussell · July 14, 2014, 7:47pm

Unfortunately, it’s not on the roadmap for version 7, but it is on the roadmap for version 8. The amount of refactoring involved for this was simply too much to squeeze into version 7.

Cheers,
Ryan

anon16865508 · July 26, 2014, 1:03am

My +1 for an encapsulated session. It does lead to confusing bugs that the python instance is shared between jobs!
ie if a python script imports a module and that module change, the change will be ingored until the slave restarts as python caches module imports.

LaszloSebo · July 28, 2014, 4:26pm

Yeah this is a pretty major issue for us to be honest. Its a robustness failure.
Our scripts have to go through all sorts of cleanup hoops, reloading modules constantly etc. I was hoping this was just a short term problem with the 6.0 migration… :\ We can’t litter all our shared python modules with reloads()

rrussell · July 28, 2014, 5:33pm

This isn’t a problem that can be solved with a quick fix. Python doesn’t have a “reset” option, and there is no standard way of “rolling back” changes that have been made to the python environment. Any application that supports python scripting that I’ve used has the same problem, so it’s not something that’s unique to Deadline. It’s just a bigger problem in Deadline due to the long-running nature of the Deadline applications.

The way we envision solving this would be to run all python sessions in separate processes (except for the main user interface). So when a slave goes to load a job, instead of doing so in a separate thread like it does now, it does so in a completely separate process. That process would then have a clean python environment at the start of the job, and when it’s done with the job, the process just shuts down. Things like event plugins, and pre/post job scripts, would also have to run in separate processes. This is a pretty significant change, because we need to develop the communication protocol with the new process and ensure we can share data between the new process and the main one.

In Deadline 7, we’ve focused heavily on the database backend and the UI, and that simply didn’t leave enough resources for this major change. But it’s now on the Deadline 8 roadmap as one of our top priorities.

Cheers,
Ryan

LaszloSebo · July 28, 2014, 10:21pm

Sounds like a great plan Ryan.

The main reason this is not so much a problem with other applications, is that they are mostly running 1 task per session (most of the time anyway). For example, we cant even run the same maya / max session for different shows, as they all have their own configurations. You open a new maya for another shot etc. For nuke, its the same way as well, even opening a different version of the same script initiates a whole new nuke session.
The deadline slave’s main purpose is handling various different tasks in succession which are all (or mostly) independent of each other. So isolation i think is even more more critical than it is for nuke/max/maya where you will likely be working in the same environment - if working on the same show.