Hi,
we’ve just finished the transition from Deadline 7 to Deadline 10.
We’ve done a clean install of 10, and not updated on top of 7, so it’s a new repository and database on new hardware.
We’re now constantly having problems with V-Ray DBR jobs from Max, which worked with no problems on the old setup.
For some reason the Max session isn’t being closed on the slaves if a task fails or the job is finished.
So the next time we submit a DBR job, we get the “a process ‘vrayspawner201x’ with pid ‘5604’ exists” error.
If I then go to the slave and kill Max, it will quite often just start up again, even though the job is no longer active.
The Deadline slave also often says that it is still working on the job although there is no longer a job to work on.
Like I said, DBR worked perfectly a couple of days ago on DL 7, so is it just a setting in the plugin configuration I’ve missed?
Or is it something else?
If anyone has any ideas at all, I’d love to hear about it.
Cheers,
Dave
Does no-one else have this problem?
I’m guessing I’ve configured something wrong, but the only posts I found with a vrayspawner problem are from 2012 and not the same as my problem.
Dave
That’s new to me… We’ve seen Nuke detach in the past and Deadline not catch it so there may very well be something in Deadline (usually upgrading Nuke fixes it not closing, but Deadline should still be forceful about it).
Are you willing to run a test job and post a Slave log? The job report may not be verbose enough so the Slave log between when the job starts and when Max just sits around is best. The test job is mostly so there isn’t project-specific data in the log, so feel free to send whatever you’re comfortable with.
Docs on grabbing a Slave log:
docs.thinkboxsoftware.com/produ … /logs.html
Hi Edwin,
I’ll set up a test job tomorrow and post the log.
Cheers,
Dave
Any luck? I think we’ve seen this happen with another app and the job report is going to be especially helpful.
Sorry Edwin, work got in the way (
I’ll try and sort out log files for you.
Cheers,
Dave
Sure thing. It’s not urgent but it will be appreciated.
I must have changed something somewhere, because I can’t reproduce it any more.
It was happening all the time when I originally posted (
Now we are having other problems, but I’ll post them in a separate thread.
Dave
No worries man. We did some digging on our side and it looks like if you’re running Windows 7 or thereabouts there is a limitation on how we group processes together that might prevent us from being able to close applications. You’d also need to run the Slave through something that sets the Windows Job group (not related to Deadine’s jobs) so it’s probably unlikely you hit that one.
Thanks Edwin.
We’re running Windows 10, and last week this started happening again.
Unfortunately I was on vacation at the time and no-one thought about grabbing log files.
When I’m back in the office next week I’ll start digging around for anything that might be useful to you.
Dave
I think that’s alright. We found a few places where the Sandbox’s Python process wasn’t being disposed of in all cases and we have some code to fix it. During testing, it helps clean things up on Windows (which works for you) but not yet on Linux.
That code hasn’t been merged in yet, so I’m not sure when it will be publicly available.