We are using a custom Nuke To Deadline submitter inspired in the SubmitNukeToDeadline.py script
that came with the Deadline 8.1 repository (repo/submission/Nuke/Main/SubmitNukeToDeadline.py)
Our script works nicely most of the time, but sometimes it get stuck (and makes Nuke go unresponsive)
We narrowed it down to the CallDeadlineCommand function, which is exactly the same from the original script.
Shows a +1m lag.
STARTING CallDeadlineCommand 16:04:20.247962
FINISHING CallDeadlineCommand 16:05:36.172592
Our render farm is very simple.
Two Macintosh (OSX 10.8 and OSX 10.10) connected via a 10GbE switch, one of those is connected via Thunderbolt to a RAID with the repo.
I think that for such a simple configuration there shouldn’t be so much delay between the calls, and if we are going to expand it, it’s going to be much slower.
What was the exact command that was taking this long? You mentioned it was the same as in the original script, but I’m not sure which command exactly you are referring to.
Was it for getting pools/groups, the repository path, dependencies?
We have found and fixed a few issues with the ‘getrepositorypath’ directive for DeadlineCommand, which should be fixed in the next build that we post. If that’s not what was being called let me know though and we’ll have a look at it!
def CallDeadlineCommand( arguments, hideWindow=True ):
# On OSX, we look for the DEADLINE_PATH file. On other platforms, we use the environment variable.
# print "STARTING CallDeadlineCommand " + str ( datetime.datetime.now().time() )
if os.path.exists( "/Users/Shared/Thinkbox/DEADLINE_PATH" ):
with open( "/Users/Shared/Thinkbox/DEADLINE_PATH" ) as f: deadlineBin = f.read().strip()
deadlineCommand = deadlineBin + "/deadlinecommand"
else:
deadlineBin = os.environ['DEADLINE_PATH']
if os.name == 'nt':
deadlineCommand = deadlineBin + "\\deadlinecommand.exe"
else:
deadlineCommand = deadlineBin + "/deadlinecommand"
startupinfo = None
if hideWindow and os.name == 'nt':
# Python 2.6 has subprocess.STARTF_USESHOWWINDOW, and Python 2.7 has subprocess._subprocess.STARTF_USESHOWWINDOW, so check for both.
if hasattr( subprocess, '_subprocess' ) and hasattr( subprocess._subprocess, 'STARTF_USESHOWWINDOW' ):
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= subprocess._subprocess.STARTF_USESHOWWINDOW
elif hasattr( subprocess, 'STARTF_USESHOWWINDOW' ):
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
environment = {}
for key in os.environ.keys():
environment[key] = str(os.environ[key])
# Need to set the PATH, cuz windows seems to load DLLs from the PATH earlier that cwd....
if os.name == 'nt':
environment['PATH'] = str(deadlineBin + os.pathsep + os.environ['PATH'])
arguments.insert( 0, deadlineCommand)
# Specifying PIPE for all handles to workaround a Python bug on Windows. The unused handles are then closed immediatley afterwards.
proc = subprocess.Popen(arguments, cwd=deadlineBin, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, startupinfo=startupinfo, env=environment)
proc.stdin.close()
proc.stderr.close()
output = proc.stdout.read()
# print "FINISHING CallDeadlineCommand " + str ( datetime.datetime.now().time() )
return output
Gotcha – one thing we have seen in the past on the Mac is that sometimes if you have a Database hostname configured that the Mac machine cannot resolve, DeadlineCommand gets stuck trying to connect to the mongo DB, until the request times out.
The easiest way to test this would be to check your “settings/connection.ini” file in your Deadline Repository – the Hostname entry probably looks something like this:
Hostname=mobile-034;10.10.1.125
If all your machines can reach your database machine using the same IP, I would remove anything that’s not that IP. If not, you can try to add anything that the Mac can’t reach to your local hosts file, to see if this is at least the problem I think it is.
We’re still trying to find a resolution for this, but as far as we can tell it is due to a Mono bug, and we have yet to find a way to work around it other than ensuring the list of hostnames are reachable by Mac. We might have to look at some kind of local override for the DB settings, in order to work around this kind of issue in a cleaner way.