AWS Thinkbox Discussion Forums

Permission denied error during job start up

Hi,
we are struggling with following problem…
Whenever we start new job it generates this error

=======================================================
Error
=======================================================
An error occurred trying to start process '/var/lib/Thinkbox/Deadline10/workers/fx003/plugins/65ae3c71a8399b5ddaa8cefb/pipeline_launch.sh' with working directory '/upp/upptools/workgroups/houdini/sw/linux/hfs20.0.506/bin'. Permission denied (System.ComponentModel.Win32Exception)
   at System.Diagnostics.Process.ForkAndExecProcess(ProcessStartInfo startInfo, String resolvedFilename, String[] argv, String[] envp, String cwd, Boolean setCredentials, UInt32 userId, UInt32 groupId, UInt32[] groups, Int32& stdinFd, Int32& stdoutFd, Int32& stderrFd, Boolean usesTerminal, Boolean throwOnNoExec)
   at System.Diagnostics.Process.StartCore(ProcessStartInfo startInfo)
   at System.Diagnostics.Process.Start()
   at FranticX.Processes.ChildProcess.i(String cc, String cd, String ce)
   at FranticX.Processes.ChildProcess.Launch(String executable, String arguments, String startupDirectory)
   at FranticX.Processes.ManagedProcess.Execute(Boolean waitForExit)
   at Deadline.Plugins.DeadlinePlugin.DoRenderTasks()
   at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)
   at Deadline.Plugins.PluginWrapper.RenderTasks(Task task, String& outMessage, AbortLevel& abortLevel)

=======================================================
Type
=======================================================
RenderPluginException

=======================================================
Stack Trace
=======================================================
   at Deadline.Plugins.SandboxedPlugin.d(DeadlineMessage bgt, CancellationToken bgu)
   at Deadline.Plugins.SandboxedPlugin.RenderTask(Task task, CancellationToken cancellationToken)
   at Deadline.Slaves.SlaveRenderThread.c(TaskLogWriter ajy, CancellationToken ajz)

=======================================================
Log

we’re utilizing AlterCommandLine technique which runs pipeline_launch.sh shell which sets the environments needed for render and then runs actual plugin’s render script. We have track down that this error is caused by missing executable flag on pipeline_launch.sh which is set somehow later (we are not sure what actually doing this). This all results that several slaves will report this error until permission is set and render continues ok. So question is how to delay start of the pipeline_launch.sh after is synced to slave’s local disk and permissions are set? We’ve workarounding this by setting executable flag ourselves.

Debug from JobPreload.py

2024-01-23 13:51:49:  0: PYTHON: Executable Exists: pipeline_launch.sh True
2024-01-23 13:51:49:  0: PYTHON: File Flags: 33188
2024-01-23 13:51:49:  0: PYTHON: Is Readable: True
2024-01-23 13:51:49:  0: PYTHON: Is Writable: True
2024-01-23 13:51:49:  0: PYTHON: Is Executbale: False

and then when job is being rendered

2024-01-23 13:51:56:  0: PYTHON: -------------------------------
2024-01-23 13:51:56:  0: PYTHON: Executable Exists: pipeline_launch.sh True
2024-01-23 13:51:56:  0: PYTHON: File Flags: 33216
2024-01-23 13:51:56:  0: PYTHON: Is Readable: True
2024-01-23 13:51:56:  0: PYTHON: Is Writable: True
2024-01-23 13:51:56:  0: PYTHON: Is Executbale: True

Thanks

Which version of Deadline is this? The file copy should be completed before the AlterCommandLine hook happens and I’d expect the file properties to be retained, but maybe the file copy we’re doing isn’t quite right?

it is 10.3.0.13 (7883f0093)

Silly question - does that file have the executable flag set wherever it’s stored? From what I’m seeing it should be preserving it.

I was going to suggest correcting the flag with os.chmod, but if the user running the Worker had permission to set that flag it likely also would have permission to execute it.

the thing is that our repository is on windows storage whereas slaves are on the Linux. The repository is mounted on slaves via nfs. So I’m not totally sure how this works in this case. But what I’m seeing on slaves it has group rwx attributes set to which deadline user also belongs.

Hmm. I do think Justin’s onto something with the permissions. I’m surprised he didn’t give a test case but you should be able to stop the Worker (or disabled it from the Monitor) before it moves onto another job that would clear the temporary location.

Then just log onto that machine, switch to the Worker’s user with sudo su <username> and try running it:

/var/lib/Thinkbox/Deadline10/workers/fx003/plugins/65ae3c71a8399b5ddaa8cefb/pipeline_launch.sh

Does pipeline_launch.sh change anything depending on the project or job? If not, you could drop it in “/etc/profile.d/” and chmod +x it. Also, you could do this if you want more context on the job and implement it in Python:

https://docs.thinkboxsoftware.com/products/deadline/10.3/1_User%20Manual/manual/environment.html#path-mapping

Because you can grab the job it can do more magic. If you legitimately need to modify the render arguments then you’ll have to have the script change permissions as Justin mentioned.

Hope that helps!

Thanks for advice. I’ll give that try. Purpose of pipeline_launch.sh is set all environments related to specific version of the DCC app and renderer. I found out this method more versatile than setting job’s environments in submitting time

Ah, makes sense! Then permissions are going to be tricky… The Path Mapping example might be best where it grabs the job object in Python, then you can do use GetJobPluginInfoKeyValue with (“Version”) as its parameter. You can get the plugin name with GetJobInfoKeyValue and (“Plugin”) as is parameter.

It feels a bit weird having both a Python solution and a shell script version, but adding the execute bit can also do it if you’re up for that.

Privacy | Site terms | Cookie preferences