We updated our whole Deadline server/clients to 10.3.2.1 and after that Deadline Monitor started to randomly crash. The interesting part is that this is only happening to clients connecting to the Server via a RCS. The clients connecting directly to the server are not facing this issue.
Time to initialize: 65.000 ms
Auto Configuration: No auto configuration for Repository Path could be detected, using local configuration
Connecting to Deadline RCS 10.3 [v10.3.2.1 Release (1a66fe40f)]
Auto Configuration: Picking configuration based on: xx-xxx-xxxxxx / 10.100.148.247
Auto Configuration: No auto configuration could be detected, using local configuration
Time to connect to Repository: 1.107 s
Time to check user account: 10.000 ms
Time to purge old logs and temp files: 3.000 ms
Time to synchronize plugin icons: 292.000 ms
Time to initialize main window: 397.000 ms
Main Window shown
Python 3.10.13 | packaged by Thinkbox Software | (main, Mar 9 2024, 00:36:53) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
Time to show main window: 39.000 ms
malloc(): unaligned tcache chunk detected
I found a similar issue happening to someone but that was on Deadline 10.2 and there’s not much info around the thread.
In our case, we are running Rocky 9.4. The build that works is exactly the same that’s failing. The only difference is how Monitor is connecting to the server.
1 Like
there are some known issues with rocky 8.10 / 9.4
https://issues.redhat.com/browse/RHEL-39415
Houdini and Arnold have fixes out for this, not sure if this is the error, but I’ve noticed flaky connection issues with 9.4 too, lots of error messages but still getting jobs processed.
Are you running RCS on 9.4 too?
Hey Anthony, thanks for your suggestions but the issue is also happening on 9.3…
We had issues with Houdini + Arnold but 19.5, not 20. But now everything is also fixed.
Our RCS is running on 9.3. I see a bunch of the same error in the logs:
2024-07-01 10:07:01: Startup Directory: "/opt/Thinkbox/Deadline10/bin"
2024-07-01 10:07:01: Process Priority: BelowNormal
2024-07-01 10:07:01: Process Affinity: default
2024-07-01 10:07:01: Process is now running
2024-07-01 10:07:04: System.TypeInitializationException: The type initializer for 'Delegates' threw an exception.
2024-07-01 10:07:04: ---> System.DllNotFoundException: Could not load libpython3.10.so with flags RTLD_NOW | RTLD_GLOBAL: libpython3.10.so: cannot open shared object file: No such file or directory
2024-07-01 10:07:04: at Python.Runtime.Platform.PosixLoader.Load(String dllToLoad) in C:\thinkbox-conda\conda-bld\dotnet_pythonnet_1709944764012\work\src\runtime\Native\LibraryLoader.cs:line 61
2024-07-01 10:07:04: at Python.Runtime.Runtime.Delegates.GetUnmanagedDll(String libraryName) in C:\thinkbox-conda\conda-bld\dotnet_pythonnet_1709944764012\work\src\runtime\Runtime.Delegates.cs:line 290
2024-07-01 10:07:04: at Python.Runtime.Runtime.Delegates..cctor() in C:\thinkbox-conda\conda-bld\dotnet_pythonnet_1709944764012\work\src\runtime\Runtime.Delegates.cs:line 16
2024-07-01 10:07:04: --- End of inner exception stack trace ---
2024-07-01 10:07:04: at Python.Runtime.Runtime.Delegates.get_Py_GetVersion() in C:\thinkbox-conda\conda-bld\dotnet_pythonnet_1709944764012\work\src\runtime\Runtime.Delegates.cs:line 341
2024-07-01 10:07:04: at Python.Runtime.Runtime.Py_GetVersion() in C:\thinkbox-conda\conda-bld\dotnet_pythonnet_1709944764012\work\src\runtime\Runtime.cs:line 826
2024-07-01 10:07:04: at Python.Runtime.PythonEngine.get_Version() in C:\thinkbox-conda\conda-bld\dotnet_pythonnet_1709944764012\work\src\runtime\PythonEngine.cs:line 143
2024-07-01 10:07:04: at FranticX.Scripting.PythonNetScriptEngine.Initialize(Boolean setUnbufferedStdioFlag, String home, String programName)
2024-07-01 10:07:04: Skipping pending job scan because it is not required at this time
2024-07-01 10:07:05: Process exit code: 0
What version Mongo did it/you install? seems to not like anything over 5, which isn’t supported on 9.x
the last install I did I ended up wiping and going back to 8.10 with Mongo5
OR is this something else where it doesn’t clear out the old python/cache files from the temp directory. did it reboot after an update? weird it’s looking for C:\ on linux.
did you submit a ticket to support@?
Configuring a LD_LIBRARY_PATH pointing to “/opt/Thinkbox/Deadline10/lib/python3/lib/” removed that error message.
This configuration was all an update from a 10.2.xxx to 10.3.2.1.
Maybe I should try a fresh install of everything…
It’s an weird issue since it’s only happening to Monitor clients connected through the RCS server. It’s like the connection is dropping and when that happens, the monitor window closes, I mean, I think that’s the Monitor behavior if the RCS shuts down for instance. Or it can be some Qt issue? I don’t know, just guessing here.
And yes, the server was rebooted after update. MongoDB was first installed with 10.2. I’m not sure if the 10.3 also updated it but the current db version is v4.2.12.
For the C:\ mentions, I believe it’s related to where the application was compiled at.
1 Like
stumbled across this today, centos7 10.2 > 10.3 upgrade no longer launching pulse, the env var above fixed the issue. thanks!
Ant