We are running into the below errors when trying to run a cmdline test job while we have just started setting up deadline.
2023-03-06 21:50:44: 0: Failed to properly create Deadline Worker data folder 'Thinkbox\Deadline10\workers' because: The SlaveDataRoot path in the deadline.ini file isn't a rooted path. (Deadline.Configuration.DeadlineConfigException)
and
2023-03-06 21:50:44: 0: ERROR: DataController threw an unexpected exception during initialization: FranticX.Database.DatabaseConnectionException: Could not connect to any of the specified Mongo DB servers defined in the "Hostname" parameter of the "settings\connection.ini" file in the root of the Repository.
Mongo DB âsettings\connection.iniâ info: Hostname=fpdeadline01;192.168.0.107
and we are able to telenet fpdeadline01 27100 from the same worker machine which is giving above errors, please find below the Mongo DB config file for your reference:
#MongoDB config file
systemLog:
destination: file
# Mongo DB's output will be logged here.
path: C:\DeadlineDatabase10\mongo\data\logs\log.txt
# Default to quiet mode to limit log output size. Set to 'false' when debugging.
quiet: true
# Increase verbosity level for more debug messages (max: 5)
verbosity: 0
net:
# Port MongoDB will listen on for incoming connections
port: 27100
ipv6: true
ssl:
# SSL/TLS options
mode: disabled
# If enabling TLS, the below options need to be set:
#PEMKeyFile:
#CAFile:
# By default mongo will only use localhost, this will allow us to use the IP Address
bindIpAll: true
storage:
# Database files will be stored here
dbPath: C:\DeadlineDatabase10\mongo\data
engine: wiredTiger
security:
authorization: disabled
Any thoughts or pointers to fix these will be really helpful.
Hey, Iâm very new to deadline too. But Iâll take a guess at your issue regarding the SlaveDataRoot. It sounds like the SlaveDataRoot inside the desktop.ini on the given worker doesnât have a root. Like for example, if the worker is a Windows computer. I believe the path has to start with a drive letter or double backslash if itâs a network drive. Like C:\Thinkbox\Deadline10 or \networkDrive\Thinkbox\Deadline10.
Thanks, @Mads_Hangaard while your point is correct, the deadline.ini (C:\ProgramData\Thinkbox\Deadline10) for our deadline client machinesâ looks like this:
where the SlaveDataRoot= is empty by default and therefore we expect the defaults to work correctly or give us an error message thatâs a little more specific as to which default location the write attempt is being made and what fixes would allow us to get this working?
Is that make sense to you? Looking forward to your thoughts.
I see, then I donât know why the error is happening. But I can tell you, that when we set up Deadline. The default was C:\Users\USER\AppData\Local\Thinkbox\Deadline10. Iâm sorry I couldnât be of better assistance.
Not a prob @Mads_Hangaard we also tried running this job with the below change in deadline.ini:
SlaveDataRoot=%AppData%\Thinkbox\Deadline10
and restarted the Deadline 10 Launcher Service and re-run the Command Line test task, however that again resulted into the same error as below:
2023-03-08 18:46:55: 0: Failed to properly create Deadline Worker data folder 'Thinkbox\Deadline10\workers' because: The SlaveDataRoot path in the deadline.ini file isn't a rooted path. (Deadline.Configuration.DeadlineConfigException)
2023-03-08 18:46:55: 0: ERROR: DataController threw an unexpected exception during initialization: FranticX.Database.DatabaseConnectionException: Could not connect to any of the specified Mongo DB servers defined in the "Hostname" parameter of the "settings\connection.ini" file in the root of the Repository.
Anyone else on these forums, would you recommend any other tests/checks to further narrow down the troubleshooting of this issue?
Additionally, we also tried again with the C:\ProgramData\Thinkbox\Deadline10\deadline.ini modified with:
SlaveDataRoot=C:\LocalSlaveData
Where C:\LocalSlaveData has permissions for Everyone to Modify, Read & Execute, List folder contents, Read, Write and the contents of C:\LocalSlaveData post filed job execution are:
Does your deadline repo/settings/connection.ini file (e.g. EnableSSL=False and Authenticate=False) match the mongo config.conf ?
Not sure if the affects anything, but assuming youâre on windows, can you re-write the NetworkRoot and NetworkRoot0 with backslashes (not forward slash) e.g.: \\fpdeadline01\DeadlineRepository10
I believe the settings\connection.ini file is getting its root path from the NetworkRoot
From the docs: The windows path location for the worker is: %PROGRAMDATA%\Thinkbox\Deadline[VERSION]\workers\[WORKERNAME]
Does %PROGRAMDATA% expand/resolve correctly on your render node e.g.:
PS C:\Users\deadline> dir env:programdata
Name Value
---- -----
ProgramData C:\ProgramData
Your test of SlaveDataRoot=C:\LocalSlaveData, I think the permissions needs to be Full Control (this folder, subfolders, and files) as the worker is creating and deleting the jobsData and plugins folders and files under workers\[WORKERNAME]
Hey I wanted add to my response. %APPDATA% or any path which starts with the environment variable like this cannot be expanded by the Worker because only Windows can do that Worker cannot.
Additionally, I am trying to run the deadlinecommand to get the DatabaseSettings which seems to be working fine for the local (LAN) Deadline Repository. However, the same isnât working for the remote Deadline Repository which we are able to connect over the RCS through Deadline Monitor GUI.
Please refer to the below example outputs for more details and reference,
"%DEADLINE_PATH%"\deadlinecommand RunCommandForRepository Remote sgtdlrepo:4433;"D:/RCS Certs/RCS Certs/Deadline10RemoteClient.pfx" -GetDatabaseSettings
Warning: This command does not support "RunCommandForRepository" and that option will be ignored.
An error occurred while updating the database settings:
Index was outside the bounds of the array. (System.IndexOutOfRangeException)
GetDatabaseSettings doesnât work when pointed at the RCS. Itâs reading out whatâs set in DeadlineRepository10\settings\connection.ini. The command should be smart enough to tell you this in a human friendly way. Iâll get a dev ticket in to improve that.
If you want to confirm database settings, youâll have to run that command on the RCS machine itself as itâll be using a direct connection to the database.
To test connection to the database you could instead use something like deadlinecommand -getpools, which will prove a connection and pulling data from the database.
Sure @Justin_B get that, and thatâs very useful info.
My apologies if I should be posting this to some other topics and threads, but since I am new both to Deadline itself as well as these forums, a few quick questions:
For already submitted jobs - how do we edit its âJob Info Parametersâ as well as its âPlugin Info Parametersâ?
How can we edit the already submitted Gaffer job to add the â-threads 16â parameter to it? and how can we get the respective jobâs execution also to pick it up and respect that flag for the actual command line execution?
Can we have the Jobâs âConcurrent Tasksâ be a dynamic or programmatically calculated value? For example, for the Deadline-Gaffer job, we want to drive this through the jobâs submission parameter â-threads intâ value. And hoping to implement it in a way like the one below:
where the goal is to send concurrent jobs on a Deadline worker(s) with heterogeneous cores of machines (which would be say a mix of 16,32,72 and 128 cores) - where the jobâs requested rendering thread counts should assign multiple concurrent jobs to a deadline worker, where
-threads 16
on a 16 core worker - it should assign/run one job/task
on a 32 core worker - it should assign/run two concurrent jobs/tasks
on a 72 core worker - it should assign/run four concurrent jobs/tasks
on a 128 core worker - it should assign/run eight concurrent jobs/tasks
Can you please post your deadline.ini file? I think in the other thread/post you have an entry forConnectionType=Respository, but you also have ProxyRoot=fpdeadline01:8080
Maybe someone from Thinkbox can answer, but I think this may be where your worker is getting confused. If the ConnectionType is Repository or Direct, it will use NetworkRoot. But if the ConnectionType is Remote, it will use ProxyRoot, (Proxy* etc). Not sure if having both types entries is the issue, since one would expect it to ignore the Proxy keys if the ConnectionType is set to Repository.
For the next test, I suggest that you only have one or the other (i.e. Respository or Remote type entries).
Can you try removing the Proxy* and ClientSSLAuthentication entries?
e.g.:
And just to double check, you worker can reach the repo using the path: \\fpdeadline01\DeadlineRepository10
For the SlaveDataRoot You did not show the Full Control permission checkbox which is just above Modify. Can you click on Advanced and double check that the Permission entries are:
Type Allow
Principal Everyone
Access Full Control
Inherited from None
Applies to This folder, subfolders, and files
Sure @jarak please note that now we are working with two Deadline Repositories/Servers. One is local/LAN and the other is remote over RCS. Attaching the latest deadline.ini from C:\ProgramData\Thinkbox\Deadline10
Yes, you are right, but from what I can tell, from the Deadline Monitor GUI we have set the default repository as:
Also, we will be working with both these DirectConnection (Local) and RCS (Remote) repositories. Thatâs on-premise and cloud Deadline repositories. So it is necessary for us to get things working with these. Any thoughts on fixing or setting things up for our use case will be very helpful.
Additionally, I am noticing the connections from Deadline Worker to the Deadline Repository for the below ports,
netstat -a | find "fprdsk113"
TCP 192.168.0.117:445 fprdsk113:50214 ESTABLISHED
TCP 192.168.0.117:3389 fprdsk113:59640 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49739 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49740 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49741 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49742 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49796 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49797 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49798 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:49799 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50441 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50442 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50443 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50444 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50448 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50449 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50450 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:50451 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:57177 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:57178 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:57179 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:57180 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:58965 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:58983 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:59261 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:59380 ESTABLISHED
TCP 192.168.0.117:27100 fprdsk113:59417 ESTABLISHED
sure thanks for pointing that @jarak I have done the permissions fixes and rebooted the Deadline worker node and tried running the job again but no luck! I am sure something is going wrong on our side but not sure how do I narrow it down and fix it!