Deadline 6.0 repository on Nexenta and mongoDB on linux

Hi!

Took a bold step and started updating from Deadline 5.2 to 6.0.

I had the previous install repository all on a Nexenta server and a license server on Linux Debian. Other render nodes are linux debian. Workstations are Windows 7.

Now I understood that because of the new database structure, the mongoDB needs to be installed on an operating system (?) so I installed the mongoDB on my license server (since it’s always up&running) and the repository on the Nexenta file server.

I’ve got everything installed on one Linux node for starters but the slave doesn’t show up on my workstation’s Deadline Monitor… I fire up a slave on my workstation and that does show up on the Monitor.

“deadlineslave” is frequently running on the node and license server seems to be working as well.

Just don’t know where to start looking for a problem/log to troubleshoot…
I noticed a “Sharing the repository folder” -title on the documents but since the Nexenta is mounted on all of the nodes, this part shouldn’t be necessary, right?

Thanks for all of your help, having a deadline :slight_smile: getting nearer and really need to get Maya 2014 working on the farm soon.

Teemu

Hello Teemu,

I just wanted to verify that the machine hosting the database had post 27017 open, as the database does need that for connection. If it is fine, it would be a great start to send over the slave log from that slave machine and we can take a look at what it is reporting. Often this can be found by right clicking the launcher(on windows) and choosing ‘explore log folder’. Hope that helps.

Cheers,

Hi, got it working but was actually a much more simpler problem than the port.

I didn’t know if the port was open so I ran these on the Debian where the mongoDB was installed anyway :

iptables -I INPUT -p tcp --dport 27017 --syn -j ACCEPT
iptables-save
reboot

iptables -I INPUT -p udp --dport 12345 -j ACCEPT
iptables-save
reboot

and then I figured how to run netstat -pln :

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:60269           0.0.0.0:*               LISTEN      1766/thinkbox
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      973/portmap
tcp        0      0 0.0.0.0:60880           0.0.0.0:*               LISTEN      1764/frantic
tcp        0      0 0.0.0.0:28017           0.0.0.0:*               LISTEN      1151/mongod
tcp        0      0 0.0.0.0:7412            0.0.0.0:*               LISTEN      1758/xinetd
tcp        0      0 0.0.0.0:7413            0.0.0.0:*               LISTEN      1758/xinetd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1728/sshd
tcp        0      0 0.0.0.0:27000           0.0.0.0:*               LISTEN      1578/lmgrd.foundry
tcp        0      0 0.0.0.0:27001           0.0.0.0:*               LISTEN      1762/lmgrd
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      1534/exim4
tcp        0      0 0.0.0.0:47487           0.0.0.0:*               LISTEN      1709/rlm.foundry
tcp        0      0 127.0.0.1:959           0.0.0.0:*               LISTEN      1207/famd
tcp        0      0 0.0.0.0:2080            0.0.0.0:*               LISTEN      1765/adskflex
tcp        0      0 0.0.0.0:59043           0.0.0.0:*               LISTEN      1580/foundry
tcp        0      0 0.0.0.0:4101            0.0.0.0:*               LISTEN      1649/rlm.foundry
tcp        0      0 0.0.0.0:4102            0.0.0.0:*               LISTEN      1649/rlm.foundry
tcp        0      0 0.0.0.0:27017           0.0.0.0:*               LISTEN      1151/mongod
tcp6       0      0 :::22                   :::*                    LISTEN      1728/sshd
tcp6       0      0 ::1:25                  :::*                    LISTEN      1534/exim4
tcp6       0      0 :::44410                :::*                    LISTEN      1836/mono
tcp6       0      0 :::445                  :::*                    LISTEN      1246/smbd
tcp6       0      0 :::139                  :::*                    LISTEN      1246/smbd
udp        0      0 0.0.0.0:111             0.0.0.0:*                           973/portmap
udp        0      0 0.0.0.0:27000           0.0.0.0:*                           1766/thinkbox
udp        0      0 10.123.123.255:137      0.0.0.0:*                           1213/nmbd
udp        0      0 10.123.123.201:137      0.0.0.0:*                           1213/nmbd
udp        0      0 0.0.0.0:137             0.0.0.0:*                           1213/nmbd
udp        0      0 10.123.123.255:138      0.0.0.0:*                           1213/nmbd
udp        0      0 10.123.123.201:138      0.0.0.0:*                           1213/nmbd
udp        0      0 0.0.0.0:138             0.0.0.0:*                           1213/nmbd
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name    Path
unix  2      [ ACC ]     STREAM     LISTENING     6172     1151/mongod         /tmp/mongodb-27017.sock
unix  2      [ ACC ]     STREAM     LISTENING     4530     1222/acpid          /var/run/acpid.socket
unix  2      [ ACC ]     STREAM     LISTENING     4649     1280/dbus-daemon    /var/run/dbus/system_bus_socket

But still the slave didn’t appear on the Monitor.

So I went to search the logs of the slave. And to my surprise there were none!
What had happened was I thought the installer would override the previous Deadline 5.2 slave but of course not :slight_smile:
So I uninstalled that and ran the slave from Deadline6/bin folder :
deadlineslave -nogui

  • added a line to /etc/rc.local

Works now! :slight_smile:

Ok, one problem still.

First of all, I used to fire up the deadline slaves during boot inside /etc/rc.local but in this version it didn’t work. I installed the clients on the render nodes as Daemons instead (was this possible on previous versions?)

So all the other nodes render atm except one.
The render node which also has the MongoDB installed does not fire up the deadlineslave -command as daemon.

Only if I run it manually from /opt/Thinkbox/Deadline6/bin/deadlineslave -nogui it works, but not after I reboot the machine.

if I run #top on that machine, I can see it is trying to start the deadlineLauncher every few seconds but deadlineslave doesn’t flash there like on other nodes.

Is there a log somewhere where I could see what goes wrong in the deadlineLauncher application? Could it be interfering with MongoDB since it’s on the same machine?


also interesting behavior when trying to View a Task report, it gives me this
Error occurred while writing report log: Destination directory not found: /mnt/nopia/farm/reports/jobs/53/b (System.IO.DirectoryNotFoundException)

Hello Teemu,

So there seems to be a few issues here. First, if you had a 5.2 slave connecting to a 6.09 repo, this would explain your missing log files. As 6.0 saves files differently than 5.2, a 5.2 slave’s cleanup would have wiped out all the 6.0 log folders. This is why we say you should install your repo and slaves to an entirely new location in 6.0. Reinstalling the repository will allow your repo to regain the proper folder structure.

As for running the slave as a Daemon, that is something specified on install. If you reinstall the slave there should be an option to launch as daemon, and if you select that it will set the slave up to launch automatically. I do believe this is a new option for 6.0. Can you send over a deadlinelauncher log file and we can see if there is anything useful there? Thanks.

Cheers,

Thanks a lot Dwight,

I actually figured out that 5.2 - 6.0 jump from another post somewhere here and had all reinstalled and the repo to a different folder.

Now everything works (after fighting a lot with Maya related problems though) - still except that one node which has mongoDB on it and which has to be manually fired up still.

  • where exactly could I find a deadlinelauncher log on that node? I’ve been trying to search it but I’m lost in a linux endless black terminal.

Hello,

Looking at the virtual Ubuntu setup I have here, it looks like that is in /var/log/Thinkbox/Deadline6/ but you can open the monitor, and choose help, then explore log folder to find it, if you have a GUI.

Cheers,

Hi, I found it.

Here are the contents :

/var/log/Thinkbox/Deadline6# cat deadlinelauncher-merry-2013-11-15-0001.log 2013-11-15 09:43:10: BEGIN - merry\root 2013-11-15 09:43:10: Deadline Launcher 6.0 [v6.0.0.51561 R] 2013-11-15 09:43:11: Error occurred while updating network settings: An error occurred while trying to connect to the Database (10.123.123.201:27017). It is possible that the Mongo Database server is incorrectly configured, currently offline, or experiencing network issues. 2013-11-15 09:43:11: Full error: Unable to connect to server 10.123.123.201:27017: Connection refused. (FranticX.Database.DatabaseConnectionException) 2013-11-15 09:43:11: Launching Slave: 2013-11-15 09:43:11: Launcher Thread - Launcher thread initializing... 2013-11-15 09:43:11: Error occurred while updating network settings: An error occurred while trying to connect to the Database (10.123.123.201:27017). It is possible that the Mongo Database server is incorrectly configured, currently offline, or experiencing network issues. 2013-11-15 09:43:11: Full error: Unable to connect to server 10.123.123.201:27017: Connection refused. (FranticX.Database.DatabaseConnectionException) 2013-11-15 09:43:11: Launcher Thread - Remote administration is disabled 2013-11-15 09:43:11: Launcher Thread - Launcher thread listening on port 17060 2013-11-15 09:43:16: Error occurred while updating network settings: An error occurred while trying to connect to the Database (10.123.123.201:27017). It is possible that the Mongo Database server is incorrectly configured, currently offline, or experiencing network issues. 2013-11-15 09:43:16: Full error: Unable to connect to server 10.123.123.201:27017: Connection refused. (FranticX.Database.DatabaseConnectionException)

So I guess the problem is that probably Debian can’t connect to server 10.123.123.201 since this machine is that IP.

How could I setup the deadlinelauncher in a way that it would understand to use the database which exists on the same machine it was launched on?

Thanks!

Hello Teemue,

Just to clarify, that IP belongs to the machine this log came from, right? If so, it is a bit confusing as to why it cannot connect to it’s own IP address. That is a totally different issue from the lack of a launch as daemon. I am curious about something you said earlier, though. You said that it shows the deadlinelauncher trying to load, but failing. Usually the launch of the slave is connected to the launcher, so that might be the cause of the issue. I suppose if it had these connection issues, that could prevent it from launching, but then the curious thing is why the slave can launch. Can you try to ping the machine by it’s IP? that might shed some light. As another option, you can add in the localhost to the hostname line of the dbConnect.xml file, which would theoretically help this machine look for 27017 on itself. That file is found in “[repo]/settings/”. Let me know how that goes.

Cheers,