AWS Thinkbox Discussion Forums

Connection issues: Abrubtly loses connection to target machine

Hello,
We have been experiencing a connection issue to our Repository.

Our current setup involves a Virtual Machine that runs the Repository within our server/network infrastructure. Each Client has access to the same network and connects to the Repository through a Remote Connection using port 4433. It works perfectly fine, but we experience at odd times, that the connection fails to connect to target computer.

We’ve followed this throubleshooting guide. On the Repository computer, MondoDB isn’t under “Details” on the task manager and “Deadline 10 Database Service” isn’t running.

We’ve tried disabling the firewall and attempting a Direct Connection, but unfortunately, these measures did not help.

The solution has always been restarting the VM, and everything seems to work afterwards. Our issue is the problem occurs abruptly, often during overnight or weekend rendering sessions, which is really inconvenient.

One possibility we are considering is moving the repository to a dedicated computer instead of relying on a Virtual Machine. Would you recommend this? Any help is appriciated. Thank you.

We use a repo on a VM and seems to work ok.
Did you check the rcs log? It’s in /var/log/Thinkbox/Deadline10
The mongo log is under /opt/Thinkbox/DeadlineDatabase10/mongo/data/logs
It might be that some resource (like file descriptors) gets exhausted.

2 Likes

I’ll second both checking the mongodb log, and that the VM is likely running out of some resource.

If the database service is going down that would indeed cause a failure and the log should have some clues as to why the database is crashing out.

1 Like

We run deadline (repo and db) on a vm (Our RCS runs on a different host – since we seem to only use it when using aws/cloud renders)

Adding a little more info to what was already posted, in case it is an open file issue. Hopefully the logs will have something useful.

I believe that the mongodb has a separate ulimit you can tweak in its startup script. However most of the docs suggest ulimit of 200000 for the deadline procs so in your case, I’d suggest looking at tweaking the mongodb and RCS ulimit. I think RCS will spit out a warning (not error) if it is set too low.

2 Likes

Thank you everybody for your help!

We looked through the logs and it details that the machine runs out of storage. So now we’ve extended the storage amount and it seems to work fine now.
We’ve tested it through the weekends without any crashes. Hopefully that was the answer to our problems.

3 Likes
Privacy | Site terms | Cookie preferences