Connection Server error and RCS crash on longer rendering jobs

Hi there,

I am running into issues when if a client is rendering a longer job (a single frame taking more than an hour or so) I suddenly start getting a lot of connection server errors and neither monitor or worker can properly connect to RCS. On top of that it seems that the RCS crashes/freezes also and it needs to be restarted to get it working again.

This is only happening on the longer jobs though, if it’s a less than say a 10 minute task it’s all ok and renders are completed without a hitch. When the connection server error comes up I can still ping the machine from the client though so the connection is there, it just can’t reach RCS. There’s no other errors/warnings on any of the logs when everything is working fine.

Other somewhat strange thing is that when this happen the Monitor on Server changes some of it’s colors, but otherwise runs fine (the odd background which is normally dark grey in the default color profile changes to white), restarting monitor clears this. But there’s no error in the log supporting this, neither in the Monitor or RCS log

RCS and database is running on a Mac Mini M4, MacOS Sequoia 15.3.1. The render worker is a Windows 11 Pro Machine. The clients are connecting to the RCS via Tailscale VPN.

Thanks so much

2025-10-08 08:08:21: ---------- Inner Stack Trace (System.Net.Sockets.SocketException) ----------
2025-10-08 08:08:21: at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
2025-10-08 08:08:21: at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
2025-10-08 08:08:21: at System.Net.Sockets.Socket.g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
2025-10-08 08:08:21: at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
2025-10-08 08:08:53: ERROR: UpdateClient.MaybeSendRequestNow caught an exception: POST https://100.109.55.111:4433/rcs/v1/update returned “One or more errors occurred. (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. (100.109.55.111:4433))” (Deadline.Net.Clients.Http.DeadlineHttpRequestException)
2025-10-08 08:08:59: ERROR: UpdateClient.MaybeSendRequestNow caught an exception: POST https://100.109.55.111:4433/rcs/v1/update returned “One or more errors occurred. (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. (100.109.55.111:4433))” (Deadline.Net.Clients.Http.DeadlineHttpRequestException)
2025-10-08 08:09:29: Error occurred while updating Deadline AWS Resource Tracker Status label: Connection Server error: GET https://100.109.55.111:4433/db/dash/dashFleet/health?region=af-south-1 returned “One or more errors occurred. (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. (100.109.55.111:4433))” (System.Net.WebException)
2025-10-08 08:09:29: at Deadline.StorageDB.Proxy.Utils.ProxyUtils.HandleException(Exception e, NetworkManager manager, String server, Int32 port, String certificatePath)
2025-10-08 08:09:29: at Deadline.StorageDB.Proxy.ProxyDashStorage.GetFleetHealthSummary(String awsRegion, String& fleetsHealthReport)
2025-10-08 08:09:29: at Deadline.StorageDB.DashStorage.UpdateResourceTrackerStatusLabel(Object o)

Just bumping this up, any help pls? RCS keeps crashing/freezing on the MacOS and sooner or later it crashes the rendering jobs. It’s hard to tell what’s happening to the RCS process, it’s still there but the CPU usage seems to go either to 0% without any movement, or well above 50-60%.

Error I get from Monitor when trying to connect is the below, but if RCS process is restarted on the server everything works fine—at least for a while. Anything I can do to fix? I can ping the server fine, firewall is ok and everything else should be enabled/shared as per the manual, unless I missed anything.

The Monitor was unable to connect to the specified server (100.109.55.111:4433 (Deadline10RemoteClient.pfx)).

Failed to establish connection to 100.109.55.111:4433 due to a communication error.
 ---> System.Net.Sockets.SocketException (10060): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at Deadline.Controllers.RemoteDataController..ctor(RepositoryConnectionSettings settings)
   at Deadline.Applications.DeadlineApplicationManager.CreateDataController(RepositoryConnectionSettings connSettings)
   at Deadline.Applications.DeadlineApplicationManager.Connect(RepositoryConnectionSettings connSettings, Boolean updateScriptManager)
   at Deadline.Monitor.MonitorManager.Connect(RepositoryConnectionSettings connectionSettings, Boolean updateScriptManager)
   at InvokeStub_MonitorManager.Connect(Object, Span`1)
   at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)