Hello, I just tried to use the external Nuke Submitter to submit a simple test job to a worker on our farm for the first time. We have RCS configured (for the AWS Portal for which we could get infrastructure running) which should be working afaict.
The first task hangs on Waiting to Start
for a couple of minutes before dumping this to the console.
2024-01-17 09:15:42: Running script NukeSubmission (/home/sdugaro/Thinkbox/Deadline10/cache/q3UNeMyV11ydp4KDU3ru0w0QmY/scripts/Submission/NukeSubmission.py)
2024-01-17 09:18:44: Error occurred while reloading network settings: Connection Server error: GET https://192.168.X.XX:4433/db/settings/network?invalidateCache=true returned "One or more errors occurred. (Connection refused (192.168.X.XX:4433))" (System.Net.WebException)
2024-01-17 09:18:44: Error occurred while updating Worker cache: Connection Server error: POST https://192.168.X.XX:4433/db/slaves/modified?transactionID=65dad542-3fca-4c43-82bc-36d3d4597770 returned "One or more errors occurred. (Connection refused (192.168.X.XX:4433))" (System.Net.WebException)
2024-01-17 09:18:44: Error occurred while updating licenseForwarder cache: Connection Server error: POST https://192.168.X.XX:4433/db/licenseforwarders/modified returned "One or more errors occurred. (An error occurred while sending the request.)" (System.Net.WebException)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.Utils.ProxyUtils.HandleException(Exception e, NetworkManager manager, String server, Int32 port, String certificatePath)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.ProxyLicenseForwarderStorage.GetModifiedLicenseForwarders(LicenseForwarderInfoSettings[]& modifiedLicenseForwarders, String[]& deletedLicenseForwarderIds, Nullable`1 lastSettingsAutoUpdate, Nullable`1 lastInfoAutoUpdate, Nullable`1 lastDeletionAutoUpdate)
2024-01-17 09:18:44: at Deadline.StorageDB.LicenseForwarderStorage.a(Object lw)
2024-01-17 09:18:44: ---------- Inner Stack Trace (System.Net.Sockets.SocketException) ----------
2024-01-17 09:18:44: Error occurred while updating job cache: Connection Server error: POST https://192.168.X.XX:4433/db/jobs/modified?transactionID=00000000-0000-0000-0000-000000000000&invalidateCache=true returned "One or more errors occurred. (An error occurred while sending the request.)" (System.Net.WebException)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.Utils.ProxyUtils.HandleException(Exception e, NetworkManager manager, String server, Int32 port, String certificatePath)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.ProxyJobStorage.GetModifiedJobs(Job[]& modifiedJobs, String[]& deletedJobIds, Boolean& hasMore, String& transactionID, Nullable`1 batchQueryTime, Nullable`1 deleteQueryTime, Nullable`1 firstBatchFlag)
2024-01-17 09:18:44: at Deadline.StorageDB.JobStorage.b(Object kp)
2024-01-17 09:18:44: ---------- Inner Stack Trace (System.Net.Sockets.SocketException) ----------
2024-01-17 09:18:44: Error occurred while updating RCS cache: Connection Server error: POST https://192.168.X.XX:4433/db/proxyservers/modified returned "One or more errors occurred. (An error occurred while sending the request.)" (System.Net.WebException)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.Utils.ProxyUtils.HandleException(Exception e, NetworkManager manager, String server, Int32 port, String certificatePath)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.ProxyProxyServerStorage.GetModifiedProxyServers(ProxyServerInfoSettings[]& modifiedProxyServers, String[]& deletedProxyServerIds, Nullable`1 lastSettingsAutoUpdate, Nullable`1 lastInfoAutoUpdate, Nullable`1 lastDeletionAutoUpdate)
2024-01-17 09:18:44: at Deadline.StorageDB.ProxyServerStorage.a(Object mv)
2024-01-17 09:18:44: ---------- Inner Stack Trace (System.Net.Sockets.SocketException) ----------
2024-01-17 09:18:44: Error occurred while updating balancer cache: Connection Server error: POST https://192.168.X.XX:4433/db/balancers/modified returned "One or more errors occurred. (Connection refused (192.168.X.XX:4433))" (System.Net.WebException)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.Utils.ProxyUtils.HandleException(Exception e, NetworkManager manager, String server, Int32 port, String certificatePath)
2024-01-17 09:18:44: at Deadline.StorageDB.Proxy.ProxyBalancerStorage.GetModifiedBalancers(BalancerInfoSettings[]& modifiedBalancers, String[]& deletedBalancerIds, Nullable`1 lastSettingsAutoUpdate, Nullable`1 lastInfoAutoUpdate, Nullable`1 lastDeletionAutoUpdate)
2024-01-17 09:18:44: at Deadline.StorageDB.BalancerStorage.a(Object ji)
...
2024-01-17 09:20:00: ERROR: UpdateClient.MaybeSendRequestNow caught an exception: POST https://192.168.X.XX:4433/rcs/v1/update returned "One or more errors occurred. (Connection refused (192.168.X.XX:4433))" (Deadline.Net.Clients.Http.DeadlineHttpRequestException)
2024-01-17 09:20:02: ERROR: UpdateClient.MaybeSendRequestNow caught an exception: POST https://192.168.X.XX:4433/rcs/v1/update returned "One or more errors occurred. (Connection refused (192.168.X.XX:4433))"
...
2024-01-17 09:20:31: The batch request reach the limit of 2 retries
2024-01-17 09:20:38: The batch request reach the limit of 2 retries
2024-01-17 09:20:38: Error occurred while reloading network settings: Connection Server error: GET https://192.168.X.XX:4433/db/settings/network?invalidateCache=true returned "One or more errors occurred. (Connection refused (192.168.X.XX:4433))" (System.Net.WebException)
2024-01-17 09:20:38: The batch request reach the limit of 2 retries
2024-01-17 09:20:44: Error occurred while updating Deadline AWS Resource Tracker Status label: Connection Server error: GET https://192.168.X.XX:4433/db/dash/dashFleet/health?region=af-south-1 returned "One or more errors occurred. (Connection refused (192.168.X.XX:4433))" (System.Net.WebException)
2024-01-17 09:20:44: at Deadline.StorageDB.Proxy.Utils.ProxyUtils.HandleException(Exception e, NetworkManager manager, String server, Int32 port, String certificatePath)
2024-01-17 09:20:44: at Deadline.StorageDB.Proxy.ProxyDashStorage.GetFleetHealthSummary(String awsRegion, String& fleetsHealthReport)
2024-01-17 09:20:44: at Deadline.StorageDB.DashStorage.UpdateResourceTrackerStatusLabel(Object o)
2024-01-17 09:20:44: ---------- Inner Stack Trace (System.Net.Sockets.SocketException) ----------
2024-01-17 09:20:44: at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
2024-01-17 09:20:44: at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
2024-01-17 09:20:44: at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
2024-01-17 09:20:44: at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
2024-01-17 09:20:44: The batch request reach the limit of 2 retries
The First Task (1001) has no Task Report.
Subsequent tasks (1002,1003) pickup, and their task reports complain about a Licencing error which is fine.
I can Modify Job Properties
to put in the license environment variable, then I try to resubmit the job.
It then hangs for another two minutes and pops up this resubmit error dialog
Which looks related as it spits out a similar message in the console
2024-01-17 09:53:20: An unexpected error occurred while Resubmitting Job:
2024-01-17 09:53:20: POST https://192.168.X.XX:4433/jobs/65a80b61df00ec740d421340/resubmit returned "One or more errors occurred. (A task was canceled.)" (Deadline.Net.Clients.Http.DeadlineHttpRequestException)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.GetResponse(HttpRequestMessage request)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.b(HttpRequestMessage blz)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.SendRequestForStream(String method, String uri, String contentType, Dictionary`2 headers, HttpContent httpContent)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.SendRequest(String method, String uri, String contentType, Dictionary`2 headers, HttpContent httpContent)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.Post(String uri, Object body, String contentType, Dictionary`2 headers)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.Post[TRequest,TResponse](String uri, TRequest body, String contentType, Dictionary`2 headers)
2024-01-17 09:53:20: at Deadline.Controllers.RemoteDataController.ResubmitJob(Job job, String jobName, String frameList, Int32 chunkSize, Boolean submitSuspended, Boolean maintenanceJob, Int32 maintenanceJobStartFrame, Int32 maintenanceJobEndFrame)
2024-01-17 09:53:20: at Deadline.Monitor.WorkItems.ResubmitJobWI.InternalDoWork()
2024-01-17 09:53:20: at Deadline.Monitor.MonitorWorkItem.DoWork()
2024-01-17 09:53:20: ---------- Inner Stack Trace (System.Threading.Tasks.TaskCanceledException) ----------
2024-01-17 09:53:20: at System.Threading.Tasks.Task.GetExceptions(Boolean includeTaskCanceledExceptions)
2024-01-17 09:53:20: at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
2024-01-17 09:53:20: at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.GetResponse(HttpRequestMessage request)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.b(HttpRequestMessage blz)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.SendRequestForStream(String method, String uri, String contentType, Dictionary`2 headers, HttpContent httpContent)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.SendRequest(String method, String uri, String contentType, Dictionary`2 headers, HttpContent httpContent)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.Post(String uri, Object body, String contentType, Dictionary`2 headers)
2024-01-17 09:53:20: at Deadline.Net.Clients.Http.HttpClient.Post[TRequest,TResponse](String uri, TRequest body, String contentType, Dictionary`2 headers)
2024-01-17 09:53:20: at Deadline.Controllers.RemoteDataController.ResubmitJob(Job job, String jobName, String frameList, Int32 chunkSize, Boolean submitSuspended, Boolean maintenanceJob, Int32 maintenanceJobStartFrame, Int32 maintenanceJobEndFrame)
2024-01-17 09:53:20: at Deadline.Monitor.WorkItems.ResubmitJobWI.InternalDoWork()
2024-01-17 09:53:20: at Deadline.Monitor.MonitorWorkItem.DoWork()
2024-01-17 09:53:20: at Deadline.Monitor.MonitorThreadPool.a.f()
2024-01-17 09:53:20: --- End of stack trace from previous location ---