Hey team!
I’m trying to track down an issue that’s been spotty, very difficult to recreate, and results in very random errors/behaviors.
Without going into any of those particular specifics, I’m wondering a few general questions that may help me rule out some factors.
To give a quick summary of the environment, we have a Mac database server and six (6) Mac workers, all running Apple Silicon. The repository is stored on a Linux-based NAS, and all the computers access it over SMB.
So, here are some of my general questions:
-
Are there any recommended SMB optimizations or SMB configurations to avoid?
-
What is the likelihood that multiple workers rendering a C4D job and all trying to read the same asset file (say, an animated GIF) over SMB could cause a problem?
-
Typically, how many TCP port connections should each machine make to the database server? Would five (5) connections per worker seem excessive if the worker is only running DeadlineWorker10.app? How many TCP connections to the database server would one expect from DeadlineMonitor10.app?
-
With a render farm this size (one database server, six workers), would it seem unlikely that we’d run up against macOS’s resource limits?
One of the few consistencies about the weird issue I’m running into is that it tends to get triggered when multiple workers are rendering the same C4D Job. I had a C4D Job rendering with two workers for several hours without issue, but when I added a third worker, the whole Job failed after about an hour. If I have all six workers rendering the same C4D Job, it will start failing almost immediately.
Just wanted to mention that in case that impacts the way my questions are considered.
Thanks for any consideration or input!