Cinema 4D Batch Plugin Failure and Other Issues

seenSN · October 25, 2024, 9:09pm

Greetings,

I’m looking for any incite that could help me solve some of the issues that I’m still facing. Currently I’m running Cinema 4D 2024.5 along with Redshift. Currently on Deadline 10.3.2.1. All machines are windows. Repo is on windows server.

I’ll try to keep this as organized as possible as there are a few issues:

1. Submitting with Cinema 4D Batch plugin:

Around 25% of renders submitted using c4d batch plugin complete with no output almost instantly. This line indicates the issue but I don’t know where to start debugging it:

Context:

2024-10-25 15:41:55:  0: INFO: Done Path Mapping
2024-10-25 15:41:55:  0: Done executing plugin command of type 'Start Job'
2024-10-25 15:41:55:  0: Plugin rendering frame(s): 0
2024-10-25 15:41:56:  0: Executing plugin command of type 'Render Task'
2024-10-25 15:41:56:  0: INFO: Render Tasks called
2024-10-25 15:41:56:  0: INFO: Pre Build Script
2024-10-25 15:41:56:  0: INFO: Validating the path: 'L:\04_CGI\~Users\schen\UA075\test\'
2024-10-25 15:41:56:  0: INFO: Rendering main output to network drive
2024-10-25 15:41:56:  0: INFO: 
2024-10-25 15:41:56:  0: STDOUT: Running Script: C:\ProgramData\Thinkbox\Deadline10\workers\bos-sv-mgprors-renderbox_gpu_1\jobsData\671bf3d52b25db680d4f74fb\thread0_temp1dry60\c4d_Batch_Script.py
2024-10-25 15:41:56:  0: STDOUT: CRITICAL: Stop [ge_container.h(556)]
2024-10-25 15:41:56:  0: STDOUT: CRITICAL: Stop [ge_container.h(556)]
2024-10-25 15:41:56:  0: STDOUT: (null)PendingDeprecationWarning: Since 2024.0 BaseList2D.GetData is deprecated and will be removed in 2025.0. Use BaseList2D.GetDataInstance instead.
2024-10-25 15:41:56:  0: STDOUT: Illegal state: Condition src._systemImpl->GetNodeSystem().IsFinalized() not fulfilled.  [nodesgraph_impl.cpp(3388)]
2024-10-25 15:41:56:  0: INFO: Script Ran Successfully
2024-10-25 15:41:57:  0: INFO: Finished Cinema 4D Task
2024-10-25 15:41:57:  0: Done executing plugin command of type 'Render Task'

2. Application Crashed Error:

This problem started occur a while ago where renders would finish with application crashed after seemingly rendering properly. Since the output was written, I made an exception for it in the cinema4d.py plugin. This happens on 50% of jobs.

2024-10-25 16:50:57:  0: STDOUT: Rendering successful: 216.938 sec.
2024-10-25 16:50:57:  0: STDOUT: Warning: Unknown arguments: true
2024-10-25 16:50:57:  0: STDOUT: Redshift Debug: PreviewScheduler: End
2024-10-25 16:50:57:  0: STDOUT: Redshift Debug: PreviewScheduler: Flush/Clear Context
2024-10-25 16:50:57:  0: STDOUT: Redshift Debug: PreviewScheduler: Flush completed
2024-10-25 16:50:59:  0: STDOUT: Error: application crashed
2024-10-25 16:50:59:  0: STDOUT: C4DUnhandledExceptionFilter: writing exception info
2024-10-25 16:50:59:  0: STDOUT: C4DUnhandledExceptionFilter: writing call stacks
2024-10-25 16:50:59:  0: INFO: Process exit code: 1
2024-10-25 16:50:59:  0: INFO: Ignoring Code 1
2024-10-25 16:50:59:  0: INFO: Finished Cinema 4D Task
2024-10-25 16:50:59:  0: Done executing plugin command of type 'Render Task'

3. Jobs will not “complete” after completing

This problem is the worst of all. Job will render, plugin seems to try to finish, but Deadline won’t move onto the next job. This is obviously a problem as workers will sit for hours if not days not working.

more info:

2024-07-19 12:51:31: 0: STDOUT: Redshift Debug: Freeing GPU mem…(device 0)
2024-07-19 12:51:31: 0: STDOUT: Redshift Debug: Done (CUDA reported free mem before: 4275 MB, after: 43373 MB)
2024-07-19 12:52:35: 0: STDOUT: Redshift Debug: Summary: Profile: Update:00:02.258 (00:02.164/00:00.000/00:00.094/00:00.000) Render:10:20.284 (00:11.759/00:05.526) Output:01:08.565 Total:11:31.108
2024-07-19 12:53:28: 0: STDOUT: Redshift Debug: Context: Unlocked:Render
2024-07-19 12:53:28: 0: STDOUT: Progress: 100%
2024-07-19 12:53:28: 0: STDOUT: Rendering successful: 901.203 sec.
2024-07-19 12:53:28: 0: STDOUT: Warning: Unknown arguments: true
2024-07-19 12:53:28: 0: STDOUT: Redshift Debug: PreviewScheduler: Flush/Clear Context
2024-07-19 12:53:28: 0: STDOUT: Redshift Debug: PreviewScheduler: End
2024-07-19 12:53:28: 0: STDOUT: Redshift Debug: PreviewScheduler: Flush completed
2024-07-19 12:53:29: 0: STDOUT: Redshift Debug: Shutdown mem management thread…
2024-07-19 12:53:29: 0: STDOUT: Redshift Debug: Shut down ok
2024-07-19 12:53:29: 0: STDOUT: Redshift Debug: PostFX: Shut down
2024-07-19 12:53:29: 0: STDOUT: Redshift Debug: Shutdown GPU Devices…
2024-07-19 12:53:29: 0: STDOUT: Redshift Debug: Devices shut down ok
2024-07-19 12:53:29: 0: STDOUT: Redshift Debug: Shutdown Rendering Sub-Systems…
2024-07-19 12:53:29: 0: STDOUT: Redshift Info: License returned
2024-07-19 12:53:29: 0: STDOUT: Redshift Debug: Finished Shutting down Rendering Sub-Systems
2024-07-19 12:53:29: 0: STDOUT: MaxonEnd: 07/19/24 at 12:53:29

Things I’ve tried

systems engineer helped check for licensing issues. None found
trying new GPU drivers, updating Maxon One and Deadline
team purchased new worker machine. Worked great for a few weeks and then some of these issues started cropping up again?
Reaching out to Maxon. Their support teams weren’t able to find issues with Maxon One or C4D in the logs.

I appreciate anyone who takes the time to read this!

anthonygelatka · October 29, 2024, 10:44am

Can you confirm this is 25% of machines which have successfully rendered a frame, or does it always fail on the same machine? (possibly different release versions of c4d?)

2/3. Are you setting the render to render locally first, then copy to the server? Does the issue still happen when not using batch?

seenSN · October 29, 2024, 3:13pm

I can confirm this happens on each worker. We have two machines with 2 instances of worker installed on each (I’ve tested and encountered this issue with 1 instance of worker). Problem 2 has occurred with and without batch enabled. I don’t have “enable local rendering” checked in the submitter.

anthonygelatka · October 29, 2024, 3:26pm

do the GPU’s have their affinity set, is it possibly it’s crashing the cards running multiple instances?

seenSN · October 29, 2024, 3:40pm

Yeah, each machine has two GPUs. Worker 1 has only GPU 0 affinity and worker 2 has only GPU 1 affinity.

anthonygelatka · October 29, 2024, 4:56pm

do you have the latest MaxonApp? I’ve seen elsewhere this flashes up the floating licenses are in use, I’m not sure how it’s managing multiple check-ins on floating licenses

seenSN · October 29, 2024, 4:56pm

Yup. I do have the latest maxon app on both machines. I’ve also tried their scripted install but that didn’t improve anything

anthonygelatka · October 29, 2024, 4:58pm

do you get the same error if you run a Redshift RS export job?

seenSN · October 29, 2024, 5:20pm

I haven’t tested that but I can. However, I should’ve mentioned earlier that if I run commandline.exe by itself (going into programfiles and doubleclicking), it will crash or hang

2024-10-25 16:50:59:  0: STDOUT: Error: application crashed
2024-10-25 16:50:59:  0: STDOUT: C4DUnhandledExceptionFilter: writing exception info
2024-10-25 16:50:59:  0: STDOUT: C4DUnhandledExceptionFilter: writing call stacks

seenSN · October 29, 2024, 5:25pm

We don’t have any plugins installed other than Redshift so I’m unsure what configuration issue it could be

anthonygelatka · October 30, 2024, 8:42am

sounds like the issue is outside of Deadline so it maybe worth opening a ticket with Maxon

You’ve got some nice cards in there, have they got NVLink attached?

I found an issue with 2x of these linked together 48gb x2 VRAM = 96Gb where the machine didn’t have enough RAM, if I remember it had 96Gb to match the VRAM but didn’t account for the OS RAM requirement, so Redshift couldn’t allocate the amount of RAM to match the VRAM resulting in the jobs failing.

I’d also try removing all plugins and running a test, then re-introducing the plugins to see if they cause any issues.