Frames Randomly Render with Clay Material

This is sitting maddenly on the border between a C4D/RS issue and a C4DBatch issue.

Background before symptoms:

  • We have four render nodes, each with two GPUs (3090x2 or 4090x2). GPU affinity is set so that each GPU renders its own task.
  • All with Nvidia 576.52.
  • Two boxes with Windows 10, two with Windows 11 (issue has appeared on both OS)
  • Occurs with RS 2025 and higher, and thus occurs with every version of C4D 2025, or setups of C4D 2024.4 with RS 2025+.

Symptoms with C4DBatch enabled:
A job starts; all 8 GPUs are placed on it. Either one or multiple GPUs will begin to render the scene such that a specific object in the scene has a clay material. Every frame rendered with that GPU on that scene will render the same way. The other GPU in the same box won’t have the issue.


If the same scene is submitted again, that particular GPU will probably render it fine, and this time a different GPU on a different box will have the same issue (usually with the same object in scene; there will usually be 1-3 specific objects that could get hit).

Other times the same scene will be submitted, and all frames will render fine.

Since the glitch never begins at a frame half way through the sequence (it always begins at the sequence’s beginning) I believe that the issue is related to the initialization process, as that only occurs with the beginning few frames when the BatchPlugin is enabled. It seems as though something is being triggered during this initialization that is messing with the materials, and then lasting throughout the render session.

However, since initialization occurs only once per GPU per scene with the batchplugin, each scene has only ~8 opportunities to initialize the glitch on our rack. One would then think that with the batch plug disabled, we’d have many more occurrences, however…

Symptoms with C4DBatch DISABLED:

Occurrences are reduced dramatically, but still occur. Instead of the issue first occurring at the beginning of a frame sequence and sticking throughout the rest of it, the bad frames can occur anywhere throughout the sequence… but they occur extremely rarely. For example, one scene (with a series of different file names) was submitted 10 times, each time ~60 frames being rendered. Out of 600 frames, there were only two bad frames.

However, since batchRender was disabled, all of these frames took notably longer to render, so this isn’t a true long term solution.

I’ve shared this with Maxon via the developer forums, but they can’t reproduce it, and I can’t figure out how to reproduce it outside of deadline, and even with deadline, it’s difficult to recreate when I want to. Last week I submitted a scene 9 times, and it was only on the 9th that it happened; but then coworkers will submit scenes for work and it’ll happen first time through.

I found a few other threads on here of people who possibly had the same issue, and asked them for more details, but wasn’t able to get much. Based on the dates, all of them appear to be experiencing it with C4D 2025.

Any thoughts?

Thanks,
Luke

export to RS and render Redshift Standalone, see if that still has the issue.

I’m not sure whether I’ll be able to make that work based on the sporadic nature of this issue.

As mentioned by gutster in the “Deadline loses Standard Material” thread, there appears to be a permissions issue at hand; my logs are showing the same error messages that his show, and they appear to only occur when two instances of Command Line are trying to access the same file at the same time during the step of “STDOUT: Loading Scene: [ scene path ]”

Can confirm this is not a Deadline issue; here’s a Royal Render user with the same problem:
https://groups.google.com/g/rrKnights/c/88vAp9pwsBA

I’ve seen plenty of issues with CLR, especially once you add xparticles or other things into the mix. Try the export to RS, see if this removes the issue of missing textures.

I’ll do that just so we can cross it off the list.

And just to clarify - this isn’t a missing texture issue: it’s material based. Here’s an example with some glass.


if you run the job outside of deadline using the commanline.exe does it also do this? just to see if this is c4d CLR or deadline?

does running the jobs as c4d / c4d batch make a difference also?

Batch render experience is described throughout the original post.

It seems to be triggered when two instances of CL are trying to access a series of Redshift 2025+ files stored in preferences at the exact same fraction of a second, with one accessing it successfully, and the other getting locked out (creating a permissions issue).

The timecode of the logs show 6-7 steps being accomplished at that specific second, and some successful logs show the series of tasks being completed successfully on both GPUs/iterations of CL during the same exact second, but on others it fails.

This seems to imply that the two instanes of CL need to be started at exactly the same moment (within milliseconds) so that this step occurs at exactly the same time on both instances… something that Deadline (or another render manager) does quite well.

What I essentially need to do is create a script that will launch both of these for me outside of DL, and hopefully get it to trigger.

Alternatively, if there were a way to adjust DL so that jobs with GPU affinity enabled didn’t launch tasks on the same machine at the same second, I think we’d be ok.

Location where the permissions-related files occur:

C:/Users/<user>/AppData/Roaming/Maxon/Maxon Cinema 4D <C4D Version>_x/dc/builtinrepository/

Files appearing in these error messages are the following; but again note that the materials that are failing don’t always have triplanar or MaxonNoise in them :

com.redshift3d.redshift4c4d.nodes.core.maxonnoise/net.maxon.asset.previewimageurl.derived
com.redshift3d.redshift4c4d.nodes.core.triplanar/net.maxon.asset.previewimageurl.derived

Found a way to trigger it some of the time outside of Deadline by using this power-shell script. Adjust the paths as needed, and run it on a dual GPU node directly. It will launch two windows of CL rendering different sets of frames.

Letellier_PowerShell_C4DCommandLine_Launcher.zip (590 Bytes)

  • Rebooting the machine beforehand seems to help some of the time.
  • Attempting to render a different scene file after a failed attempt seems to help some of the time.
  • It feels as if there’s some type of ‘cool down’ factor needed between attempts.
  • If the early parts of the two CL logs are out of sync by even a fraction of a second (enough to be visibly noticed) the issue won’t be triggered.,
    (edited)

Overall about a 30% success rate using these techniques.

Hey Luke

I keep seeing your struggles appear on my notifications but couldn’t log on to reply until now

I wish i could help you but it seems like you’re spending a lot of time trying to solve an issue that we also spent so much time and energy trying to solve - have you tried the royal render demo yet? I think you can use a few machines and test the system

I was just using it again this morning and thinking about all the terrible times i had with deadline and thinking of all the wonderful things i can do with my time now that i don’t have to struggle with deadline anymore

Best
Alexis

Hey Alexis,

It’s not a Deadline issue - I chatted for a week with Tony Bexley, render wrangler at Xvivo, as they were also getting this issue while using Royal Render (see his tech support post on the RR website here.

The good news is that I think I’ve found a solution by adjusting a Deadline file. This hasn’t been fully stress tested, so it might not be a solution, but early tests are promising:

In cinema4dbatch.py, I made the following adjustments; this would only work if you’re using the Batch plugin, but I’m assuming a similar adjustment could be made to the other script. In theory, it adds a delay of 0-3 seconds to each instance of CommandLine, ensuring that they are out of sync & don’t access the same files at the same moment.

1 Like

The first technique above has a flaw: if both processes on a machine generate a random number that’s close enough, they’ll still error out (as happened in further testing). This second technique below avoids that by generating a delay based on the GPU ordinal:

Essentially: Delay = GPU-Ordinal * 2 seconds
GPU0 → no delay
GPU1 → 2 Second delay
GPU2 → 4 Second delay
GPU3 → 6 Second delay

The 4-8 GPU folks will probably want to tweak that and see how low they can make the multiplier & still get away with it, but for a dual GPU no one will miss the 2 secs.

I’ve sent a test scene through the farm 18 times and so far so good.

To make the adjustment, go into Cinema4dbatch.py and adjust as follows:

    # Stagger startup based on GPU ID to avoid Redshift file access conflicts.
    gpusPerTask = self.Plugin.GetIntegerPluginInfoEntryWithDefault( "GPUsPerTask", 0 )
    if gpusPerTask > 0:
        # When GPUsPerTask is used, GetThreadNumber() corresponds to the task ID, which we can use
        # as a stand-in for the GPU ordinal to calculate a unique delay.
        gpuOrdinal = self.Plugin.GetThreadNumber()
        delay = gpuOrdinal * 2  # 2-second delay increment per GPU.
        
        if delay > 0:
            self.Plugin.LogInfo( "Luke is Delaying startup by %d seconds for GPU ordinal %d to avoid file access conflicts." % (delay, gpuOrdinal) )
            time.sleep( delay )

Hi Luke

Thanks for sharing the notes above!

Does this technique work only with batch render? I normally render with settings showed in the screen grab.

I updated the Cinema4DBatch.py but nothing has changed in regards to the missing textures on some frames. When enabling the batch plugin all renders fail.
What am I missing here?

Peter

Hey Peter,

Yes, my adjustment only works with Cinema4DBatch.py ; I hadn’t adjusted the non-batch setup yet as we don’t use it, and when i did use it this glitch rarely occurred).

I asked my coding expert Mr. Gemini what tweaks would need to be made for adjusting Cinema4d.py, and this as the result (please note: this is AI generated and has not been tested yet. However, it is in the same chat thread which generated my original result above):





“Add import time at the top of the Cinema4D.py file, for example, at line 5.”

from __future__ import absolute_import
import os
import tempfile
import time  # <<< ADD THIS LINE

from Deadline.Plugins import DeadlinePlugin, PluginType
#...

Place the delay code inside the PreRenderTasks method, right after the initial log message. Based on the file you provided, this would be at line 120.

Code Snippet:

        # Stagger startup based on GPU ID to avoid Redshift file access conflicts.
        gpusPerTask = self.GetIntegerPluginInfoEntryWithDefault( "GPUsPerTask", 0 )
        if gpusPerTask > 0:
            # When GPUsPerTask is used, GetThreadNumber() corresponds to the task ID, which we can use
            # as a stand-in for the GPU ordinal to calculate a unique delay.
            gpuOrdinal = self.GetThreadNumber()
            delay = gpuOrdinal * 2  # 2-second delay increment per GPU.
            
            if delay > 0:
                self.LogInfo( "Delaying startup by %d seconds for GPU ordinal %d to avoid file access conflicts." % (delay, gpuOrdinal) )
                time.sleep( delay )

Place it here:

# ... inside the Cinema4DPlugin class
def PreRenderTasks( self ):
    self.LogInfo("Starting Cinema 4D Task")

    # <<< INSERT THE SNIPPET HERE (at line ~120)

    self.FinishedFrameCount = 0

def RenderExecutable( self ):
#...

Hi Luke

Thank you so much for sharing this.

This helps a lot. I did some rendering over the past few days and there are way less missing frames. I’m using two machines one with 4gpus another one with 2. Do you think I should just increase the delay number or is there another factor since there are two machines?

Glad it’s helping! Try increasing the delay to 3 seconds & see what happens. Most likely there’s already a delay between the processes due to CPU speed and such, and at those times our artificial delay worked against us, and put them in sync instead of the reverse. :smiley:

The only negative is that the fourth GPU in that particular box (which would be GPU3) currently has a 6 second delay every task, and now would have a 9 second delay.