Redshift Error: ASSERT FAILED, CFile::Close() ... failed to close file handle. Error: 5

Hey all!

We’re running into an occasional issue when rendering C4D jobs with Redshift. Frames will usually render fine, until at some point it will just start spitting out a black .tif file for every frame.

When checking the logs for those jobs, we see an error that says:

2024-02-22 19:32:49:  0: STDOUT: Redshift Error: ======================================================================================================
2024-02-22 19:32:49:  0: STDOUT: ASSERT FAILED
2024-02-22 19:32:49:  0: STDOUT: File /Users/redshift/redshiftP4/branches/35_release/SourceCode/BaseLib/Common/File.cpp
2024-02-22 19:32:49:  0: STDOUT: Line 2164
2024-02-22 19:32:49:  0: STDOUT: CFile::Close() for file '/Volumes/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx/xxxxxxxxxxx/xxxxxxxxxxxxxxxxxxx/xxx/xxxxxxxxxxxxxxxxxxxxx.jpg' failed to close file handle. Error: 5
2024-02-22 19:32:49:  0: STDOUT: ======================================================================================================
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug: Call stack from /Users/redshift/redshiftP4/branches/35_release/SourceCode/BaseLib/Common/File.cpp:2164:
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug:     1   libredshift-core.dylib              0x000000014907878f _Z22RS_TextRenderer_DeleteP14RSTextRenderer + 594063
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug:     2   libredshift-core.dylib              0x000000014907b79a _Z22RS_TextRenderer_DeleteP14RSTextRenderer + 606362
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug:     3   libredshift-core.dylib              0x0000000148a3324a _Z15RS_Denoise_OIDNjjiPfiPKfiS1_iS1_P18RSProgressReporter + 38922
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug:     4   libredshift-core.dylib              0x0000000148a34ade _Z15RS_Denoise_OIDNjjiPfiPKfiS1_iS1_P18RSProgressReporter + 45214
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug:     5   libredshift-core.dylib              0x000000014908623c _Z22RS_TextRenderer_DeleteP14RSTextRenderer + 650044
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug:     6   libsystem_pthread.dylib             0x00007ff811db01d3 _pthread_start + 125
2024-02-22 19:32:49:  0: STDOUT: Redshift Debug:     7   libsystem_pthread.dylib             0x00007ff811dabbd3 thread_start + 15
2024-02-22 19:32:49:  0: STDOUT: libc++abi: terminating due to uncaught exception of type char*
2024-02-22 19:32:49:  0: STDOUT: ======================================================================================================
2024-02-22 19:32:49:  0: STDOUT: ASSERT FAILED
2024-02-22 19:32:49:  0: STDOUT: File /Users/redshift/redshiftP4/branches/35_release/SourceCode/BaseLib/Common/File.cpp
2024-02-22 19:32:49:  0: STDOUT: Line 2164
2024-02-22 19:32:49:  0: STDOUT: CFile::Close() for file '/Volumes/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx/xxxxxxxxxxx/xxxxxxxxxxxxxxxxxxx/xxx/xxxxxxxxxxxxxxxxxxxxx.jpg' failed to close file handle. Error: 5
2024-02-22 19:32:49:  0: STDOUT: ======================================================================================================

I’m running through some tests right now to try and glean more information, but was wondering if anyone had any ideas simply based off the error and behavior so far.

Also, if there’s anything that would be helpful for me to check or share, I’m definitely open to ideas. :slightly_smiling_face:

Given you’re getting a black frame back, maybe it’s having issues reading textures from the file server? Is there any issues on the file server around the time these reads fail?

If you can trigger the error with the troubleshooting guide you could start isolating it from Deadline, but give it’s random you may need Deadline to help out in testing.

Hey Justin!

Given you’re getting a black frame back, maybe it’s having issues reading textures from the file server? Is there any issues on the file server around the time these reads fail?

No issues that are apparent. The file server has had a relatively low load, so it’s not like it was getting slammed or something.

I’ve been trying to find a way to reliably get this error to trigger again, but of course it doesn’t seem very straightforward. I let the job render on two workers overnight and there were no issues whatsoever.

However, when I added the rest of the workers this afternoon (only six workers total), then I got another ASSERT FAILED error. Different from the first one, and it happened towards the end of the render task.

2024-02-23 14:05:26:  0: STDOUT: Redshift Error: ======================================================================================================
2024-02-23 14:05:26:  0: STDOUT: ASSERT FAILED
2024-02-23 14:05:26:  0: STDOUT: File /Users/redshift/redshiftP4/branches/35_release/SourceCode/Renderer/LargeMemPool.cpp
2024-02-23 14:05:26:  0: STDOUT: Line 246
2024-02-23 14:05:26:  0: STDOUT: Error while getting alloc length. PageID out of range (was 16383, m_numPages: 1)
2024-02-23 14:05:26:  0: STDOUT: ======================================================================================================
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug: Call stack from /Users/redshift/redshiftP4/branches/35_release/SourceCode/Renderer/LargeMemPool.cpp:246:
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     1   libredshift-core.dylib              0x00000001525bb0f3 _Z15RS_Denoise_OIDNjjiPfiPKfiS1_iS1_P18RSProgressReporter + 9907
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     2   libredshift-core.dylib              0x00000001527f62d4 _Z35RS_Denoise_OptiXNeedsARenderRestartv + 1208484
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     3   libredshift-core.dylib              0x000000015283c21a _Z35RS_Denoise_OptiXNeedsARenderRestartv + 1495018
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     4   libredshift-core.dylib              0x00000001528dae5e _Z35RS_Denoise_OptiXNeedsARenderRestartv + 2145326
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     5   libredshift-core.dylib              0x0000000152573667 _Z25RS_Denoise_Altus_DualPassjjiPfiS_S_iS_S_iS_S_iS_S_ffffb + 286167
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     6   libredshift-core.dylib              0x0000000152575eb9 _Z25RS_Denoise_Altus_DualPassjjiPfiS_S_iS_S_iS_S_iS_S_ffffb + 296489
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     7   libredshift-core.dylib              0x00000001526156a1 _Z15RS_Denoise_OIDNjjiPfiPKfiS1_iS1_P18RSProgressReporter + 380001
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     8   libredshift-core.dylib              0x000000015262ddd6 _Z15RS_Denoise_OIDNjjiPfiPKfiS1_iS1_P18RSProgressReporter + 480150
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     9   libredshift-core.dylib              0x00000001528b35f3 _Z35RS_Denoise_OptiXNeedsARenderRestartv + 1983427
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     10  libredshift-core.dylib              0x00000001528b7fca _Z35RS_Denoise_OptiXNeedsARenderRestartv + 2002330
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     11  libredshift-core.dylib              0x00000001528eae58 _Z38RS_Renderer_DisableCUDAShutdownAssertsv + 10024
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     12  libredshift-core.dylib              0x0000000152922130 _Z41RS_Renderer_GetShaderDebuggingModeEnabledv + 104112
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     13  libredshift-core.dylib              0x0000000152c1523c _Z22RS_TextRenderer_DeleteP14RSTextRenderer + 650044
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     14  libsystem_pthread.dylib             0x00007ff80658f1d3 _pthread_start + 125
2024-02-23 14:05:26:  0: STDOUT: Redshift Debug:     15  libsystem_pthread.dylib             0x00007ff80658abd3 thread_start + 15
2024-02-23 14:05:26:  0: STDOUT: libc++abi: terminating due to uncaught exception of type char*
2024-02-23 14:05:26:  0: STDOUT: ======================================================================================================
2024-02-23 14:05:26:  0: STDOUT: ASSERT FAILED
2024-02-23 14:05:26:  0: STDOUT: File /Users/redshift/redshiftP4/branches/35_release/SourceCode/Renderer/LargeMemPool.cpp
2024-02-23 14:05:26:  0: STDOUT: Line 246
2024-02-23 14:05:26:  0: STDOUT: Error while getting alloc length. PageID out of range (was 16383, m_numPages: 1)
2024-02-23 14:05:26:  0: STDOUT: ======================================================================================================

You can see it was rendering up to frame 231/240 when it cacked. :sob:

It’s weird that the error messages mention /Users/redshift, because that’s not a user or user folder we have. It feels like that’s a relic of the Redshift code or something?

And this particular error makes it sound like there was an issue related to memory allocation or something. Could that result in ASSERT errors?

So we have inconsistent errors, and inconsistent repeatability. Our favorite kinds of problems to tackle, right? :grin:

1 Like

I think you’re right, this does look like the path to the file Redshift was built against. I’d see what Redshift support think as well!

At the bottom of task reports we say how much the peak RAM usage was, maybe that’s what’s happening? That is, the .jpg is trying to be shoved into RAM and there’s no room? But that only happens after multiple renders where RAM isn’t released in time, resulting in random failures?

All guesses, but if we can figure out the pattern we can figure the rest out.