Softimage 2012 dropped textures

Before I detail the specifics of the problem, here’s a basic overview of our setup:

Storage - xsan volume reshared over smb via a 10.6 server
Clients - windows 7 64 pro
Nodes - windows 7 64 pro
Softimage 2012
Deadline 5.1

Instead of using mapped drives (eg Z:), we use \server\share, for both nodes and workstations.

All machines are authenticated onto that server using the same credentials and for good measure i have chmod’d the job folder in question.

I’ll also prefix this by saying that pretty much all of our maya and nuke jobs all go through fine, there are some occasions where maya drops textures, mostly tiffs, it’s not too much of a problem though we would like to sort it out at some point.

This morning we tried to submit a scene. All our nodes picked the job up, started a task, loaded xsibatch and the scene, but failed on reading some textures. It complained that the texture wasn’t in the specified place (it is). Some nodes rendered their tasks fine, but the majority errored (ERROR: 2000) and re-queued themselves.

After verifying that the textures were there and had the correct permissions, i set the error checking to false in the repository / plugins section. The tasks again submitted fine, didn’t error but dropped loads of textures. Some textures were ok in some frames, some were not, some were fine, just nothing consistent.

We fiddled around with loads of settings in soft and deadline, image handling type, scripting language prefs, submit scene option in deadline submission, but none made any difference. I’ve tried submitting the scene via deadline monitor also, same deal.

We merged the soft scene into a new project, same thing. Also checked on the max file lenght for windows, we weren’t even close to it.

As a separate test, i submitted the scene to the nodes using the xsibatch commandline. This seemed a bit more stable, but again dropped frames often.

We also created a new scene, new project, with all the same textures applied to separate cubes, again it dropped textures.

Submitted scene as suspended, closed soft to get rid of the lock file and resumed it in deadline, no difference.

As a final test, i submitted the job, but limited the number of machines it could render on to 2. Although slow, it didnt drop any frames at all. We thought we had hit on the problem, perhaps some wierd network glitch but remember that we can submit maya and nuke jobs to all 35 nodes without issue.

I’m sure there are other things that we have tried that I cannot remember, but at this point, im kinda out of ideas, so any suggestions woudl be most welcome!!

Maybe it’s a network issue that only Softimage is sensitive too. You did say that you see it on occasion with Maya, so maybe Maya is just better at handling the situation…

Do you have an extra server that you could try uploading some textures? You could then create identical test scenes (one that uses textures from your xsan, and the other from the test server) to see if both exhibit the same behavior when rendering on the farm. Maybe it’s a server issue, and this test might help confirm/deny that.

Also, are you able to upload a log from a job where the textures were dropped? You can find the log by right-clicking on the job in the Monitor and selecting Job Reports → View Log Reports. Maybe it will contain some information that could help explain the problem.

Thanks!

  • Ryan

additionally, what kind of textures are they? tif, exr, jpg etc or a mix?

i was just with a client who has a recurring problem with a number of applications using tif - it appears that some tiff library that [some] developers use times out on the network, or that was the gist of their problem more or less. just curious if it could be something like that. if this is the case, then you might want to use staggered loading in deadline

cb

Yeah, we have had problems with tiffs in the past, in this case we have a mix of jpeg, tif, png, exr and hdr. Tiff’s aren’t he only textures that soft seems to being dropping though.

I’ve also set the pulse settings>Throttling to only copy the job to 2 nodes at a time to try and reduce the strain on the network, but there was no difference.

Attached are the logs from the most recent job submission.

I wanted to connect everything via nfs, but it seems that windows 7 pro, doesnt include an nfs option, only server, enterprise and ultimate. I’m just in the middle of setting up a windows 2008 r2 server which ill connect via nfs to the san and use that to share out over smb.
metro_logs.zip (279 KB)

Thanks for the logs. Yeah, there really isn’t any additional info in them that’s useful. I googled that error message, and it looks like a standard mental ray error. I couldn’t find any specifics regarding Softimage.

Let us know how the Windows 2008 server tests go!

Cheers,

  • Ryan

Just a quick update on this thread. It seems that the problem was due to oplocks on various textures. To cut a long story short, we had a series of power cuts, which seemed to have put false oplocks on texture files that were open on workstations that were connected via smb at the time of the power cut.

After disabling oplocks and strict locking on the xserve sharing out via smb and restarting the service, the locks disappeared and we are now able to render the scene across all the farm.

So, after a week or two of pain, my advice is to not use oplocks, especially if you are accessing those files via other protocols.