AWS Thinkbox Discussion Forums

Deadline and Network Gurus - 3dsmax2012 crashing on the rend

I’ll get straight to the point.

3dsmax 2012 Sp1 with both hotfixes on workstation and on farm
deadline 5
fumefx.2.1 and fumefx2.1sl on farm

I am trying to render a large 3dsmax scene file. When I submit it to the farm, the render goes fine until after it begins to try and load atmospherics in fumefx which are about 800mb per. Then 3dsmax will crash, and that frame will fail. Now heres where the confusing part comes in. When I try and resubmit the job to just one machine, it renders out okay, but when it’s across the 20 render machines it errors about 500 times. Also, I resubmitted the job without atmospherics enabled and the render goes fine. I have tried to disable in deadline where it closes 3dsmax between frames, that didn’t help.

I cannot determine where the problem is, and rendering such a large file on just one machine isn’t a option I want to precede with because it takes all day.

Our network is built to handle this kind of workload. We have an icilon cluster which has all of the files that are being referenced, with all the machines networked through 2 pro-curve switches with jumboframes 9k enabled. I had a local company GPL help me set everything up and so far everything has been very fast without any downtime.

Maybe it’s a problem with deadline? A few months ago I was having another 3dsmax plugin problem that was resolved with a repository amendment.

Any questions or sujestions please feel free to place your input. My guess it’s deadline5 having a bug with this particular situation. Or simply, something might be wrong in the network where when all the machines are trying to reference this 800mb fumefile, 3dsmax crashes.

UPDATE: So duplicated the job 7 times and assigned each task to a separate machine on a different frame range. After submitting, the machines once again errored out which is leading me to believe it has something to do with accessing that fumefx file.

Interesting. Could you post a sample error log, so we can see what the error looks like? Also, when you tried splitting the job, did you try to stagger the start of each one, so that each slave wasn’t trying to load the resources all at the same time? The only thing I could think of being an issue from the Deadline side of things is multiple slaves trying to load the same large resource at once; though having it sitting on an Isilon cluster with the proper network setup really should be preventing that from being a problem…

Cheers,

  • Jon

Hi,
Do you have a backup of the FumeFX file which you could use to compare if the sim files are the same? I’m thinking that when 3dsMax went down it corrupted the sim file(s) at the same time?
Just checking, but you do have the FumeFX licensing correctly setup to support simulating on a farm? ( which is a different license to network rendering with MR, etc)
HTH,
Mike

Here’s a link to the Deadline Error. pastebin.com/XwxmeHnH

I have just tried running just the fumefx grid lines across all the machines and they are rendering fine. I am also in the process of copying over the sim data to our old fiber raid to see if that makes any difference.

@Mike we have 4 machines dedicated as our sim machines with proper sim licenses like you were noting. The sim has already been created though.

It has something to do with it possibly interacting with the rest of the scene.

I figured it out.

It was a permissions issue. Apparently, the /render roaming profile accounts didn’t have write privelages to the sims drive folders. When I changed it, everything is now going smoothly. What I cannot figure out is why when I did one machine at a time it worked.

thanks for the input guys.

Cheers,
Dax

Good to hear you got it working!

Cheers,

  • Jon

Cool. Pleased to hear it’s all sorted.
We stopped using roaming profiles as they were quite unreliable. Perhaps this was the original cause of the issue?
Regards,
Mike

In what regards have you found them to be unreliable? (Is there something I should watch out for)

I am enjoying the centralization of active directory allot, these types of permission problems seem to be happening more often than I would like. So In the context of problematic I would agree. The roaming profiles seem to having syncing problems when in reference to mac os x.

Okay guys, so I submitted the job again last night and the same error has come back. I am lost for ideas at this point.

Hi,
When you updated the /render user profile permissions for the sims directory, did you force any machines that were logged in under this profile to restart to ensure they all received the latest credentials? Syncing roaming profile privileges is the kind of thing especially across different OS platforms where I use to see strange things happen. Also, I remember my guys wanted to do silly things like dump 20GB to the roaming profile desktop and then wonder why it was taking ages to login to another machine with this roaming profile! I understand their usefulness but I experienced nothing but pain with them. It might be worth disabling this roaming profile on our network to see if your setup stabilises?
HTH,
Mike

It sounds a lot like a file is being written to and others can’t access it.

Running the job through 3DS Command via the Monitor will allow you to find out what arguments Deadline’s using so you can run it yourself. Then view the log to pull the executable and arguments.

If you run a job through, do ANY of the machines actually get a job done (in other words, do the job report numbers equal the error report numbers)? If it’s a file in use, the first renderer will get there and at least get something done.

I have narrowed the problem down now to afterburn. It’s a plugin from the same company that makes fumefx, but with afterburn there are no alternate versions.

Hmm… With roaming profiles, where is the temp folder located? Normally it would be under the current user’s profile. Might they be sharing the same network location? I guess the fastest way to find out would be to type %temp% into the run box.

I think most of them restarted last night, but I will make sure to do that. I could try and disable roaming profiles on a couple.

C:\Users\RENDER~1.FUS\AppData\Local\Temp I believe thats a local account for the temp folder, each of the machines is configured with the same local.

What’s weird is I can render afterburn and fumefx together without the rest of the scene in place just fine. It’s when I combine it all together with geometry then it starts eroring.

Hi,
Could you confirm the exact version of both FumeFX and Afterburn you have in place for Max 2012?
ie: v2.1c for FumeFX
Also, I don’t see anywhere you have mentioned what flavour of Windows you are running? If it’s Win7 OS; have you by any chance recently deployed Windows 7 sp1 as this causes issues with slightly older versions of FumeFX, which makes the MR component of FumeFX dll crash out 3dsMax. This might also be causing a smiliar issue for Afterburn? (Don’t use Afterburn, so just a guess). Finally, have you posted to the Fume/Afterburn support forum as they may be able to narrow this issue down?
Mike

Hey Mike,

FumeFX is FXSL21R2012 on the farm and on the workstations it’s FumeFX21R2012 not SL.

I am running Windows 7 Service pack 1 on the machines. I remember running into that problem where fumfx wouldnt work with sp 1. But I thought this new version of fumefx fixed it.

Afterburn is Afterburn40dR2012.

Privacy | Site terms | Cookie preferences