Max 2012 submission error with jobs over 100 frames

We are getting a strange permissions error when our artists try to submit a job with more than 100 frames of animation.

Error: Access to the path ‘\ttuc-3d01\deadlinerepository\temp\000_070_999_61f52e37’ is denied. (System.IO.IOException)

This only happens if the animation is over 100 frames long and using the Range as the time output in the render settings.
If the active time segment is checked and set to the desired frames the job will submit but only up to 600 frames then it fails. The only work around we have found is to manually submit the job through deadline.
We are running deadline 5 with Max 2012 sp2. Any help is appreciated, thanks.

It seems the problem affects manual submissions as well. Starting with a fresh scene, it is successfully submitted, but, if the output path is changed when resubmitted it will fail with the same error.
This seems to be only reliably repeatable on one user’s station, and intermittently on other animator’s workstations.
We do not think this is related to read write permissions as the repository machine allows everyone access to everything.
We created a new repository from scratch to replace the old one and the issue persists.
We checked to make sure the destination drives are not full.
His log file is attached.
SubmitMaxToDeadline - [ZTURA049274] - 10-3-2011-0000.log (433 KB)

Which OS do you have the repository installed on? It sounds like a connectivity problem with the repository…

Thanks!

  • Ryan

we have win7 64 workstations and the repository is windows server 2008

the user can consistently browse to the share on the repository, the user has full admin rights to all the folders on the repository and can access them all from the workstation.

the failures seem to be related to frame range vs active time segment selections and frame counts above and below 400.
I am documenting submission tests to give you success and failure conditions related to these 2 variables, i will send this as soon as the user runs through the permutations to create a more clear picture of the situation.

we are thoroughly stumped.

Here are the conditions, only the active / range options and frame counts were changed here, everything else was consistent.

new max scene
defualt start up scene

active time 0-800 work
range 0-100 work
single frame work
active time 0-800 fail
range 0-100 work
active time 0-800 fail
active time 0-200 work
active time 0-500 work
range 0-800 work
active time 0-600 fail
single frame work
range 0-600 fail
range 0-300 fail
single frame work
range 0-100 work
range 0-200 work
range 0-300 fail
range 0-299 work
active time 0-100 work
active time 0-200 work
active time 0-300 fail
active time 0-200 work
active time 0-299 fail

Thanks for the info. We just wanted to make sure you weren’t hitting the Windows connection limitation issue, and since you’re running Server 2008, that wouldn’t be the case.

Based on your test results, I think it’s safe to assume the problem is pretty random. The way the job is submitted is the same regardless of how the frames are specified, so it’s probably a more general issue. Is this something that recently started occurring, or have you been experiencing this problem for a while? Have you tried rebooting the repository machine to see if that helps?

The problem started last Thursday. As far as I know nothing was done to the render farm or the workstation other than everyday use. The repository machine has been rebooted since then. We have uninstalled and reinstalled Max and all it’s associated plug-ins and service packs, as well as deadline on the workstation. Nothing seems to help. I am not sure what we can try next. Thanks for your help.

Thanks for the additional info. Here are a few things you can check:

  1. Were any Windows updates applied to the Repository machine, or the workstations?
  2. What does the disk space look like on the Repository machine. We’ve seen strange things happen when it gets to capacity.

Also, if you have another server available, it might be worth setting up a test repository and connect a few slaves/workstations to it to see if the problem persists. Maybe the issue is hardware related, and if things work fine on a different server, that could help confirm that.

As far as we know there were no updates the corresponded with the issue. The repository machine was at capacity, we cleaned it out and rebuilt the repository, however the issue persists. There seems to be a correlation with the number of frames and the failures. Is there any reason that a larger frame count should fail more frequently? Are jobs that have larger frame counts written and sent differently? Any help is much appreciated, thanks.

There is no difference with how the jobs would be submitted, other than a job with more frames needs to create more task files. When a job is being submitted, the temp folder is created and the job xml file and other files get created here. When finished, the job is then moved to the main jobs folder in the repository. The task files are also created during this process. So a job with more task files will need access to that temp folder longer because it needs to create more files.

So it would appear that at some point during the task file creation, access to that temp folder gets denied. I guess jobs with larger frame counts would be more susceptible to the problem simply because there is a larger window for the issue to occur.

So the question is, what could be causing this interruption? Do you have any processes running on your server (like firewall or antivirus) that walk the repository and check files?

We just got off a conference call with our “extensive” IT experts trying to diagnose this IO error.
The networking and OS guys are saying its an application issue where a temp file is being written to and read from simultaneously. The log files on the server do not correlate to the error the submitting machine is reporting. The server does not see this as an error.

They would like to setup a conference call with someone that is more of an expert that I am to discuss how to solve this.

My desk phone is 520-794-2518 and my email is troy_wuelfing@raytheon.com
my cell is 520-425-1768

please get in touch with me and let us know how to setup a time to call.
Also if we have to pay for this support I am sure that would be fine just let me know what we need to do.

more information

we built a new repository on a different machine we had laying around, its windows server 2003
its a different OS and only has 2 slaves (no lic file for this), dont think this matters

we immediately got the same exact error on the new hardware

Error: Access to the path ‘\ttuc-rc2\deadlinerepository\temp\000_070_999_XXXXXXXXX’ is denied. (System.IO.IOException)

our guys dont think its a network issue and I dont know enough to tell them what could be wrong with the network to cause this.
Seems reasonable that we have eliminated the OS as the problem as we reproduced it on 2 different machines
the only other constants are the software or the network. Nothing was changed in the software, and I have no idea what the network infrastructure looks like as this company is huge (20,000 people) and IT deals with all that for us.

is there perhaps a beta of 5.1 we can try so we can generate more evidence.
can you propose what deadline may not like about our network so I can have them look into it.
also they still want a conference call

thanks,

Here’s what we’ll do then. We’ll try to build a custom deadlinecommand app from our 5.1 beta branch that prints out a bunch of debug messages while submitting the job, and we’ll also include a batch file that can be used to invoke a max submission. Hopefully this can help us pinpoint the exact location that the error occurs. If it constantly occurs in the same spot, then I can’t imagine it would be difficult to find a workaround in time for the 5.1 release.

We’ll also get our tech guy to give you a call today.

Cheers,

  • Ryan

Attached is a custom deadlinecommand.exe bundle that you can use to try and reproduce the problem. Just unzip the attached file to the deskop on a workstation that the problem usually occurs on. Then open the Release folder that gets extracted and run the _submit.bat file. This will submit a max 2012 job with 1000 frames to Deadline, so I would expect it to reproduce the problem on a pretty regular basis.

This build of deadlinecommand.exe will include the stack trace if an error occurs, so take a screen shot of the console window and post it when it does.

Cheers,

  • Ryan

Ryan,
Here is the screen shot.

Thanks for testing that. Now it would be interesting to see if any files in that folder are open. You can do this from the server machine. From the Start menu, right-click on Computer and select Manage. In the panel on the left, expand System Tools and then expand Shared Folders. If you click on Open Files, it should show if any files are open (see screenshot).

If anything in that particular temp folder is still open, let us know which file(s) specifically.

Thanks!

  • Ryan

here is a text version of a subsequent error:

C:\Users\vaa3622\Desktop\Release>".\deadlinecommand.exe" “max_submit_info.job” "
max_job_info.job" “mental_ray_test.max”
Deadline Command 5.1 [v5.1.0.45496 R]

Submitting to Repository: \ttuc-3d01\deadlinerepository

Submission Contains the Following:

  1. Auxiliary File #1 (“max_job_info.job”)
  2. Auxiliary File #2 (“mental_ray_test.max”)

JOB WARNINGS:
Contact settings have not been set for user ryan. This can be done in Deadlin
e Monitor options.

Error: Access to the path ‘\ttuc-3d01\deadlinerepository\temp\999_050_999_3b265
4f8’ is denied. (System.IO.IOException)
at System.IO.Directory.Move(String sourceDirName, String destDirName)
at Deadline.Storage.JobStorage.SubmitJob(Job job, Task[] tasks, String[] auxi
liarySubmissionFilenamesArray, String alternativeJobDirectory, String alternativ
eAuxiliaryDirectory) in C:\Users\Ryan\Development\DeadlineProject\DeadlineProjec
t\Deadline\Storage\JobStorage.cs:line 511
at Deadline.Controllers.DeadlineController.SubmitJob(Job job, Task[] tasks, S
tring[] auxiliarySubmissionFileNamesArray, String alternativeJobDirectory) in C:
\Users\Ryan\Development\DeadlineProject\DeadlineProject\Deadline\Controllers\Dea
dlineController.cs:line 4031
at Deadline.Submission.SubmissionUtils.SubmitNewJob(Hashtable htInfo, StringC
ollection2 auxillarySubmissionFileNames, String jobRepositoryDirectory) in C:\Us
ers\Ryan\Development\DeadlineProject\DeadlineProject\Deadline\Submission\Submiss
ionUtils.cs:line 200
at Deadline.Submission.SubmissionUtils.SubmitNewJob(String[] args) in C:\User
s\Ryan\Development\DeadlineProject\DeadlineProject\Deadline\Submission\Submissio
nUtils.cs:line 87
at Deadline.Submission.Submit.Perform(String[] args) in C:\Users\Ryan\Develop
ment\DeadlineProject\DeadlineProject\Deadline\Submission\Submit.cs:line 82

C:\Users\vaa3622\Desktop\Release>pause
Press any key to continue . . .

It succeeded in submitting a few times in-between.
Thanks again for all your help.

Did you check for any open files from the server end?
viewtopic.php?f=11&t=6303&start=10#p25410

I just want to make sure you didn’t miss my previous response when you originally posted the screen shot.

Cheers,

  • Ryan

There are a bunch open I took screen shots.


I don’t see anything from the temp folder. If a locked file was preventing the move, you’d think it would show up in this list. I wonder if it makes sense to try to move it multiple times. If the failure is random, maybe this could help workaround the randomness. We’ll try to think of ideas here and see if we can get them into a new deadlinecommand build for testing.

Cheers,

  • Ryan