slave grabbing all queued frames, rendering none

Having a problem with slave on one of the render machines. Looks like every time a specific slave is launched from a certain machine, it will pick up a job on the stack and grab all remaining frames for that job at once, which is odd seeing as how it usually only grabs one frame (or group of frames depending on the job), however just sits on those tasks and does not spit out so much as a black frame. the slave client will claim its purging old jobs, but monitor swears its rendering these frames, with absurdly high render times. i.e. other machines will render frames from that same job at 3m/frame and this say 18h/frame. no other machines on the farm seem to be having this problem. slave also shows its jumping from job to job, most of them older jobs that were completed weeks ago, however monitor claims its picked up a current job in que.

Running Mac OS 10.6.4 2x 2.66 Dual-Core Intel/ Deadline 4.0/ Maya 2011 hotfix 1

Thanks

Ross

Hi Ross,

I noticed you also emailed our support mailing list with this question, but I guess we can continue the conversation here. :slight_smile:

This problem is usually an indication that there is a permission problem. Is the user that this particular machine logged in as different than your other working machines? If so, perhaps that particular user doesn’t have full read/write permissions for the repository.

Which OS do you have the repository installed on?

Cheers,

  • Ryan

Hey Ryan,

Thanks for the speedy response! Looks like you’re right. We’re on a pretty locked down network here and IT is off-site, so I logged in under an admin acct and it works fine now. Thanks again!

Ross

Hi,

we have a very similar issue, so I didin’t want to stat a new thread. We are testing Deadline on Fedora 10, and after submitting from nuke, the slave does not start to render: the monitor displays that all the chunks are rendering on the same machine at once, but the slave does not pick up the job. If slave started in root mode then it works, but then the rendered frames are root too, and we didn’t want that. The repository was a simple windows share, but today we tried as a samba share like mentioned in this thread: http://support.na.primefocusworld.com/viewtopic.php?f=11&t=3699&p=14431&hilit=fedora#p14431 and the permission also was set to nogroup/nobody for the whole repository. Still no luck. In root working, in user (there are only one “user” account created for all the artist machines) mode, no. What else should be set to get permission to the repository? Is there a way to ping the slave from the monitor or test the accesibilty to the repo? Any suggestions? After having bad luck with other netrenders on linux I really would like to use this software, as we had
good experience with it earlier…

Thank you very much
Gabor

Hi Gabor,

Have you tried mounting with CIFS? Seemed like that resolved the problem in the thread you referred to.

Cheers,

  • Ryan

Hi Ryan,

I’m sorry I’m not a linux specialist, maybe used wrong terms. It is mounted with cifs like this: “mount -t cifs… … uid=nobody,gid=nogroup”
Does it look ok?

Thanks,
Gabor

Maybe try removing the uid and gid overrides in your mount command. The settings in your smb.conf file on the repository machine should be sufficient to deal with that stuff. When we mount here, we are just using the username and password option to specify the credentials. Our repository is hosted on a openSuse machine and our smb.conf entry looks like this:

[DeadlineRepository]
path = /usr/local/Prime_Focus/DeadlineRepository
writeable = Yes
guest ok = Yes
create mask = 0777
force create mode = 0777
force directory mode = 0777
unix extensions = No

Hi Russel,

thanks for the info, now it’s working!

Cheers,
Gabor

I’ll dig up this old thread, because i came up with the same error, but on Fedora

the /proc/mounts i giving me:

192.168.10.26:/es1/repository /repository nfs rw,nosuid,noatime,nodiratime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.10.26,mountvers=3,mountport=811,mountproto=udp,addr=192.168.10.26 0 0

this is the mount command in our “rclocal” file:

mount -o nosuid,noatime,nodiratime 192.168.10.26:/es1/repository /repository

Samba is configured as it should, except for guest, we do have globally set “unix extensions” to “no”

The permissions for the repository are all 777.
Owner: deadline
Group: artists

I launched two slaves, and one of them grabbed all tasks from two jobs. When i entered the directory to check permissions of the files (one of the tasks) it returned:
777
Owner: 65534
Group: 65534

and well… all files of that job, even the directory of the job has this permissions. The slaves have owner of a user that started them first.

A little hint, please? :slight_smile:

EDIT: Ok, i just realised i’m doing it on old “test” jobs. I’ll remove them and create new ones from Windows and Linux. I’ll post info later.

EDIT2: looks like everything is working correctly now :slight_smile: