Slave won't pickup job

Hi,

I’m evaluating Deadline for 3D max 2009.

I’ve submited a test job (just a single teapot).

The monitor sees the slaves but the job won’t become active.

Hi Shaun,

There could be a few reasons why a slave won’t pick up a job:

  1. The job was submitted to a pool or group that the slave hasn’t been assigned.
  2. The job has a machine limit, and that limit is currently maxed out.
  3. The job has been assigned limit groups that the slave is blacklisted for it.
  4. The job has run on the slave machine, but produced errors.
  5. The job has been submitted in the Suspended state.

It’s likely (1). If it’s (4), then you can view the errors by right-clicking on the job and selecting Job Reports -> View Error Reports. If you don’t know what to make of the error, feel free to post it here.

Let us know if this helps!

Cheers,

  • Ryan

Thanks for the quick response.

The job was assigned a pool and group of None. The machine limit is 2, and I only have 2 slaves. I don’t have any slaves blacklisted.
I can’t see that any errors have occured. The job’s status is Queued and I can see the job files in the Repository. I flipped it to Suspended and back to Resume just to double check.

Could there be something wrong witht he slaves? The monitor lists them as Started.

Thanks again.

Hmm, that’s strange. First, we should enabled slave verbose logging, which can be done from the monitor while in super user mode by selecting Tools -> Configure Repository Options. You can find the slave verbose logging under the Logging section. Then restart the slave and watch it to see if it prints out any error or warning messages while it is searching for a job. Normally, if a slave can’t find a job to render, it will print out something like this:


Scheduler - Pulse has not been configured. This can be done from the Repository Options in the Monitor.
Scheduler - Job chooser found no jobs.
starting between task wait - seconds: 12
Scheduler Thread - performing house cleaning…
Scheduler - Pulse has not been configured. This can be done from the Repository Options in the Monitor.
Scheduler - Job chooser found no jobs.
starting between task wait - seconds: 11

Let us know what you see!

OK. I tried to enable the slave verbose logging from the monitor, but when I did it crashed. I shut the slaves down, relaunched the monitor and tried again but I got another crash.

Also, when the slave machines were turned off, the monitor still thought they were up.

Can you send us the Monitor log after it crashes? On XP, you can find the logs here:
C:\Documents and Settings\All Users\Application Data\Frantic Films\Deadline\logs

If the slaves are crashing too, then they don’t have a chance to update their status when shutting down, so that would explain why they’re still appearing as online in the Monitor.

I’m starting to wonder if there is are permission issues with accessing the Repository. The Monitor log will likely give us a better idea, but you might want to double check the permissions on the repository share. You may have to set the permissions under the Sharing and Security tabs to that Everyone has read/write access.

Cheers,

  • Ryan

I’m on Vista, but I think I found it. Thanks.

2008-12-23 08:13:14: BEGIN - DESIGNWS1\sswanson
2008-12-23 08:13:14: Start-up
2008-12-23 08:13:14: Deadline Monitor 3.0 [v3.0.33353 R]
2008-12-23 08:13:14: 2008-12-23 08:13:14
2008-12-23 08:13:17: Repository time: 12/23/2008 08:13:17
2008-12-23 08:13:27: Enqueing: &Refresh Slave
2008-12-23 08:13:27: Dequeued: &Refresh Slave
2008-12-23 08:13:34: Enqueing: Super User Mode
2008-12-23 08:13:34: Dequeued: Super User Mode
2008-12-23 08:17:35: Enqueing: &Refresh Slave
2008-12-23 08:17:35: Dequeued: &Refresh Slave
2008-12-23 08:18:37: Enqueing: Configure &Repository Options…
2008-12-23 08:18:37: Dequeued: Configure &Repository Options…
2008-12-23 08:19:02: Caught unhandled exception: Cannot access a disposed object.
Object name: ‘MainWindow’. (System.ObjectDisposedException)

Hmm, that’s not a permission-related error. There have been some known issues with Vista, but they only affected the slave application (the workaround for now is to start the slave from a dos prompt - this problem will be fixed in the next release). This is the first we’ve heard of this particular problem.

With the changes we’ve made for Vista in the 3.1 beta, it would be interesting to see if the problem persists in the latest beta release. If you would like to join the 3.1 beta to try it out, let us know!

We got it!

We had to set more permissions for the repository. It’s humming along now.

Thanks for the fast help!

Great! Glad to hear it!

Cheers,

  • Ryan