Can`t find job!

Hello. Help me please with my strange problem. I send my job to deadline. Then our renderfarm begin render. And then slave is stopping render and prints this message (look attachment) in the middle of render task. What is going on? :slight_smile: Repository server is ok.

It sounds like there is a latency issue between the slave machine and the repository. A couple of questions to start:

  1. Which OS is the repository machine running (if it’s Windows, is it Windows Server or a non-Server edition)?
  2. Are you running Pulse?
  3. If you are running Pulse, do you have Task Confirmation enabled? (thinkboxsoftware.com/deadlin … e_Settings)

Just an FYI that we are working on solving this latency issue in Deadline 6, so this particular issue should be a thing of the past.

Cheers,

  • Ryan

rrussell, thanks for reply.

  1. It`s Windows 7 Ultimate. Non-Server edition
  2. Pulse is off. (There are just 17 slaves in our net and I thought that pulse is not necessary in this case (?))

This problem occurred only on render farm (5 slaves (Windows 7 Ultimate)). On others PC in net (more 12 PC`s), there are no any problem with that.

Pulse shouldn’t be necessary in this case, I was just asking because the latency problem seems to happen more often when Pulse is enabled.

I am a bit concerned that it’s a non-server edition of Windows. How many non-slave machines would you say you have connecting to the repository machine? Non-server editions of Windows have a 10 connection limit, and it is possible for a machine to have more than 1 connection open at a time. If that connection limit is reached, others can become disconnected without warning, and that can cause all sorts of problems.

Cheers,

  • Ryan

I thought Windows 7 has 20 limit on connection. Am I wrong? We have 10 PCs with deadline slave tool on it, and one render farm with 5 nodes. In other words we have 15 deadline slaves in net. And problem with "cant find job" is occurred only with this 5 nodes (one deadline slave tool on each of them). Repository installed on one of PC with Windows 7 on board.

re: windows with 20 - i can’t find the right answer. it seems that SMB connections are 20, but via file/print sharing. TCP connections are another thing - i spent a few minutes searching and was not able to clarify.

whatever the case, i would recommend windows server for >10 machines, or a linux variant if you dont want to pay for licensing.
we have many client successfully using linux machines as their repository

cb

Well, we have found our problem, but it is not solved yet. Problem is that our render farm begins render job from 1 hr task time (look attachment, job lasts about few seconds on this screen ). And we don`t know why. PC shows normal Task Time. We solved this trouble temporarily by increasing the value in “Number of minutes a Slave must be unresponsive before it is considered Stalled” to 100.

Our admin says that Windows 7 Ultimate has no connection limit.

And one more question. Tell me please, why when we install Deadline with “Install Launcher As Service” checkbox, slave does not work. :unamused:

Thanks.

Is the date the same on all machines?

Cb

Yeah, it sounds like a date/time sync problem. It looks like there are false positives for stalled slave detection, which can happen if there are one or more machines that are out of sync. This is a problem we are addressing in Deadline 6.

There could be a few reasons. First, which account is the service using? If it’s the local system account, that would be your first problem. Try switching it to the account you login as when you run the slave in interactive mode. If you are already using a network account, let us know and we’ll go from there.

Cheers,

  • Ryan

cbond, rrussell, thanks for your help.

Was that it? Time or date?

We don`t know yet. Date on render farm is ok - it first thing what we checked. We keep searching. We will post here if we will find something.

do you want edwin or something to call you, do a gotomeeting?

cb

cbond, thanks, but we have found cause. It is different time settings. Time on PC`s and render farm is same. But on host machine UTC is +4:00 (Windows 7 Prof) and on render farm is UTC +3:00 (Windows 7 Ultimate) in same time zone (look attachment). Do you know, is it possible change it? :slight_smile:

We can set time zone on render farm to Kuwait, maybe :slight_smile:

It`s ok now. Abu Dhabi !

Most frequent cause of slaves running as services not starting up is the username/password not entering correctly for some reason. Windows can be really picky about exactly how it’s entered with domains. We often have to manually enter the login info using the services interface instead of through the slave installer. Something to try if you’re on a domain and the service isn’t starting. Although if you’re running windows 7 ultimate that’s probably not the case.