Slaves stick on 'Initializing' 3DS Max 2012 & DL 6.0

Hi Folks,

I’ve recently upgraded our render farm to Deadline 6 and am unfortunately encountering an error when trying to render 3DS Max scenes. It doesn’t happen all the time and isn’t tied to specific slaves, it just seems to be random.

When a job is submitted to Deadline, all seems to be going well and all slaves report their status as ‘Rendering’. However, when I remote log-in to slaves which seem to be taking far too long to render a frame, I see this:

Trying to control the slave remotely from Deadline Monitor doesn’t work. The only solutions I’ve found are to either log-in to the slave machine, kill Deadline Launcher and then restart the Slave or to restart the machine.

I couldn’t find any information in the slave reports, as Deadline seems to think its working fine. However, when connecting to the slave log on one of the affected slaves, it gave me this message:

 could not resolve host name name (i7FARM077) to IP address. The machine may not exist on the network.

Once I killed Deadline Launcher and restarted Deadline Slave, all was well again. This has only cropped up since the upgrade to Deadline 6 and wasn’t a problem previously.

All slaves are Windows 7 x64 machines running Deadline 6 and rendering 3DS Max 2012 scenes. Please let me know if there’s any more information I can give which might help resolve this issue.

Thanks for your time and all the best,

Andy

Hi,
Sounds like you have a network DNS issue at your office as Deadline is struggling to find the IP address of machine “i7FARM077”.
What happens when you open a command prompt and type in “ping i7FARM077”. Does it resolve and respond?
Mike

Hi Mike,

Thanks for the reply.

Pinging that particular Slave does get a response as seen below:

[code]Microsoft Windows [Version 6.1.7601]
Copyright © 2009 Microsoft Corporation. All rights reserved.

C:\Users\Andy>ping i7farm077

Pinging i7farm077 [192.168.2.128] with 32 bytes of data:
Reply from 192.168.2.128: bytes=32 time<1ms TTL=127
Reply from 192.168.2.128: bytes=32 time<1ms TTL=127
Reply from 192.168.2.128: bytes=32 time<1ms TTL=127
Reply from 192.168.2.128: bytes=32 time<1ms TTL=127

Ping statistics for 192.168.2.128:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms

C:\Users\Andy>[/code]

I’ll try it next time one of the slaves is playing up and see what happens.

Thanks for your help :slight_smile: ,

Andy

Just to update on the above:

Once again I found 20-odd slaves stuck at the initializing stage overnight, claiming to have been ‘Rendering’ for 16 hours! I tried pinging a couple whilst they were still in this state but didn’t have any trouble connecting. Going in to super user mode and restarting the machines via remote control worked but it’s far from the best solution.

Do you think it’s still most likely to be a network DNS issue rather than a Deadline one? As an artist, I know very little about our network set up, so wondering if I should suggest we get someone in to take a look?

Thanks for your time,

Andy

The next time a machine gets stuck like this, can you go to C:\ProgramData\Thinkbox\Deadline6 on the machine, zip up the logs folder, and post it? We can check to see what the slave is doing by looking at its logs.

Thanks!

  • Ryan

Hi Ryan,

Well I think we’ve found the issue. I’ve configured Idle Shutdown Mode on the slaves so they’ll suspend if they’ve not received a job after 60 minutes. A work college suggested that it might be that Deadline Slave is launching before the machine itself has got its IP address. We tested his theory on one machine by assigning it a static IP address, suspending it via Monitor and then sending a job to it. Sure enough, it launched the slave without an issue.

I guess I hadn’t run in to this problem before because maybe Deadline 5 slaves don’t launch as quickly as Deadline 6 ones?

If this is indeed the problem, I was wondering if there’s a way of delaying the launch of Deadline Slave on a machine to give it a chance to obtain an IP address? I don’t relish the thought of having to log on to each individual slave node and manually assign it a static IP address! :confused:

All the best,

Andy

Hi Andy,

Glad you figured out the issue, and that’s definitely an interesting problem. We’ve logged this, and perhaps we can find a way to avoid this. Maybe it IS as simple as having a delayed startup option…

Cheers,

  • Ryan