Problem report with Remote Control command

Hi Russell and all,

We’re soon planning on buying a few more dedicated render farm nodes, so we’re trying to pay more attention to some operational issues we’ve glossed over in the past. I’ll post each one in a separate thread.

We have 30 Render Farm hosts running WinXP x64 SP2. They auto-login to a special domain user, ‘deadline’, which runs Launcher and then Slave. Typically, no one logs into these machines, they’re 1U rack-mount servers with no graphics card or monitor. I don’t know if it’s important for this bug report, but we run Pulse on a separate machine under Windows Server 2003 (32-bit).

This thread concerns a case where we can’t contact the Slave on a host. In this case, from my local machine (also WinXP x64 SP2), I ran Monitor and selected the 30 hosts and said, Remote Control > Stop Slave (FYI, I’ve seen the exact same phenomenon I’m reporting here in the past with Remote Control > Restart Slave or Remote Control > Restart Slave After Next Job Completion). Of course I said to “wait from response from remote machines”.

Most of the machines respond normally. However, 3 machines consistently failed the remote command:
UnableToWrite.png

Ordinarily, that would make me think something is up with the network - but I can ping the machines normally, and in fact can Remote Desktop into them no problem. At that point I see something interesting: after logging in, here’s what the desktop looks like:
UnableToWrite.png

In particular, in the task bar you can see that no applications are open, but in the System Icon Tray the icon for Deadline Slave is still visible. That’s a very odd state, normally in my experience if Slave is running, its window is one of the windows I can switch to in teh task bar. Even more interestingly, when a Slave machine is in this state, if I move my mouse over the Slave icon in the System Tray, that icon instantly disappears.

Once we log into the machine, we can usually restart Slave from Deadline Launcher normally (sometimes we have to kill Slave). However, needless to say, what we really want is to be able to reliably administer our growing pool of machines using the Remote Control features. Right now in actual use the Remote Control features only have an effect on most, rather than all, of the machines in the farm. Today’s 10% failure rate was actually pretty good, usually it’s a little higher than that.

Any ideas about things to check or why we might be seeing this failure rate on Remote Control commands would be most welcome.

Leo

Hi Leo,

Remote Control is done via a socket connection between your Monitor and the Launcher running on the remote machines. So the first thing to check is that the Launcher is running on the remote machines.

The socket communication, by default, is done over port 5042 (which you may remember could be modified when running the client installer). All Launchers must be listening on the same port, so double check this in the deadline.ini file on these 3 machines. In your case, this file should be found here:

C:\Documents and Settings\deadline\Local Settings\Application Data\Frantic Films\Deadline\deadline.ini

If you need to change the port, just edit the ini file, but make sure to restart the Launcher after making the change. Once you’ve confirmed that the Launcher is running, and is listening on the correct port, make sure that no antivirus/firewall software is blocking that port.

If remote communication continue to fail on specific machines (like it is in your case), check the launcher logs on the remote machines to see if they contain any errors. These can be found here on XP machines:

C:\Documents and Settings\All Users\Application Data\Frantic Films\Deadline\logs

Feel free to post the logs here and we’ll take a look.

Finally, socket connections aren’t 100% reliable, so it’s always possible to get one-off failures. In the next release of Deadline, we will be vastly improving the Remote Command feature, and part of this will make it easy to see which commands were received successfully and which failed.

Cheers,

  • Ryan