I’ve got machines for our render farm on 3 different subnets. We’ve found that we are unable to have power management keep our machines scheduled the way we want b/c of the varying subnets(won’t suspend/wakeup or do any of the remote commands).
We have
Pulse and repository on server subnet
Some slaves on building subnet
Some slaves on lab subnet
We have found that if the lab subnet sends to another machine on lab subnet, it can be controlled remotely. Same with the others. We can only control machines on the subnet that the machine issuing the command is on.
I thought maybe adding a 2nd NIC to the pulse machine and having it grab an IP from the building subnet(so it has a server IP and building IP) might allow us to control those slaves - but no luck.
Just wondering if you guys have any ideas or have heard this before?
Basically, the mechanism we use for Wake on LAN is broadcast packets. It’s never occurred to me before, but in situations like yours where the farm is across subnets, these broadcasts never make it through.
The reason for this is that the routing hardware which passes data across these subnets blocks multicast data by default, broadcasts included.
The way to fix this, is to either enable multicast broadcasts through the routers, or have Deadline try and send an extra packet directly to the host. Just talked to Ryan, and he’s put that functionality in (one line of code ) so now WOL should make it across those subnets.
Enabling broadcasts in your routers isn’t a good idea. It will increase useless traffic with little benefit other than Deadline’s Wake on LAN functionality.
Would you be willing to get on our beta program to see if that fix actually works? Otherwise, it’ll be in Deadline 5.1 once it’s released.
I recently ran into a similar issue when trying to get Power Management working in our studio. My issue is that our render farm uses IPMI for controlling the power, which is on a different subnet.
My solution was the following:
Something needs to route the packets from the one subnet to the other, so I have a simple server running XP which is connected to both subnets 172.16.5.xxx (Render Farm and Workstations) and 10.0.0.xxx (IPMI for Render Farm). The 172.16.5.xxx ethernet is set for sharing.
I run Pulse on this system as well since Pulse will be executing the commands, which ultimately need to reach the 10.0.0.xxx subnet.
I set the Power Management “Machine Startup” command to: c:\ipmiutil\Deadline_power_Boxx_on.bat {SLAVE_IP}. The {SLAVE_IP} is a variable that Deadline passes through based on the “Machines In Group” you’ve established. So I pass this variable into my batch file.
Here’s the batch file:
@ECHO OFF
del c:\ipmiutil\DL_PowerManage_Log.txt
SET IP=%~1
ECHO %IP% | sed "s/[0-9][0-9][0-9][.][0-9][0-9][0-9][.][0-9][.]/10.0.0./" >> c:\ipmiutil\DL_PowerManage_Log.txt
SET inputFile=c:\ipmiutil\DL_PowerManage_Log.txt
FOR /F "tokens=*" %%I IN (%inputFile%) DO (
ipmiutil reset -u -N %%I -U **** -P ****
)
Essentially, I’m grabbing the SLAVE_IP’s as DL runs the script, changing the first three sections (length based) of the network address, then storing that in a simple text file. I then grab the contents of the newly created text file, which is the new IP address(es), and run it through a loop executing the IPMI utilty with the correct IPMI-based IP addresses.
That’s pretty solid. Only issue is you need a minimalist GNU tool set to make that work (no sed in Windows by default).
I wrote something similar for a client who needed to map MAC addresses from one adapter to another. I should throw that up somewhere when I’ve got the chance…
I also hit a eureka moment the other day in regard to this exact problem (the one the thread is about). Typically global broadcasting is disabled, but local broadcasts ought to work via the routers. An example network would be:
Net 1: 10.0.1.0/24
Net 2: 10.0.2.0/24
Net 3: 10.1.0.0/16
Net 4: 10.2.0.0/30
We can send broadcasts to configurable network segments and that should also allow cross-subnet waking. In theory the directed IP address trick will only work if the route is somehow preserved. With this system, you’d specify wake segmets using the local broadcast like so:
10.0.1.255
10.0.2.255
10.1.255.255
10.2.0.4