AWS Thinkbox Discussion Forums

Slave not starting

Hi,

i´m having trouble to get Deadline 8 up and running.

The problem is, that the slave is not starting when the machine is waking up. After a fresh reboot the slave starts as expected.
Strange thing is, that other apps, which were running when the machine was send to sleep, are up and running after waking up, even the Deadline Launcher. Only the Slave is missing.

I´ve tried several things like a fresh install of client and repository, install client as service, changed pulse from linux to windows server, change settings in power management, no luck. There was no problem running Deadline 7.

I was thinking about using IPMI instead of WOL, because with IPMI i can shutdown and restart the machines, but to be honest i don´t know how to get that running inside from Deadline.

Clients ( Supermicro) are running Windows 10 Pro, Repository is running on Red Hat 6.2.

Any idea?

Kind regards,
André

Do you just need to set Slave to auto start when Launcher starts? If so, when you right-click Launcher in system tray, you need to enable: “Launch Slave At Startup” or this can be done via “Auto-Config” as well:
docs.thinkboxsoftware.com/produc … -ref-label

Also, what do your settings here look like?
docs.thinkboxsoftware.com/produc … ne-startup

Have you followed all the notes in the above docs page as well?

Hi Mike,

thanks for the quick reply.

I doubled checked all settings,and they all say "“Launch Slave at Start Up”. So when i install the client (normal and as service) i check, that the slave should start when the launcher starts, the auto configuration says “Launch Slave at Start Up = True”, right-clicking on the launcher also is green on “Launch Slave at Start Up”. The thing is, that the slave starts when the machine is booting or restarting, but not when waking up, although the machine starts.

Kind Regards,
André

Interesting. Can you zip up and send me your logs?
docs.thinkboxsoftware.com/produc … ation-logs

Hi Mike,

attached you´ll find the Logfiles of the Linux Repository, and 3 different Win 10 Clients (V8 on fresh Win 10 install, V8 on once working V7 Win10 install, working V7 client).

Regards,
André

Sorry, I don’t see any attachments in your last post?

next try…
logfiles.zip (1.12 MB)

The logs are saying that the Slave in question (upgrade from 7) slave is being told to suspend or shutdown very quickly after being started up. The commands are always coming from “::ffff:192.168.2.202”, so I suspect you have aggressive Power Management settings (set to run every ~1 min and it should typically be more like ~5mins), possibly combined with “Slave Scheduling” settings here which are forcing this slave to shutdown when no jobs are in the queue even if it’s only recently started up. Can you show us your PM (Machine Startup & Idle Shutdown for this slave in question) & Slave Scheduling settings?

Hi Mike,

i already tried different settings in the power management “idle shutdown” tab. Besides Idle Shutdown and Machine Startup there is no further scheduling for the slaves.

The last settings in power management were:

  • Idle Shutdown after 1min (turned it now back to 3min)
  • Shutdown Type: “Suspend (Windows only)”; also tried “Shutdown” (with Hybrid Shutdown On/Off), which was working fine with Windows 7; since Win10 only “Suspend” works to wake up the machine, if there´s a job in the render queue.

Do you also need screenshots?

The screen-shots would be good just so we have validated all your PM settings for both Shutdown and Startup and are hitting the applicable slave which is in difficulty here.

Please note the following limitation when using Hybrid Shutdown with WOL for machine startup from our docs:

docs.thinkboxsoftware.com/produc … -ref-label

Hybrid Shutdown: If enabled, the machine is prepared for a fast startup. This option only works on Windows 8 machines and later. Note, Machine Startup via WOL does not support waking a Windows 8 or later machine which has entered the Hybrid Shutdown state (S4).

I’m not sure how you were able to get “Hybrid Shutdown” to work correctly on Win 7 as it is only supported in Win 8 and later.

Hi Mike,

attached are the screenshots from the current power management settings.

I read about the limitations in the manual, but it doesn´t make a difference, if i turn it on or off. On Win7 i used just the regular shutdown, now on Win10 only suspend is working. For some reasons i haven´t figured out yet, the machines can´t start with WOL, if they´re off.

powermanagement.zip (5.83 MB)

Your settings all look good here. So, to confirm, you have 2 issues here?

  1. 1 slave (Win 10 only) will not start if the machine is woken by WOL by Pulse, although the machine is actually started up by WOL ok. (I’m assuming your Pulse machine is running all the time and has been upgraded to the same version of Deadline as the rest of your farm?)

  2. Sometimes your slaves fail to start as a result of WOL not working?

I just want to clarify the exact issues here, as it sounds like configuration. Did it help to follow our PM docs here on all the things you need to check to ensure WOL works for you?
docs.thinkboxsoftware.com/produc … -ref-label

Outside of Deadline, did you use a 3rd party WOL app to confirm you can successfully wake up your machine across your network? (We cover how to do this in our above PM docs -> Troubleshooting WOL section).

Finally, do you have different generations of SuperMicro blade/rendernodes here? The more modern generation doesn’t have WOL supported on the Motherboard and normally only have IPMI available. I would double-check your motherboard manual to verify if this is the case.

Hi Mike,

the issue is on our complete renderfarm (33 Clients on Supermicro, plus a bunch of HP Workstations (Z800/Z820).

Everything was fine with Deadline7.2 (and before) on Win7 and with some issues on Win10 (turning on a client which is turned off; solved that by changing power management to suspend.)
So now on Win10 and Deadline 8 (current release) everything works as before, so suspended machines wake up, when there is a new job in the queue, and go to sleep after amount x in idle mode. The “only” difference now is, that the Deadline Slave is not starting after waking up and the clients seem to be offline / not available in the Deadline monitor. When the client machine is turned off and on again, or after a restart, the Deadline Slave starts fine.

I did not tried all that 3rd party tools to check WOL since it was working before (only difference Deadline 7 vs 8), but i certainly can try that. Or i have have to find someone, who can help me to set up a script, which uses IPMI for starting and stopping the machines (all Supermicro). I tried the script from the manual, but without success.

I think we need a remote debug session here. Can you log a ticket via support.thinkboxsoftware.com/ and someone from the support team can take a look here. No need to test WOL if it’s already working. Also, no need for IPMI, if all your machines support WOL. WOL is far simpler to get working than IPMI anyway.

Privacy | Site terms | Cookie preferences