AWS Thinkbox Discussion Forums

Start Slave Remote reports success but doesn't start

I’m back testing the power management suspend mode when I’m noticing that slaves don’t seem to unsuspend or if they do (which i can’t tell because remoting does unsuspend them), the slave doesn’t successfully launch and start jobs. So I’m wondering what is reporting success back to my monitor if the events don’t seem to actually happen. Hate those false positives.

B.

5.0 all windows, mix of xp64 and 7-64.

Hey Ben,

We’ve had a few other clients report issues with the suspend feature, and we’ve been digging into the problem some more over the past few weeks. Of course, we unfortunately can’t reproduce any reported issues yet, but it seems like a major issue is that the Launcher loses its ability to listen (the socket connection gets severed at some point while the machine is sleeping, and the launcher is unable to listen again until the machine is rebooted). There really isn’t a way to prevent the connection from getting severed, so we’re a little stuck at what to do at this point. One idea was to gracefully close the listening socket when the machine sleeps, and then reopen it when it wakes up, but we haven’t determined if this is a viable solution yet.

Now I’m not sure if this is the problem you’re seeing, since you get “success” messages back from the Launcher.

To be honest, if we can’t make it work more consistently for 5.1, we’re considering either (a) removing this option for the time being, or (b) slapping a big “BETA” warning on it so that at least we can keep the feature available to those that it actually works for. We’re leaning towards (b) at the moment.

Cheers,

  • Ryan
Privacy | Site terms | Cookie preferences