AWS Thinkbox Discussion Forums

Feature request - Power Management Machine Startup Timeout

Hi,

It would be great if Pulse and Power management was aware that it tried to wake up a machine but the machine did not become online within given time frame. Lets call this feature “Maximum number of minutes before a woken up slave should become online” with tooltip “The interval, in minutes, to wait for given slave to become online”.

How it’s currently working:
Assume we have 1 job with single task and two slaves A and B (one slave one machine; startup order is set to A then B). Both machines are offline.

  1. The job is submitted to DL
  2. The Pulse is checking that both slaves can render the task and selects machine A to be woken up.
  3. Machine A is not waking up.
  4. Wait “Next pending job scan” timeout
  5. Go to step 2

Effect:
Pulse is trying hard to wake up machine A, but it’s not gonna happen. Job is waiting for infinity.

How the proposed feature should work:
Same assumption as before and wake up timeout is set

  1. The job is submitted to DL
  2. The Pulse is checking that both slaves can render the task and selects machine A to be woken up.
  3. Machine A is not waking up.
  4. Wait “Next pending job scan” timeout
  5. Check if wake up timeout passed, if not go to step 2
  6. Wake up timeout passed and the Pulse is deciding to wake up machine B. Maybe event the machine A is marked as non-responsive (?)
  7. Machine B is starting and soon the job will be rendered

Hope this makes sens :slight_smile:

What do you think of having a special event fire when this is detected?

I suppose we already have an “Override startup order” feature, is the problem that we’re just not taking into account the machine(s) not successfully starting? I can see that as a pretty easy thing to implement if we are able to store the failed attempts in the DB.

Sounds good. More options, more flexibility

You’ve mentioned the “Override startup order” window which reminded me about another feature :slight_smile:

It would be great if you would add a button that would set machines in the list in random order. Currently you can arrange them manually, but I’m thinking of something like shuffle the list. It might sound silly, but we found that some machines that are on top of the are most commonly used. From time to time I’m manually changing their order but it’s a tedious tasks and it would be great if there was a shuffle button.

I’ve used mu "Professional MS Paint Skills"™ to illustrate what I’m thinking of :slight_smile:
mockup.png

Privacy | Site terms | Cookie preferences