AWS Thinkbox Discussion Forums

Spot fleet plugin doesn't recognize when it needs to make a new spot fleet! Thinks one I deleted days ago is "unhealthy"

The spot fleet plugins awareness of state is severely problematic. Despite me deleting a spot fleet before I even launched all my AWS infrastructure from scratch, the spot fleet plugin is complaining about the spot fleet which I deleted DAYS ago being in an unhealthy state.

That spot fleet has been cancelled, and there’s nothing I can do to shake the spot plugin out of its state, even a fresh install. The plugin is gridlocked. This is a horrible bug.

@thinkbox hear me out, I’ve mentioned this needs to be fixed a few times:

The spot fleet plugin needs to do a few things that I am positive right now it doesn’t do:

  • Store the id of the spot fleet resource it creates. if it didn’t create it, it has not right to aquire/manage it.
  • Store a hash of the config contents included the spot json template or other settings. if that hash changes, destroy the fleet request and relaunch it. Even something as critical as an AMI ID change will not currently result in relaunch of a spot fleet. You might think this is slightly unrelated, but the management of state here in this plugin is painful, and its why I have to cancel spot fleets in the first place. If I update my AMI build, and that json template in code, then the previous spot fleet must be deleted.
2022-06-11 14:43:08:  Spot: 	cloud_c64_engine: 0
2022-06-11 14:43:08:  Spot: 	cloud_c8_engine: 1
2022-06-11 14:43:08:  Spot: 		62a4a9f504664d5bf7e7d5db: 1
2022-06-11 14:43:08:  Spot: 	cloud_c64_mantra: 0
2022-06-11 14:43:08:  Spot: 	cloud_c8_mantra: 0
2022-06-11 14:43:08:  Spot: 	cloud_c8_shell: 0
2022-06-11 14:43:08:  Fleet Health Publisher: Published fleet health data for region ap-southeast-2
2022-06-11 14:43:08:  WARNING: Unable to create new Spot Fleet Request due to one or more of your AWS resources being in an unhealthy state.
2022-06-11 14:43:08:  "ap-southeast-2": ["sfr-0259fa7b-1afe-4fbb-9f8b-64800a027c9c"]
2022-06-11 14:43:08:  Fleet Health Publisher: Published fleet health data for region ap-southeast-2
2022-06-11 14:43:08:  WARNING: Unable to create new Spot Fleet Request due to one or more of your AWS resources being in an unhealthy state.
2022-06-11 14:43:08:  "ap-southeast-2": ["sfr-0259fa7b-1afe-4fbb-9f8b-64800a027c9c"]
2022-06-11 14:43:08:  Fleet Health Publisher: Published fleet health data for region ap-southeast-2
2022-06-11 14:43:08:  WARNING: Unable to create new Spot Fleet Request due to one or more of your AWS resources being in an unhealthy state.
2022-06-11 14:43:08:  "ap-southeast-2": ["sfr-0259fa7b-1afe-4fbb-9f8b-64800a027c9c"]
2022-06-11 14:43:08:  Fleet Health Publisher: Published fleet health data for region ap-southeast-2
2022-06-11 14:43:08:  WARNING: Unable to create new Spot Fleet Request due to one or more of your AWS resources being in an unhealthy state.
2022-06-11 14:43:08:  "ap-southeast-2": ["sfr-0259fa7b-1afe-4fbb-9f8b-64800a027c9c"]
2022-06-11 14:43:09:  Fleet Health Publisher: Published fleet health data for region ap-southeast-2
2022-06-11 14:43:09:  WARNING: Unable to create new Spot Fleet Request due to one or more of your AWS resources being in an unhealthy state.
2022-06-11 14:43:09:  "ap-southeast-2": ["sfr-0259fa7b-1afe-4fbb-9f8b-64800a027c9c"]
2022-06-11 14:43:09:  Spot: Starting Spot fleet status fetch
2022-06-11 14:43:09:  Fleet Health Publisher: Published fleet health data for region ap-southeast-2
2022-06-11 14:43:09:  Spot: Ending Spot fleet status fetch
2022-06-11 14:43:09:      Pending Job Events - No more job events to process
2022-06-11 14:43:09:      Pending Job Events - Done.

Is there anyway I can hook into the database via a command and force it to refresh this resource id/kill the reference to the spot fleet request so it goes and creates a new one?

@thinkbox

This page must be incorrect / out of date which is what I have followed in the past to create the JSON template:

https://docs.thinkboxsoftware.com/products/deadline/10.1/1_User%20Manual/manual/event-spot-fleet-configurations.html#event-spot-fleet-configurations-ref-label

… Because there is another page now mentioning the requirement to add a specific tag. None of the examples show this tag - 'DeadlineTrackedAWSResource':'SpotEventPlugin'.

https://docs.thinkboxsoftware.com/products/deadline/10.1/1_User%20Manual/manual/event-spot-config-utility.html#event-spot-config-utility-ref-label

1 Like
Privacy | Site terms | Cookie preferences