AWS Thinkbox Discussion Forums

Spot Event Plugin - groups and slave list

I’m doing testing of the Spot Event plugin in Deadline 9. I’m able to have it create a spot fleet when I submit a job and have it cancel the spot fleet once a job has finished. That’s wonderful and works as advertised. However, I have only once had my new EC2 slaves added the the group associated with the render job (and only 3 of the 10 instances at that). And, the instances are not being removed from the slave list once the job is complete, the instances have been terminated, and the spot fleet request has been canceled. I don’t know if this is a bug or if there are further settings I could play with to get the failing functionality to work. I’m hoping someone will have some insight. My gut says this is a bug.

Hello,

Are you on Deadline 9.0.0.18? When the slaves start up with no group are you able to log on the machine and take a look at the slave log? It should say why it has failed to add the group. Are you connecting to the Repo/Db via the Proxy or File Share? Currently you need to manually delete the remaining instances from the list, I’ll look into the progress on this.

Regards,

Charles

Yes, we are using Deadline 9.0.0.18. To make the renders work I’ve been manually adding the new slaves to the group and deleting them from the slave list when the render is done. Additionally we are connecting to the Repo as a smb mount. I will bring up a new render and look at the slave logs to see why they are not being added to the group.

I will say, if you guys are able to get the slave to be deleted from the slave list automatically and I can figure out why they aren’t being added to their associated group this will be a nearly perfect implementation of EC2 Spot Instances. Looking forward to using an ironed out implementation.

I will post what I find with the slave logs.

I thought I’d try a new group name. At first there was a case mismatch between the group name and spot fleet group name. Changing the case of the spot fleet group name to match the case of the regular deadline group name resulted in intermittent slave assignment to it associated group. However, once I created a whole new deadline group and spot fleet group everything is working like clockwork. The slaves are still not being removed from the slave list, but as it was pointed out this feature is known to not work yet.

Just an update on the groups. We’ve got a plan that should squash that issue. I’m expecting it to be in the next (first?) service pack for 9.0. Thanks for reporting it!

We’re not 100% sure of the cause, but the solution is to have the Slave continually check to see if it’s in any groups, then query the Amazon API to figure out its mapping if not. I need to double check performance for this on non-cloud instances, but I trust Charles and the dev team to do a good job here.

Privacy | Site terms | Cookie preferences