AWS Thinkbox Discussion Forums

AWS spot instance slave not picking up job

Hi all

I’m trying to submit my first small test 3ds Max + V-ray scene to render in the cloud using the AWS portal, but can’t get the slave to pickup the job.

Infrastructure is running, a single spot render node is online, with a valid UBL license. I submit a job using the 3ds Max submitter, and local slaves render it fine, however the cloud node doesn’t pickup it up.

To confirm there are no issues with groups etc I’ve checked the job can be assigned to this slave using this process:
Go to test job in monitor, right click, “Find render candidates”, select the cloud render node, click ok, resume job.

Looking at the slave log all seems fine:

To submit jobs to render using AWS via the SMTD do I need to do anything particular?

Any pointers really appreciated, hoping to be up and running this weekend.

Deadline Client Version: 10.0.16.5 Release (c7fafb65b)
FranticX Client Version: 2.4.0.0 Release (a9285251c)
Repository Version: 10.0.16.5 (c7fafb65b)
Integration Version: 10.0.16.5 (c7fafb65b)
3PL Settings Version: 02/04/2018

Thanks

Just noticed an InvalidAccessKeyId error in the Cloud watch logs, could this be related? Does the job not initiate until all assets have been transferred successfully?

I’ve run out of time to tinker today, but any pointers on this and the InvalidAccessKeyId would be v helpful.

1527959230.306424 2018-06-02 17:07:10,306 [/opt/Thinkbox/S3BackedCache/bin/precache_thread.py:run:53] [root] [3665] [Thread-2] [ERROR] Error in precache thread
Traceback (most recent call last):
  File "/opt/Thinkbox/S3BackedCache/bin/precache_thread.py", line 43, in run
    self.servicer.get_file(path, seq)
  File "/opt/Thinkbox/S3BackedCache/bin/central.py", line 842, in get_file
    sequence)
  File "/opt/Thinkbox/S3BackedCache/bin/central.py", line 522, in _deduplicate_request
    value = f(self, path, seq)
  File "/opt/Thinkbox/S3BackedCache/bin/central.py", line 788, in _get_file_impl
    info = self.get_file_onprem(path)
  File "/opt/Thinkbox/S3BackedCache/bin/central.py", line 438, in get_file_onprem
    return self.onprem.GetFile(params)
  File "/usr/local/lib64/python2.7/site-packages/grpc/_channel.py", line 500, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib64/python2.7/site-packages/grpc/_channel.py", line 434, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
_Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.PERMISSION_DENIED, Failed to upload //NAS/DeadlineRepository10/DS_temp/Scenes/CGSkies_samplescene_maxvray/maps/groundplane.jpg to aws-portal-cache-<id_removed>/<id_removed>: An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.)>

That is exactly it.

I think you may need to re-install the AWS Portal components with a newly created IAM role’s access credentials and remove that old bucket.

This resolved my issue, thanks!

Welcome!

Privacy | Site terms | Cookie preferences