Amazon and my Repository

Please excuse my ignorance in the matter, but I am struggling with the execution.

What is the best way to get Deadline up and Running with Amazon?
I’m currently trying to test it out, so I am not connecting them to our license server as yet.

Should the repository be on an amazon node, or is it sensible to have it connect back to my current infrastructure?
Does anyone know of any way to securely transmit between an EC2 instance and private work network?
Everything we are doing is on Windows, 3DSMax and Vray.

Any help would be most appreciated.
Thanks,
Alexander

I’m very interested in this as well, from what I’ve learned you may be able to do this with a feature called AWS direct connect. Apparently it gives you a direct connection from your premises to your Amazon machines, however I don’t know how to implement it.

We don’t really have any “best practices” to recommend at this point. We’re all kind of learning as we go here. :slight_smile:

For the instances themselves, they need Deadline installed and licensed, and all applications you plan to use for rendering need to be installed and licensed (just like a typical render node).

The tricky part is figuring out how to deal with your assets and your rendered images, and this will all depend on your pipeline. For example, you’ll need to get your assets (textures, caches, etc) uploaded to the cloud somehow. This could be a manual process, or a sync folder, but either way, your instances will have to be able to find them. This could mean making sure your asset paths on your local network are identical on the cloud.

When it comes to rendered images, you probably don’t want to be downloading everything that gets rendered to save on transfer costs. If you are rendering out large EXRs, you might want to convert them to JPEGs first for review, and only download the EXRs if you’re happy with the results. If you’re creating large sim caches in the cloud, you might just want to keep them there for the render that uses them, rather than transferring all that cached data back to the local network.

When it comes to where to put the Repository, that also depends on your pipeline. Do you still plan on having local render nodes connect? If so, are you wanting one repository to share between your local nodes and your EC2 instances, or do you want separate repositories? If you want to share, then you’ll want to put the repo wherever it works best. If you want separate repositories, you could use the job right-click script in the Monitor to transfer jobs from the local to the cloud repo.

For secure transfers, AWS sounds like it might do the trick. We don’t have experience with it ourselves though.
aws.amazon.com/directconnect/

So, yeah, not a lot of concrete answers there, but it’s one of those questions that doesn’t really have one “right” answer at this point.

Cheers,

  • Ryan

Shucks,

I was hoping to get a peek into the inner workings of Thinkbox Software.
Well, I guess I will keep on trucking along in an attempt to get this going!

Thanks for the response.

The “Best” way (in my humble opinion, that I got mostly worked out… although I never really got it all perfectly straightened out myself) is to setup a Subnet on Amazon that has one micro-system acting as a VPN gateway. Then it’s like you have a virtual network in Amazon. I would also setup your VPN gateway system to have a storage pool where you can do a sync of your project. The network speed between Amazon compute nodes is very very fast. So if you can get the project uploaded to your virtual file server all of your render nodes could hit it at GigE+ speeds. This is easy to setup if you have a mapped drive workflow. Just use your sync software to update the project folder and then map all of the Amazon Instances to the Amazon file server as the same drive as your local file server. All of your maps and such should sync up correctly.

I would however write directly to the file server (using UNC paths). Since render nodes usually can’t saturate a 50mbps internet connection with renders even at 4k. That skips a long transfer at the end of the render.

Your repository also stays on your local network and the amazon nodes connect to it through the VPN gateawy.

Ultimately though I ran into trouble getting the VPN gateway setup correctly and decided that the performance just wasn’t worth the cost and would rather spend the $$ on more local render nodes. I did run a couple test jobs through it and in theory it works. But Amazon systems are Slllllloooowwwwww and expensive.

So I gave this a whirl.
I have two repos on two very different machines.
I can manage them both in Monitor, but job transfer doesn’t seem to work.
Despite selecting a new repo to move the job to (see screen shot) I just get duplicates in my current repo.
I found this behavior in both repos, the actions were the same. (Duplicating in the current repo despite providing a transfer repo)

And even with the Send email results, I get no email about the transfer.
Notifications are correctly configured and work just fine!

Note that the way the transfer system works is to submit a new job to your CURRENT repo (Z:) that will transfer the existing job to the NEW repo (\ARC-3DMAIN\Deadline_Repository). I noticed in the screen shot that your transfer jobs aren’t complete. When the “Transfer of Testing!!!” job finishes, you should see the original “Testing!!!” job in the new repo.

Cheers,
-Ryan

Ah, that makes sense.

Now I am getting an error:

This is a job that was created within deadline and with a script.

The actual path to the script is:

That looks like a bug with the way we’re building up the auxiliary file paths. Looks like it only happens with jobs with more than one auxiliary file.

Attached is an updated JobTransfer plugin file that should fix the problem. To install it, go to \your\repository\plugins\JobTransfer and backup the JobTransfer.py file. Then unzip the attached file to the same folder. Now try to run the transfer job again to see if it works.

Thanks!

That works far better!
Thanks Ryan. :smiley:

Thanks for confirming! We’ll get this fixed in the next release.