AWS Thinkbox Discussion Forums

Deadline 6.1/VM etc

hey all - some rambling thoughts and brainstorming

D6 is about to ship in the next few weeks barring some disaster [yay!]
D6.1 is already started/roadmapped and additionally something focused on cloud/vm use with Deadline that I am considering as an add on.

so some quick feedback:

95% of our users wont use cloud this year, and I see that changing but slowly over time [and when I say cloud, I mean public or private vm control or rental] so I am considering our VM/Cloud efforts to be an add-on.

an add-on somethign like a Deadline PRO, with some additional fee per license. nothing excessive - we are at $185 now, and for example it could be $45 more or $230 for the full whack. how does that initially sit with you? We would still have a rental model for the cloud side for VM’s you manage out there.

What might Pro be? lots of benefits here - hybrid advanced local cloud rendering, public cloud rendering and bare metal on the same farm. ability to manage compute resources in different granular approaches, manage different environments/builds in parallel [10 machines running nuke 6.x, 15 running 6.xx , 3 running 7] and changing on demand etc. Definitely for the power-user/facility.

why an add on? because it’s going to be an additional layer. I don’t want to spec out too much - it’s early days.

the reasoning behind the pricing difference is that I feel like with our dev team, and the efforts we are expending I want to raise the price. it seems more reasonable to hold the price, and charge more for the PRO edition with the advanced features - but that will directly hit some of you here. Note that some of the feature set we want to include in the coming versions would begin to compete with large compute management tools that cost 1000’s per CPU [!]

anyway, just spitballing here a little.
cb

Right now, our VM’s each have their own licenses of DL. So 1 physical machine eats 10 licenses so we can get more granularity for rendering. Would that mean that that one machine would cost $450 more?

Would you be able to mix/match DL and DLPro in the same farm? Meaning we leave our old “pizza boxes” on DL running their boring 3ds max rendering and we put DLPro on the new 40 core machines where it does the VM management according to the demands of the job pool?

hmm. interesting thoughts - no, the intent would /not/ be to have each machine eat n*45 where n= number of virtual machines.

I believe I would be amenable to a model where you had 100 Slaves and 45 Pros on the same license file…

let me have more internal discussions - but any feedback on this being a part of a subscription or the M&S fee like Draft is, rather than a owned license like the D6 Pro model above?

cb

Deadline Pro lic being part of a M&S fee would be easier for me to sell internally. We don’t need to buy anymore licenses and so “upgrading” to Pro would be tricker to justify.
If cost increase means more Thinkbox developers and faster delivery of cloud integration. Then, Yes!
If it also means more dev. time is thrown at solving outstanding workflow/efficiency improvements in Deadline std edition, then Yes! again.
Improvements such as:

  1. Need Tile Assembler or Draft or whatever is used, to support smarter bucket/tile assembly duties. We create many very high resolution images. I want Draft/TA to be a major part of this workflow. Assembling partial tiles/buckets, handling multi-channel on partial tiles, assembling into correct location, artists being able to “dial-in” the X,Y tile count on a certain area of a render; ie: BLOW-UP but really smart BLOW-UP. Together with MULTIPLE blow-up regions all being rendered in 1 Deadline job, across multiple slaves. And there’s more…

  2. DBR special plugin architecture. Deadline really needs to support MR & VRay DBR / VRay RT DBR systems. It’s mega important.

I have multiple artists asking for these things every week. Honest.
I just did some quick calculations based on our license count and your proposed cost increases. The difference in cost increase if we were to buy all our licenses again, is the same as the amount of money Deadline’s WoL/IPMI PM system saves us every Q!

re: your last paragraph: are you saying we are too cheap? ;-p

right now the deadline team is nearly double where it was a year ago - we actually have a dedicated team on the VM/Pro stuff separate from core 6.1, so there you go. Thus trying to justify additional revenue from it somehow. although honestly i’m torn - the more VMs you use, he more DL liecnses you need/want, so it might work out.
honestly, looking for help here. how do we raise the value, without the raise in price becoming a barrier?

re: DBR/Draft - yes. DBR unfortunately not in the immediate picture, although the Draft tile stuff is in our upcoming roadmap - just not sure it would make the 6.1 cutoff or be more like a 6.2 time-frame. we all want to do it, that’s for sure. somehow it ended up being weighted at the edge of happening or not for 6.1. frankly, it depends on how long some of this stuff takes!
again, take this as a grain of salt without promises for the future - and under NDA!

cb

Hi ya,
Just to hammer home my previous comment…(not to cause offence)
This week I have had the same conversations twice more in the studio. I honestly don’t think I’m the only studio with the same feature requirements.
Could I suggest a poll type of thing at some point in the future to all existing Deadline / potential Deadline users on the following potential feature requests for development ASAP:

  1. Draft - enhanced Tile Assembler duties. A lot of users have a major part of their business generating “still” images via 3D packages and not animations.
  2. DBR - particularly VRay. Although MR would be good to. Is there any studio that doesn’t use VRay in some form these days?! DBR might not be the most stable thing in the world, but everyone has to iterate through Lighting’n’Rendering and speeding up this workflow, helps all studios.

If these are deemed “Pro” features and require an additional Deadline Pro license, then fine if I get these features this year; let’s do it. :slight_smile:

M

mike -

I hear yah. what renderer would you need for DBR?

cb

VRay / VRay RT

i think we would need to understand the scope of your requirement for DBR and tile rendering to be clear. would that be something we could have a discussion about to gain more detail?

cb

This has come up many times in the past. We need something state based, not task based.

The process of converting a slave from state to task and back will be the nuanced part. Otherwise we could just generate DBR jobs with tasks that never timeout.

It’s not just DBR, either. Like it could be running some sort of service. Like you could have a webserver job, or a bittorrent server job, or a folder-watching job. They wouldn’t ever “complete”, you just want the slave to have that state active. And you might want more than one state to be active and allow for states and tasks to be be concurrent.

Discussion? - absolutely. I already have pretty good notes on the Tile/Draft stuff as I have chatted to Bobo about this in the past and he needed Draft v2 in place for the advanced TileAssembler duties. Chad has hit the nail on the head with his comments about DBR. 2 major issues there. 1 is the state based concept and 2 is the 3rd party support which in the past has been restrictive to make this work. However, I believe Ryan has talked to Vlado in the past about what he needs to make this work? Perhaps other renderers such as MR are easier to communicate with?

Curious, what do you need 3rd party support for? Better logging? To me it seems like we’d just be able to have an event like “Start MR DBR” which runs some series of commands and “End MR DBR” that runs some other series of commands. Then you have some rules about what can or can’t happen while in the “MR DBR” state.

There are a few parts to the DBR puzzle which we should discuss. The main ones are getting the DBR process running in the first place, as well as determining how DBR jobs fit in Deadline’s normal job scheduling system. As Chad mentioned, this system could work with any process that has an undefined running time, but we’ll still refer to it as DBR for the sake of this discussion.

Before I dive into the details, I just want to confirm that the DBR system is meant to be an “interactive” render where the image is assembled in a viewport on the artists machine. It’s not like tile assembly where it’s just pushed out to the farm and forgotten about. In other words, the DBR “master” will be an artist’s workstation, and not a slave on the farm. Assuming that’s the case, there are a couple ways we could do this:

1) Stop the Slave and start the DBR process, and don’t allow the Slave to start until the DBR process has finished.

Studios have already written Slave right-click scripts to do this in the past. The scripts stop the slave remotely, disable it, set the Slave’s comment to indicate why it’s disabled, and then fire up the DBR process remotely. When they’re done, they just run another script to reverse this. We could write out own and ship these with Deadline out of the box. In this case, Deadline’s job scheduling system doesn’t play a role.

Pros:

  • This process allows you to pick the slaves they want to use, based on visible info like current task progress (if the slave is already rendering).
  • This allows slaves to be converted to DBR on demand.
  • The slave is unable to do anything else while DBR is running.

Cons:

  • It’s a manual process to convert slaves.
  • Need to remember to switch DBR slaves back to regular slaves.
  • Admins will probably want to lock this down so that artists don’t take down the farm.

2) Have the Slave run the DBR process.

We would write a ReserveDBR (working title) plugin that just launches the DBR process like any other rendering process. A ReserveDBR job would then be submitted to the farm that would “reserve” slaves. For example, if I want 10 slaves for my DBR, I would set the ReserveDBR job’s task count to 10. Once the ReserveDBR job has been picked up by enough slaves and the DBR process is running on them, I can then start my render. When my render is finished, I can just delete or suspend the ReserveDBR job to allow the slaves to move on to something else. We’ve already prototyped this type of system in the past, and it worked. In this case, Deadline’s job scheduling system plays a role.

Pros:

  • This system uses Deadline’s existing job/plugin architecture.
  • Starting the DBR process is automatic.
  • The DBR process is managed by the slave, so errors and crashes should be detectable.

Cons:

  • Depending on current queue priorities, it may take a while for the ReserveDBR job to reserve enough slaves, and ones that are already reserved would be sitting idle.
  • If I start my render before the ReserveDBR job has reserved all slaves, any slaves that pick up the ReserveDBR job afterwards will not be able to participate in the render, and will sit idle.

3) Others?

If you guys have other ways you could see this working, it would be great to hear them. Both (1) and (2) above have their pros and cons, so maybe a solution is just to do both and let the studios choose the method they want. Either way, we can already do everything I’ve described above with Deadline’s current architecture. Or maybe there is an “outside-the-box” solution that we haven’t thought of yet that might require us to make changes to Deadline’s core…

i think its just a job-type that never completes.
cb

It should still respect priority, so if you have a DBR job at priority 40, and a normal task based job gets submitted at priority 80, the slaves should drop out of the DBR job, do the normal task job, then return to the DBR job.

Depending on the type of DBR state job (even if we call it DBR, there’s different software that would use it, like vray, mental ray, maxwell, fusion, etc) it might be useful have the master workstation get updates on the utilization. Periodic reminders, like a lending library, where you’d get a message on the workstation saying “Do you still need DBR job 01’s cluster to be reserved?” and if they don’t confirm within 5 min, the job is suspended and the slaves freed.

And for jobs that are not super high demand (most DBR renders have huge idle periods if you aren’t doing some sort of interactive path tracing) you’d want to allow for a task based job to also occur at that same time. Like if we were running a folder watch application or a webserver or doing GPU renders, we might to be able to allow a normal task render to happen at the same time and have Deadline manage the process priority for us.

You could achieve this behavior by making the DBR job interruptible, but an issue with DBR renderers (unless something has changed recently) is that once a DBR render has started, slaves will not be able to participate in that render. So if a slave drops out mid render, it won’t be able to join in again. This means that if all slaves drop a DBR job when a render is in progress, the render would be starved of resources. Because of this, I don’t think it would be desirable to have the DBR job be preempted by a higher priority job if the DBR job has already started.

Chris and I were talking, and an idea we already have on the wish list might be helpful here. The idea is to add an additional option for task timeouts where the task is marked complete instead of throwing an error. In the DBR submitter, the artist would choose a timeout for the job so that it gets marked as complete after a certain time. This way, if they forget about it, it will eventually complete itself instead of running indefinitely so that the slaves move on to something else.

I know we all want this to just work, but it seems like building around bugs/limitations in current DBR renderers is goofy.

Wouldn’t the timeout idea violate the concern about the slaves being able to rejoin the render again? Like if the timeout is 10 min, wouldn’t we be forcing the entire DBR session to restart for all slaves and the workstation every 10 min? That seems annoying enough that people would up the timeout to 4 hours and still leave the farm idled.

Yeah, the timeout is really just a fail safe. At least 4 hours is better than never…

The heart of the issue is that DBR (with its current limitations) doesn’t fit the render queue model very well. It’s the primary reason why we haven’t done it yet, and it’s also part of the reason we added the tile rendering feature in the first place. The main hurdle is slaves not being able to join a DBR after it’s already started. Maybe we need to start a conversation with Chaos Group to see if this is something that could added to VRay in the future…

But other distributed renderers do handle dynamic slave allocation gracefully. Eyeon Fusion, for instance, does. So it might be a case where a general case system is put into place, and you add specific oddball tweaks when you want to configure it for picky applications.

By “3rd party” I was referring to the different renderer packages out there. Better logging? Absolutely.

Based on Ryan’s DBR puzzle initial response, I was thinking along the lines of option 2. I want the slave to run the show, provide back the stdout,stderr,stack trace, etc. (Need to make sure only 1 slave on a physical machine tries to run up a total of 1 DBR process at any time). I believe the “options 2 cons” of slave reserving and their timing before a DBR process could start involving all slaves, could be left to the users to configure by smart use of “pools” and “groups”?

I think there are 2 components to DBR support in Deadline.
(1) The core Deadline ‘plugin’ support, which we are chatting about here!
(2) software specific UI/integration work to allow seamless workflow between Deadline and whatever renderer we are trying to place nice with. In the example of VRay in 3dsMax, this could be another tab in the SMTD called “DBR” or a series of standalone customisation scripts stored in the Deadline repository (just like ‘client setup’ / ‘scripts/submission’ for each Deadline supported app), which allows for a software / renderer specific UI to allow easy Deadline integration. The thing here is to help users interface with Deadline to prep slaves in the normal farm, so they can be made ready and used in a DBR render which will be executed from within the 3D application environment. The artist shouldn’t have to dive into Deadline monitor and back out again, to get the DBR process all setup? Or should they? My example is the current VRay DBR UI. You have to type in the DNS names or IP addresses of the slaves you want to use. This is just an *.ini file on the local machine that gets updated between sessions of 3dsMax. So, have a maxscript interface to edit and auto-update these entries once Deadline has reported back which machines have been reserved for DBR for that user.

I believe MR via 3dsMax is the same as VRay and doesn’t support dynamic slave allocation either. :frowning:

VRay v3 private beta has opened recently. I’m seeing on the beta forum features being suggested one day and being implemented by Vlado within a couple of days and available in the nightly builds…you guys should talk… :slight_smile: (Vlado is at EUE this year. Couple of the guys from Burrows are going. I’m not.)

Mike

Privacy | Site terms | Cookie preferences