Hello, my brothers/sisters/others in arms of the well-documented Deadline-API,
I am writing to you because I have a dream.
I am pretty new to Deadline so I am looking for some ideas or hints to solve my idea:
A Job that dynamically changes the frames a task has to calculate when one node is done but others are still rendering.
E.g I have 5 slaves (everyone performing a task) which are rendering with a chunk-size of 5. Slave 1,2,3,4 are done. Slave 5 is still rendering at frame 21 due to the bad luck of some stuff that happens in the frames.
I want slave 1,2,3,4 to split the remaining 4 frames (22,23,24,25) between each other so the Job is done quicker. So the chunk-size kind of dynamically gets reduced.
I am thankful for any answer that might lead up some workflows and/or scripting ideas.
+1 for this idea, there’s quite a few issues with it though.
If you were rendering large frame ranges you’d have to calculate the number of available slaves/workers which of course could change while this is being calculated. 1000 frames / by 10 free nodes could be 100 free nodes in 5 more minutes, do you keep recalculating?
Also if the scene is very large it could impact the render times by pulling the scene over the network and loading it which could be longer than letting the loaded scene continue to render.
Ideally you’ll be submitting the job as a batch render and doing 1 frame at a time, or using something like V-Ray offload distributed rendering which does this kind of thing reducing bucket size as the image completes
If i understand you correctly, your idea is setting the chunk-size of a slave rendering a task to 1, offering a limited amount of slaves so it automatically keeps track of it?
yeeees and no… due to our internal workflow a chunk-size has a reason and I cant just change it to 1. As you said it should add a lot of network traffic.
Thats why my Idea is to dynamically send stuff. So traffic is only produced under specific circumstances.
Does the above statement mean that the task is hanging indefinitely? Or these frames are just taking longer because they are more complex then the previous frames? Is this an issue with specific plugins, which ones are you using? If your plugin supports batch rendering you can likely just submit a frame per task, because the scene is going to be held in memory.
Unfortunately there is not an event that we can trigger on when a task has run too long to perform a functionality like this. There are no task based events in general. There would likely need to be a task timeout event where you could trigger a script.
What you could do is trigger on an event like House Cleaning which runs every 60 seconds or via an alternative method of your choosing to check all of the jobs for tasks that are taking too long. You would need to decide what determines if a frame redistributed needs to happen. You would suspend that current task and use the AppendJobFrameRange function to redistribute/append the tasks frame range as chunks of 1.
I tested some stuff with the housecleaning event you talked about. Like here
I didn’t know about this event. - it’s actually quite cool and offers a lot of things! So it could lead me to the solution you introduced.
While testing I encountered one problem tho. How is it possible to see a traceback. Or manual logging when working with this “houseclean” event? Here it says that Job Events and Reports are stored in Reports of Jobs or Slaves. Also, I ticked “House Cleaning” -> “Run House Cleaning in a Sperate Process”.
When I opened this sperate Houscleaning Logfile there was no entry about my script at all. “self.LogInfo(I like hazelnuts)” didn’t do anything too.
from Deadline.Events import *
import taskTimeout
def GetDeadlineEventListener():
"""
This is the function that Deadline calls to get an instance of the
main DeadlineEventListener class.
:return:
"""
return ScheduledEvent()
def CleanupDeadlineEventListener(deadlinePlugin):
"""
This is the function that Deadline calls when the event plugin is
no longer in use so that it can get cleaned up.
:param deadlinePlugin:
:return:
"""
deadlinePlugin.Cleanup()
class ScheduledEvent(DeadlineEventListener):
"""
This is the main DeadlineEventListener class for ScheduledEvent.
"""
def __init__(self):
# Set up the event callbacks here
self.OnHouseCleaningCallback += self.OnHouseCleaning
def Cleanup(self):
del self.OnHouseCleaningCallback
def OnHouseCleaning(self):
# check which checks should be done
timeout_active = self.GetConfigEntry("TimeOutActive")
self.LogInfo( "looking for hazelnuts " ) <-- nothing to find in the house cleaning.log
# Process TimeoutCheck
if timeout_active:
self.LogInfo("hazelnuts timeout") <-- nothing to find in the house cleaning.log
time_out_multiplier = self.GetConfigEntry("TimeOutMultiplier")
job_progress_before_check = self.GetConfigEntry("JobProgressBeforeCheck")
min_frames_job = self.GetConfigEntry("MinFramesJob")
start_time_offset_in_seconds = self.GetConfigEntry("startTimeOffset")
test_job_name = start_time_offset_in_seconds = self.GetConfigEntry("TestJob")
do_test = start_time_offset_in_seconds = self.GetConfigEntry("DoTest")
taskTimeout.checkTasks(time_out_multiplier,job_progress_before_check, min_frames_job, start_time_offset_in_seconds, test_job_name,do_test)
yep its running and no, there is no Entry in the Pulse log. I just realized when I perform manual Housecleaning via the monitor, it works. But the standard house cleaning doesn’t work.