AWS Thinkbox Discussion Forums

orphaned task not detected

We find an occasional task here and there with machines that are seemingly stuck on them for several days:

In this case, the machine was rendering something else, in another, the slave was actually offline (not stalled).

For this particular job, the limitgroups document is:

{ "_id" : "5423794fd48f140d6446e37d", "LastWriteTime" : { "$date" : 1412120269606 }, "Props" : { "Limit" : 1, "RelPer" : -1, "Slaves" : [], "White" : false, "SlavesEx" : [] }, "Name" : "5423794fd48f140d6446e37d", "Stubs" : [], "StubCount" : 0, "StubLevel" : 0, "Type" : 1 }

Jobtasks:

{ "_id" : "5423794fd48f140d6446e37d_0", "JobID" : "5423794fd48f140d6446e37d", "TaskID" : 0, "Frames" : "1-1", "Slave" : "LAPRO0415", "Stat" : 4, "Errs" : 0, "Start" : { "$date" : 1411612105970 }, "StartRen" : { "$date" : -62135596800000 }, "Comp" : { "$date" : -62135596800000 }, "WtgStrt" : true }

Job:

Where this is odd:

“QueuedChunks” : 1
“RenderingChunks” : 0

{ "_id" : "5423794fd48f140d6446e37d", "LastWriteTime" : { "$date" : 1411611288403 }, "Props" : { "Name" : "[KING] Upload Proxy Quicktime: KB_084_5040_avidref-Full_proxy_v0001", "User" : "ryan.valade", "Cmmt" : "", "CmmtTag" : "", "Dept" : "", "Frames" : "1", "Chunk" : 1, "Tasks" : 1, "Grp" : "python", "Pool" : "python", "SecPool" : "", "Pri" : 17, "ReqAss" : [], "ScrDep" : [], "Conc" : 1, "ConcLimt" : false, "AuxSync" : false, "Int" : false, "Seq" : false, "Reload" : false, "NoEvnt" : false, "OnComp" : 1, "AutoTime" : false, "TimeScrpt" : false, "MinTime" : 0, "MaxTime" : 72000, "Timeout" : 1, "StartTime" : 0, "Dep" : [ "5423794ad48f1419ac35dc9e" ], "DepFrame" : false, "DepFrameStart" : 0, "DepFrameEnd" : 0, "DepComp" : true, "DepDel" : true, "DepFail" : false, "DepPer" : -1, "NoBad" : false, "JobFailOvr" : false, "JobFailErr" : 0, "TskFailOvr" : false, "TskFailErr" : 0, "SndWarn" : true, "NotOvr" : true, "SndEmail" : false, "SndPopup" : false, "NotEmail" : [], "NotUser" : [ "ryan.valade" ], "NotNote" : "", "Limits" : [ "Upload" ], "ListedSlaves" : [], "White" : false, "MachLmt" : 1, "MachLmtProg" : -1, "PrJobScrp" : "", "PoJobScrp" : "", "PrTskScrp" : "", "PoTskScrp" : "", "Schd" : 0, "SchdDays" : 1, "SchdDate" : { "$date" : -62135596800000 }, "SchdDateRan" : { "$date" : -62135596800000 }, "PlugInfo" : { "LocalRendering" : "0", "StrictErrorChecking" : "1", "MaxProcessors" : "0", "Build" : "64bit", "ProjectPath" : "", "OutputFilePath" : "", "OutputFilePrefix" : "", "Camera" : "", "IgnoreError211" : "0", "ScriptJob" : "True", "Version" : "2.6", "Arguments" : "//inferno2/projects/king/scenes/KB_084_5040/images/reference/avidref-Full/v0001_rva_ingested/quicktime_hd/KB_084_5040_avidref-Full_proxy_v0001.mov" }, "Env" : { "SCL_DEVELOPER_MODE" : "False" }, "EnvOnly" : false, "PlugDir" : "", "Ex0" : "", "Ex1" : "", "Ex2" : "", "Ex3" : "", "Ex4" : "", "Ex5" : "", "Ex6" : "", "Ex7" : "", "Ex8" : "", "Ex9" : "", "ExDic" : {} }, "IsSub" : true, "Mach" : "LAPRO0618", "Date" : { "$date" : 1411610959152 }, "DateStart" : { "$date" : 1411612105970 }, "DateComp" : { "$date" : -62135596800000 }, "Plug" : "Python", "OutDir" : [], "OutFile" : [], "Main" : false, "MainStart" : 0, "MainEnd" : 0, "Tile" : false, "TileFrame" : 0, "TileCount" : 0, "TileX" : 0, "TileY" : 0, "Stat" : 1, "Aux" : [ "uploadQT.py" ], "Bad" : [], "CompletedChunks" : 0, "QueuedChunks" : 1, "SuspendedChunks" : 0, "RenderingChunks" : 0, "FailedChunks" : 0, "PendingChunks" : 0, "Errs" : 0, "DataSize" : 2244 }

Maybe just a corrupted job from our last week of madness?

Hey Laszlo,

In Deadline 7 beta 4, these orphaned tasks that get stuck in the Waiting To Start phase should now be cleaned up properly.

Cheers,
Ryan

Privacy | Site terms | Cookie preferences