We have a problem with LightWave renders using Deadline (latest version).
The problem is that Deadline tells that a job is completed although he hasn’t rendered all the frames.
In the attachment I’ve placed the tasks for a certain job. You can see that some frames render for 2 or 4 seconds, while normally they should render for 50 seconds or more. When verifying the destination location on our storage server, we notice that the frames with the too fast render time aren’t there? I’ve placed the fast and fake renders in red in the excel in attachment. Why does deadline say that the task is ready then, because it takes a lot of manually verifying for each job?
Sometimes some slaves give an error (but as you can see, these aren’t necessary the ones with a too fast render time). I’ve also placed one of these error logs onto the second sheet in the same excel. We run LightWave from a network location, and it says it doesn’t find the executable to the Fprime executable. Although, when logging onto the machine the path to the executable is reachable, and later on the machine renders the task anyway.
We can try to set a value to a minimum render time, but, that’s a difficult workaround as we can’t predict always how long a render task should render.
In other words, the main problem is why deadline says that certain tasks are done, while the frames aren’t there or the slave rendered too fast?
Can you send us a log or two from a task that renders quickly and doesn’t produce a frame? Just right-click on the task in the Monitor and select Task Reports -> View Log Reports.
Also, when you submit the job, do you have the “ScreamerNet” option enabled? If you do, try disabling it for a few jobs to see if that improves things.
Finally, that error you’re getting is likely a result of using a shared Lightwave installation. In order for Deadline to check the bitness of a file, it must open it, and it’s possible that 2 or more machines could attempt this at the same time. The only workarounds are to not check for bitness (ie: set “None” as the build to force), or to move away from a shared installation setup.
The artist of who these renders are tells me he has already disabled screamernet because he uses Fprime in his renders. A human error can always happen, but he checked his files. He’s gone now, but tomorrow we can send you the logs. We’re thinking a bit that this problem should be related to rendering with Fprime and deadline?
In the submitter, the artist can set the Build option to be “None”, “32bit”, or “64bit”. Setting it to “None” will remove the bitness check.
If the “Screamernet” option is disabled, then all Deadline is doing is running a command line render and waiting for it to complete, and since the problem is random, odds are that the problem is related to FPrime. The logs should help confirm or deny this though.
At the bottom of the Deadline forum page (viewforum.php?f=11) there should be a link that says “Subscribe forum”. Click this to receive emails for all new forum posts.
0: Task timeout is disabled.
0: Loaded job: Sh01810_Waterstaf_Snake_Bubbles_Vs010.lws (003_089_999_490b7db9)
0: INFO: StartJob: initializing script plugin Lightwave
0: INFO: About: Lightwave Plugin for Deadline
0: INFO: Using FPrime for rendering
0: INFO: Enforcing 64 bit build of FPrime
0: Plugin rendering frame(s): 20-24
0: INFO: Any popup windows with titles matching the regular expression “.Unable To Locate.” will be handled by pressing “OK”
0: INFO: Any stdout that matches the regular expression “Error:(.)" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "Can’t find "(.)?”…" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Can’t open scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.bad magic number.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.Unable to access the scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "(Rendering frame [0-9]+). pass ([0-9]+)[^0-9]+([0-9]+)." will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Frame completed” will be handled as appropriate
0: INFO: Stdout Handling Enabled: True
0: INFO: Popup Handling Enabled: True
0: INFO: Using Process Tree: True
0: INFO: Hiding DOS Window: True
0: INFO: Creating New Console: False
0: INFO: Render Executable: “S:\LW9.6_x64_TEST\Programs\WSN.exe”
0: INFO: Render Argument: -3 -c"S:/LW9.6_x64_TEST/Configs/_Common" -d"C:/Temp" “C:\Documents and Settings\Administrator\Local Settings\Temp\Sh01810_Waterstaf_Snake_Bubbles_Vs010_0.lws” 20 24
0: INFO: Startup Directory: “S:\LW9.6_x64_TEST\Programs”
0: INFO: Process Priority: BelowNormal
0: INFO: Process is now running
0: STDOUT: WSN Layout Launcher 1.01
0: STDOUT: Using Config dir "S:/LW9.6_x64_TEST/Configs/Common\LW9-64.cfg"
0: STDOUT: Using command-line Content dir “C:/Temp”
0: STDOUT: Scene file output file prefix found: I:\Hector_Raveleijn_10\Renders_3D\Afl12\Sh01810\Sh01810_Waterstaf_WaterSnake_Bubbles_Vs010
0: STDOUT: Cannot create temporary config directory S:/LW9.6_x64_TEST/Configs/_Common\tempCFG_f0
0: INFO: Process exit code: 0
Slave Machine = Rf123
Slave Version = v4.1.0.43205 R
Plugin Name = Lightwave
the report above is the 3 seconds rendering task, this one is the other task (there were 2 tasks) which rendered for 40minutes=
Log Message
0: Task timeout is disabled.
0: Loaded job: Sh01810_Waterstaf_Snake_Bubbles_Vs010.lws (003_089_999_490b7db9)
0: INFO: StartJob: initializing script plugin Lightwave
0: INFO: About: Lightwave Plugin for Deadline
0: INFO: Using FPrime for rendering
0: INFO: Enforcing 64 bit build of FPrime
0: Plugin rendering frame(s): 25
0: INFO: Any popup windows with titles matching the regular expression “.Unable To Locate.” will be handled by pressing “OK”
0: INFO: Any stdout that matches the regular expression “Error:(.)" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "Can’t find "(.)?”…" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Can’t open scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.bad magic number.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.Unable to access the scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "(Rendering frame [0-9]+). pass ([0-9]+)[^0-9]+([0-9]+)." will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Frame completed” will be handled as appropriate
0: INFO: Stdout Handling Enabled: True
0: INFO: Popup Handling Enabled: True
0: INFO: Using Process Tree: True
0: INFO: Hiding DOS Window: True
0: INFO: Creating New Console: False
0: INFO: Render Executable: “S:\LW9.6_x64_TEST\Programs\WSN.exe”
0: INFO: Render Argument: -3 -c"S:/LW9.6_x64_TEST/Configs/_Common" -d"C:/Temp" “C:\Documents and Settings\Administrator\Local Settings\Temp\Sh01810_Waterstaf_Snake_Bubbles_Vs010_0.lws” 25 25
0: INFO: Startup Directory: “S:\LW9.6_x64_TEST\Programs”
0: INFO: Process Priority: BelowNormal
0: INFO: Process is now running
0: STDOUT: WSN Layout Launcher 1.01
0: STDOUT: Using Config dir "S:/LW9.6_x64_TEST/Configs/Common\LW9-64.cfg"
0: STDOUT: Using command-line Content dir “C:/Temp”
0: STDOUT: Scene file output file prefix found: I:\Hector_Raveleijn_10\Renders_3D\Afl12\Sh01810\Sh01810_Waterstaf_WaterSnake_Bubbles_Vs010
0: STDOUT: Found WSNClient plugin in CFG : S:\LW9.6_x64_TEST\Plugins\3rdParty\fp3_64.p
0: STDOUT: Layout: S:\LW9.6_x64_TEST\Programs\Lightwav.exe
0: STDOUT: Cmdline= -0 -c"S:/LW9.6_x64_TEST/Configs/Common\tempCFG_fa4"
0: STDOUT: Layout launched and render commands sent.
0: STDOUT: Layout successfully started, client is running.
0: STDOUT: WSN now in relay mode.
0: STDOUT: Client: Layout Client Initializing
0: STDOUT: Client: Setting content directory to C:/Temp
0: STDOUT: Client: Loading scene C:\Documents and Settings\Administrator\Local Settings\Temp\Sh01810_Waterstaf_Snake_Bubbles_Vs010_0.lws
0: STDOUT: Client: Scene loaded.
0: STDOUT: Client: Output RGB=I:\Hector_Raveleijn_10\Renders_3D\Afl12\Sh01810\Sh01810_Waterstaf_WaterSnake_Bubbles_Vs010
0: STDOUT: Client: Render complete.
0: STDOUT: Layout exited, WSN now cleaning up.
0: STDOUT: Total rendering time: 2636.38s. Last frame took 2626.84s.
0: INFO: Process exit code: 0
Slave Machine = Rf129
Slave Version = v4.1.0.43205 R
Plugin Name = Lightwave
example 2) (this one had 2 logs)
report 1
Log Message
0: Task timeout is disabled.
0: Loaded job: Sh01970_Waterstaf_Snake_bubbles_Vs010.lws (003_089_999_3dc57025)
0: INFO: StartJob: initializing script plugin Lightwave
0: INFO: About: Lightwave Plugin for Deadline
0: INFO: Using FPrime for rendering
0: INFO: Enforcing 64 bit build of FPrime
0: Plugin rendering frame(s): 161-162
0: INFO: Any popup windows with titles matching the regular expression “.Unable To Locate.” will be handled by pressing “OK”
0: INFO: Any stdout that matches the regular expression “Error:(.)" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "Can’t find "(.)?”…" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Can’t open scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.bad magic number.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.Unable to access the scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "(Rendering frame [0-9]+). pass ([0-9]+)[^0-9]+([0-9]+)." will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Frame completed” will be handled as appropriate
0: INFO: Stdout Handling Enabled: True
0: INFO: Popup Handling Enabled: True
0: INFO: Using Process Tree: True
0: INFO: Hiding DOS Window: True
0: INFO: Creating New Console: False
0: INFO: Render Executable: “S:\LW9.6_x64_TEST\Programs\WSN.exe”
0: INFO: Render Argument: -3 -c"S:/LW9.6_x64_TEST/Configs/_Common" -d"C:/Temp" “C:\Documents and Settings\Administrator\Local Settings\Temp\Sh01970_Waterstaf_Snake_bubbles_Vs010_0.lws” 161 162
0: INFO: Startup Directory: “S:\LW9.6_x64_TEST\Programs”
0: INFO: Process Priority: BelowNormal
0: INFO: Process is now running
0: STDOUT: WSN Layout Launcher 1.01
0: STDOUT: Using Config dir "S:/LW9.6_x64_TEST/Configs/Common\LW9-64.cfg"
0: STDOUT: Using command-line Content dir “C:/Temp”
0: STDOUT: Scene file output file prefix found: I:\Hector_Raveleijn_10\Renders_3D\Afl12\Sh01970\Waterstaf\Sh01970_Waterstaf_WaterSnake_Bubbles_Vs010
0: STDOUT: Cannot create temporary config directory S:/LW9.6_x64_TEST/Configs/_Common\tempCFG_fec
0: INFO: Process exit code: 0
Slave Machine = Rf123
Slave Version = v4.1.0.43205 R
Plugin Name = Lightwave
report2
Log Message
0: Task timeout is disabled.
0: Loaded job: Sh01970_Waterstaf_Snake_bubbles_Vs010.lws (003_089_999_3dc57025)
0: INFO: StartJob: initializing script plugin Lightwave
0: INFO: About: Lightwave Plugin for Deadline
0: INFO: Using FPrime for rendering
0: INFO: Enforcing 64 bit build of FPrime
0: Plugin rendering frame(s): 161-162
0: INFO: Any popup windows with titles matching the regular expression “.Unable To Locate.” will be handled by pressing “OK”
0: INFO: Any stdout that matches the regular expression “Error:(.)" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "Can’t find "(.)?”…" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Can’t open scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.bad magic number.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.Unable to access the scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "(Rendering frame [0-9]+). pass ([0-9]+)[^0-9]+([0-9]+)." will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Frame completed” will be handled as appropriate
0: INFO: Stdout Handling Enabled: True
0: INFO: Popup Handling Enabled: True
0: INFO: Using Process Tree: True
0: INFO: Hiding DOS Window: True
0: INFO: Creating New Console: False
0: INFO: Render Executable: “S:\LW9.6_x64_TEST\Programs\WSN.exe”
0: INFO: Render Argument: -3 -c"S:/LW9.6_x64_TEST/Configs/_Common" -d"C:/Temp" “C:\Documents and Settings\Administrator\Local Settings\Temp\Sh01970_Waterstaf_Snake_bubbles_Vs010_0.lws” 161 162
0: INFO: Startup Directory: “S:\LW9.6_x64_TEST\Programs”
0: INFO: Process Priority: BelowNormal
0: INFO: Process is now running
0: STDOUT: WSN Layout Launcher 1.01
0: STDOUT: Using Config dir "S:/LW9.6_x64_TEST/Configs/Common\LW9-64.cfg"
0: STDOUT: Using command-line Content dir “C:/Temp”
0: STDOUT: Scene file output file prefix found: I:\Hector_Raveleijn_10\Renders_3D\Afl12\Sh01970\Waterstaf\Sh01970_Waterstaf_WaterSnake_Bubbles_Vs010
0: STDOUT: Cannot create temporary config directory S:/LW9.6_x64_TEST/Configs/_Common\tempCFG_234
0: INFO: Process exit code: 0
process: Lightwave0
0: INFO: Starting monitored managed process Lightwave0
0: INFO: Any popup windows with titles matching the regular expression “.Unable To Locate.” will be handled by pressing “OK”
0: INFO: Any stdout that matches the regular expression “Error:(.)" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "Can’t find "(.)?”…" will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Can’t open scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.bad magic number.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “.Unable to access the scene file.” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression "(Rendering frame [0-9]+). pass ([0-9]+)[^0-9]+([0-9]+)." will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Frame completed” will be handled as appropriate
0: INFO: Stdout Handling Enabled: True
0: INFO: Popup Handling Enabled: True
0: INFO: Using Process Tree: True
0: INFO: Hiding DOS Window: True
0: INFO: Creating New Console: False
0: INFO: Render Executable: “S:\LW9.6_x32_TEST\Programs\WSN.exe”
0: INFO: Render Argument: -2 -c"S:/LW9.6_x64_TEST/Configs/_Common" -d"C:/Temp" “C:\Documents and Settings\Administrator\Local Settings\Application Data\Prime Focus\Deadline\slave\jobsData\job0” “C:\Documents and Settings\Administrator\Local Settings\Application Data\Prime Focus\Deadline\slave\jobsData\ack0”
0: INFO: Startup Directory: “S:\LW9.6_x32_TEST\Programs”
0: INFO: Process Priority: BelowNormal
0: INFO: Process is now running
0: INFO: Sending command: init
0: WARNING: Monitored managed process Lightwave0 is no longer running
Scheduler Thread - Render Thread 0 threw an error:
Scheduler Thread - An error occurred in StartJob(): Monitored managed process “Lightwave0” has exited or been terminated.
=======================================================
Error Type
at Deadline.Plugins.ScriptPlugin.StartJob(Job job)
at Deadline.Plugins.Plugin.StartJob(Job job)
at Deadline.Slaves.SlaveRenderThread.RenderCurrentTask()
Changing the bitness to none should only help the “64 bit FPrime render executable was not found in the semicolon separated list” error. Are you still getting errors like “FPrime render executable was not found in the semicolon separated list” after setting this value to none?
Also, I noticed that in the last log you posted here (viewtopic.php?f=11&t=5017#p20482), screamernet IS enabled. But let’s not worry about that for now.
On to the logs!
The logs for tasks that finished really quickly show this message before exiting:
So it looks like Deadline needs to watch for this message and fail the job if detected. I’ve attached an updated Lightwave plugin script that will detect this error message and fail the task so that it can be reattempted. To install, go to \your\repository\plugins\Lightwave and make a backup copy of Lightwave.py so that you have a rollback option. Then unzip the attached file to the same folder. Let us know if this at least helps the missing frame problem.
Now just a note. It would appear that these issue you are seeing might be related to using a shared Lightwave installation. Based on this output, it would appear that WSN tries to create a temp folder in the config folder, and if you have a bunch of slaves trying to do this at the same time, it could result in some conflicts. Just to satisfy my curiosity, can you let me know how many slaves you have in your farm, and also why you’re choosing to use a shared Lightwave installation instead of installing it on each machine? I just get the sense that the larger the farm, the more issues that will arise from using a shared installation.
I’ve watched a few jobs of which we’re certain that screamernet was disabled posting.php?mode=reply&f=11&t=5017#, and I haven’t seen the error again. Perhaps for other jobs but normally not.
I’ve changed the py-script, and we’ll check out now if this gives further problems or not.
We’re using a shared lightwave, because this way we can change the config at once for all the lightwaves on the network, it’s only one config instead of plural installations on machines etc.
We have an average of 65 machines rendering, is that too large?
Definitely not too large for Deadline, but my concern is that you have 65 machines sharing a single executable file, as well as a single config location. If any of the slaves lock anything here for any reason, the rest of your slaves will be out of luck for a brief amount of time. That would explain two of the problems you’re seeing:
Not being able to find a render executable when forcing 32/64 bit rendering (because a slave needs to parse the executable itself for this info). If that file is locked, then this check fails and the executable “can’t be found”. I should note that when Deadline parses the file, it opens it with shared read/write permissions, so Deadline itself isn’t locking the file. But if that file is locked for another reason, that will cause the bitness check to fail. Having Lightwave installed locally on each machine should alleviate this problem.
For plugin synchronization, you could create a PreLoad.py script for the Lightwave plugin that can be used to sync up plugins with a network location before a render begins. That way, your Lightwave plugins are always in sync: thinkboxsoftware.com/deadlin … -_Optional
WSN.exe not being able to create a temp config directory. If that config folder is locked, then creation of the temp folder fails, which results in those missing frames. With the new Lightwave.py file I sent you, Deadline should no longer mark these failed renders as complete, but note that your jobs will accumulate errors when this problem is detected, which results in wasted CPU cycles. Sure, it’s only a 5-10 seconds a task when the problem is detected, but that can add up across hundreds of jobs. If you had a local config, I would expect this specific problem to go away, and thus improve your overall throughput.
That being said, using a shared config folder makes sense for ease of administration, and to be honest, I think this is what is recommended by Newtek for Lightwave network rendering. So that begs the question: Why is WSN.exe creating a temp folder in the config folder, rather then using the user’s or system’s local temp folder? Maybe this is a good question to bring up with the FPrime developers.
We’ve checked the renders, we haven’t seen frames which pretended to be rendered but weren’t there. I guess the adjustment of the script did the work.
I’m going to check out if it would be possible for us to render locally.
We’re also going to talk to the FPrime developers about the config problem.
I’ve mailed with Fprime, this is their answer, maybe this is helpful:
Your diagnosis is completely correct. WSN does need to make a temp config, in order to trick Lightwave
into auto-running WSN when Lightwave starts up.
You can make WSN use any directory you like (like a dedicated temp directory.) Copy the LW config files into your chosen directory and just point WSN at THAT config dir with the -c option. Annoying, yes, but it only needs to be done once. You’re right that it’d be nice if there was a command line option for specifying the temp directory directly though.
Attached is an updated Lightwave patch. This patch adds a new option to the Lightwave plugin configuration called “FPrime Use Local Config”. If enabled, Deadline will copy the contents of the Config folder to a local temp folder, and then pass that local folder to WSN.exe. Based on the info from the WSN developers, that should fix the temp config creation problem.
To install, go to \your\repository\plugins\Lightwave and make a backup copy of the following files so that you have a rollback option:
Lightwave.py
Lightwave.dlinit
Lightwave.param
Then unzip the attached file to the same folder. Note that this patch will reset your render executable settings in the Lightwave plugin configuration, so you may need to set them again.
I seem to forget about this matter. My excuses.
Use Fprime local is still set to true. Now it seems we have other problems and errors occurring in the error logs:
propriate
0: INFO: Any stdout that matches the regular expression “(Rendering frame [0-9]+).* pass ([0-9]+)[^0-9]+([0-9]+).” will be handled as appropriate
0: INFO: Any stdout that matches the regular expression “Frame completed” will be handled as appropriate
0: INFO: Stdout Handling Enabled: True
0: INFO: Popup Handling Enabled: True
0: INFO: Using Process Tree: True
0: INFO: Hiding DOS Window: True
0: INFO: Creating New Console: False
0: INFO: Render Executable: “S:\LW9.6_x64_TEST\Programs\WSN.exe”
0: INFO: Using local Config for FPrime
Slave - Exception: Failed to update slaveInfo: The process cannot access the file because it is being used by another process.
0: INFO: Render Argument: -3 -c"C:/Users/Grid/AppData/Local/Prime Focus/Deadline/slave/jobsData/LocalFPrimeConfig_temph93F30" -d"H:/TeamTo_Plankton_Invasion_11" “C:\Users\Grid\AppData\Local\Temp\pla_render_106_102_RGB_Vs002_XX_g2fixed_0.lws” 133 144
0: INFO: Startup Directory: “S:\LW9.6_x64_TEST\Programs”
0: INFO: Process Priority: BelowNormal
0: INFO: Process is now running
0: STDOUT: WSN Layout Launcher 1.01
0: STDOUT: Using Config dir “C:/Users/Grid/AppData/Local/Prime Focus/Deadline/slave/jobsData/LocalFPrimeConfig_temph93F30\LW9-64.cfg”
0: STDOUT: Using command-line Content dir “H:/TeamTo_Plankton_Invasion_11”
0: STDOUT: Scene file output file prefix found: I:/TeamTo_Plankton_Invasion_11/Renders_3D/106/106-102/Beauty/pla_render_106_102_BEAUTY_rgb_
0: STDOUT: Cannot create temporary config directory C:/Users/Grid/AppData/Local/Prime Focus/Deadline/slave/jobsData/LocalFPrimeConfig_temph93F30\tempCFG_a8
Scheduler Thread - Render Thread 0 threw an error:
Scheduler Thread - Exception during render: An error occurred in RenderTasks(): Cannot create temporary config directory C:/Users/Grid/AppData/Local/Prime Focus/Deadline/slave/jobsData/LocalFPrimeConfig_temph93F30\tempCFG_a8
at Deadline.Plugins.ScriptPlugin.RenderTasks(Int32 startFrame, Int32 endFrame, String& outMessage)
=======================================================
Error Type
at Deadline.Plugins.Plugin.RenderTask(Int32 startFrame, Int32 endFrame)
at Deadline.Slaves.SlaveRenderThread.RenderCurrentTask()
Take a look at this line:
An error occurred in RenderTasks(): Cannot create temporary config directory C:/Users/Grid/AppData/Local/Prime Focus/Deadline/slave/jobsData/LocalFPrimeConfig_temph93F30\tempCFG_a8
We also have it on other slaves for c:\documents and settings…
and for
c:/Users/Administrator.