Using arnold render plugin.
In the task window
progress goes from 0% to 100% and nothing inbetween.
Peak Ram Usage and Ave ram usage are incorrect. I am seeing 1% while sar reports between 43-49% usage as does cat /proc/meminfo
[root@render10 ~]# sar -r 5 5
Linux 2.6.32-279.22.1.el6.x86_64 (render10) 03/14/2013 x86_64 (24 CPU)
12:24:37 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
12:24:42 PM 12445824 12151192 49.40 207264 5363696 5451368 11.09
12:24:47 PM 12445740 12151276 49.40 207264 5363724 5451368 11.09
12:24:52 PM 12445760 12151256 49.40 207264 5363768 5721568 11.64
12:24:57 PM 12447256 12149760 49.40 207264 5363768 5451368 11.09
[root@render10 ~]# cat /proc/meminfo
MemTotal: 24597016 kB
MemFree: 12445988 kB
Buffers: 207264 kB
Cached: 5363464 kB
Chris
Found another…
Under jobs I have status active (38), but there 20 nodes each running 2 tasks which equals 40 tasks.
Chris
What’s your arnold verbosity level set to? I think you need a minimum of 4 to get progress.
Ram values are only for the process that the slave is running (and its child processes), not the ram usage of the entire system.
Probably just a minor display issue. Do you see this happen a lot? When you do see it, does it ever “fix” itself for that job?
Cheers,
Arnold verbosity is set to 4.
Just looked and all of the ram usage is currently empty
If you look under the slaves the memory is reported correctly, but the CPU speed is incorrect 2 systems (nodes 20 & 3) say 2.4ghz and the other 18 say 1.6ghz and I have checked the 1.6ghz systems at cpuinfo reports 2.4ghz.
Again minor…but I am doing performance setups on render nodes to get the optimal performance and when each frame takes 12 or 19 mins to render you get lots of time to uncover these minor problems…
I would like to get pulse up and working but I cannot see any install notes under V6?
Chris
Thanks for checking the arnold verbosity. Can you post a render log from a job? We can check the output to make sure it’s printing out progress info, and if it is, check our stdout handlers to see if they are parsing it incorrectly.
Just to confirm, are you saying those 2 systems are reporting 2.4ghz in their cpuinfo? If so, then that means Deadline is getting this information correctly.
Setup is the same as it was for v5, so you can refer to the v5 docs:
thinkboxsoftware.com/deadlin … /#Overview
Cheers,
All nodes are 2.4GHZ as reported by cpuinfo.
Only 2 were reported correctly yesterday.
Today I have 6 @ 2.4Ghz and 14 @ 1.6Ghz
I saw in your other post that you’re still on beta 10. Can you upgrade to beta 15 and see if you still have these problems?
Thanks!
Interesting… I thought I was on a much higher beta than that as I thought I started with beta11 then upgraded to beta13…
OK - downloading latest beta…
Is it possible you might have just updated the repository and not the clients?
Cheers,
Had problems with shotgun (see other post) so upgraded everything and the upgrade resolved it, and now when performing RV & shotgun integration found the Deadline and shotgun integration is broken again back to where I was a few weeks ago…
Both the repository and clients are now running beta 15? I’m a bit confused because in the shotgun post, you said you’re on beta 10 still…
viewtopic.php?f=86&t=9103#p39614
Note that in order to get the new ssl libraries, you would need to run the client installer on all of your machines.
Nothing has really changed and its a minor display problem.
Progress bar 0 then 100% (renders take 20 - 30 mins each) in the task window (jobs window the task usage is correct).
CPU speed is still reading either 1.6ghz or 2.4ghz in the slave window.
Memory usage under the task window is incorrect (peak usage 1.2mb ) while Slaves show consistent 4.6GB usage.
Are you able to post a log from an Arnold job? We’ll need that to check our progress handling.
Can you post the contents of the cpuinfo file for a machine that this is being reported wrong for?
I’ve logged this as a bug. Sounds like something is off with our memory usage gathering (at least under linux).
Thanks!
See attached tar file.
This has screen shot, cpuinfo and arnold log as produced from the launcher task on one of the nodes.
Enjoy
deadline.tar.gz (495 KB)
Thanks. I’ve confirmed that Deadline is pulling the correct “cpu MHz” value from the cpuinfo file. I did some reading, and I’ve learned that some processors can scale up or down as necessary, which I’m pretty sure explains what you are seeing here. Apparently, you can change the CPU governor settings to avoid this:
experts-exchange.com/OS/Linu … -Tips.html
Also, thanks for the Arnold log. I’m a bit embarrassed to say this, but it turns out our Arnold standalone plugin has to progress handling built into it like I had thought. I was thinking of our Arnold for Maya support. However, from looking at this particular log, I don’t see an obvious way to pull out overall progress. Yes, there are these lines:
55% done - 578 rays/pixel
However, there are 5 different sections of this log that go from 0 to 100%, and nothing obvious to indicate how many sections there would be in advance. We could probably guesstimate the overall progress like this:
- We go through the first section, reporting progress 0-100%.
- We go through the second section, adjusting overall progress so it appears as 51-100% (the first 0-50% is covered by the first section).
- We go through the third section, adjusting overall progress so it appears as 67-100% (the first 0-66% is covered by the first two sections).
etc…
This will result in the progress jumping around a bit, but maybe that’s better than nothing?
Just in case we get more info with more verbosity though, can you run another job with verbosity set to 5 and send us the log?
Thanks!
Thanks on the cpuinfo stuff.
Yes they were ticking over at 1600mhz and not 2400mhz… Oh joy but at least there is a fix/workaround.
Just a follow up. Any chance we can get a log from an Arnold job with verbosity set to 5?
Also, I took a look at how we collect memory usage for the rendering process. We are currently grabbing the Resident Set value, instead of the Virtual Memory value. That being said, the difference between 1.2MB and 4.6GB is pretty substantial.
Can you get an Arnold job rendering, and then while it’s rendering and using lots of memory, go to the node it’s rendering on and get the contents of /proc/PID/stat (where PID is the process ID of the Arnold process)? If you could send us the contents of that stat file, that would be great!
Cheers,
Sorry been stupidly busy… Will try a arnold with it set to 5… Expect a weeks delay as the farm as 1300 arnold frames to render currently and due to a Yeti bug we cannot multi thread the tasks so we are looking at 1hour a frame