RyanIG
October 6, 2016, 2:54pm
1
Hey,
Having some issues rendering on a Fresh Linux system, the system has 8 Nvidia 980Tis in it, and i’ve setup the GPU Affinity so each frame will be allocated two GPUs, So I have 4 slave instances running at a time.
Ex:
GPU 1 - 0,1
GPU 2 - 2,3
GPU 3 - 4,5
GPU 4 - 6,7
I’m used to looking at Deadline slave logs on windows, not so much on Linux, it appears as if the slaves are using all 8 gpus across 4 frames concurrently.
This seems to be causing slaves to hang and it also crashes MayaBatch occasionally, I’m not sure if this log is related to the issues but using GPU affinity on Windows before did resolve many errors for us.
Screenshot of GPU Affinity / Slave instance Names.
http://i67.tinypic.com/2i1yr8k.png
Here’s the log of a Slave, I can attach the full log if needed.
2016-10-06 15:22:57: 0: STDOUT: mel: mel: [Redshift] Cache path: /root/redshift/cache
2016-10-06 15:22:57: 0: STDOUT: [Redshift] Redshift Initialized
2016-10-06 15:22:57: 0: STDOUT: [Redshift] Linux Platform
2016-10-06 15:22:57: 0: STDOUT: [Redshift] Release Build
2016-10-06 15:22:57: 0: STDOUT: [Redshift] Number of CPU HW threads: 24
2016-10-06 15:22:57: 0: STDOUT: [Redshift] Total system memory: 78.59 GB
2016-10-06 15:22:57: 0: STDOUT: [Redshift] Creating CUDA contexts
2016-10-06 15:22:57: 0: STDOUT: [Redshift] CUDA init ok
2016-10-06 15:22:57: 0: STDOUT: [Redshift] Ordinals: { 0 1 2 3 4 5 6 7 }
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 0
2016-10-06 15:23:04: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Device 1/8 : GeForce GTX 980 Ti
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:04: 0: STDOUT: [Redshift] PCI busID: 4, deviceID: 0, domainID: 0
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 3.734444 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 3.164830 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 3.587930 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 5.033083 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.013842 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.013315 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.008581 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.008449 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Available memory: 5699.3125 MB out of 6077.3750 MB
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 1
2016-10-06 15:23:04: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Device 2/8 : GeForce GTX 980 Ti
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:04: 0: STDOUT: [Redshift] PCI busID: 5, deviceID: 0, domainID: 0
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 5.265391 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 3.529710 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 3.286990 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 5.071782 GB/s
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.014596 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.014719 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.016304 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.014752 ms
2016-10-06 15:23:04: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:04: 0: STDOUT: [Redshift] Available memory: 5767.4375 MB out of 6077.7500 MB
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 2
2016-10-06 15:23:05: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Device 3/8 : GeForce GTX 980 Ti
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:05: 0: STDOUT: [Redshift] PCI busID: 8, deviceID: 0, domainID: 0
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 10.262206 GB/s
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 7.233344 GB/s
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 2.917630 GB/s
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 1.262746 GB/s
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.030813 ms
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.018031 ms
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.017355 ms
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.014718 ms
2016-10-06 15:23:05: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Available memory: 5767.4375 MB out of 6077.7500 MB
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 3
2016-10-06 15:23:05: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Device 4/8 : GeForce GTX 980 Ti
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:05: 0: STDOUT: [Redshift] PCI busID: 9, deviceID: 0, domainID: 0
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 4.354003 GB/s
2016-10-06 15:23:05: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 3.055894 GB/s
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 1.430131 GB/s
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 1.638408 GB/s
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.023706 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.024728 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.026575 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.025886 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Available memory: 5767.4375 MB out of 6077.7500 MB
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 4
2016-10-06 15:23:06: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Device 5/8 : GeForce GTX 980 Ti
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:06: 0: STDOUT: [Redshift] PCI busID: 131, deviceID: 0, domainID: 0
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 10.882689 GB/s
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 6.997245 GB/s
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 1.953577 GB/s
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 2.272115 GB/s
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.037778 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.030086 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.019099 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.012297 ms
2016-10-06 15:23:06: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Available memory: 5767.4375 MB out of 6077.7500 MB
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 5
2016-10-06 15:23:06: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Device 6/8 : GeForce GTX 980 Ti
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:06: 0: STDOUT: [Redshift] PCI busID: 132, deviceID: 0, domainID: 0
2016-10-06 15:23:06: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 11.277424 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 6.180534 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 1.533541 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 4.110238 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.015370 ms
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.015822 ms
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.017610 ms
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.013428 ms
2016-10-06 15:23:07: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Available memory: 5767.4375 MB out of 6077.7500 MB
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 6
2016-10-06 15:23:07: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Device 7/8 : GeForce GTX 980 Ti
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:07: 0: STDOUT: [Redshift] PCI busID: 135, deviceID: 0, domainID: 0
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 10.746902 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 7.103100 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 3.847854 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 1.555237 GB/s
2016-10-06 15:23:07: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.134678 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.046141 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.012746 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.013795 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Available memory: 5767.4375 MB out of 6077.7500 MB
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Initializing GPUComputing module (CUDA). Ordinal 7
2016-10-06 15:23:08: 0: STDOUT: [Redshift] CUDA Ver: 8000
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Device 8/8 : GeForce GTX 980 Ti
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Compute capability: 5.2
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Num multiprocessors: 22
2016-10-06 15:23:08: 0: STDOUT: [Redshift] PCI busID: 136, deviceID: 0, domainID: 0
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Theoretical memory bandwidth: 336.480011 GB/Sec
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned CPU->GPU): 9.768453 GB/s
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Measured PCIe bandwidth (pinned GPU->CPU): 4.299793 GB/s
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged CPU->GPU): 1.596275 GB/s
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Measured PCIe bandwidth (paged GPU->CPU): 1.582923 GB/s
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (0): 0.036990 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (1): 0.010945 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (2): 0.012484 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Estimated GPU->CPU latency (3): 0.013389 ms
2016-10-06 15:23:08: 0: STDOUT: [Redshift] New CUDA context created
2016-10-06 15:23:08: 0: STDOUT: [Redshift] Available memory: 5871.9375 MB out of 6077.7500 MB
2016-10-06 15:23:13: 0: STDOUT: [Redshift] Loading Redshift procedural extensions…
2016-10-06 15:23:13: 0: STDOUT: [Redshift] Done!
2016-10-06 15:23:13: 0: STDOUT: [Redshift] Redshift for Maya 2016
2016-10-06 15:23:13: 0: STDOUT: [Redshift] Version 2.0.50, Jul 12 2016
2016-10-06 15:23:13: 0: STDOUT: [Redshift] renderable camera = |persp
2016-10-06 15:23:13: 0: STDOUT: [Redshift] Rendering frame 2 (1/1)
2016-10-06 15:23:13: 0: STDOUT: [Redshift] Maya evaluation manager mode: parallel
So my question is this, is the above log normal? or is GPU Affinity in fact not working and the 8 GPUs are rendering concurrently across 4 frames?
Any help is appreciated
Cheers
Ryan
Hey,
What exact version of Deadline are you using? As we fixed a GPU override bug in Deadline 8.0.1.0 for Maya:
docs.thinkboxsoftware.com/produc … .html#id64
Mike
RyanIG
October 6, 2016, 3:02pm
3
Hey Mike,
Currently using 8.1.4.6, Updated yesterday due to some issues I was having with the Submission plugin in Maya.
http://i66.tinypic.com/2cdzpg9.png
Oh, beta version. Can we continue this conversation on the beta forum?
forums.thinkboxsoftware.com/viewforum.php?f=211
In your log you should have this line printed out, which will verify which GPU slots should be in use:
RyanIG
October 6, 2016, 3:23pm
5
Certainly Mike,
I’ve created a thread over there and have added my reply to the end of it,
Cheers
Ryan