This thread is quite technical. So for those who just need to have 2 different jobs (job_A , job_B) to render on a dual GPU machine at the same time with one GPU per job, i do it this way and it works :
- i have my machine foo which is my only worker
- from foo i create 2 worker instance foo_0 and foo_1
- i then set in the workers instances properties GPU_affinity, with GPU_0 only for instance foo_0 and GPU_1 only for instance foo_1
- i create a Pool call instances that contain only foo_0 and foo_1
in the houdini submitter for my 2 jobs : job_A and job_B i use the following setup :
- Pool : i select the pool instances i created previously
- concurent task : 1 ( default )
- frame per task : 9999999
- GPU per task : 0 ( default )
with those setups it works i have one job per GPU at full power for each Job. All other test iâve done ended with bad GPU dispatching with jobs trying to use the same GPU
i just post my result here, cause the thread is quite hard to follow for non technical people
Cheers
As extra infos , here is my test to describe the problem :
- I have one machine only name foo, so 1 worker with 2 x 3090.
- I create 2 workers instances foo_0 and foo_1 and put them in a pool instances,
- if i submit 2 jobs ( job_A and job_B ) with those option :
test_1
test_2
In both case it doesnât work deadline canât dispatch the jobs properly they both try to use the same GPU. for test_1 itâs logic , but i donât understand why test_2 is not working. It should theoritically work.
But in both scenario GPU_0 is at full load at 1850Mhz and GPU_1 is irregular always oscillating between long period at 400Mhz and quick period at 1850Mhz.
I think the source of the problem is redshift preferences.xml that overide everything .
C:\ProgramData\redshift\preferences.xml
This line :
preference name=âSelectedComputeDevicesâ type=âstringâ value=â0:NVIDIA GeForce RTX 3090,â
Deadline is overrided by this, it canât think and say âohh this gpu is busy letâs use the other one that would be more cleverâ. It is just brute force to use GPU_0 because of this file.
So the only 2 solutions i have is either :
- use the GPU affinity in the worker instance foo_0 set to GPU_0 and foo_1 set to GPU_1
- use a python pre-render script in each Redshift ROP before submitting to deadline to specify which GPU to use
hou.hscript(âsetenv REDSHIFT_GPUDEVICES=0; varchange REDSHIFT_GPUDEVICESâ) # use GPU 0
hou.hscript(âsetenv REDSHIFT_GPUDEVICES=1; varchange REDSHIFT_GPUDEVICESâ) # use GPU 1
But in any case we are forced to brute force the GPU assignation because it canât be done cleverly.
Nevertheless i am not sure itâs a deadline issue ?