I wish for the PRT loaders to work with the Include/Exclude lists in the lights. I need to optimize some of my lighting for speed, and this would help.
- Chad
I wish for the PRT loaders to work with the Include/Exclude lists in the lights. I need to optimize some of my lighting for speed, and this would help.
I wish for the PRT loaders to
work with the Include/Exclude
lists in the lights. I need
to optimize some of my
lighting for speed, and this
would help.
- Chad
This will not be possible in 1.0 - it would require significant changes in the internal design. Right now, all particles are loaded into a single container, sorted and lit of each light. Then the same particles are sorted against the camera and rendered.
There is no way to know which particle came from which loader and thus cannot be included/excluded.
The good news: sorting of particles for lighting has been made A LOT faster. We added several sorting algorithms, depending on what you are doing and whether you are using PCache or not you can get a lot of speed up.
Will post some real world benchmarks with 50+ million particles when I have them.
For example, 2 million particles and one light used to calculate in 4.5 seconds (whithout the bug). The Radix Sort does the same in 2.7 seconds the first time, then the second time you run from PCache it finds that the particles are sorted and needs only 0.5 seconds to sort.
Cheers,
Borislav “Bobo” Petrov
Technical Director 3D VFX
Frantic Films Winnipeg
Yay. Does the speedup also occur when you have PCache and LCache, ala the camera flying around the prelit static scene?
Yay. Does the speedup also
occur when you have PCache and
LCache, ala the camera flying
around the prelit static
scene?
- Chad
Yes.
When LCache is enabled, the first time you load particles they will be sorted and lit using the sort routine of your choice.
Consecutive renders from PCache+LCache should use what is in memory and thus not require sorting of particles against the light, but would benefit from faster sorting of the particles against the camera.
So the most significant speedup would be when using Radix Sort or the improved threaded sort on 4 CPUs in the FIRST pass of building the caches, and less significant when rendering from caches with a moving camera (as the particles have to be sorted against the camera, but if the camera is moving just a little bit, the buffer would be mostly in correct order anyway.) Radix Sort needs about 1 second to detect that 10 million particles are already sorted, while the old method would have needed 4 seconds (with 2 threads). So with 50 million particles, the sorting time should go down from 20 seconds to 5 seconds when using Radix Sort on a 64 bit machine/OS/Max with enough RAM. With 4 CPUs and FF Threaded sort, the same operation with 50 million should take about half the time, or 10 seconds instead of 20, and would not use additional memory.
Cheers,
Borislav “Bobo” Petrov
Technical Director 3D VFX
Frantic Films Winnipeg
(puts down current scene and moves on to testing specular shading again)
No sense bashing my head over this “Clip particles to an animated sphere after orbiting camera around static particles and lights” scene.