AWS Thinkbox Discussion Forums

Slow shadows when using Pcache

Beta 12 (not sp1 yet, will check that in a tad) / max 9 64 bit.



1 PRT loader with 10% of 50 million rendering.

1 fspot with shadow map size of 100

1 camera



First render, with Pcache off, takes 27 seconds.



Turn on Pcache and render, 26 seconds.



Render 3rd time, with now valid Pcache (182MB),

render time 37 seconds.



Bizarro! Bizarro!



At 100%, it returns 5:24 and… Well, it’s been a while, and I’m not waiting any longer. Kill max and start over.


  • Chad

Ok, at 25% of 50 million, we get…



1:20 uncached



2:55 cached



Which means the slowdown is getting worse as it goes. 30% -> 120% ish. That’s probably why my 100% seemed to take forever.

Thanks, Chad, I was able to repro the problem with 5M particles:



No Cache, One Spot Light 100 shadow map - 18 sec.

PCaching - 18 sec.

Rendering from PCache - 30 sec.

Rendering from PCache+LCache - 8 sec.



Something is not right when getting particles for lighting from PCache. It sits there 10 seconds doing nothing, in that time it could load the file a couple of times…



We will investigate. If it can be fixed in a patch, we might send you a new DLL to try out.

As an aside, the fact that we’re rendering 5 million particles as a “quick test” is really humerous to me.

Logged as defect 4063.

Here is a follow up:



We investigated and figured out what is happening. Turns out that the standard library sort method takes a huge hit when the values it has to sort are already sorted (!). With 2 million particles, the sort time goes up from 4 seconds to 12+ seconds. If you would move the light to a new position that causes a new sort order, the calculation is fast again. In addition, it is single-threaded. We had our own code that attempted multi-threaded sorting, but it had the same behavior with already sorted particles, so the second CPU never really played a huge role in that specific case. Also, it was hard-coded to 2 CPUs instead of using all available cores.



The obvious solution to this problem is to implement a better sorting algorithm that does not have this undesired behavior. Our hope is that such an algorithm could be potentially faster in general cases, too, making Krakatoa even faster in the lighting phase. If we can implement the new sorting to better support multiple threads, then using a Quad system would allow for even higher sorting speed.



I will keep you updated as we find out more.



Cheers,



Borislav “Bobo” Petrov

Technical Director 3D VFX

Frantic Films Winnipeg

Privacy | Site terms | Cookie preferences