Stellar performance & licensing issues :)

PatrykKizny · March 7, 2013, 4:28pm

Stoke is pretty cool making use of ALL resources I have available in my workstation.
It’s that good, that even kills a network license server and other workstations are now getting licensing issues.
Although one can argue that license server should not be rendering, I believe there’s lots more users that have it also configured that way.

In such case maybe it’s worth it to set some limits in the UI so that It left some RAM & CPU for the system.
ALso, one thing I don’t get is that there is a disk cache limiter (which I would doubtly need) and there is no RAM usage limiter.

anon58450692 · March 7, 2013, 5:34pm

The memory limit is actually a RAM limit, its just positioned/named oddly.

PatrykKizny · March 7, 2013, 7:24pm

Got It. That’probably explains why I am getting out of memory errors.

anon58450692 · March 7, 2013, 9:14pm

You shouldn’t see any errors, even when running into the limit (it will just unload data to a file on disk instead). Please post the errors and a description of the context they occurred in.

Bobo · March 7, 2013, 10:27pm

We found some memory leaks on the MAXScript side, working on it…

PatrykKizny · March 8, 2013, 12:52pm

Actually I am unable to use Stoke for larger simulations (latest builds).

My scene is pushing 10M particles emitted on one frame through a vector noise field from Ember. Entire sim is about 500 frames.
I am on Max 2013, x64, 8-core, 32GB RAM.

I have limited RAM Cache to 16GB.
I am saving files to drive.

The simulation never gets through.
Latest try gave me 185 frames done, then a system message of Windows tells me that My system has ran out of RAM.
No messages in Max other than saying that the error occured and app will close.

System monitor tells me 3dmax.exe ate 127 GB RAM.

Bobo · March 8, 2013, 4:29pm

The sad fact is that there was a memory leak in the latest build.

Basically, there was a MAXScript variable used to clone the particles during simulation, and it was never garbage collected, resulting in huge memory usage. So the Limit was used by Stoke, but MAXScript added a lot more data to memory and forgot to clear it up.
Attached is my version, in case you are using Max 2013 64 bit. Copy the DLO to C:\Program Files\Thinkbox\Stoke MX\3dsMax2013, and the MS to C:\Program Files\Thinkbox\Stoke MX\Scripts
We will release it today anyway.

The alternative is to add a ‘gc light:true’ inside the loop (I can tell you where if you want). But that would make it a bit slower.

[size=85]Attachments

StokeMX_20130307.zip (removed)
(3.15 MiB) Downloaded 3 times[/size]

PatrykKizny · March 8, 2013, 8:28pm

I am afraid it’s not only latest build. I switched back to the previous one, and also successfully killed my beasts.
There’s one more thing you may want to look into - today while doing a simple sim, 0 - 50 000 - 0 emission rate (about 10M max particle count) simulated with just a wind, I got the out of memory around 350 of 550 frames. 3d Max continued to work (and display progress in the UI!) but stopped saving files to drive.

I am trying the fix. Hopefully it works. Unfortunately I am struggling 4th day to get a simple sims done. PFlow would have already done it. But that’s part of the fun with new toys

Thanks for heads up. You’re awesome.

Bobo · March 8, 2013, 9:04pm

Here is how it was SUPPOSED to work:

*In the original Stoke, that Memory was unlimited, and saving to disk was done sequentially. Running out of memory was completely possible, and the disk cache had no connection with the memory cache.
*In the 003 build, you specify a Memory Limit. This is the memory where particles are stored for interactive scrubbing (like before). It defaulted internally to 1024 MB, but I overwrote it with 4096 in the script. There is a second buffer which is used to dump the particles to PRTs in the background. That buffer was supposed to be half the size of the Memory Cache, but it was stuck at 512MB (bug!). When simulating, the memory cache would pass particles to the saving buffer to be written to disk. Once written to disk, the data was still associated with the memory cache, so the memory cache could drop frames to free up space for more simulated particles, and later restore while scrubbing.
*Unfortunately, there was another bug where the particles passed to the memory cache were cloned via a MAXScript variable and were not cleared at all. Thus, even after resetting the cache, the memory would stay up, or in your case cause a failure. So we fixed that in an internal build (posted here), and we also added a graphical display that shows you the state of each frame - green means in memory, blue means on disk, red means missing, gray means empty. This did not fix the 512MB saving buffer though - even if you have 8GB Limit, the saving buffer would still remain 512MB.

NOTE: Due to the 512MB limitation, it was often the case that the simulation would finish, the write buffer would become full, and would stop dumping PRTs to disk. In that case, you would have to use the [X] button and select the FLUSH option to force all memory to be written out to disk. Optimally, this should not be necessary in the final version someday in the future…

Here is how it SHOULD work:
So for the next build, we are trying to provide a single memory limit and balance the size of the two buffers (memory cache vs. write pending buffer) dynamically during simulation. If a frame is already written to disk, it can not only be dropped from memory, but the write cache could grow while shrinking the memory cache. Thus you will simply specify how much memory you want to be used, and the rest will happen automagically without ever exceeding that amount, and it will be as efficient as possible.

PatrykKizny · March 9, 2013, 12:39am

It sounds a bit complicated. But the patch worked and I was able to wite down my first sim (10M generated on one frame) and simulated over 600.
The 2nd one more challenging (emiting 1000 - 100 000 per frame) resulting in peak of 15M is processing, but it looks like files are getting slowly to the disk and memory keeps within limits.
It’s a bit slow (10M pushed by wind turbulence should be fast) and I am getting about 1min/frame (~500MB) times.

Things slowly take shape.

Now I have serious issue and maybe with a bit of creativity and your experience we could work around it.
viewtopic.php?f=162&t=9193

I really need a workaround because the files with ‘bad’ lifespan take over 1TB here. So reprocessing them makes me scared.

Bobo · March 9, 2013, 1:12am

I will simply send you our internal Krakatoa 2.1.8 build for Max 2013 which solves the Age/Lifespan format problem neatly. No need to resim anything, and it will work with streams using both Integers and Floats.

PatrykKizny · March 9, 2013, 1:18am

Huge thanks. You made my day. Actually… you made my night.

Bobo · March 9, 2013, 1:22am

Check your GMail account.

PatrykKizny · March 9, 2013, 1:58pm

Unfortunately with this build I am still getting dropped frames.
It went through to about 414/600 then I got random files saved towards the end.
There’s also a mess with file dates.
Same with all 3 sims I did. That was 10M through wind and 24GB RAM limit.

Bobo · March 9, 2013, 5:12pm

It does not DROP frames. It just does not save them all in the original run under certain memory conditions.

You can call the FLUSH command by clicking the button and selecting “FLUSH The Memory Cache…” option. This will save all frames that were not saved by the background thread.

Basically Stoke’s Disk Caching is NOT the same as saving PRTs. The PRTs are used as a disk cache. So they don’t have to be written to disk immediately, but they are not lost. As I explained in the docs, it can skip some frames while saving if it runs out of memory for the cache buffer, but then those “skipped” frames are held in memory instead.

I know that there was a lot to read in the release notes, but this is all explained there:
viewtopic.php?f=161&t=9225

This is probably the only plugin with asynchronous threading, so there is some learning involved.

Alternatively, there is an option in the menu that lets you save a new PRT sequence to a different location. Basically it will use both the disk cache and the memory cache to dump all simulation data to new PRT files. But I don’t recommend that, because the saving will lock Max until all files are saved, and knowing your data sizes, it can take hours.
So next time you think frames are missing, just remember to Flush the cache

DeKo · March 14, 2013, 3:47pm

I have same problem: after simulation (with cache to disk) some frames are left in memory cache. Flushing [X] to disk takes hours in my situation. Maybe there is ability saving to disk frame by frame on the fly? Something like disable memory cache at all?

Bobo · March 14, 2013, 5:19pm

You can set the memory cache to 0 - this will cause the simulation to run sequentially - simulate a frame, write a frame, sim a frame, write a frame. This way it will take hours to finish the simulation The idea here was to simulate as fast as possible and shift the saving to the background so you can do other things with Max in the mean time. But if your simulation is so big that it cannot fit in memory, then it is not optimal.

What we haven’t done yet is multi-threading the actual saving. Saving a PRT file involves a single thread for compressing the data, so the speed of writing is quite possibly CPU-bound. We suspect that using two or more threads to ZIP multiple frames at a time and write them to disk in parallel might reduce the saving time additionally (assuming your HDD or SSD drive can keep up with the I/O), thus further reducing the total time it takes from the moment you hit Simulate to the moment all frames are on disk. It is already proven that loading benefits significantly from multi-threading.

Keep in mind that if you would simulate using PFlow and save with Krakatoa, you would get sequential behavior and possibly longer hours to save the same type of simulation. So we are attempting to make that process shorter. As I posted on the Stoke website, emitting 10,000 particles over 100 frames to produce one million particles on frame 100 takes 160 seconds using PFlow and Krakatoa, and 7.5 seconds with Stoke. Stoke still needs around 2.5 minutes to save all PRTs to disk (due to the single-threaded PRT saving), but you are fee to use Max after 8 seconds. In this particular case the Memory Cache needed 2GB to keep all particles in memory.

So the new cache scheme is great for quick iterations - you run a million or a few million particles with Every 10th frame multiple times and check the results after each run. If you don’t like the output and hit Simulate again, the cache is cleared and the previous saving is cancelled and then the PRTs are overwritten. So instead of waiting for several minutes before you can inspect the result, you wait a few seconds, look, sim again until you like the result. Then you set the count higher, set the Every Nth to 1, hit Simulate and go away. If the particles don’t fit in memory, you won’t even have to flush, since the sim will run out of space to cache the data and go into sequential mode of simulating/saving/simulating/saving…

We will be looking at partitioning next.

anon58450692 · March 14, 2013, 5:23pm

Running the simulation followed by a flush is guaranteed to be no slower than synchronously writing each file immediately as the simulation generates the data for it. You are simply moving the hours waiting on the Flush button to hours waiting on the Simulate button.

DeKo · March 14, 2013, 7:20pm

Yes, mem cache is a great for testing, but for generating final solid (and big) PRT data two user input is a time waste waiting for my another action.
Mem cache: 0 - works great, thanks!

DeKo · March 14, 2013, 7:24pm

Cant wait for ANY solution