-
Notifications
You must be signed in to change notification settings - Fork 35
Description
While thinking about how to modify the particle system to better support star-particle creation, I had some thoughts on how we could potentially improve the particle system in general.
These are mostly thoughts for an "ideal world." I'm not really suggesting that anybody should spend time doing this.
-
support different particle "types": so that we could support star particles and sink particles. (Ideally with the option for them to have different particle attributes -- but that's obviously far less important)
-
rather than organizing particles in a giant array, it would be worth reorganizing them into "batches" (or "chunks")
-
For some datatype
T
, rather than storing the values in 1 giant array, we would instead store the values in a series of chunks. -
For the sake of concreteness, a naive implementation of this might look like:
template<typename T> struct Batch{ int size; T values[CHUNK_CAPACITY]; }; template<typename T> struct ChunkedStorage{ int num_batches; Batch<T>* batches; };
In the above snippet,
CHUNK_CAPACITY
is some compile-time constant (e.g. 8, 16, 32, etc.).The control-flow would change from
for (int i =0; i < num_particles; i++) { // do work }
to
for (int batch_ind =0; batch_ind < obj.num_batches; batch_ind++) { for (int i =0; i < obj.num_batches[batch_ind].size; i++) { // do work on obj.batches[batch_ind].values[i] } }
-
The above snippets made a number of simplifications. We could make a bunch of optimizations1
-
The primary benefit is that it is much faster to remove values from this data structure (the worst-case cost is based on the max-size of a chunk) in comparison to the existing approach (the worst-case cost scales with the number of contained values). This comes up now whenever particles move between processors.
-
The obvious drawback: we are introducing complexity and we slightly increase the cost to access an arbitrary element.
-
It's not entirely obvious to me whether the extra complexity is worthwhile on GPUs
-
I imagine that most of the semantics for adding/removing values could be modeled after an unrolled linked list
-
Footnotes
-
For example, it may be better to store batch size separately from the data in a given batch (you could do away with the
Batch<T>
Class template and store the values directly within Batches). If we also pre-allocate the memory for the maximum allowed number of particles (which would probably be beneficial), you could allocate all of the memory for all batches in a single array (rather than storing pointers to each batch, you could then store the index of each batch). ↩