It’s been quite a while since I’ve posted. A lot happened in the mean time. I changed web hosts. I spent a few months helping a nascent startup. I learned I’m going to have a son.
But this post isn’t about any of that. Instead, it’s about my recent interest in data-oriented design, such as when implementing component-based game engines. Here’s the gist:
- Accessing data stored in RAM is very expensive compared to how fast the processor can crunch it.
- Instead of accessing only the tiny bit of data needed for a particular operation, the CPU pulls in a big chunk (usually 64 bytes) and saves it in its own small but fast-access storage.
- If the program needs data that is already cached, the CPU can access it easily (a cache hit) instead of needing to slowly retrieve it out of RAM (a cache miss).
- Given the above, your program can experience possibly huge gains in speed by organizing its data in a way maximize cache hits.
One way that programmers can take advantage of this is to keep object collections in contiguous memory. The data of the collection can be arranged in the block in multiple different ways, however. One is the array of struct in which all the data elements of a particular object are grouped together in memory. An alternative is the struct of arrays in which all the values of a particular field are grouped together in memory. Depending on access patterns of the data in the collection, one method of organization will provide more cache hits than the other, and thus be faster even for the same process.
A downside of the struct of arrays pattern is that it’s a little inconvenient to have the members of a particular object spread out in different memory locations. Managing the different field arrays can get unwieldy. I wanted to know if there was a way to use C++ and templates to somehow generate and handle all the array juggling for me.
This was the closest I found to a solution. It uses a lot of template trickery and macros to handle everything. I wanted to avoid the need for macros, but I believe it is not possible without liberal use of tuples. The reason is that the field names of the objects cannot be specified by templates. If you are happy to settle with each “object” being a tuple and accessing fields using syntax similar to get<1>(obj)
, then it should be possible. But the only way to specify convenient field names is to actually write them out, and that calls for macros.