In games we use a lot of ray casts to probe the environment around the player. This environment consists mainly of triangle meshes. Much of the research in ray casting is done in the context of ray tracing where there are 1000s of coherent (nearly parallel) rays to be calculated. Our ray casts are typically not coherent and if they originate from the same location, they’re usually in opposite directions. For game play, we typically do in the order of 100 ray casts per frame. We need the ray cast result right away, because game logic often does new ray casts based on the result of the previous ray cast. Getting ray cast results 1 frame later (as is often the case with GPU based solutions) makes the game code complex and reduces the responsiveness of the game. Memory is always tight on a game console, so we want to store our meshes in a memory efficient format.
This article compares various triangle encodings and bounding volume hierarchies to select an efficient algorithm for our use case. We will limit ourselves to finding the closest triangle hit. We focus on CPU tests, but at the end of the document we also do some testing on GPU.
Since, on modern processors memory is much slower than calculations, we expect that it will pay off to do some more calculations to decompress data. We found that, for our use case, this is not completely true. As soon as the bounding volume hierarchy / triangle encoding scheme becomes more complex, the extra calculations do not weigh up against the reduced amount of memory accesses. We found that a simple encoding in combination with SIMD instructions for hit testing gives a good balance between speed and memory requirements. We found that ray casting on GPU is faster, but only when there are many ray casts and the game code can wait (typically a frame) for the answer.
Article: PDF
Source Code: github.com