Realtime Profiler

The Problem

At The Game Assembly we use the tool "PIX" for profiling our games. However, even though it's a great tool it requires recording a section of gameplay and reviewing it back after the fact. This can be annoying if you feel a sudden frame drop and want to find the source so you can probe the problem further.

With PIX that would require attaching it to the process, recording a few seconds, stopping, waiting for it to process, then find the problem, and then finally begin all again to test the next thing.

The Fix

So, i took some free time and created this tool...

With the tool i created you simply open the profiler tab in the engine editor and observe the time split to find the problem!

Though it's still a quite recent addition to the engine it's already proved very useful as a quick debugging tool since it gives a more digestible overview of performance rather than an in-depth graph like PIX. I believe that by using them together, a far smoother experience can be found while optimizing as contrasted to using only PIX. You can look at the general performance of a solution in the real time profiler and then drop down to PIX if you need a far more detailed view over the timings.

Since the tool at it's core is merely an updating spreadsheet it's also far easier for other disciplines to get an idea of what's taking performance. An artist could see that the rendering stage is taking too long after adding a far too high-poly model, or A level designer could see if all the enemies their placing in the scene is making the AI take too long and so on...

Technical Information

The system has 2 ways to begin profiling something:

The Automatic way

...and the Manual way

With the automatic Macro it uses/abuses RAII (Resource Acquisition is Initialization) to automatically capture the entire scope following the line. This is most often what you want and since there's no paired call it's very easy to use and maintain. With the manual approach you do need to consider those things but since the push and pop are now separate you can use it in more complicated scenarios where the start and end of a task isn't as straight forward as the bounds of a scope.

The system notes the order in which you push and pop scopes to automatically detect the frame start and end to do things like count calls per frame, combined time spent in scope per frame, and more!

The system also taps in to the PIX API and forwards all the profiler scopes there aswell so they can share the profilier calls without the user needing to duplicate their profiling code for PIX. If you for whatever reason don't want to use pix with the realtime system or the other way around the USE_MetProfiler and USE_PIX defines can be mixed and matched as you please and the system will adapt. If both are undefined the entire profiling system will simply turn off and the overhead is gone for the games release.

I've also tried making the system as fast as possible by never allocating during normal runtime and pooling all the objects in the system. The GUI part of the system is also built mostly on top of the profiler rather than inside it meaning it can be swapped out or removed if need be with relative ease.

Conclusion

I think it was an intresting little side-project as i learned more about how to ensure the profiler wasn't taking too much performance itself and skewing results and how to use and abuse C++ Macros and RAII to effectively profile and optimize in and out code depending on context and user defines.

All and all, i think the final product turned out great!