Pretty banner! :)

Fastest Frame Rates with Optimized Rendering

Particle Shower

There are a thousand important things to optimize in a game engine, but perhaps the most important optimization we can make is the renderer. The reality is that even though we have a bunch of fun new tools available to help squeeze a few more frames out of the Flash Player, it still sucks in terms of raw graphical power. Render speed will likely remain the bottleneck for our games for the foreseeable future. It is therefore very important for a developer to understand when and how to push the renderer, and that starts with a fundamental understanding of different rendering techniques available and how best to utilize them to maximize performance.

Unscientific Process

I honestly don’t have the time to do exhaustive tests, so my main goal here is to review two basic approaches to enhancing render speed. I want nothing more in the world for something to pick up where I leave off and create a test suite that can be used to accurately and exhaustively compare different rendering methods. The reality is that there are enough special cases and different applications, along with personal agendas and other obstacles that I don’t think that’s likely. If nothing else, I’m hoping someone can point out some misconceptions I might have, or, even better, show me something that clearly declares one particular renderering method superior so that we can put the debate to bed and focus on making awesome games.

Case 1: The Naive Built In Renderer

The built in renderer is the traditional DisplayObject/DisplayObjectContainer/Sprite display tree without any concern for squeezing maximum performance out of the rendering. The good news is that with AS3, this is good enough for lots of different game types, and so render optimization doesn’t even need to be a consideration for a lot of games these days. There are some cheap tricks available to us to get maximum bang for our buck with this method, like scrollRect, cacheAsBitmap, mouseEnabled = false, etc. but at the end of the day we will hit a brick wall with the Naive Built In Renderer.

Pros: Simple to use. Highly flexible. No custom code required. Oftentimes provides “good enough” performance. Cons: Leaves a lot to be desired in terms of performance.

Case 2: CopyPixels Renderer

This technique seems to be the one heralded as the fastest general purpose renderer and the one I see paraded about the most often. The high level concept is that we have a BitmapData that we update every frame with the location of all our graphical bits. Every frame we reset the BitmapData and redraw everything to the render window. I used this technique to draw the sakura tree effect in the banner on this page, something that was simply not possible with the built in renderer. We can achieve render speeds sometimes 2-3 times that of the traditional Naive Built In Renderer. There is an example of this in the source code linked in the article. This rendering technique exposes some fun “free” graphical effects that you can get by remembering portions of the rendered frame. It’s trivial to add a trail effect to a moving an object, for example, using this method. The greatest weakness with this method is its inflexibility. The capabilities of DisplayObject are lost completely. Anything you require aside from changing a visual object’s x and y location will require custom code to support. Additionally, you’ll have to write some code to support the render steps. Luckily it’s not that complicated.

Pros: Fast, most say the fastest. Cons: Inflexible. Requires custom rendering code (though it is trivial).

Case 3: The Built In Bitmap Renderer

In messing around with different rendering techniques, I came up with a third method. To some extent this is simply a restricted set of functionality available in Case 1. With some basic engine choices, you can significantly speed up the rendering in your engine, while keeping your code somewhat easier to understand and maintain. The basic premise stems from the concept that Sprites and other DisplayObjects are fat and slow. Avoid them completely and you get a pretty fast little engine. Have a single root Sprite that represents your game world, and then attach only Bitmaps to that Sprite via addChild. If you use this technique, you can avoid needing to mess with BitmapDatas, and resetting and redrawing them each frame. The code is easier to understand and maintain as it matches the traditional display model, just with some very specific constraints in place. In my tests, the Built In Bitmap Renderer would often outperform the other techniques. In the version committed to the Google Project, the Built In Bitmap Renderer beats out the CopyPixels method by as much as 15%. I don’t really have a good answer for why that’s the case. I’d like to think because it lets the renderer handle the optimization now that I’ve given it the appropriately organized data.

Pros: It’s a simpler concept to grasp which makes it easier to maintain and share. One less thing to worry about. Cons: Still inflexible.

Wrapping It Up

Download the swf to compare the two methods using the Litmus testing framework. The test code is spawning 3 particles each frame and attempting to do so at 120 frames per second. Source code for the render test is available in the repository. I hope this makes it just a tad bit easier to do all the fancy cool effects you’d like to tackle in your next game development masterpiece.

Post Metadata

Date
October 17th, 2008

Author
urbansquall

Category


Leave a Reply