I'll put something clever here later
Things Every Newbie Needs to Know About OpenGL
November 25th, 2011

Now with the popularization of things like WebGL, I’m starting to see a lot of people dip their feet into GL programming, including people who wouldn’t normally get anywhere near it like web programmers. It’s great to see this kind of high-performance and low-level control extended to the web, cell phones, and whatever beyond mere desktop PCs.

Unfortunately they’re all doing it wrong.

I mean, that’s expected from someone who’s new to it. And there’s a problem with every OpenGL “tutorial” out there in that none of them help you adjust your thinking properly to really understand what’s going on under the hood.

OpenGL is more than just drawing triangles

Granted, that’s what you’ll be doing in the end to get your visuals to your user. But more importantly, there are a few parts that happen implicitly that can be costly and everyone really should be aware of. Forgive me for the number of 90% true generalizations I’m about to throw out, but I feel that simplifying some of this stuff in the name of easy understanding is necessary.

OpenGL queues up commands to happen later

When you tell it to draw a triangle, rather than drawing your triangle and returning after it’s done, it actually sticks your command on the end of a big list of commands, then returns. The graphics card is gradually working through this list in the background.

Here’s the first big important takeaway from this: The graphics processor (GPU) is an entirely separate processor from your CPU and can handle itself pretty well. Ideally, you want to have it start handling all those graphics functions, then you go do something else on the CPU. Handle game logic or something. Concurrency matters because if you do it wrong you’ll lose half your framerate.

Consider this simple game loop for a single-threaded game:

void mainGameLoop(void) {
    while(running)
        handleGameLogic();
        renderGameWorld();
        flipScreen();
    }
}

flipScreen() is your favorite platform-specific screen update function. (I personally prefer SDL_GL_SwapBuffers.

This loop is pretty bad, because your CPU and GPU end up waiting on each other quite a lot. If you were to try to visualize processor usage over time it would look like this…

CPU : ***********-------------
GPU : ----------**************

('*' = Time utilized, '-' = Time waiting)

The GPU ends up waiting for the CPU to finish all its game logic before it can even start because its list of commands to execute is just empty that whole time. Similarly, SDL_GL_SwapBuffers or whatever equivalent you’re using probably calls glFlush or glFinish internally. This command causes the CPU to wait until the GPU is done before returning.

So consider this loop instead…

void mainGameLoop(void) {
    while(running)
        renderGameWorld(); // Queue up rendering to happen.
        handleGameLogic();
        flipScreen();
    }
}

Now because you’ve filled the list of GPU commands, it has something to work on while the CPU crunches through game logic. And both processors are more fully used, leading to a shorter overall frame time…

CPU : ***********---
GPU : **************

These are pretty naive examples of game loops, so don’t take it as gospel or anything. Just be aware of how this stuff works under the hood. And know that when you tell it to draw a triangle, it might not actually finish drawing that triangle until glFinish or something returns.

The graphics system is a pipeline

And it generally goes one way. When you ask for graphics data back from the GPU, it’s like calling glFinish. It’ll cause everything you’ve done to stop and wait until the video card is done before returning.

Imagine rendering a bunch of triangles to the screen, and then taking a screenshot. The video card must finish rendering the triangles before you can get access to the image data it results in. Drivers can obviously try to be clever about how they handle dependencies of that sort, but I wouldn’t rely on them. Simply consider that asking for data back from the GPU is a potentially costly operation.

If you want to see clever ways to work around these stalls, you may look into how people set up hardware occlusion queries.

GPUs suck at normal logic

Don’t use “if” statements or “for” loops of any sort in your shaders. They’re terrible at that. CPUs have complicated systems set up to predict branching logic and start executing future code in a pipeline before the result of a test is done (probably to throw it away if the prediction is incorrect). But GPUs are really bad at this.

Don’t get me wrong. GPUs are incredibly fast at all kind of math. They just suck at branching. The closer you can get your shader to being a pure mathematical equation, rather than something with logical choices here and there, the better.

Know what’s in your video memory

Or at least what you expect to be. The driver can put stuff wherever you want, but once you say glTexImage2D, you’re basically telling OpenGL to copy that texture image data from your system’s RAM to the video card’s own RAM. The pipe between the CPU’s RAM and the GPU’s RAM is pretty big, but it’s not infinite. Calling functions too much that push lots of bulk data can slow things down.

On the other hand, things that are in video memory can be used for rendering pretty damn fast. Vertex buffers, index buffers, textures, compiled shader programs, and so on all need to be in the graphics card’s memory to be useful. As a rule of thumb, once GL owns the memory and the only way you can access the data for it is through a GLuint associated with it instead of a pointer, it’s probably either in video memory, on its way there, or the driver is doing something clever with it (in which case you shouldn’t worry about it at this point anyway). (The GLuint I’m referring to are things like the values generated by glGenTextures, glGenBuffers, glCreateProgram, and others.

Texture sampling is slow

This actually depends a lot on the size of the texture, filtering mechanism, and a bunch of other junk. In general, though, you should try to minimize the number of superfluous texture samples in a shader (calling texture2D in a shader is a texture sample).

The other really important thing that nobody ever tells you is this: Don’t sample from one texture and use the resulting value to sample another texture. That is, imagine you sample a color from a texture…

vec4 color = texture2D(someSampler, someTexCoords);

…and then use that color as the position in another sample…

vec4 otherColor = texture2D(someOtherSampler, color.xy);

This is called a “dependent texture read” and is potentially very slow due to the way drivers and video cards try to predict stuff. The reasoning for this is possibly a little esoteric, but the results are something you have to deal with.

Fragment shaders run for every pixel and Vertex shaders run for every vertex

Most of the time you will have more fragments than vertices. It’s probably okay to make your vertex shaders more complicated than your fragment shaders. How you divide up some of the logic is up to you.

There’s probably tons of other stuff

I just can’t think of it right now. Good luck!

Edit: This is also pretty useful. Found it on the opengl.org wiki: Common_Mistakes.

Normal Mapping
November 25th, 2011

So I’ve been working on my own 3D engine for a while. Finally got normal mapping working last night.

Despite a few hiccups now and then, things have been going pretty smoothly. I’ve been working on a commercial engine for a couple of years now, and it seems to have prepared me pretty well for a lot of this. Debugging can be a little hard because it’s hard to tell the difference between good and bad mesh data when it’s a gigantic list of floats, so when a render goes bad it’s a problem somewhere in between when I load the mesh and when the fragment shader ends and that’s all I know starting off. If I screwed up my tangent space calculations, it’s pretty hard to tell without just looking at it and seeing if it’s off. So I’m open to suggestions for how to handle this.

I use PIX at work for DirectX stuff, but I’ve been having some issues with gDEBugger. I think it’d be perfect for what I’m doing if it worked right.

Anyway, I’ll try to update more with game dev-y stuff now that I’ve got a long weekend to hammer out features on this thing.

More Doodles 3
September 29th, 2011

More random doodles from work, because I haven’t posted in a while.

More Doodles 2
May 11th, 2011

Blorp

Humble Indie Bundle #3
April 13th, 2011

Everyone who hasn’t picked up the Humble Indie Bundle #3 is a horrible person. http://www.humblebundle.com/

(Not really a horrible person, but you must have some excuse if you didn’t. :P )

More Doodles
April 3rd, 2011

Did some more random doodling.

New Arts!
March 27th, 2011

Work doodles turned into full blown art! I should do that more often. And upload my doodles here more often. Woops.

Herp derp
January 25th, 2011

Holy crap I still have a blog that I totally forgot about. Updates incoming and stuff.

Lily source code
September 25th, 2010

I finally got around to getting the Lily game source code ready to release. There’s now a .tar.bz2 file along with the Mac and Windows binaries. ( Right here. )

It can now be compiled and played on its main platform and other variations of GNU/Linux.

On Debian-based systems (Ubuntu, etc) it’ll need these packages installed to compile in addition to the usual toolchain (someone let me know if I’m missing something)…

pkg-config libvorbis-dev libogg-dev libopenal-dev zlib1g-dev libglew1.5-dev liblua5.1-0-dev libboost-thread-dev libglu1-mesa-dev libgl1-mesa-dev libsdl1.2-dev

It uses a non-standard build system (custom Makefile), so just run “make” to build it. Windows and Mac builds are still done through VC’s compiler under WINE and cross-compiling for Mac, so good luck to whoever the hell wants to mess with that. If you can somehow manage to pull off the same build environment I have, those can be built by specifying PLATFORM=win32 or PLATFORM=macosx.

The Lua code is the best place to look for anyone interested in modding, but it’s also some of the worst code in the project. Particularly the code for the main character.

In the future I’ll look into making .deb files for Debian-based Linux distros.

On another note, I think I’m going to take the engine and do a little side project away from the main Lily project. Just to see if I can hammer out a short, simpler game in a reasonably amount of time and to make sure the engine is where I want it to be. The Lua code in the game right now is an example of what happens when I try to develop using an incomplete API (the interface for the Lua code).

So… Who wants to try Lily?
September 11th, 2010

I Lily is finally coming around to a point in its development that I can actually get more people to run the game instead of just posting screenshots all the time.

For a while now I’ve been poking friends to test little aspects of the engine, but I think I can safely give it out to everyone now, at least to figure out if there are still any technical issues lingering around.

So I’m finally opening it up to testing. If you’re interested, grab a binary from here: Lily binaries.

Just grab the one with the highest number and the platform of your choice (as of this writing, that would be: Lily-gameintegration-macosx-160.zip for Mac users and Lily-gameintegration-win32-160.zip for Windows users). Only Mac and Windows versions are up now. Linux is definitely on my list (hell, it’s the main dev platform) but I have to clean up a bit of stuff in code before I’m willing to post a tarball for it.

Windows users: If it complains about missing OpenAL32.dll, you need to run oalinst.exe in the redist directory.

Mac users: Save games get stored in the app’s folder. I doubt this is what’s supposed to happen, but if someone could give me a heads up on Mac app-specific preference and settings storage, that’d be nice.

Make sure you read the important parts of the README.txt file before trying to do anything, or you’ll wonder what to do at the main screen.

And the last thing: The only bugs I’m interested in hearing about are the ones where you can’t play the game at all. Either because of crashes, or it failing to start up.

I know about all of the following:
* There are places where you can fall off the edge of the world.
* The player appears as a white box because the texture is too big (only on some graphics cards).
* Control, Alt, and Shift don’t make very good controls because of window manager keybinds.
* It’s impossible to delete saves to restart a game without actually deleting the saves directory.
* The save menu is stupid and incomprehensible.

Good luck!

« Older Entries

Search

  • Categories