In Part 16, I showed a world rendered with three techniques -- the nearby graphics done with correctly sorted transparency, the middle range done with opaque graphics only (since sorting all the cubes is so expensive) and the distant data done with points, which also were not transparent.
I mentioned that this code "wasn't quite interactive". This week, I have a new demo for you that does run at reasonable speed. And better yet, transparency is handled through the entire range of distances.
"Screen Door" Transparency
In sorted transparency, we apply the textures in the proper order, back to front, and blend them using the alpha component. This selects how much of the background vs. how much of the new texture we want. That gives the appearance of translucent textures, as in Figure 2.
Back when I first considered transparency, someone (probably Florian) mentioned the "screen door" style to me. In screen door transparency, we aren't blending at all. The alpha component is either 0% or 100%. In other words, the texture is solid, but with holes in it (like a screen door.) Transparency is achieved by looking through the holes.
The advantage of this is that the surfaces don't have to be drawn in sorted order. Instead, we use "alpha testing" to leave the holes open. By not drawing pixels with zero alpha at all, we don't set the z-buffer depth at that point. This makes the surface rendered with lots of small holes the same as if we had drawn a complex series of triangles with gaps between them. The z-buffer can handle all the overlaps, and we don't have to sort. This is much faster, especially when dealing with the amount of data we have to work with.
This is what Minecraft does, and you can see that in Figure 3. The glass blocks are mostly completely transparent, with opaque streaks to suggest reflections off the blocks.
I thought this was a very "Minecraft" look and I didn't want to imitate it. I realized you could use a grid instead of these streaks, but when I tried it in my own code, I didn't like the result. See Figure 4.
Not only can you see the pattern, but you can get anti-aliasing effects when two "transparent" surfaces overlap. In fact, you can see the same effect (Moiré patterns) in real life when two layers of screen door overlap. So I kept on with my sorted transparency.
By the way, you may notice that the second-over, second-down pane of glass has a black background in it. This is some effect of transparent over transparent. It's clearly working elsewhere in this same image, and you can even see the distant scenery through two panes of glass.
I'm not sure what is happening in this one bit of overlap -- if it's a pathological case of aliasing, or some bug. I'm still looking into it. You'll see it in the demo if you look for it. It seems to happen fairly often.
A "D'oh!" Moment
I had several "D'oh!" moments this week, and one of them was about screen door transparency. I was thinking it just wasn't good enough because it looks so bad close up. However, the time to sort transparent cubes and send them all to the display each refresh cycle was killing my performance. I didn't like the look of opaque data in the distance. So I was trying screen door style transparency again and scowling at the image.
Then I realized it looks fine in the middle distance, where you can't see the pattern of holes. And so I could use screen door transparency instead of opaque for the middle distance rendering, without any loss of performance. I can also use it for the points, since they are textured the same as triangles. That completely solved the appearance issues with the Part 16 version. It now looks like Figure 5. D'oh!
I had coded all the support for rendering cubes as points back in Part 17, but hadn't added the code to turn it on in the distance. This week I did that, and checked it in to GitHub. Reader Klemens Friedl tried it and said it was even slower than cubes on his machine.
I "knew" this was impossible. The points are tiny and just a single rectangle, not the up to three rectangles of a cube. The points are a single vertex, not the up to 24 of a cube. It made absolutely no sense for it to be slower. But I tried it on my Linux box with integrated graphics (Klemens uses a laptop), and it was in fact slower!
I immediately jumped to the conclusion that points were just not supported as well as triangles on the cheap displays. I was tempted to just ignore it, since I was tired of fighting with display hardware. But then I thought "how hard could it be to draw little triangles?" It would be three vertexes instead of one, and I would need to do more work in the shaders to orient the triangle to the screen, but it seemed worth a try.
I coded this up and it was faster than points, but still slower than cubes. That really made no sense -- the cubes are made of triangles, and fewer of them. So I was stumped. I finally just put in code to count all the triangles I was putting out on the screen, and had another "D'oh!" moment... Figure 6 shows what a single chunk looks like in the demo.
As you can see, all the outside faces are visible. This is actually wrong, since the chunk will be up against other chunks that obscure it. The faces got turned on when I extracted the data from Minecraft. I was extracting a chunk at a time and so the code saw each one independently and left the outside faces visible (which they would be, if there was only the one chunk.)
What I hadn't bothered to realize is how many cubes this adds up to. Each chunk is 32 by 32 by 32, so each face is 32 by 32 or 1024 cubes. The extra five faces (or six, for buried chunks) is five or six thousand cubes. I am rendering 300 chunks in the test case... 1.5 million extra cubes.
Then I realized what is different about points. When rendering the cubes, all the faces that point away from the eye are not drawn. But I was writing a point per cube, and the display draws them regardless of orientation. So in the demo, I was actually drawing a million more points than cubes! D'oh! again.
On top of that, the cubes scale down until a face is often only a pixel or so across. The points are a fixed size, and I was using size=3. So I was actually drawing 9 times as many pixels for the distant cubes, and a million more points than cubes. This was killing the performance.
The answer was smaller points, that scale with distance. And I needed to fix the world data so that all the extra cubes are removed. I did both of those things and performance improved dramatically. I'm still not sure I like the look of the points. Since they are a different size and brightness than the cubes, it's fairly noticeable when the distant scenery changes from points to cubes. See the video below, if you can't run the demo.
The demo draws points with triangles, not the GL point primitive. I couldn't decide which I liked better for appearance. The points are 2 by 2 rectangles and look larger than the triangles, making the distant scenery obviously more dense. They are also slower, since they write more pixels per point than the triangle does.
My final "D'oh!" moment this week started with another "this can't be happening" situation. I had regenerated the world from the Minecraft data and turned off all the extra cubes. While I was at it, I stripped away the bottom layer of chunks, since they are mostly unused, and I wanted better performance. Unfortunately, with the new world data, the demo ran perhaps five times slower than the old world. This was clearly impossible.
I went looking for some bug in the extraction code that was turning on too many faces. I counted the output triangles again in the demo, and it was fewer with the new world, as I would expect. I turned off transparency and it made no difference. I turned off all rendering, and discovered that it was still slow! This meant the extra time was in the CPU, not the GPU.
I finally tracked this to a stupid bug. I currently get the list of chunks to display in a really heavy-handed way. I traverse all the chunk positions out from the eye, and at each one, I ask a hash table if there's a chunk there. Then I ask if the chunk is visible (intersects the view frustum.) If so, I draw it.
It turns out, that way back when I wrote the hash table for chunks, I used a really stupid hash function. Many chunks have been winding up in the same hash table cell. This means the access is a linear search though all the chunks in that cell. When I stripped off the bottom layer of chunks from the world, it changed the collision pattern in the hash table. Suddenly, there were lists of chunks hundreds long! And this is on a table that is accessed thousands of times each refresh...
I hate when I find this kind of bug. I look at them and wonder how the code ever worked in the first place, and how I never noticed it before. It's like walking through an old building and having your foot go through the floor. Not reassuring.
With this fixed, and the new transparency and new points, and mixed rendering of the landscape, I'm getting very respectable frame rates on my hardware. Try the new demo below and let me know if it works for you.
For those of you who can't run the demo, there's a video here. Switch to 480p to see the small details.
Note the way that rendering of scenery in the distance changes when you move. Remember that there's no fog here, so you're seeing distant landscape at it's worst.
I've rebuilt the world from the data I extracted from Minecraft. The new version no longer has the visibility set on the surface of all the chunks. This makes for a significantly faster rendering. (It also looks cool underground.) The demo will run on the old version, but you should download the new one.
Download The Part 19 Demo World. Unzip it into the same directory as the demo. The directory "world" should be next to "docs" and "options.xml". Or you can edit the "worldDir" attribute in "options.xml" to point to it wherever you like.
For Windows, download The Part 19 Demo - Windows.
For Linux, download The Part 19 Demo - Linux.
As mentioned in the last update, I've stopped work on all but the OpenGL 3.3 versions of the framework. If you are running under Windows, make sure your display drivers are up to date. You may need to go to the manufacturer's driver site (Nvidia or ATI) for the latest. Both Windows Update and Dell insisted my laptop was up to date with its OpenGL 1.2 drivers, but there was a full 3.3 implementation on the ATI website.
This demo does not contain the Part 18 asteroid field (next demo, hopefully), but it does contain the compressed vertex support described in Part 17. The options.xml file is set up for small (256 meg) display cards. Increase the displayMemory and systemMemory options for better performance.
If you want to play with the three different types of rendering (sorted transparency, screen door, and points), change the options viewDistance, summaryDistance and detailDistance. The rule is that only chunks within viewDistance are displayed. Anything closer than detailDistance will use full sorted transparency. Anything farther than that and closer than summaryDistance will use screen door transparency. Anything farther than summaryDistance will use points.
So if you want everything rendered the slow, high quality way, set both summaryDistance and detailDistance to a high value like 99999. If you want everything as screen door transparency, set detailDistance to zero, and summaryDistance to 9999. If you want all points, set both detailDistance and summaryDistance to zero.
The Source Code
Despite the long time since Part 15, there hasn't been much code growth. I've been mostly pulling my hair out over shader issues, rather than writing new code.
Mip-mapping produces reduced scale versions of the texture for use at a distance. Unfortunately, this averages together the alpha pixels and turns the black and white pattern back into a grayscale. When blended into the screen, this never produces a zero alpha value and so no pixels are discarded. It's exactly the same as drawing the original non-halftoned, non-alpha-tested textures.
On top of that, I use an OpenGL texture filtering mode that averages adjacent pixels. So even close up, when the full sized texture would be used, the black and white alpha values are being averaged to produce gray.
There are three reasons why this still appeared to work, despite the cubes being drawn with normal blending without sorting. First, I display all the transparent data after the opaque. So in the case where there's only a layer of glass or water over an opaque background, sorting is not necessary anyway.
Second, when glass is behind glass, the order of drawing is unimportant. You are still going to see the opaque background through two layers of glass. We would only see errors when a chunk contained water behind glass, or glass behind water. This doesn't happen much in the sample world.
Finally, the chunks are 32 across and are sorted from back to front. This takes care of some of the obvious cases like the glass dome with a tree in the center. The parts of the dome and the tree are all in separate chunks and are sorted correctly due to that.
If I do screendoor transparency correctly, by turning off mip-mapping and texture filtering, it looks like Figure 8.
On the left, the alpha is halftoned with a grid pattern, which works but produces horrible Moiré patterns. Worse, as you move, the patterns change due to round-off errors and different scales and angles. So glass sparkles constantly.
On the right, I've used a random screen on the alpha pattern, which avoids the Moiré patterns, but looks even worse when you are stationary. At a distance, both styles are similar. See Figure 9.
Needless to say, I don't like the way this looks at all. Glass is tolerable except when you move (then it still sparkles, even with random halftoning). Water looks terrible.
Florian has given me a list of other approaches to consider. Right now, the incorrectly-rendered transparency in the demo will do. I want to make more progress building the world. I'll get back to transparency later.
blog comments powered by Disqus