Part 45: Tales of ROMhacking: Part 4
Edit: I wanna apologize in advance if this is too slow or something. I know SA is a tech-savvy audience, but I wasn't sure what level of detail to go with. If it seems like I'm talking down to you, it's just me screwing up "know your audience".Tales of ROMhacking
I wanted this to focus on Scarboy's completely brilliant move: The LZO Decompression Hijack. Before I do though, we need some stuff in the way of context; an entire post in fact.
Let's talk graphics.
Like the text, graphics in Policenauts are stored in multiple formats for different scenes. There are more or less three major standards Policenauts uses to store graphical data, but there's a lot of one-off scenarios that use modifications on those standards and in one or two cases, completely different things.
So before we get to Konami, in general, how do we store graphics as data?
I don't think I have to get into pixels and computer monitors here, so I'm just going to leave it at we have to store RGB values for the pixels we want to render. In the simplest case, we can just do that. We'll just have a big file, delimited by spaces and ... what the hey. We'll use HTML-colors for right now to make things easy.
#000000 #000000 #000000 #000000 #000000 #000000 #000000 #000000
#000000 #000000 #000000 #000000 #000000 #000000 #000000 #000000
#000000 #000000 #000000 #000000 #000000 #000000 #000000 #000000
#000000 #000000 #000000 #FF0000 #FFFFFF #000000 #000000 #000000
#000000 #000000 #000000 #FFFFFF #FF0000 #000000 #000000 #000000
#000000 #000000 #000000 #000000 #000000 #000000 #000000 #000000
#000000 #000000 #000000 #000000 #000000 #000000 #000000 #000000
#000000 #000000 #000000 #000000 #000000 #000000 #000000 #000000
(scaled to 3200x magnification)
That is an 8x8 drawing of a 2x2 red-white checkered square, centered on a black background. It's three colors, 64 pixels, and requires about, oh 512 bytes of storage. That SUCKS for such a tiny graphic.
Ok, look, granted IN MEMORY, it's gonna be uncompressed pixel data that the game uses to draw with, but we can't actually store graphics like this - the game data would be enormous. There's quite a few different simplistic schemes we can try, and Policenauts uses a bunch of them. Let's continue this example and examine a different method for storing this.
1. Paletted Graphics
One of the biggest factors in determining how big graphical data can be is "bits per pixel" or bpp. In our HTML-hex color, uncompressed scheme, each pixel of data is presented by 6 bytes (not counting the # or space after). That's 48bpp which, like I mentioned earlier, isn't good for something that's 3 colors total.
But what if we were to introduce a palette? Since we know this graphic is only 3 colors, how about we do this instead: we replace each distinct color with some value, and have a lookup table (LUT) to tell us what color each represents. What I mean is, we start with a table like this:
0 - #000000
1 - #FF0000
2 - #FFFFFF
And store our graphic like this:
00000000
00000000
00000000
00012000
00021000
00000000
00000000
00000000
When we go to draw, we replace the individual numbers (indices) with the color in the lookup table. This alone puts us at 1 byte, or 8 bits per pixel - 64 bytes total from 512 with no information loss. So just by doing this, our "compression" ratio for graphics jumps up to an incredible 87% roughly. Note, I'm not counting the lookup table, but if I did, that's about 97 bytes, 80% compression, still decent. Hell, we can do even better and use the actual bits for the indices - 00, 01, 10 could be our indices, so we can cut it down to TWO bits per pixel! 16 bytes total! In fact, the graphic data is actually smaller than the lookup table!
...For a bullshit 3-color sample graphic.
That's why Policenauts uses it for:
1. The font file and name tags. They're four colors tops: White, light gray, dark gray, and black. Perfectly reasonable.
2. The shooting trainer scenes. Those are actually modest scenes in terms of color depth. I don't think it was appropriate for them to use paletted graphics there, and I'm not 100% sure why they did it - especially, because they had much better schemes. We'll examine why in the next section.
But let's say that 8x8 sample graphic was part of a much larger one. One that had, I dunno, 65536 colors total. Or even larger. Unfortunately, as our color depth increases, so do the number of palette entries, and eventually, we'll need so many bits to represent them that we lose the benefits of paletting. Of course, pretty much anything is better than my HTML hex example, but at a reasonable 65k (16-bit color, which is actually what Policenauts uses for the gameplay scenes - 5-bits per color channel, and 1-bit for alpha) , we need two bytes per color palette. Even indexed, it takes up some room.
Could we do better?
2. RLE (Run-Length Encoding) (note: Not exactly used by the game, but included here to introduce the third)
The crummy thing about our sample graphic is that there's a lot of useless space. Of course we could crop it, but the truth with cartoony graphics, or video game graphics - or typically anything that's non-photoreleastic - single colors do tend to dominate large regions of the graphic. It'd be cool if we could take advantage, and the simplest method is to do it in "runs". Simply put, we store rows of pixels as multipliers and colors. So instead of 8 whole bytes for (assuming 1-byte paletted):
00000000
We have:
8x0
Obviously, that "x" is wasted space, but we'll keep it for clarity sake. Our graphic now becomes:
8x0
8x0
8x0
3x0 1x1 1x2 3x0
3x0 1x2 1x1 3x0
8x0
8x0
8x0
If we get rid of superfluous x's and spaces, and assuming one byte for multiplier and one byte for color index, we have 20 bytes. That's even better than the 64 bytes before! (Note, the 2bpp example - it gets more complicated because you have to decide how to specify how many bits for the multiplier, which gets into how wide a row pixels can be. I stuck with the bytes here for easier math, but with 2bpp and ... let's say 6 bits for the multiplier for no good reason? I dunno, we're looking at maybe 14 bytes total.)
This would've been pretty great for the shooting trainer stuff, which has lots of runs of similarly colored pixels... so I have no idea in the world why Konami didn't do that. It's uncompressed paletted, and it takes up way more than it has to. (By the by? Each screen for the shooting trainer, maddeningly, uses a whole different palette with wholly different colors - even if there aren't new colors between screens. Meryl's shooting evaluation screens use "1" for white, but Redwood's use "15" for white, which made them a pain in the ass.)
Run-length encoding is a pretty common graphical "compression" routine since it's so simple, but there's one catch. Imagine an 8-8 checkered board, black and white alternating pixels:
10101010
01010101
10101010
01010101
10101010
01010101
10101010
01010101
Assuming an 8x8 paletted graphic like before, without RLE, that's 64 bytes for storage. However, let's try to RLE that:
1x1 1x0 1x1 1x0 1x1 1x0 1x1 1x0
1x0 1x1 1x0 1x1 1x0 1x1 1x0 1x1
1x1 1x0 1x1 1x0 1x1 1x0 1x1 1x0
1x0 1x1 1x0 1x1 1x0 1x1 1x0 1x1
1x1 1x0 1x1 1x0 1x1 1x0 1x1 1x0
1x0 1x1 1x0 1x1 1x0 1x1 1x0 1x1
1x1 1x0 1x1 1x0 1x1 1x0 1x1 1x0
1x0 1x1 1x0 1x1 1x0 1x1 1x0 1x1
Since there's no "runs" of colors, RLE actually hurts us quite a bit. Here, it just adds bytes and our graphic isn't compressed at all - it's actually doubled in size! That sucks, especially for video games with low color depth - they can use "dithering" (sprinkling pixels onto a different color to create the illusion of a new color without increasing color depth) which means RLE isn't going to help a lot.
Konami did something "clever" for get around this. I don't know if it's a real thing or not (please enlighten me if so!), but they did a thing that I call:
3. Contiguous/Non-Continguous Run Length Encoding
The checkerboard can screw up RLE. But if only there was some way to specify it. Like to say "hey, stop doing RLE for a minute - I have a bunch of heterogenous colors coming, and I don't need multipliers." In one weird instance, Konami does it with negative and positive numbers. It works like this:
- If you see a positive number m, assume it's the multiplier for an RLE row of pixels. Draw that color m times.
- If you see a negative number n, take its absolute value and draw the next n pixels as if it were an uncompressed graphic.
So our checkerboard graphic looks like:
-8 10101010
-8 01010101
-8 10101010
-8 01010101
-8 10101010
-8 01010101
-8 10101010
-8 01010101
That's still worse than the orginal - it's 8 extra bytes. But let's go back to our original graphic. With CNC run length encoding, it's now:
8x0
8x0
8x0
3x0 -2 1,2 3x0
3x0 -2 2,1 3x0
8x0
8x0
8x0
This is slightly better than our RLE example, and it works to get around the checkerboard problem because we're not adding a multiplier for every single color change. It basically tries to minimize the downside of RLE.
Like it or not, this was the second-most prevalent graphical storage scheme used in the game for things like, but not limited to:
- Movie overlays, like the telops and credits for the FMV scenes
- One-off full screen graphics, like the summary screen backgrounds, or the title screen
- The "Please Change Discs" screen
- The game over screen
- The end credits
For the record, when Konami did this RLE thing, instead of negative numbers, they actually just used a flag. One bit was used - if it was zero, it said "this is a run of contiguous same-colored pixels" and if it was one, it said "this is a run of non-contigious heterogenous colors." The next bit - for God knows what reason - was unused, so the next six were used for the actual number of pixels. This means that the "runs" could describe 64 pixels maximum.
Also these graphics were NOT paletted. They used 5551 BGRA scheme.
And again, there is ONE SCENE in the game, at the end of Act 2, where they use positive and negative numbers instead of the single leading bit thing. Again, no idea why.
But the most prevalent version of the graphical compression in the game?
4. Real Compression
What do I mean?
I mean heavy-ass mathmetical algorithms that take large swaths of numbers representing pixel data, and churning out a much smaller chunk of data. Something smarter, and much harder to crack. Something proprietary to Konami used to store the graphics.
I'm talking about things like lossless bit rate reduction, and stuff with all sorts of greek letters and prime numbers and all that. And we had to figure out how to do it, somehow.
We didn't, though.
Scarboy did something a lot smarter.
(to be continued)