Part 15: Tech Post 7: Doug, Mel, and the Modern World
Tech Post 7: Doug, Mel, and the Modern WorldI've just spent the past month digging into a binary that hewed to strange and arcane requirements, trying to tease out its secrets.
There's precedent for that, in literature and folklore. A lot of us have, at this point, heard of The Story of Mel. If you haven't read it, you should go read it, and if you have read it, that link is to an annotated version that explains the tricks in detail. Mel Kaye is an interesting comparison point, because Mel was working in the 50s or early 60s and was about as far from Doug Neubauer writing Solaris as we are from him. So it's interesting to see what's changed and what hasn't.
Before we go on, though, I do have to note that I don't like The Story of Mel. It's a story of the guy who writes the fastest code, and who thinks that entitles him to get to pick and choose his projects at will, and also excuses him writing impenetrable code and does not require him to mentor his own co-workers. Programming is a craft, not a feat of athleticsand to the extent that there was method to his techniques, they should have been spread throughout the entire company. To the extent that tools existed, and that Mel could beat them, the techniques he used to beat them should have been folded into the tools.
And from a basic level of professional standards, leaving your successors the ability and knowledge to extend and modify your work once you're gone is a fundamental responsibility, one that Mel expects to be praised for shirking, and that the storyteller is willing to grant.
Despite literally wasting two weeks of the storyteller's time to reverse engineer a loop that wasn't even accomplishing anything important, because Mel just kept his tricks to himself.
Sure, stories grow in the telling, but the basic idea here is still incredibly unhealthy. We know for a fact that Atari's internal developers shared tricks with each otherthe six-digit score trick was apparently the "Dave Staugus score kernel" there, after the programmer who worked it outand once it was devised, everyone used it. It was a hardware upgrade for everyone.
The optimization techniques Mel employed only upgraded the RPC-4000 if Mel was the one writing for it.
To this day I am baffled by the idea that he's supposed to be some kind of folk hero of the open source community.
But all of that is culture. Let's talk technology.
Constraints
In Mel's time, the integrated circuit had only just been invented, and it would be approximately ten years before the technology was mature enough that you could create a CPU on a single chip. The defining feature of the computer in the Story of Mel is that it is a drum-memory computer. This means that its RAM is basically a disc drive that is always spinning and reads data when it comes into the right place. This computer would not have had a lot of RAM, but it would have had a whole lot more than the Atari 2600 did. After all, it had to store the program it was running in RAM.
(Mel complains about programs that can't modify their own code. Self-modifying code was a limited thing on the 6502we'll get to that in a bitbut a fundamental constraint of cartridge-based programming is that the program code is immutable. That's what "ROM" means. Modern machines forbid self-modifying code of the sort Mel deployed on the grounds that if it's allowed it opens you up to the broadest class of system-takeover malware attacks.)
But the sticky bit about drum memory is that it means that memory accesses can take a variableand largeamount of time. Everything about Mel's tricks revolves around the fact that he wants to always have his memory's "seek time" be zero. The program's instructions, considered in execution order, are scattered throughout the memory so that the time spent running the previous instruction is exactly the amount of time it takes to seek to the next instruction.
(Mel's employers weren't complete newbies here; they had in fact mechanized the process of instruction placement. Mel could defeat it with hand-tuned code, but he could also defeat it by significant margins because he assigned the locations of the most time-critical code first, while the instruction-placement program would just go through from start to end. That this story ends with "so Mel is Just Better" instead of "so best practices for that machine changed to putting the most time-critical code at the top of your program" is the part where I start gnashing my teeth and ranting about the savagery and barbarism of earlier days.)
So in this sense, the Atari 2600 is a system with the unimaginable luxury of constant, and fast, memory access. This also means that your program gets to just be written in linear order.
One of the other odd things about the Atari 2600 is that past a certain point optimizing for size or speed buys you literally nothing. ROM chips always have a size that's a power of two, so if you've gotten your program down to 2KB, you gain nothing whatsoever by getting it down to 1.5KB. And since you're shackled to the television's electron gun, past a certain level of efficiency making your code faster just means that you're spending more time waiting for the CRT to catch up.
In modern terms: once you're I/O bound, that means your logic is fast enough. It's as true today as it was in 1977.
Mel's Tricks
High-level program organization aside, several of Mel's programming tricks are detailed in the story and are relevant to the 6502 as well.
Opcodes as constants. There's nothing stopping you from treating code as constant data and reading it as a raw value. But on the 6502, doing this is purely a dick move that makes your code more brittle. Constant data on the 6502 can be encoded directly into the instruction, and it's always faster to do so.
Self-modifying code to replace indexes. The 6502 turns out to be really bad at pointers, so this trick was very common if you weren't running off a ROM. If you're reading an address out of RAM on the 6502, that RAM has to be in the first 256 bytes of memory, and you have to index it. For loops with well-defined bounds, it is often simpler to just modify the address in the middle of an instruction to point at the new place to load from. However, while simpler, it's very rare that it's faster, because messing with memory that isn't in the first 256 bytes of RAM is 50% slower. And the Atari 2600 laughs at you, because with only 128 bytes of RAM, all of it can be used as pointers and be accessed really quickly anyway! Sometimes constraint is a luxury all its own.
Self-modifying code to alter opcodes. This is the one that stumped the storyteller for two weeks. Unlike many other chips, the opcode on the 6502 is its own byte, and the data is in other bytes. You can mess with the operands, or you can mess with the opcodes, but you can't do both at once. Because of this, the trick doesn't translate, and I'm unaware of any tricks on the 6502 that involve modifying opcodes and that actually buy you anything.
Doug's Tricks
See Tech Posts 2 through 5.
But here's the thing. I've done my share of 6502 assembly programming, but I'm too young to have done so professionally. It's always been as a hobby and as a curiosity.
I knew about the 6-digit score trick going in, but that was the only technique used that I knew in advance. Not a single one of those posts required more than an hour of experimentation to research.
In the Doug vs. Mel showdown, Doug wins hands down.
The View From 2016
Moore's Law hasn't broken yet. We command unimaginably sophisticated systems at every scale of deployment from handheld through datacenter, and we've had to grow our techniques to match. There's a very strong tendency to consider this a sign of weaknessa generation of John Henrys giving way to an age of power tools.
I think that's only half right. The idea of shifting from a world of hand tools to a world of industrial machinery is a useful one, but the old code that's really worth reading is less like admiring the mettle of iron men in wooden ships and more like admiring the comfortable simplicity of a piece of well-crafted old furniture. It's a genuine relief to step away from a world where one is trying to get fifteen different computers to all kind of talk to each other and point in roughly the same direction while speaking slightly different variations on five different relevant standards into a world where there is one computer doing one thing because that bit right there is a 1.
When I started digging into Solaris's code, I really expected it to feel like cracking a cipher or solving a puzzle. Instead, it felt like I was taking a tour and being directed to points of interest. The Atari 2600 is just small enough that even when it's juggling as much as Solaris did, at any given moment all the pieces are doing something relatively obvious. It was a genuine delight to see the parts of the system fit together to become the whole of the display, and I hope I've been able to get that across to you through these articles.
And if by some odd chance Doug Neubauer himself ends up reading this: Thanks. By my standards as a youth and as an adult this was a very fine work, and I'm glad I got to spend time with it.