Compute!'s Gazette Part #34 - Disassembling (Part 1)

Part 34: Disassembling (Part 1)

Disassembly for fun and ~~profit~~, Part 1

Giving a complete explanation of Assembly is definitely beyond the scope of this thread, but I am hoping that as I delve into the secrets of other people's code with the Disassembler, at least some essence of how it works will be conveyed. That said, there will be a brief overview here, which you can skip and come back to if you just want to see the code taken apart.

Machine language is actually fairly simple, as it only consists of a handful of commands, with a few variations. That doesn't mean it's easy to understand what it is doing; arguably the opposite is true. A fair amount of effort must be expended cross-checking reference material, address listings, and the like, to decipher what the code is accomplishing. But when you know that with enough effort you can figure it out, it's pretty fun to pull back that veil from the inside.

To get one thing out of the way from the start: We'll see a number of values that are limited to 0-255, and also be talking about 'bytes'. The C64 uses an 8-bit processor (the MOS 6510, a variant of the 6502), which means that one 8-bit byte is the natural unit of storage. While a few things are coded as individual bits, most values are accessed and processed one byte at a time.

To further explain how machine language and disassembling works, we need to consider how the code is stored in memory. When interpreted as machine language, the contents of memory consist of a list of instructions. Under normal operation, the processor steps through the instructions one at a time, in sequence.

The first byte of the instruction is called the opcode. Depending on what the opcode is, the next several bytes will contain the operands, if required. Every value from 0 to 255 corresponds to a potential opcode. I say 'potential' opcode. Only a certain set of opcodes are specified to work. Technically every opcode will make the processor do something, even if that something might be jam/crash for the undocumented ones. There are even a few with some unexpected effects that can even let you get away with doing two things at once if you're lucky. You really need a physical machine to explore what happens in those cases, though.

Here's a reference chart that shows all the (documented) opcodes: http://e-tradition.net/bytes/6502/6...uction_set.html

Once the opcode is read, the processor will operate on that, plus any operands. The next value in memory will be interpreted as the next opcode in sequence. You could start interpreting the numbers in memory as machine language from any location, but it will only make sense if those values actually form a meaningful sequence.

It's not difficult, then, to understand how the disassembler program works. It saves us the effort of looking up the opcodes in the chart, grabs the operands, and does some formatting to make it easy to read.

Even without a full range of 256 values, the instruction list may seem like a lot, but it boils down to just a few different actions. You can:

move values around (load to a register, store a register to memory, transfer between registers, or push/pull)
change where the program will go next (branches and jumps)
make simple changes to some values (increment, set/clear bits)
do more complex logic and math operations
...and if you really want, do nothing at all (NOP, called a 'no op').

If you look at that chart you'll see that in at least some cases, the types of commands are grouped together. Probably the most noticeable is that similar commands tend to show up in the same column; what's different about them is something called the 'mode' which is how the operand is interpreted. Detailing the modes isn't all that important, but I'll go over them for the commands we see below.

The other important thing to understand about the processor is the registers at the heart of the machine. These are all 8-bit registers, and most of the machine language instructions affect them in some way. There are six registers, only three of which are normally interacted with.

The Accumulator (sometimes referred to as ACC) is the primary register for the processor. It's the only one that math and logical operations can be used with, and as such sees the most usage.

The X and Y Registers (also called Index Registers) can be used for temporary storage of values, but since they can't perform all the operations of the accumulator, their primary use is as loop counters. Some of the operand modes automatically add the X or Y register as the offset to an address.

That's it for the 'general' registers. Here are the other three:

The Program Counter (PC) stores the address of the next instruction to execute in memory. The PC is not manipulated directly, but can be modified as a result of an instruction like a jump.

The Status Register (SR) contains a series of single bit flags to indicate various conditions. Typically this is for testing the result of logical or arithmetic operations, but there are other bits located here as well. The SR is not used as a whole; instead, the individual bits are set, cleared, or checked by various instructions.

The Stack Register is used for temporary storage of values. Values put on the stack are stored in memory, and so any number of values can be saved until memory space runs out (what the Stack Register actually holds is the location where values on the stack can be found). Using a stack allows for code to be recursive (jump back to itself), or for some other operation that needs to temporarily save the current state. In contrast to modern usage ('push/pop'), the 6502 uses the terms 'push' and 'pull' for the stack. Only the Accumulator and Status Register can be put on the Stack.

That's hopefully enough of an overview to get started. Now let's make Johnny Five very sad:

Our first look is this little gem from Fred M. Sloniker:

code:

10 data238,32,208,238,33,208,76,0,192
20 forzz=0to8:readzx:poke49152+zz,zx:next
30 printchr$(147):sys49152

The loop puts those numbers directly into memory, starting at 49152. That's a typical location to put machine language stuff, since there's a block of free memory there that normal BASIC programs won't overwrite. Of particular usefulness to our efforts, anything stored there will stay even when we load the disassembler from disk.*

*Note that in VICE, using the 'Open' command to create a disk on-the-fly seems to clear this memory, so be sure to have your disks ready if you try this in an emulator.

As there's only one line of data statements, we can expect the resultant code to be quite short, and indeed it is.

code:

start address(decimal)
? 49152 (hex=c000)
 49152  inc 53280
 49155  inc 53281
 49158  jmp---> 49152
 49161  brk
 49162  brk
 ... (rest is brk statements)

There we have it, just three instructions. These are relatively easy to understand. 'INC' is an 'increment' statement, and can be used on values other than registers. A bare number as an operand means an address, i.e. a location in memory. So what we're doing here is incrementing the value at location 53280, then incrementing the value at location 53281. After that we have the 'JMP' statement, which you've probably figured out is a jump. The operand is an address which is where the program will jump to next. In this case, we loop forever.

As it happens, these two locations control the screen's background and border color. So what we're doing is very rapidly cycling through them. The effect you see on the screen doesn't look like any consistent screen color, however. What's happening is that the machine language is executing so quickly that the colors are changing while the video chip is still in the process of redrawing the screen. That's why this effect only works properly when using a tight machine language loop.

Here's the equivalent in BASIC, if you want to compare. This is merely an attempt to imitate the technique used; there are faster ways to do the same thing in BASIC, but any method is still going to be noticeably slower.

code:

10 poke 53280,(peek(53280)+1 and 255)
20 poke 53281,(peek(53281)+1 and 255)
30 goto 10

Manxome Bromide gave us this bit of code, which gives us a chance to consider how programs are stored in memory:

code:

10 rem omg goon rush
20 for i=0 to 19:read a:poke 49152+i,a:next i:sys 49152
30 data 169,6,160,8,32,30,171,238,134,2,32,228,255,240,241,169,154,76,210,255

As mentioned, the REM statement is not optional. As with the first snippet, this code is conveniently placed up at 49152. Here's the disassembly, with comments added by me.

code:

start address(decimal)
? 49152 (hex=c000)
 49152  lda   # 6             
 49154  ldy   # 8             ; pointer to where the string is 
 49156  jsr 43806             ; string out (BASIC routine)
 49159  inc 646               ; change cursor color
 49162  jsr 65508             ; get char
 49165  beq    49152
 49167  lda   # 154           ; light blue
 49169  jmp---> 65490         ; output char
 49172  brk

I should stress that the comments I add tend to be guesswork, with a few other things figured out by looking at other reference material. That's part of the fun of disassembly -- you get to try and reconstruct how the original assembly might have been written.

The first two instructions are 'LDA' and 'LDY', which stand for a 'load' of the Accumulator and the Y register respectively. The # means we're using immediate mode, which means that those literal values are being loaded to the registers. That sets the Accumulator to 6, and the Y register to 8.

The next statement is a 'JSR'. This is also a jump, like the 'JMP' instruction, but in this case it means 'Jump to Subroutine'. At some point the routine we're jumping to will return, and we'll come back right here, proceeding with the next statement in seqence.

The address we jump to is 43806. I happen to know that that is somewhere in the BASIC interpreter. After looking it up, this turns out to be a 'output a string' routine. Jumping into BASIC subroutines can save you space in your own code, though it is a sort of uncharted territory if you don't have the machine already mapped out (as has been done multiple times at this point).

Without knowing much about that string routine, we can make a good guess at those values loaded into the Y and Accumulator are. Almost any subroutine needs some sort of set-up, and if we're going to output a string, the routine needs to be told which string we want to print. That means those two registers are likely being used to supply an address, and knowing what this program does and the value contained, that interpretation makes sense.

The first thing to note is that because this is an 8-bit machine with a 16-bit address space, we need two registers if we want to access all of memory. Since I know what this code does, I can recognize that the address we're loading takes the Y register as the high byte: $0806, in other words. ($ indicates hexadecimal -- with the value split across registers it's a lot easier to read it as such).

Looking at the code snippet, we didn't apparently poke anything into location $0806 (2054 decimal). How did the string that does get printed get there, and how did the programmer know it would be there (or alternately, how did I know that's what it should mean)?

As it turns out, when a new BASIC program is written, it will start at location $0800 by default. And so that first REM statement is going to be stashed right around that point. Now, BASIC programs are not all stored as the literal typed-in characters; the representation compresses keywords, drops a few things here and there. But REM statements can be anything, so that statement actually must be stored somewhere as a literal string.

So that subroutine will output whatever string is located at the REM statement (or alternately, whatever data is located at $0806, interpreted as a string whether it is one or not).

Continuing on, we see an INC statement, this time to location 646. That cycles the cursor (text printing) color.

Next we have another JSR, up to a real high place in memory (65508). This is one of a number of routines known as KERNAL routines. Unlike jumping into the BASIC interpreter, using the KERNAL routines was encouraged and was indeed the recommended approach to doing I/O in machine language. These routines were fully documented and explained in the C64 Programmer's Reference Manual; this one checks to see if a key has been pressed.

After that, we see a 'BEQ' statement. Most statements starting with 'B' are a branch, which is similar to a JMP. Branches can test for a particular bit in the Status register and they only jump when that condition is true. In this case, it's checking for 'equal'; if there had been a keypress than the result would not be equal, and we'd want to exit. But if no keys were pressed, we loop around.

This looked like another fairly short loop of code. But it takes advantage of the built-in routines to accomplish most of its work. That also makes it slightly slower than the previous one (outputting text is always slow), but it's still faster than the BASIC would be.

Incidentally, we can also look at the disassembly of those subroutines we're jumping into. I'm not going to analyze them, but here's the BASIC one. You can see that it involves effectively looping through (the string) and also calling more subroutines. The 'RTS' statement is a 'Return from Subroutine' that is required after a JSR was used to arrive at this piece of code.

code:

 43806  jsr 46215
 43809  jsr 46758
 43812  tax
 43813  ldy   # 0
 43815  inx
 43816  dex
 43817  beq    43751
 43819  lda ( 34 ),y
 43821  jsr 43847
 43824  iny
 43825  cmp   # 13
 43827  bne    43816
 43829  jsr 43749
 43832  jmp---> 43816
 43835  lda   19
 43837  beq    43842
 43839  lda   # 32
 43841  bit 7593
 43844  bit 16297
 43847  jsr 57612
 43850  and   # 255
 43852  rts
--------------------
 43853  lda   17
 43855  beq    43874
 43857  bmi    43863
 43859  ldy   # 255
 43861  bne    43867
 43863  lda   63
 43865  ldy   64
 43867  sta   57
 43869  sty   58
 43871  jmp---> 44808
 43874  lda   19
 43876  beq    43883
 43878  ldx   # 24
 43880  jmp---> 42039
type c for 43883
? -1

You can always try jumping into these routines at any time by using the SYS command. But you won't be able to set up the registers from BASIC, so what happens is somewhat unpredictable. If you type SYS 43806 you most likely will get something output as a string and not a crashed machine, but you never know.

Next time we'll get to Project Atrocity, which is long enough to let us mess around with it, and also will let us show off a few other features of the disassembler.

The Let's Play Archive

Compute!'s Gazette

by Chokes McGee

Part 34: Disassembling (Part 1)

code:

code:

code:

code:

code:

code: