The Let's Play Archive

Compute!'s Gazette

by Chokes McGee

Part 28: ManxomeBromide - BASIC Loaders & Machine Language Programs

Bonus Update Corner

Hex War, it turns out, is actually two programs: Hex War itself and a smaller program called Hex War Loader. Both are predominantly BASIC and quite possibly 100% BASIC.

Why two files? Well, the issue here is that the graphics chip (the VIC-II) can only actually "see" 16KB of RAM at a time, and 2KB of that is taken up by the character ROMs overlaid over the RAM. If you wanted to use the character ROMs and also wanted space to do a bunch of graphical stuff, there wasn't really enough space in the first 16KB to do anything.

I'm going to cede the floor to FredMSloniker again from earlier in the thread when he was walking about MLX and what I had to do to make Richtofen's Revenge work, because it's relevant here:

FredMSloniker posted:

To go into more detail: the first version of MLX stored what you were entering in the memory it would ultimately be loaded into.... this meant you had to load the program with the command 'LOAD "filename", 8, 1', then run it with 'SYS whatever'. Which was a bit inconvenient, especially if you forgot the '1' at the end of the LOAD command, which told BASIC to pay attention to the file header giving an address to load the file into. Without it, the program would be loaded into BASIC memory (starting at 2049, or $0801), which would cause all kinds of problems.

If you wanted to enter a program like that using MLX, though, you had to go through some shenanigans to load the BASIC program somewhere safe and tell the computer where to find it.

MLX had to be loaded somewhere higher up in memory so that I had room to enter in Richtofen's Revenge. Hex War needs to be loaded somewhere higher up in memory so that the graphics chip has room to play. The "shenanigans" in question involve altering some pointers that BASIC keeps around so that it isn't, strictly speaking, bound to the C64's default memory layout. The start of BASIC programs is a pointer in RAM, and there are also pointers for the end of the program code, and the start/end of normal variables, arrays, and strings. Strings in BASIC were fully garbage collected, too; it turns out you can do a lot in 8K!

But if we want a loader program, we have a problem. If we mess with BASIC's pointers, we're in trouble because we are a BASIC program. We're trying to yank the rug out from underneath ourselves! How to solve this?

To answer that, let's look at HEX WAR LOADER in its entirety:

10 DV=8:Q$=CHR$(34)
20 PRINT"{CLR}POKE 44,64:POKE 16384,0:NEW"
40 POKE 198,3:POKE 631,19:POKE 632,13:POKE 633,131
Lines 10-30 are printing out the necessary commands on-screen, with space for BASIC to reply. Line 40 is then stuffing the keyboard buffer with HOME, ENTER, and SHIFT-RUN/STOP so that after returning to BASIC you end up executing these commands. The first set of commands tells BASIC that it should be loading programs into location 16385, completely out of the graphics chip's range. But what does SHIFT-RUN/STOP do?

FredMSloniker posted:

Heck, if you stored it on tape, all they had to do was push SHIFT and the RUN/STOP key, which would automatically LOAD and RUN the next thing on the tape!

SHIFT-RUN/STOP was the same thing as typing LOAD<ENTER>RUN<ENTER>, so they used it here to save some space.

Can we skip the loader?

We kind of can! We go back to that post I keep quoting:

FredMSloniker posted:

If you started the machine language file, not with machine language, but with a short BASIC program that only used the SYS function to call the actual machine language, you had all of BASIC storage (just under 38k) to play with. Plus you could just load the program with 'LOAD "filename", 8' and run it with 'RUN'.

That short BASIC program, incidentally, is this:

10 SYS 2062

It's loaded into $0801 and ends right at memory location 2062. We could totally write a machine language program that has that at the front, and then a little code that copies the Hex War program into $4001, sets the pointers as needed, and then hands off control by stuffing the keyboard buffer with "RUN<ENTER>" and quitting out.

That looks like this, more or less. (I'm using xa65 here as my assembler, both because it actually ships with Ubuntu for some insane reason and because it has a directive that lets me embed other program files really easily.)

EDIT: OK, the actual code paste is way too long. See here if you actually care.

Part one is the loading address and that short BASIC program as it's saved to disk. Since the line number is a 16-bit value we can change it to whatever we want within reason an the addresses still work out.

Then we copy the memory into place, which is one of those My First Assembler routines and is both deeply weird and second nature after some experience.

Then we set the pointers, including a few extra ones that the LOAD command did for the BASIC case.

Finally we stuff the keyboard buffer and get out.

Oh, right, and we also have to put the program in at the end.

If we assemble and run this... well, it explodes. This is for two reasons.

First, because we yanked the carpet out from under ourselves, even though we were in machine language; RTS jumps back into BASIC from the SYS command that called us, and that's "that line that started at memory location 2049". Great, but BASIC programs start at 16385, thanks to our meddling. Instead of returning to the calling program we basically have to intentionally crash out. BASIC provides a routine for that, as it happens, so we replace rts with jmp ($a002) and all is well.

Then it explodes because it turns out that the command LOAD "WHATEVER",8 does a ton of extra processing on the file as it loads it. In particular, the lines of a BASIC program form a linked list, and while those addresses are saved to disk, if the program's base address has changed from time-of-save and time-of-load, BASIC goes and fixes up all the addresses. Dumping the binary at the end of our program doesn't do that.

The easiest way to fix that is to run the loader program without the last line, run those two commands it prints out, and then re-SAVE it to disk. That saves it with a base address of 16385, and all the addresses work out fine.

Congratulations! You now have a single machine language program that, when run, transmutes itself into a BASIC program somewhere that BASIC programs aren't really intended to be and reconfigures the machine to think that all of this is perfectly normal. It's also now basically impossible to edit or examine, so this was perhaps not the greatest use of your time. But it is one file now!

It makes way more sense to do this kind of thing with machine language programs that live in inconvenient locations, really; all you need is the BASIC stub and the copy loop and then everything else we've talked about here basically turns into something like jmp $c000. That's what Astro-PANIC got added to it so that it could be run with RUN.