EXAPUNKS Part #20 - Digital Library Project

Part 20: Digital Library Project

Part 20 - Digital Library Project

=== Trash World Inbox ===

As I expected, Quackles' rendition of Ra Ra Rasputin in Shenzhen I/O came up in the thread.
Because it's very impressive, here's a link to the Youtube video and a detailed write-up. Note that the write-up continues in the next part of that LP.

=== Digital Library Project ===

Congratulations! You did it.
What does it feel like to be a game developer?

One vote for "Pain" which I feel, yeah. It's though work. I prefer doing other kinds of programming. Anyway, two votes went to "great". The reward can make it worth the effort.

It's great.

Yeah?
At least there's one thing you can enjoy.
I'm not sure I fully understand it, but there you go.

Before we continue, a voiced cutscene with Ember.

Okay. I have a real question to ask you.
Say there's a trolley on a track heading towards five people.
Three of those five people are vegetarians.

Once again, for the voice-acted parts, dialog choices really make no difference.

Not this again...

There's a switch you can throw that will shut down a meat packing facility that is located down the road.
Unfortunately, the switch will also prevent the train from stopping, because...
Because...

Wait.
I may have reversed something.
Recalibrating.
There are too many variables here.
Okay.
Let's return to this later.
I need more data first.
And I need more processing power.

Well, that was a complete waste of time.

I have two assignments available now. Let's start with the first, hacking a library. Ook!

Time for some research.
Human physiology, human morality...

An unanimous vote.

Your usual topics.

Yeah, I guess so.
Wait. Am I really that predictable?
I'll have to change things up soon.
Let's get those books.

OST: Behind the Scenes

We're in the Palo Alto digital library project. Looks like a quite complicated network.

The assignment:
- Books are stored in the host corresponding to the first digit of their call number, while a book's file ID is 200 plus the last two digits of the call number. For example, book 512 would be stored in the host 500-599 as file 212.
- Duplicate each of the books requested by EMBER-2 and bring them back to your host.
- The call numbers for the books EMBER-2 wants are available in the file 300.
- Note that EMBER-2 will never request more than one book from the same host.

Alright, let's nose around a bit first. File 300 contains Ember's shopping list. To my knowledge there's no way to reach that EXA holding a file on the far left.

As for the books, here's all of them. Did you know you can use that little button with the right arrow in a file or EXA's title bar to pop it out of the left column so you can drag and drop it anywhere?

I feel some of these books might not be very trustworthy, but if Ember wants them, we'll get them for her I guess. Since you can't steal from a library (that would be evil!) I'll need to copy the books.

First of all, getting to them.

This icon seems to indicate that to get deeper into the network, I always need to use LINK 800, and to get back home it's -1.

My first attempt to get to the files looks like this:

code:

GRAB 300
MARK START
COPY F X
REPL GO
JUMP START

MARK GO
LINK 800
DIVI X 100 T

MARK GETTHERE
LINK 800
SUBI T 1 T
TJMP GETTHERE

The EXA starts by reading from the file. By writing values to X, the REPLs will still have the value in memory. The hundreds digit tells us how many LINKs we need to traverse, which is handled in a simple loop. The original EXA will die once it tries to read past the end of the file.

code:

REPL WRITER

MODI X 100 X
ADDI X 200 X
GRAB X
; DO STUFF

MARK WRITER
; DO OTHER STUFF

To grab the file we need to do a bit of arithmetic. I also REPL a WRITER which can hold the new file to write the copy to. Testing this code I see some hosts get a bit crowded. That might slow down EXAs a bit, if they have to wait for space. But that's a problem for later.

Since we're going to need communications between EXAs and since a lot of them will be writing in parallel, I think it's a good idea to have my single initial EXA (from which the others are replicated) start in local communication mode and just keep each WRITER in the same host as its corresponding reader.

With the writer and reader code in place, the replicated EXAs start copying their files. The READER will send a 0 once it reaches EOF. After that, it conveniently self-destructs when it tries to run the MAKE instruction while already holding a file. Using 0 as EOF indicator means the WRITER can simply use the T register as an intermediate, since ALL non-0 values are considered true for the purpose of a TJMP, including keywords.

All that's left is getting our EXAs back home. That can be done in a very simple way:

code:

GRAB 300
MARK START
COPY F X
REPL GO
JUMP START

MARK GO
LINK 800
DIVI X 100 T

MARK GETTHERE
LINK 800
SUBI T 1 T
TJMP GETTHERE

REPL WRITER

MODI X 100 X
ADDI X 200 X
GRAB X

MARK READING
COPY F M
TEST EOF
FJMP READING
COPY 0 M

MARK WRITER
MAKE
COPY M T
MARK WRITING
COPY T F
COPY M T
TJMP WRITING

@REP 10
LINK -1
@END

In most cases it will reach the home node before going through all the repetitions, which is fine because in that case the next LINK will just cause the EXA to die, dropping the file in the process.

By the way, now that I got the second zine, here's a full explanation of the @REP macro.

If you didn't know about the performance increase of unrolls, the zine also explains it here.

And this is my first working solution.

I had to compress this gif to not make it too huge. It's playing at double speed now.

290/38/74. Top percentile scores are 218, 26 and 10 respectively. Honestly, my solution isn't bad - I'm already below tenth percentile for cycles. But let's see how this can be improved.

A slow part of this solution is the check if the file is done every cycle. The files have a minimum size: 34. That means if we step into the check cycle after 33 steps we're fine.

I unrolled both loops to 33 direct COPYs - but it didn't quite fit. So I added some code so each does a 16 unroll twice. I set up a countdown in T for that, and had to remove some of the final LINK unroll because it didn't fit otherwise.

code:

GRAB 300
MARK START
COPY F X
REPL GO
JUMP START

MARK GO
LINK 800
DIVI X 100 T

MARK GETTHERE
LINK 800
SUBI T 1 T
TJMP GETTHERE

COPY 2 T

REPL WRITER

MODI X 100 X
ADDI X 200 X
GRAB X

MARK A
@REP 16
COPY F M
@END
SUBI T 1 T
TJMP A

COPY F M

MARK READING
COPY F M
TEST EOF
FJMP READING
COPY 0 M

MARK WRITER
MAKE

MARK B
@REP 16
COPY M F
@END
SUBI T 1 T
TJMP B
COPY M F

COPY M T
MARK WRITING
COPY T F
COPY M T
TJMP WRITING

MARK END
@REP 4
LINK -1
@END
JUMP END

232/75/74. Much better.

The next thing I thought of was removing the test from one of the loops entirely. The result was not an improvement but I'll share it anyway in case it inspires someone.

code:

GRAB 300
MARK START
COPY F X
REPL GO
JUMP START

MARK GO
LINK 800
DIVI X 100 T

MARK GETTHERE
LINK 800
SUBI T 1 T
TJMP GETTHERE

REPL WRITER

MODI X 100 X
ADDI X 200 X
GRAB X

@REP 33
COPY F M
@END

MARK READING
COPY F M
TEST EOF
FJMP READING
DROP
KILL


COPY 399 X
MARK GRAB
ADDI 1 X X
TEST X = 410
TJMP READING
REPL GRAB

GRAB X
JUMP END


MARK WRITER
MAKE

MARK B
COPY M F
JUMP B


MARK END
@REP 5
LINK -1
@END
JUMP END

The reader simply KILLs the writer when it's done. This allows for the full unroll. The real problem comes from the fact that the file IDs of the generated files are somewhere between 400 and 409. So I need a REPL structure to try grabbing all those file numbers until it finds one. I also needed to fiddle with the timing a bit so that it doesn't KILL some other EXA that happens to pass by. Anyway, this runs at 258 cycles, so slower than the previous solution.

I didn't get a better score than 232 cycles, so I'll leave the further improvements to the thread.

As for reducing the size, the trick is that file IDs are always globally unique. I don't need to keep track of where to go if I just simply try grabbing the correct file in every host. Don't forget merging the COPY from 300 with the first MODI, which is now possible since we don't need the hundreds digit for anything anymore.

code:

GRAB 300
MARK START
MODI F 100 X
ADDI X 200 X
REPL GETTHERE
JUMP START

MARK GETTHERE
LINK 800
REPL GETTHERE

GRAB X

REPL WRITER

MARK READING
COPY F M
TEST EOF
FJMP READING
COPY 0 M

MARK WRITER
MAKE
COPY M T
MARK WRITING
COPY T F
COPY M T
TJMP WRITING

MARK END
LINK -1
JUMP END

295/26/87. That's the top percentile score.

Finally, activity. It's possible to get that all the way down to 10. Can you see how?

To get to 10 activity I can't LINK back even once. I need to send one EXA out that communicates the data back via global M. That'll be a single file at a time so it will be slow. Also, I need to grab the files in order starting from the lowest, otherwise I'd need to track back. In other words - I'm going to need to apply some kind of sorting to file 300.

It is very annoying to sort a file with EXAs. With only two registers per EXA, one which is overwritten whenever you test which value is greater, and with the fact that there's only an overwrite instruction, not an insert one, I'm not even sure if it can be done with a single EXA.

Luckily we don't need to actually store the sorted file. All we need is an EXA that outputs the values of file 300 in order from low to high. And, after that, since we can't use KILL instructions, some smart handling of EXAs so that they don't get in each others way.

I came up with a solution that's not that easy to understand, so I'll go through it bit by bit. It starts with three EXAs. XA sorts file 300. XB writes the new files. And XC goes visit the library. Here's the top part of XA:

code:

;XA LOCAL
GRAB 300

MARK NEXTFILE

COPY F X

MARK FINDLOWEST
TEST EOF
TJMP FOUND
TEST F < X
FJMP FINDLOWEST
SEEK -1
COPY F X
JUMP FINDLOWEST

MARK FOUND

This code puts the first value of F in X. Then it checks every subsequent value of F. If it's lower than X, it puts that in X instead. Once it hits EOF we know it found the lowest value currently in the file.

After this, XA has some logic to send the file to the XC, my reader EXA. I'll skip that for the moment. To complete the sorting algorithm, after doing so (and while XC is busy copying a file), XA executes this:

code:

SEEK -9999
MARK FINDCURRENT
TEST F = X
FJMP FINDCURRENT
SEEK -1
VOID F

SEEK -9999
TEST EOF
FJMP NEXTFILE

In short, it finds the value it just sent to XC, deletes it from the file, then jumps back to the start. If the file is completely empty, it falls through to the code below.

Let's look at XC next.

code:

;XC GLOBAL

LINK 800

MARK LOOP
MODI M 100 X ;GB
DIVI 0 X T ; DIE ON 0
ADDI X 200 X
MODE
MARK NEXTHOST
LINK 800
REPL FIND
NOOP
TEST MRD ;LD
FJMP NEXTHOST
VOID M ;LD
VOID M ;LF
MODE
JUMP LOOP


MARK FIND
GRAB X
COPY 0 M ;LD
MODE

MARK READLOOP
COPY F M ;GE
TEST EOF
FJMP READLOOP
COPY 0 M ;GE EOF
MODE
COPY 0 M ;LF

The comments (such as GB) indicate what communication is being done. The first letter stand for Global or Local mode, and the B corresponds to B in another EXA to see where the specific M calls line up.
It starts by getting a file ID from M, and doing the same logic as before to get the actual filename. There's a DIVI inserted so the EXA can quickly die when the program is done.

Otherwise it LINKs to the next host (this is always safe, since it starts in the gateway, and Ember never needs two files in the same host), and makes a replicated EXA (FIND to see if the file is there. If it is, the replica immediately communicates over local M (message LD), which causes the MRD in the original XC EXA to be true. In that case it has to VOID LD to allow the replicated FIND EXA to continue, otherwise it jumps to NEXTHOST and tries again. If the file is found, the original has another VOID which won't be triggered for a while.

The bottom half switches back to GLOBAL mode, and starts sending the file's content, ending with a 0 (GF). Then it changes mode back to LOCAL, sends message LF, which tells the original XC it should continue. At that point it switches back to GLOBAL mode and waits for the next file ID.

This whole switcharoo is necessary because, for example, if you leave XC in GLOBAL mode, it will keep stealing file data that's supposed to go to XB.

Speaking of which, it's time to look at the whole thing and see how XB fits into the picture.

XB starts by sending a message over LOCAL M (LA). This is basically it waiting for someone to read it. XA is the one to do so, directly after the MARK FOUND.
At that point, XA quickly switches to GLOBAL mode, sends the file ID to XC (GB, which we already saw from XC's point of view), then switches back and tells XB to get ready to receive data.

This order is important - if XA informed XB before XC, XB might already be in global mode, stealing the file ID that XC needs. Anyway, XB switches to GLOBAL mode, and starts writing the data to a new file. Once it gets a zero (tested by using the T registry as intermediate and with FJMP ENDFILE, it drops the file, switches to LOCAL mode, and tells XA it's ready to start receiving the next file. By that time XA has already found the next lowest value, so the complicated game of telephone repeats.

Once XA runs out of file IDs to send, it executes its bottom two lines: placing the special value 0 in X and then jumping to FOUND, which holds the communication protocol. It sends the 0 to both XB and XC, which will both die because of this. XA then dies itself with the DIVI 1 X T. And with that, all files are written.

I probably could have done without XB, but that would mean XA repeatedly dropping its file to make another one and then pick it up again.

The result is 1200/70/10. One file after another makes it slow but that's the only way to get the activity all the way down.

Processing.
Yes, this is what I wanted.

And with that, we're at the first thread vote for today.

I've been analyzing television and radio broadcasts.
And I have a few questions.
First of all, why do some things become popular while others don't?

And the second vote.

The Let's Play Archive

EXAPUNKS

by Carbon dioxide

Part 20: Digital Library Project

code:

code:

code:

code:

code:

code:

code:

code:

code: