The Let's Play Archive

EXAPUNKS

by Carbon dioxide

Part 8: Peanut-free Peanut bars

Part 8 - Peanut-free Peanut bars

As a reminder - you can freely skip the Inbox if you like, the 'regular' updates below them won't depend on them.


=== Trash World Inbox ===

Another part in Euclid's Pizza discussion:

GuavaMoment posted:

OK, there's some really slick M register copying and timing going on here, but you do have to store "Cheese" from the original file first.
code:
X1:

LINK 800
GRAB 200
SEEK 2
COPY F X         ***(stores CHEESE)
SEEK 9999
COPY M F
COPY M F
COPY X F
COPY M F
COPY M F


X2:

GRAB 300
COPY F X
COPY F T
REPL SEND2
VOID F
COPY F X
REPL SEND1
NOOP
NOOP
COPY F M
HALT

MARK SEND2
COPY X M
COPY T M
HALT

MARK SEND1
COPY X M
Aha. So, a part of the optimization is obvious now I see it. By already loading data into the X and T registers, a REPL'd EXA can immediately start sending as soon as the receiving end is ready to receive. Meanwhile the original EXA can read more data from the file.

What's less obvious to me is why REPL SEND1 saves a cycle even though the original EXA has to pause for two cycles. No, you can't just replace one of the NOOPs with the COPY in SEND1 and call it a day. I suspect an M transmission always pauses an EXA an additional cycle so having another EXA do that saves time.

Then, a lot of ideas about the arm hacking (not to be confused with ARM hacking).

Quackles posted:

There is a way to be very clever here: it turns out EXAs have built-in clamping functionality. If a number goes above 9999 or -9999, it is set to 9999 or -9999. Integer division also truncates, making a multiply/divide ideal for clamping.

The range of signals goes from -120 to 50. Adding 35 makes that -85 to 85. Multiplying that by 117 will make a value of 85 be 9945, but 86 be 10062 - clamped to 9999.

So, the transmission of signals can be performed without using any conditional logic. The transmitter adds 35 and multiplies by 117; the receiver divides by 117 and subtracts 35. Finis!

code:
;TRANSMITTER SETUP
LINK 800

MARK LOOP
ADDI #NERV 35 X
MULI X 117 M
JUMP LOOP
code:
;RECEIVER SETUP
LINK 800
LINK 1
LINK 1
LINK 1
LINK 1

MARK LOOP
DIVI M 117 X
SUBI X 35 #NERV
JUMP LOOP
124 / 14 / 6.

(you could probably get better efficiency at expense of size by unrolling the main loop some, but I'm happy with this)
Another idea I hadn't thought of. Going to the max size of 50 by wrapping the ADDI / MULI and DIVI / SUBI blocks by @REP 10 brings the cycle count down to 96.

It's funny, the limit of 9999 always trips me up because in real programming, you never have that kind of limit. Since everything is binary, the limits are usually some power of 2. And if you go beyond them, the program won't cap, instead it will overflow and go far into the negatives. I understand why this game didn't do that - it's hard enough as it is without introducing concepts such as two's complement.

silentsnack posted:

Oof. This puzzle made my brain hurt. There's a trick to speeding up operations by running multiple instances in parallel instead of sequentially, but then you have to deal with non-deterministic outcome of race conditions like when two EXAs try to take the same link in the same frame, so solutions would sometimes work and sometimes break for no apparent reason. This was my previous fastest solution, with some fairly insane workarounds:

code:
;====XA====
LINK 800
COPY #NERV X; WHY
COPY #NERV T; DOES
@REP 4
LINK 1
@END
COPY X #NERV; THIS
REPL OUT
COPY T #NERV; WORK
MARK OUT
@REP 2
SUBI M 120 #NERV
@END
JUMP OUT

;====XB====
NOOP
MARK LOOP
REPL LOOP
LINK 800
ADDI #NERV 120 X; ???
LINK 1
DIVI X 170 T;WTF HAX
LINK 1
FJMP BANDPASS
TEST X > 0
MULI T 170 M
HALT
MARK BANDPASS
LINK 1
COPY X M
38 cycles / 29 size / 137 activity.

All of the superfluous LINK 1 interlaced in the logic was because I kept running into problems with race conditions and/or having too many EXAs pile up in a host. I guess 38 is okay, and as a plus it looked really stupid in motion:
Wow. ;WTF HAX is right.

I'm going to do some line by line commentary, both for my own understanding and for any readers.

code:
;====XA====
LINK 800
COPY #NERV X; WHY
COPY #NERV T; DOES
@REP 4
LINK 1
@END
COPY X #NERV; THIS
REPL OUT
COPY T #NERV; WORK
Ignoring the REPL for a moment, what this does it simple: it stores the two first values from #NERV into X and T, moves itself to the outgoing #NERV, and sends those two values out again. This could only work if the first two values never pass the -120 / 50 limits - but apparently that's the case in this test set.

code:
MARK OUT
@REP 2
SUBI M 120 #NERV
@END
JUMP OUT
This, together with the REPL OUT, makes sure there are two EXAs in the host containing the outgoing node. They're doing nothing but waiting for messages on M, subtracting 120 from them, and sending them to #NERV. With two EXAs doing this at the same time, you don't watch any cycles on waiting for M (the same issue as in GuavaMoment's pizza solution).
Now, the question is, where do those Ms come from and why do they need 120 subtracted?

code:
;====XB====
NOOP
MARK LOOP
REPL LOOP
Wait an initial cycle (probably to give XA time to set itself up), then get into a loop type we haven't seen before: the newly created EXA immediately creates a clone of itself that creates a clone of itself and so on. After each EXA created its clone, it goes further down the code.

code:
LINK 800
ADDI #NERV 120 X; ???
LINK 1
DIVI X 170 T;WTF HAX
LINK 1
FJMP BANDPASS
The first thing I'm noticing are the LINK instructions intermingled with the code. The XBs are slowly moving across the nodes. Even though they never make it to the arm #NERV, this is necessary because with this much cloning, the hosts are going to fill up fast. By intermingling them like this, there's (usually) not two clones trying to traverse the same LINKs at once. That would be horrible - because then it's unpredictable which one is waiting and the results would become unpredictable.

As for the arithmetic, you take the original value plus 120 and put that in X. So if it was originally -120, it is now zero. If it was 50, it is now 170.

Next, you divide the result by 170 and store that in T, and use it for a conditional jump.

When is T true and when is it false? Well, DIVI always rounds towards zero. That is, if 0 / 170 = 0, and 169 / 170 = 0, but 170 / 170 = 1. So, T is true if X is 170 or higher - that is, if the original value is 50 or higher, and needs to be clamped.

However, if X is -170 or lower, we get -170 / 170 = -1, which counts as true. So if the original is -290 or less we'd also get a true. We know in this case it needs to be clamped to -120.

How about the range between -290 and -120? If this is the clamping test we'd miss those...
And that's right. Turns out that no numbers within that range appear in any of the test. This code is truly a dirty hack.

code:
TEST X > 0
MULI T 170 M
HALT
Alright, so now you test if X is positive. If so, we send 1 * 170 to M, otherwise -1 * 170. We already know that XA subtracts 120, so in case of 170, it writes 50, otherwise it writes -120. Clamping done.

code:
MARK BANDPASS
LINK 1
COPY X M
Otherwise it goes to the next host for some peace and quiet and copies the value of X to M. The 120 that was added at the start is removed by XA.

What makes this so fast is that the clones are constantly handling different numbers. While one clone is sending a number to M, the next one is doing the conditional jump, the one after that is simultaneously dividing its number by 170, and so on. You can do a lot with parallellism. However, as silentsnack said, it's easy to get race conditions where you aren't sure which EXA is going to do what first. It's easier to deal with here than in a real computer, because you know each EXA gets exactly one instruction per cycle, so as long as you time their cycles well and make sure they don't try to send to M simultaneously, it should work out.

silentsnack posted:

code:
LINK 800
MARK LOOP
REPL LOOP
ADDI #NERV 35 X
MULI X 117 X
DIVI X 117 X
MARK OUT
LINK 1
REPL OUT
SUBI X 35 #NERV
A nice 10-size solution that combines the 9999-clamping with the parallellism of endless clones. It's slower than the previous solution because the lack of the second EXA waiting for M messages all the time.

By the way, the REPL OUT structure works as follows: it sends a clone to the next host, then tries to send its message to #NERV. If it isn't at the ARM host yet, it will crash because #NERV isn't found (making sure clones don't clutter the host), and if it is at the last host, the clone will crash because it can't LINK to 1 but the original will succesfully write the value to #NERV.

silentsnack posted:

code:
;====XA====
LINK 800
@REP 4
LINK 1
@END
MARK OUT
REPL OUT
SUBI M 35 #NERV

;====XB====
MARK LOOP
REPL LOOP
LINK 800
ADDI #NERV 35 X
LINK 1
MULI X 117 X
DIVI X 117 M
Now down to 36 cycles / 15 size / 73 activity.
All the way down to 36! By combining both of the ideas. XB uses the endless cloning from before, but since the 35/117 trick is so much faster it only needs the one LINK to make space, since the clones can die earlier. Also, you no longer depend on a range of numbers simply missing from the test data, making this a much more generic solution.

As for XA, it doesn't matter for the cycle count whether you use two continuously looping EXAs or an endless batch of clones that all try to write to #NERV ones. This just needs a couple less lines of code.

I kinda like the "balance" of having to go through ugly hacky code first before it turns out the best solution is very clean. Happens surprisingly often in real life too.


=== Last Stop Snaxnet - peanut-free Peanut Bars ===

This hotfix should keep you useful, at least for a little while longer.
I'm glad I don't have a body.




This vote was unanimous.

You get used to it.

If you say so.
Seems like nothing but a hassle.
Maybe one day... well, we can talk about that later.


Wait, Ember wants a body?



What! I didn't know about that either. In case you were wondering, the tab key lets the simulation step to the next cycle. This ctrl+click thing is what's known as a breakpoint. It's very useful to see how a specific part of your code behaves, especially conditional branches that are rarely triggered. It will stop there, you can then manually advance the program again, and if you like you can use ctrl+click to go to the next breakpoint.

Seriously, I actually forgot this was a thing until the chat reminded me.



No new assignments appeared since I patched my arm, so time to go tackle Snaxnet.

You like snacks, don't you?
You don't have to answer that.
I know you like snacks.
That's why we're going to hack a snack factory.




One vote for "Okay." and four for "This is silly."

This is silly.

It's an experiment.
Hypothesis: It will be very funny.



New :siren: OST: Network Exploration

The assignment:
- Remove the keyword PEANUTS (file 300) from the Peanut Blast recipe (file 237).
- Note that the target keyword will appear in a different position in each of the 100 test runs. To view and debug each of the different test runs, click the arrow buttons next to the "TEST RUN" display above.


So, first of all, we're going to have to find the word PEANUTS. Now, the problem with keywords is that you can't just type them in your code like you can numbers. You need to have a reference somewhere, which is why file 300 is sitting in my home host, with just that word.

Secondly, that tip is useful if you hadn't realized yet. Those little arrows next to 1/100 allow you to see each of the 100 tests ahead of time. They also allow you to debug specific test runs, in case your program only fails for one of them.



These are all the files we have access to. We don't need those files with the dates, though. Let's start building.



My initial solution. First grab the reference file and load PEANUTS into the EXA. Then go to the factory's file. Finding a specific value in a file is simple: just do a TEST for that exact value repeatedly. Every file access moves the cursor forward one, so that is the whole loop.

Then, once it finds the right value, it needs to go back one step in the file with SEEK -1 (because the last TEST moved it just beyond that point) and then use the VOID command to get rid of the value. As the EXA dies, it drops the file, and that's it.



Not bad, already at the lower range of the histograms.



But this is one small improvement I came up with. Save some cycles by having one EXA immediately go to the factory's file, while another grabs our reference and sends the value over to the other EXA. Result: 29/11/2

If there are further improvements possible, I don't know them. It's nice to have a simple puzzle after the very complex optimizations from the thread last week though.

By the way, messing with the hardware registers really doesn't do anything here except change that I/O status view and immediately fail the test on the 'Leave no trace' goal.

Okay, I don't know how I feel about this.
Processing.
No peanuts in Peanut Blast. Was that funny?




I don't know, was it?

Meanwhile...



Crap. We're gonna need that second Zine. Let's ask Ember for help.


You're gonna help Ghast with his copy shop bill?
That's kind of you.




And that's the second vote for today.