Skip to content

Commit

Permalink
Typos, grammar
Browse files Browse the repository at this point in the history
  • Loading branch information
mwenge committed Sep 7, 2024
1 parent 2f71e56 commit 4706694
Show file tree
Hide file tree
Showing 7 changed files with 52 additions and 48 deletions.
Binary file modified out/IridisAlphaTheory.pdf
Binary file not shown.
36 changes: 18 additions & 18 deletions src/archaeo.tex
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ \chapter{A Little Archaeology}
language available to it: lots of sound waves of varying length.

Without knowing how it's actually done, it's tempting to imagine a variety of possible schemes that
might have been used. For example, one sound wave denoting a '0' and another one denoting a '1'.
might have been used to store data on a cassette. For example, one sound wave denoting a '0' and another one denoting a '1'.
The method that is actually used isn't very far away from such a thing but there is plenty of intricacy
layered on top, particularly in a bid to spend as little time as possible loading data from the
tape.
Expand All @@ -34,15 +34,15 @@ \chapter{A Little Archaeology}
invented what is now known as the '\icode{tap}' format for representing the beeps and bloops encoded on the
tape as a file of bits and bytes. The idea is that each byte in the file represents the length of
a pulse. It is the length of these pulses that will ultimately tell us whether we should interpret
a value of \icode{1} or \icode{0}. When we get eights \icode{1}s and \icode{0}s we have a byte. We get enough bytes, we have a program
a value of \icode{1} or \icode{0}. When we get eight \icode{1}s and \icode{0}s we have a byte. We get enough bytes, we have a program
we can run!

Someone, somewhere has kindly decoded the contents of the Iridis Alpha cassette tape distribution to
a \icode{tap} file for us. So we have something to dig into. This is going to be a slightly bonkers journey
into the bowels of decoding a 54KB game file from over 500KB of raw data. Every time you think you
are nearly done there will be yet another convolution to wrap your head around. But at the end of it
we will finally have our binary game file and will be ready to figure out how to decipher it into something
approximating the original assembly langage.
approximating the original assembly langage Iridis Alpha itself was written in.

\begin{figure}[H]
{
Expand All @@ -55,8 +55,9 @@ \chapter{A Little Archaeology}

\section{The Madness Begins}

This is what the start of the our \icode{iridis-alpha.tap} file looks like:

This is what the start of the our \icode{iridis-alpha.tap} file looks like. Note that this is the contents of the
start of \icode{tap} file rather than the data on the tape itself. The data on the tape is represented by the white
cells in our diagram:

\begin{figure}[H]
{
Expand Down Expand Up @@ -122,12 +123,11 @@ \section{The Madness Begins}
After the header information described above, each byte in the \icode{tap} file
represents the pulse length or duration of a single sound emitted by the tape.
A pair of sounds taken together represent a single bit, i.e. a \icode{0} or a
\icode{1}. A medium length sound followed by a short one represents a \icode{1},
a short one followed by a medium one represents a \icode{0}.
\icode{1}. A medium length sound followed by a short length represents a \icode{1},
a short length followed by a medium length represents a \icode{0}.

The following table shows us whether we should consider a byte on the tape to
represent short, medium, or long
duration sound:
represent a short, medium, or long duration sound:

\begin{figure}[H]
{
Expand All @@ -148,14 +148,14 @@ \section{The Madness Begins}

\end{adjustbox}

}\caption{Values for short, medium, and long pulses. For example, ny byte value on the \icode{tap} file between \icode{\$24} and
}\caption{Values for short, medium, and long pulses. For example, any byte value on the \icode{tap} file between \icode{\$24} and
\icode{\$36} would be considered a 'Short' pulse.}
\end{figure}

Remarkably the first 27,000 or so pulses on the Iridis Alpha tape are nothing but short sounds (values between \icode{\$2F} and
\icode{\$31}) so cannot be interpreted as anything. It's not until the 27,157th byte in the tape that we start to encounter real data:

\begin{lstlisting}[caption=Data finally gets started at \icode{56 41} in the first line above.,basicstyle=\tiny,escapechar=\%]
\begin{lstlisting}[caption=Data finally gets started at `\icode{56 41}' in the first line above.,basicstyle=\scriptsize\ttfamily,escapechar=\%]
00006a10: 3031 3031 3156 4144 3130 4231 4243 3130 01011VAD10B1BC10
00006a20: 4231 4131 4244 312f 4057 4032 4231 4231 B1A1BD1/@W@2B1B1
00006a30: 4244 312f 4231 4231 4243 3142 3055 4144 BD1/B1B1BC1B0UAD
Expand All @@ -173,7 +173,7 @@ \section{The Madness Begins}
\end{lstlisting}

You get a sense of how wasteful, or ahem redundant, this encoding\index{encoding} scheme is when you learn that these twenty pulses are
required to give us a single byte. The table below shows how we interpret them to construct a series of 1s and 0s.
required to give us a single byte. The table below shows how we interpret them to construct a series of \icode{1}s and \icode{0}s.

\begin{figure}[H]
{
Expand Down Expand Up @@ -202,7 +202,7 @@ \section{The Madness Begins}
\end{adjustbox}

}\caption{Interpretation of the first 20 meaningful bytes\, creating a byte. The parity bit at the end is a \icode{\$00}
if there are an odd number of 1s and \icode{\$01} if there are an even number of 1s. \icode{10010001} has an odd number of 1s. }
if there are an odd number of \icode{1}s, and \icode{\$01} if there are an even number of \icode{1}s. \icode{10010001} has an odd number of \icode{1}s. }
\end{figure}

We can visualize the twenty bytes as a square sound wave. When reading the tape the C64 would interpret these sound pulses as long, short,
Expand Down Expand Up @@ -338,7 +338,7 @@ \section{After Our First Real Byte}
what it does later. \\
} \\
\midrule
Checksum & \icode{E4} & How is this calculated? \\
Checksum & \icode{E4} & I don't know how is this calculated!? \\
\addlinespace
\bottomrule
\end{tabular}
Expand Down Expand Up @@ -382,7 +382,7 @@ \section{After Our First Real Byte}

\end{tikzpicture}
\end{adjustbox}
}\caption{The data we've read in so far. The unshaded section is machine code.}
}\caption{The second part of our program in machine code. The unshaded section is the machine code.}
\end{figure}

\begin{figure}[H]
Expand All @@ -407,7 +407,7 @@ \section{After Our First Real Byte}
\icode{00 00 00 00 00 8B E3 AE 02 } \\
} & This is the rest of machine code of the program to execute. \\
\midrule
Checksum & \icode{53} & How is this calculated? \\
Checksum & \icode{53} & I stil don't know how is this calculated! \\
\addlinespace
\bottomrule
\end{tabular}
Expand Down Expand Up @@ -879,7 +879,7 @@ \section{A Loader for your Loader}
\begin{adjustbox}{width=9cm,center}
\surface{archaeo/tap-full.png}
\end{adjustbox}
}\caption[]{All the data that has been read from the tape. The four chunks of game data are in green\, the third is only a sliver. The
}\caption[]{All the data that has been read from the tape. The four chunks of game data are in green\, the second is only a sliver. The
relative sizes of the red data (the MegaSave loader which is only actually 200 or so bytes long) and the green data (representing over
50,000 bytes of game data) illustrates how efficient the MegaSave loader's storage is by comparison with the default.}
\end{figure}
Expand Down Expand Up @@ -997,7 +997,7 @@ \section{Putting an End to the Madness}
\end{lstlisting}

This routine\index{routine} does two things: it turns off the tape deck and tells the C64 to execute the code at a different
location (\icode{MainControlLoop\index{MainControlLoop}}) the next time it wakes up and wonders what to do. Until then it
location (\icode{MainControlLoop\index{MainControlLoop}}) the next time that it wakes up and wonders what to do. Until then it
goes into a loop executing \icode{LoopUntilExexcutes} over and over again.

The C64 isn't getting out of bed and pondering its future over coffee once a day like the rest of us. It wakes up hundreds of times every second
Expand Down
14 changes: 7 additions & 7 deletions src/binary.tex
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ \chapter{We Need to Talk About Binary}
up the blob in some way that allows some order to be imposed. Some way
of segmenting a string of 1s and 0s that is both useful to us as the
programmers and an efficient way of directing the computer to make effective
use of the amorphous blob fed into it.
use of the amorphous binary goo fed into it.

Trial and error has eventually arrived at an optimal arrangement, one which
you have definitely heard of: this is the \icode{byte}. The idea of the byte
Expand All @@ -44,8 +44,8 @@ \chapter{We Need to Talk About Binary}

Any scheme that reduces the number of items we have to deal with is a boon,
but is there any rhyme or reason to choosing 8 as our magic number instead
of say 7, or 12? Believe it or not, the number 8 was chosen almost solely
after much experimentation with others because it proved the easiest and
of say 7, or 12? Believe it or not, the number 8 was chosen
(after much experimentation with others) almost solely because it proved the easiest and
most convenient for people to deal with when understanding how they would
make computers work.

Expand All @@ -56,7 +56,7 @@ \chapter{We Need to Talk About Binary}
important when it comes down to it. It is humans doing the important part
of the computing: making the big decisions, figuring out where things go
and how they should work. The computer is just a glorified bit shuffler
for which everything is on or off and there is no bigger picture. Humans
for which everything is on or off, it has no concept of a bigger picture. Humans
need to be able to at least intuit some of this shuffling with a mental
model of what happens when one set of ones and zeros is clashed with another.
Using the magic number of 8 as the denominator for batches of bits enables
Expand Down Expand Up @@ -327,7 +327,7 @@ \chapter{We Need to Talk About Binary}
we could have chosen 7 or 9 or 13 but how practical would this really be when
the actual value assigned to any individual bit is always going to be a power of
2. Even if we could never precisely intuit the reasons as to why, a magic number
that is also the power of 2, e.g. 8,4,16,64, is always going to make things easier
that is also the power of 2, e.g. 8, 4, 16, 64... is always going to make things easier
over the long haul even if we can't articulate them.

So now some jargon we might have heard in the past is starting to make sense. The
Expand All @@ -336,9 +336,9 @@ \chapter{We Need to Talk About Binary}
it does so using a single byte as its fundamental building block. This is true
whether we talking about the \icode{AND} operation or storing a value for use
later on. When we give the C64 a value it's always a single byte. The subsequent
ear of 16-bit computing (e.g. the Amiga) took this a step further by introducing
era of 16-bit computing (e.g. the Amiga) took this a step further by introducing
a two-byte block as its basic unit of currency. 32-bit and 64-bit computing are the
lingua franc of modern processors - no matter how small the value we're using it
\textit{lingua franca} of modern processors - no matter how small the value we're using it
will be managed as 4-byte or 8-byte value respectively by 32-bit and 64-bit
computer architectures.

Expand Down
25 changes: 14 additions & 11 deletions src/disassembly.tex
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,31 @@ \chapter{Some Disassembly Required}
\label{sec:disassembly}
\lstset{style=6502Style}

We've reached the point where the game has started to execute. We just saw a snippet of code that turned off the
tape recorder and prompted the C64 to run a routine\index{routine} a called \icode{MainControlLoop\index{MainControlLoop}}. Though perhaps a little
cryptic, and you shouldn't expect to understand it yet, this code is not exactly what the machine 'saw'. Instead it read and executed something far more puzzling looking:
We've reached the point where the game has started to execute. We just saw a
snippet of code that turned off the tape recorder and prompted the C64 to run a
routine\index{routine} called \icode{MainControlLoop\index{MainControlLoop}}.
Though perhaps a little cryptic, and you shouldn't expect to understand it yet,
this code is not exactly what the machine 'saw'. Instead it read and executed
something far more puzzling-looking:


\begin{lstlisting}[caption=The first piece of machine code that is executed in Iridis Alpha.,escapechar=\%]
78A9 408D 1903 A900 8D18 03A9 108D 04DD A900 8D05
DDA9 7F8D 0DDD A981 8D0D DDA9 198D 0EDD 584C 3508
\end{lstlisting}

That's right it executed a stream of bytes. A stream of bytes commonly referred to as 'machine code'. Each of these
That's right it executed a stream of bytes. A stream of bytes commonly referred to as \textit{machine code}. Each of these
bytes is meaningful to the C64 whether individually, or taken in pairs, or even in groups of three. It can comprehend them
as instructions to carry out that will shuffle data around in its memory and ultimately result in a game that
can be played.

Before we can dig into the internals of how Iridis Alpha works we have to convert all of the machine code we've
loaded into memory in the previous chapter into something we have a chance of reading and understanding. This
process is called disassembly and here we're going to explain how it is done and along the way gain a little
basic understanding of the human-readable language, called 6502 Assembly Language, that we convert the machine
basic understanding of the human-readable language, called '6502 Assembly Language', that we convert the machine
code back into.

The process is called disassembly simply because it is the exact reverse of the process that was originally followed
The process is called \textit{disassembly} simply because it is the exact reverse of the process that was originally followed
to generate the data on the tape from the assembly language written by Jeff Minter in the first place. Programs that
do this are referred to as 'assemblers'. They assemble the instructions written by the programmer into machine code
that the C64 can execute. As self-appointed disassemblers we are going to turn it back into assembly language.
Expand Down Expand Up @@ -124,8 +127,8 @@ \chapter{Some Disassembly Required}
in memory that you give to it. As we can see such addresses are not one byte long, but two bytes. Are they
ever more than two bytes long? No, and for a very simple reason. The C64 can only understand addresses
that are at most two bytes long and this is what ultimately limits it to 64KB of memory. The largest address it can understand
is therefore \icode{\$FFFF} - the largest value that can be expressed by 2 bytes. Which translates to 65,536.
65,536 bytes is 64KB of RAM.
is therefore \icode{\$FFFF} - the largest value that can be expressed by 2 bytes. Which translates to 65,535.
Including zero, this allows us 65,536 bytes, which is 64KB of RAM.

When we look at the disassembly of the STA instructions we see something quite puzzling:

Expand All @@ -145,7 +148,7 @@ \chapter{Some Disassembly Required}

Shouldn't we have expected \icode{8D 19 03} to translate to \icode{STA \$1903} rather than
\icode{STA \$0319}? Why are the numbers back to front like that? The reason is due to something
you may have heard described with a word before that you've never fully understood and maybe never
you may have heard described with a word that you've never fully understood and maybe never
dared to question. The machine code
stores the address \icode{0319} as \icode{1903} because the 6502 CPU in the C64 expects to read
addresses with the second half of the number first. When we read numbers we expect it to start
Expand Down Expand Up @@ -250,7 +253,7 @@ \section{Important Concept Number One: High Bytes and Low Bytes}

Writing values to a pair of adjacent addresses in memory like this so that they can
be subsequently interpreted as yet another address to get something from or do something with is a very common
pattern\index{pattern} in programming 6502 CPUs such as the C64's and we will encounter it a *lot* in this book.
pattern\index{pattern} in programming 6502 CPUs such as the C64's and we will encounter it a \textit{lot} in this book.

It's a strange sort of indirection when you first attempt to understand it. Instead of storing actual values
at an address, we're storing an \textit{address} in the address. If you are familiar with other programming languages
Expand Down Expand Up @@ -308,7 +311,7 @@ \section{Important Concept Number Two: Interrupts}
of using labels in our code is that we no longer need to worry about the number of the addresses anymore. The label will do the job
for us. It does mean we have to know what the syntax \icode{\#<MainControlLoopInterruptHandler\index{MainControlLoopInterruptHandler}} means though. What it means is:
if \icode{MainControlLoopInterruptHandler\index{MainControlLoopInterruptHandler}} lives at \icode{\$6B3E} the \icode{\#<} decorators refer to the \icode{\$3E} part of the
address, so we're actually saying \icode{LDA \#\$3E}, i.e. load the value \$3E into the 'Accumulator'. Similarly the syntax
address, so we're actually saying \icode{LDA \#\$3E}, i.e. load the value \icode{\$3E} into the 'Accumulator'. Similarly the syntax
\icode{\#>MainControlLoopInterruptHandler\index{MainControlLoopInterruptHandler}} refers to the \icode{\$6B} part of the address.

\begin{figure}[H]
Expand Down
2 changes: 1 addition & 1 deletion src/planets.tex
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ \chapter{Making Planets for Nigel}
\begin{adjustbox}{width=10cm,center}
\surface{planets/planet1Charset_Random_Step1.png}
\end{adjustbox}
}\caption[]{\textbf{Step One}: Add the sea across the entire surface\index{surface} of the planet\index{planet}.}
}\caption[]{\textbf{Step One}: Add the sea across the entire surface\index{surface} of the planet\index{planet}, 1024 bytes long.}
\end{figure}

\begin{figure}[H]
Expand Down
2 changes: 1 addition & 1 deletion src/preface.tex
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ \chapter*{About This Book}
reverse-engineering process in the opening chapters because I think it is an interesting exercise in and of it self to go from
a binary blob to a set of fully commented source code that provides insight to the inner workings of the game.
\hyperref[sec:archaeo]{\textcolor{blue}{A Little Archaeology}} describes how we extract the game binary from the cassette
tape it was originally distributed. \hyperref[sec:disassembly]{\textcolor{blue}{Some Disassembly Required}} shows you how to go from
tape on which it was originally distributed. \hyperref[sec:disassembly]{\textcolor{blue}{Some Disassembly Required}} shows you how to go from
a very long list of bytes to a full source code listing. Hopefully you are here because you enjoy this kind of gory detail too.

If you are just interested in learning about the mechanics of the game itself you can flick straight to \hyperref[sec:first16]{\textcolor{blue}{The First 16 Milliseconds}},
Expand Down
Loading

0 comments on commit 4706694

Please sign in to comment.