Skip to content

Latest commit

 

History

History
229 lines (148 loc) · 11.8 KB

01 Basic Concepts.md

File metadata and controls

229 lines (148 loc) · 11.8 KB

Basic Concepts

This document will go over the most basic concepts that you will need to know. This serves as a companion to the official spec at dcpu.com/dcpu-16. If you already understand the official spec or want to skip around, everything in this chapter is supposed to be review.

Numbers

Binary

It is not neccessary to understand binary completely to start out. All that you need to know is that binary is the only way computers think, and that they don't "know" that any data should be interpreted in a specific way other than how the programmer tells them to.

Binary is merely a way to represent numerical values, so essentially everything that the computer knows boils down to one value or another.

Digits in binary are called bits, and the smallest peice of data the DCPU deals with is 16 bits, which means that a peice of information can be any one of 65,536 values, and nothing else. This piece of data is called a word.

Writing sixteen 1's and 0's every time you want to talk about a value would get tedious fast, so programmers often use hexidecimal to represent binary values.

Hexidecimal

Hexidecimal, or simply hex, is a way to write numbers just like binary and decimal. It is used because it is compatible with binary - each hex digit is 4 bits of information. Decimal does not have such a direct comparison.

Hexidecimal digits can be any one of 16 values, represented by the numbers 0-9 and the letters a-f. The hex digit "a" is equivalent to the decimal number 10, b=11, etc, up to f=15. In order to denote that a given number is written in hex, the prefix 0x is used.

0x12f4 is equal to 4852 in decimal or 0001 0010 1111 0100 in binary.

In decimal, when you get up to 9 the next digit is 0 and you increment the next column to the left. In hex, after 9 is just a, then b, etc. When you get to f, thats when you go back to 0 and increment the next column.

Decimal:	Hex:
   8 		 8
   9		 9
  10		 a		<--- 
  11		 b
  12		 c
  13		 d
  14		 e
  15		 f
  16		10		<---
  17		11
  18		12

Just like the highest number you can have in 4 digits of decimal is 9999, the highest integer you can represent in 4-digits of hex is 0xffff, which is 65,535. Since we're counting 0 as a value, thats 65,536 values, just like I mentioned above.

Assembly Code

Assembly consists of "simple" computations which are executed one at a time, in order. These are called instructions. Each instruction contains an operation and one or two values.

	set a, 12
	add a, 4
	
	set b, 2
	sub a, b

In the example above there are several different instructions shown, using the set, add, and sub operations. As you can probably guess, add and sub perform arithmatic on integers. Set allows you to copy a value to another place, in this example we store two values in two registers, called a and b. Registers are like little pockets of memory on the processor which can be accessed quickly.

Machine Code

When you write assembly, like many programming languages, you need to compile it into machine code before the processor can run it. The translation from assembly to machine code is more direct than in other programming languages because each assembly instruction represents an equivalent machine code instruction, and vice-versa. It is a one-to-one translation.

	b401 9402 8c21 0403

The above block of machine code, represented in hexadecimal numbers, is equivalent to the assembly code shown before. I've divided the machine code into blocks of 4 hexidecimal digits, for human readability. Each block of 4 digits represents one word (16 bits).

Instructions in DCPU-Assembly take up between 1 and 3 words each, and in this example each instruction is one word. b401 is equivalent to set a, 12.

I have shown you the above block of machine code to illustrate a point, but it is not important for you to know specifically what machine code something translates into. If you would like to see how your code is being interpreted, the online emulator dcpu.ru is good because it shows line-by-line what machine code is generated by your assembly.

Instructions

DCPU assembly differs from an actual assembly language in that the operations are all designed to be easy to understand by humans, whereas actual modern processors have instruction sets that are designed to be written by compilers and are not very human-friendly.

	set a, 12

The first part of an instruction is the opcode, which represents which operation the processor should be performing. Each operation expects either 1 or 2 values, which represent the data that will be manipluated.

	add a, 4

This instruction is like the mathematical equation a = a + 4. The first value is overwritten with the result of the operation. The second value is never altered. Here, 4 is added to the value in the a register and the result is stored back in the a register.

	set b, 2
	sub a, b

Here, b is loaded with a value and then subtracted from a. As before, the result is stored in a and b is left untouched.

I will not list every possible opcode here, because it would get tedious. Instead, check out the official documentation at dcpu.com/dcpu-16.

Registers

Registers are little peices of memory on the processor which are used to store values for immediate use. The a and b from the previous example are registers. The DCPU has 8 general purpose registers and 4 special purpose registers.

###General Purpose Registers:

Because the DCPU is made to be fun and simplified for humans, every register is one word and can be manipulated directly, except IA which I'll talk about later. Actual CPU achitectures can have differently sized registers which are for specific things, and which are full of little rules and tricks.

A, B, C
X, Y, Z
I, J 		; these two are sort of unique

Each of the general purpose registers can be used for any arbitrary usage that the programmer wants to, but I and J are unique in that there are two operations which affect them directly.

I and J: the iterators

sti is an opcode which means "set then increment" it is like set, but after it sets the value it will increment the i and j registers, even if they weren't used in the operation. Likewise, std means "set then decrement".

sti a, 2 	; sets a to 2 and then increments I and J
std b, 1 	; sets b to 1 and then decrements I and J

There are also 4 other registers which have special meanings.

PC: Program Counter AKA Instruction Pointer

The special registers have specific uses. The most important to understand at first is PC. This is the program counter, or instruction pointer, and it points to the location in memory where the current instruction is. After each instruction, pc is automatically incremented by 1 and then the instruction at that location is executed.

SP: Stack Pointer

Like pc, the stack pointer sp is used to keep track of a location in memory that the processor uses for other operations. In this case it is part of the stack, which is essentially an area of memory that values can be quickly stored in and retrieved from, but only in sequential order. We'll talk about the stack more later.

IA: Interrupt Address

This is another pointer which we'll talk about later. Essentially a programmer can define a subroutine which gets run when hardware wants to talk to the DCPU or when another subroutine calls an Interrupt. ia points to the location in memory where that subroutine starts.

EX: Excess Register

The exess register is used when a function overflows the 16-bit possible values. This is a very luxurious function of the DCPU because "real" architectures often only have 1 bit for overflow and it is shared between different uses. This provides 16 whole bits of extra information when an operation overflows, making the DCPU-16 almost a 32-bit processor.

Memory

We can use registers to store information temporarily, but when we want something to be more durable we put it in memory so that we can refer to it later. We refer to it by remembering the address in memory that it is stored in.

Addresses

The DCPU has 2^16 words of memory. Conveniently, a word is also 16 bits. This means that there are enough values in 1 word to refer to each word of memory distinctly. This value is called its address, and basically is the index of that specific word. The first word's address in memory is 0, the next one 1, etc, all the way up to 0xffff.

We can store data in arbitrary addresses by using the square brackets [ and ].

set a, 4
sub a, 12

set [0x6656], a

;... later

add [0x6656], 36

The number inside the brackets is interpreted as an address, and the value of memory at that address is used for the operation.

Above, the address 0x6656 is chosen randomly, but you can refer to specific parts of your program by using labels.

Labels

While you are writing instructions, you can define a label to refer to a specific place in the code. A label is translated by the compiler into the address of next line of code after the label definition.

	set a, 1
	
:label
	add a, 1
	set pc, label

Labels are defined by prefixing the label with a semicolon. Some compilers support putting the semicolon after the label, which is how real assembly works, but not all DCPU compilers will recognize those labels at the moment.

In the example above, label would translate to the number 1. The first instruction, set a, 1 is encoded into the 0th word of memory, and add a, 1 is encoded into the 1st word. When set pc, label is compiled, it will be translated to set pc, 1.

By accessing pc directly, you change the program flow. This example will run in a continuous loop because the instruction pointer keeps getting set to the same value.

Label definitions themselves are not translated into machine code, so the above example would compile into only three words.

The Stack

One of the special registers that we saw earlier was the stack pointer. The stack pointer stores an address in memory.

Basically, the stack is a group of values that starts at the end of memory and "grows" backwards. When you add an item onto the stack the stack pointer is decreased by 1 and the value is stored in [sp].

Reading and writing to the stack is faster than reading and writing to arbitrary areas of memory.

set push, 8
set push, b

set a, pop
set b, pop

The key words push and pop act like registers in that you can set them and read values from them, but they aren't. They automatically adjust sp and read/write [sp].

You put values on the stack by setting push, and you get them back using pop. In this example, a is set to whatever b was, and b is set to 8. Values are retrieved in the opposite order that they were put in.

There are two other key words related to the stack, peek and pick.

You can access the value at the stack pointer without modifying the stack pointer using peek. It is exactly like [sp]. You can access values near the stack pointer by using pick and a number.

set push, 8
set push, 2

set a, peek   	; set a to 2
set a, pick 1 	; set a to 8

Comments

When writing any sort of program, it is good practice to leave comments alongside the code in order to describe your intent. This will make your code easier to understand later, if it is not immediately obvious what you are trying to do.

Comments are added to DCPU-16 assembly with the semi-colon ; character.

set a, 16
sub a, 2

; shl means shift-left,
; and it is can be used to 
; multiply integers by
; a power of 2.

shl a, 5

; in this case, 2^5 is 32
; so that was like:
; mul a, 32

You don't need to add comments when your code is obvious, but as you can see the intent of the above example might not have been clear without the comment.