Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve first chapter #27

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Improve first chapter #27

wants to merge 9 commits into from

Conversation

0xAX
Copy link
Owner

@0xAX 0xAX commented Jun 9, 2024

This PR provides significant rework for the first chapter.

@0xAX 0xAX requested a review from klaudiagrz as a code owner June 9, 2024 11:32
@0xAX 0xAX self-assigned this Jun 9, 2024
@0xAX 0xAX force-pushed the chapter-1-update branch 2 times, most recently from 42a39fc to ce606fa Compare June 16, 2024 15:49
hello/README.md Outdated Show resolved Hide resolved
@Deckloins
Copy link

Hi ! I started working on rewording the first article, but then saw you already had a big head start.
If you want we could work together to fine-tune the translation

content/asm_1.md Outdated Show resolved Hide resolved
content/asm_1.md Outdated Show resolved Hide resolved
content/asm_1.md Outdated Show resolved Hide resolved
content/asm_1.md Outdated Show resolved Hide resolved
content/asm_1.md Outdated Show resolved Hide resolved
content/asm_1.md Outdated Show resolved Hide resolved
- `rsi` - Used to pass `2nd` argument to a function.
- `rdx` - Used to pass `3rd` argument to a function.

There is more details related to the Linux `x86_64` calling conventions but the description above should be enough for now. Knowing the meaning and the way of use of these registers we can return to the code. What do we need to write a `hello world` program? Usually we just pass a `hello world` string to a library function like [printf](https://en.wikipedia.org/wiki/Printf) or so. But these functions usually goes from a [standard library](https://en.wikipedia.org/wiki/Standard_library) of a programming languages we are using. Assembly does not have a standard library. What to do in this case? Well, we have at least the two following approaches:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There is more details related to the Linux `x86_64` calling conventions but the description above should be enough for now. Knowing the meaning and the way of use of these registers we can return to the code. What do we need to write a `hello world` program? Usually we just pass a `hello world` string to a library function like [printf](https://en.wikipedia.org/wiki/Printf) or so. But these functions usually goes from a [standard library](https://en.wikipedia.org/wiki/Standard_library) of a programming languages we are using. Assembly does not have a standard library. What to do in this case? Well, we have at least the two following approaches:
This knowledge about Linux `x86_64` calling conventions should be enough for now. Knowing the meaning and the usage of the registers, we can return to the code. What do we need to write a `hello world` program? Usually, we just pass a `hello world` string to a library function like [printf](https://en.wikipedia.org/wiki/Printf). But this function usually goes from a [standard library](https://en.wikipedia.org/wiki/Standard_library) of a programming language we use. Assembly does not have a standard library. What to do in this case? Well, we have at least two options:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function usually goes from...

I don't like "goes from", can we replace it with another verb?

content/asm_1.md Outdated Show resolved Hide resolved
content/asm_1.md Outdated Show resolved Hide resolved
- Link our assembly program with C standard library and use [printf](https://man7.org/linux/man-pages/man3/printf.3.html) or any other function that may help us to write a text to the [standard output](https://en.wikipedia.org/wiki/Standard_streams).
- Use the operating system API

We will go through the second way. Each operating system provides an interface that a user level application may use to interact with the operating system. Usually the functions of this API are called `system calls`. Linux kernel also provides set of system calls to interact with it. The full list of system calls with the respective numbers for the Linux `x86_64` could be found [here](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl). Looking in this table, we may see:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's good to mention why we do not explain the first option?

0xAX added 7 commits July 17, 2024 17:04
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
@0xAX 0xAX force-pushed the chapter-1-update branch 3 times, most recently from eb6f132 to 4b8ead1 Compare July 17, 2024 12:14
content/asm_1.md Outdated
```

Ok, CPU performs some operations, arithmetical and etc... But where can it get data for this operations? The first answer in memory. However, reading data from and storing data into memory slows down the processor, as it involves complicated processes of sending the data request across the control bus. Thus CPU has own internal memory storage locations called registers:
The information about the given system call could be found in manual pages. To get information about the `sys_write` system_call we can execute the following command in terminal:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The information about the given system call could be found in manual pages. To get information about the `sys_write` system_call we can execute the following command in terminal:
You can find information about the given system call in manual pages. To get information about the `sys_write` system call, run the following command in the terminal:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manual pages? maybe we could add a link?
https://en.wikipedia.org/wiki/Man_page

content/asm_1.md Outdated

In another words we just make a call of `sys_write` syscall. Take a look on `sys_write`:
which is basically a wrapper around the `sys_write` system call provided by the standard C library. Usually the set of arguments of the system call and the wrapper function is the same. So we safely may assume that the `sys_write` system call is defined like that:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
which is basically a wrapper around the `sys_write` system call provided by the standard C library. Usually the set of arguments of the system call and the wrapper function is the same. So we safely may assume that the `sys_write` system call is defined like that:
This function is a wrapper around the `sys_write` system call provided by the standard C library. Usually, the set of arguments of the system call and the wrapper function are the same. So we safely may assume that the `sys_write` system call is defined like this:

* `rdi` - used to pass 1st argument to functions
* `rsi` - pointer used to pass 2nd argument to functions
```C
ssize_t write(int fd, const void buf[.count], size_t count);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ssize_t write(int fd, const void buf[.count], size_t count);
size_t write(int fd, const void buf[.count], size_t count);

content/asm_1.md Outdated
Comment on lines 195 to 199
The function expects the following three arguments:

* `fd` - The file descriptor where to write data.
* `buf` - The pointer to the buffer from which data will be send to the output.
* `count` - The number of bytes to be written from the buffer to the file specified by the file descriptor from the first argument.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The function expects the following three arguments:
* `fd` - The file descriptor where to write data.
* `buf` - The pointer to the buffer from which data will be send to the output.
* `count` - The number of bytes to be written from the buffer to the file specified by the file descriptor from the first argument.
The function expects three arguments:
* `fd` - The file descriptor that specifies where to write data.
* `buf` - The pointer to the buffer from which data is sent to the output.
* `count` - The number of bytes written from the buffer to the file specified by the file descriptor from the first argument.

content/asm_1.md Outdated
* `buf` - The pointer to the buffer from which data will be send to the output.
* `count` - The number of bytes to be written from the buffer to the file specified by the file descriptor from the first argument.

Now we can understand that the first four lines of the assembly code basically do the two following things:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Now we can understand that the first four lines of the assembly code basically do the two following things:
Now we can understand that the first four lines of the assembly code do two things:

content/asm_1.md Outdated
Comment on lines 203 to 204
- Specify the number of the system call (the `sys_write` in our example) that we are going to call.
- Specify the arguments of the `sys_write` system call.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Specify the number of the system call (the `sys_write` in our example) that we are going to call.
- Specify the arguments of the `sys_write` system call.
- Specify the number of the system call (the `sys_write` in our example) that we will call.
- Specify the arguments of the `sys_write` system call.

content/asm_1.md Outdated
* `fd` - file descriptor. Can be 0, 1 and 2 for standard input, standard output and standard error
* `buf` - points to a character array, which can be used to store content obtained from the file pointed to by fd.
* `count` - specifies the number of bytes to be written from the file into the character array
Check the system call table we can know that the `sys_write` system call has the number - `1`. Since the `rax` register should contain the number of the system call that we are going to call, we put `1` into it. After this we put `1` to the `rdi` register. That will be the first argument of the `sys_write`. In our case we want to write the `hello world` string in the terminal, so we put `1` which specifies [standard output](https://en.wikipedia.org/wiki/Standard_streams). The next step is to prepare the second argument of the `sys_write` system call. In our case we pass the address of the `msg` constant to the `rsi` register. At the last but not least step we should specify the length of data we want to write. The length of the `hello, world!` string is `13` bytes, so we pass it to the `rdx` register.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Check the system call table we can know that the `sys_write` system call has the number - `1`. Since the `rax` register should contain the number of the system call that we are going to call, we put `1` into it. After this we put `1` to the `rdi` register. That will be the first argument of the `sys_write`. In our case we want to write the `hello world` string in the terminal, so we put `1` which specifies [standard output](https://en.wikipedia.org/wiki/Standard_streams). The next step is to prepare the second argument of the `sys_write` system call. In our case we pass the address of the `msg` constant to the `rsi` register. At the last but not least step we should specify the length of data we want to write. The length of the `hello, world!` string is `13` bytes, so we pass it to the `rdx` register.
By checking the system call table, we know the `sys_write` system call has the number `1`. Since the `rax` register should contain the system call number, we put `1` into it. Then, we put `1` in the `rdi` register. That will be the first argument of `sys_write`. We want to write the `hello world` string in the terminal, so we put `1` which specifies [standard output](https://en.wikipedia.org/wiki/Standard_streams). The next step is to prepare the second argument of the `sys_write` system call. In our case, we pass the address of the `msg` constant to the `rsi` register. Last but not least, we should specify the length of data we want to write. The length of the `hello, world!` string is `13` bytes, so we pass it to the `rdx` register.

content/asm_1.md Outdated

So we know that `sys_write` syscall takes three arguments and has number one in syscall table. Let's look again to our hello world implementation. We put 1 to rax register, it means that we will use sys_write system call. In next line we put 1 to rdi register, it will be first argument of `sys_write`, 1 - standard output. Then we store pointer to msg at rsi register, it will be second buf argument for sys_write. And then we pass the last (third) parameter (length of string) to rdx, it will be third argument of sys_write. Now we have all arguments of the `sys_write` and we can call it with syscall function at 11 line. Ok, we printed "Hello world" string, now need to do correctly exit from program. We pass 60 to rax register, 60 is a number of exit syscall. And pass also 0 to rdi register, it will be error code, so with 0 our program must exit successfully. That's all for "Hello world". Quite simple :) Now let's build our program. For example we have this code in hello.asm file. Then we need to execute following commands:
As all parameters of the `sys_write` system call is ready, now we can to call the system call itself. It could be done with the `syscall` instruction. That already should print the `hello, world!` string in our terminal. But if you will build and run only these instructions, you will see the [segmentation fault](https://en.wikipedia.org/wiki/Segmentation_fault) error. The problem is that we need to exit properly from the program. To do that, we have to call the `sys_exit` system call. We need to do the same - fill the `rax` with the number of the `sys_exit` system call and fill the respective registers with the parameters needed for this system call. Let's take a look at the system call [table](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As all parameters of the `sys_write` system call is ready, now we can to call the system call itself. It could be done with the `syscall` instruction. That already should print the `hello, world!` string in our terminal. But if you will build and run only these instructions, you will see the [segmentation fault](https://en.wikipedia.org/wiki/Segmentation_fault) error. The problem is that we need to exit properly from the program. To do that, we have to call the `sys_exit` system call. We need to do the same - fill the `rax` with the number of the `sys_exit` system call and fill the respective registers with the parameters needed for this system call. Let's take a look at the system call [table](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl):
As all parameters of the `sys_write` system call are ready, we can now call the system itself. We can do it with the `syscall` instruction that should already print the `hello, world!` string in our terminal. However, if you build and run only these instructions, you will see the [segmentation fault](https://en.wikipedia.org/wiki/Segmentation_fault) error. The problem is that we need to exit properly from the program. To do that, we have to call the `sys_exit` system call. We need to do the same - fill the `rax` with the number of the `sys_exit` system call and fill the respective registers with the parameters for this system call. Let's take a look at the [system call table](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you build and run only these instructions

what instructions?

content/asm_1.md Outdated
60 common exit sys_exit
```

We may see that the number of this system call is `60`, so we put this value into the `rax` register. According to the [exit](https://www.man7.org/linux/man-pages/man2/exit.2.html) documentation, this system call expects to get a single argument which is a exit status code. We expect that our program terminates successfully let's just put `0` to the `rdi` register. Our program is ready. Now let's build our program with the following commands:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We may see that the number of this system call is `60`, so we put this value into the `rax` register. According to the [exit](https://www.man7.org/linux/man-pages/man2/exit.2.html) documentation, this system call expects to get a single argument which is a exit status code. We expect that our program terminates successfully let's just put `0` to the `rdi` register. Our program is ready. Now let's build our program with the following commands:
The system call number is `60`, so we load it into the `rax` register. The [exit](https://www.man7.org/linux/man-pages/man2/exit.2.html) docs say it needs one argument: an exit status code. To indicate success, we put `0` in the `rdi` register. That’s it — our program is ready! Let’s build it with these commands:

content/asm_1.md Outdated
ld -o hello hello.o
```

After this we should have an executable file named `hello`. Let's execute it:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
After this we should have an executable file named `hello`. Let's execute it:
Now we should have an executable file named `hello`. Let's run it:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants