-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve first chapter #27
base: master
Are you sure you want to change the base?
Conversation
42a39fc
to
ce606fa
Compare
Hi ! I started working on rewording the first article, but then saw you already had a big head start. |
- `rsi` - Used to pass `2nd` argument to a function. | ||
- `rdx` - Used to pass `3rd` argument to a function. | ||
|
||
There is more details related to the Linux `x86_64` calling conventions but the description above should be enough for now. Knowing the meaning and the way of use of these registers we can return to the code. What do we need to write a `hello world` program? Usually we just pass a `hello world` string to a library function like [printf](https://en.wikipedia.org/wiki/Printf) or so. But these functions usually goes from a [standard library](https://en.wikipedia.org/wiki/Standard_library) of a programming languages we are using. Assembly does not have a standard library. What to do in this case? Well, we have at least the two following approaches: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is more details related to the Linux `x86_64` calling conventions but the description above should be enough for now. Knowing the meaning and the way of use of these registers we can return to the code. What do we need to write a `hello world` program? Usually we just pass a `hello world` string to a library function like [printf](https://en.wikipedia.org/wiki/Printf) or so. But these functions usually goes from a [standard library](https://en.wikipedia.org/wiki/Standard_library) of a programming languages we are using. Assembly does not have a standard library. What to do in this case? Well, we have at least the two following approaches: | |
This knowledge about Linux `x86_64` calling conventions should be enough for now. Knowing the meaning and the usage of the registers, we can return to the code. What do we need to write a `hello world` program? Usually, we just pass a `hello world` string to a library function like [printf](https://en.wikipedia.org/wiki/Printf). But this function usually goes from a [standard library](https://en.wikipedia.org/wiki/Standard_library) of a programming language we use. Assembly does not have a standard library. What to do in this case? Well, we have at least two options: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function usually goes from...
I don't like "goes from", can we replace it with another verb?
- Link our assembly program with C standard library and use [printf](https://man7.org/linux/man-pages/man3/printf.3.html) or any other function that may help us to write a text to the [standard output](https://en.wikipedia.org/wiki/Standard_streams). | ||
- Use the operating system API | ||
|
||
We will go through the second way. Each operating system provides an interface that a user level application may use to interact with the operating system. Usually the functions of this API are called `system calls`. Linux kernel also provides set of system calls to interact with it. The full list of system calls with the respective numbers for the Linux `x86_64` could be found [here](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl). Looking in this table, we may see: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's good to mention why we do not explain the first option?
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
Signed-off-by: Alexander Kuleshov <[email protected]>
eb6f132
to
4b8ead1
Compare
content/asm_1.md
Outdated
``` | ||
|
||
Ok, CPU performs some operations, arithmetical and etc... But where can it get data for this operations? The first answer in memory. However, reading data from and storing data into memory slows down the processor, as it involves complicated processes of sending the data request across the control bus. Thus CPU has own internal memory storage locations called registers: | ||
The information about the given system call could be found in manual pages. To get information about the `sys_write` system_call we can execute the following command in terminal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The information about the given system call could be found in manual pages. To get information about the `sys_write` system_call we can execute the following command in terminal: | |
You can find information about the given system call in manual pages. To get information about the `sys_write` system call, run the following command in the terminal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manual pages? maybe we could add a link?
https://en.wikipedia.org/wiki/Man_page
content/asm_1.md
Outdated
|
||
In another words we just make a call of `sys_write` syscall. Take a look on `sys_write`: | ||
which is basically a wrapper around the `sys_write` system call provided by the standard C library. Usually the set of arguments of the system call and the wrapper function is the same. So we safely may assume that the `sys_write` system call is defined like that: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which is basically a wrapper around the `sys_write` system call provided by the standard C library. Usually the set of arguments of the system call and the wrapper function is the same. So we safely may assume that the `sys_write` system call is defined like that: | |
This function is a wrapper around the `sys_write` system call provided by the standard C library. Usually, the set of arguments of the system call and the wrapper function are the same. So we safely may assume that the `sys_write` system call is defined like this: |
* `rdi` - used to pass 1st argument to functions | ||
* `rsi` - pointer used to pass 2nd argument to functions | ||
```C | ||
ssize_t write(int fd, const void buf[.count], size_t count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ssize_t write(int fd, const void buf[.count], size_t count); | |
size_t write(int fd, const void buf[.count], size_t count); |
content/asm_1.md
Outdated
The function expects the following three arguments: | ||
|
||
* `fd` - The file descriptor where to write data. | ||
* `buf` - The pointer to the buffer from which data will be send to the output. | ||
* `count` - The number of bytes to be written from the buffer to the file specified by the file descriptor from the first argument. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function expects the following three arguments: | |
* `fd` - The file descriptor where to write data. | |
* `buf` - The pointer to the buffer from which data will be send to the output. | |
* `count` - The number of bytes to be written from the buffer to the file specified by the file descriptor from the first argument. | |
The function expects three arguments: | |
* `fd` - The file descriptor that specifies where to write data. | |
* `buf` - The pointer to the buffer from which data is sent to the output. | |
* `count` - The number of bytes written from the buffer to the file specified by the file descriptor from the first argument. |
content/asm_1.md
Outdated
* `buf` - The pointer to the buffer from which data will be send to the output. | ||
* `count` - The number of bytes to be written from the buffer to the file specified by the file descriptor from the first argument. | ||
|
||
Now we can understand that the first four lines of the assembly code basically do the two following things: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we can understand that the first four lines of the assembly code basically do the two following things: | |
Now we can understand that the first four lines of the assembly code do two things: |
content/asm_1.md
Outdated
- Specify the number of the system call (the `sys_write` in our example) that we are going to call. | ||
- Specify the arguments of the `sys_write` system call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Specify the number of the system call (the `sys_write` in our example) that we are going to call. | |
- Specify the arguments of the `sys_write` system call. | |
- Specify the number of the system call (the `sys_write` in our example) that we will call. | |
- Specify the arguments of the `sys_write` system call. |
content/asm_1.md
Outdated
* `fd` - file descriptor. Can be 0, 1 and 2 for standard input, standard output and standard error | ||
* `buf` - points to a character array, which can be used to store content obtained from the file pointed to by fd. | ||
* `count` - specifies the number of bytes to be written from the file into the character array | ||
Check the system call table we can know that the `sys_write` system call has the number - `1`. Since the `rax` register should contain the number of the system call that we are going to call, we put `1` into it. After this we put `1` to the `rdi` register. That will be the first argument of the `sys_write`. In our case we want to write the `hello world` string in the terminal, so we put `1` which specifies [standard output](https://en.wikipedia.org/wiki/Standard_streams). The next step is to prepare the second argument of the `sys_write` system call. In our case we pass the address of the `msg` constant to the `rsi` register. At the last but not least step we should specify the length of data we want to write. The length of the `hello, world!` string is `13` bytes, so we pass it to the `rdx` register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the system call table we can know that the `sys_write` system call has the number - `1`. Since the `rax` register should contain the number of the system call that we are going to call, we put `1` into it. After this we put `1` to the `rdi` register. That will be the first argument of the `sys_write`. In our case we want to write the `hello world` string in the terminal, so we put `1` which specifies [standard output](https://en.wikipedia.org/wiki/Standard_streams). The next step is to prepare the second argument of the `sys_write` system call. In our case we pass the address of the `msg` constant to the `rsi` register. At the last but not least step we should specify the length of data we want to write. The length of the `hello, world!` string is `13` bytes, so we pass it to the `rdx` register. | |
By checking the system call table, we know the `sys_write` system call has the number `1`. Since the `rax` register should contain the system call number, we put `1` into it. Then, we put `1` in the `rdi` register. That will be the first argument of `sys_write`. We want to write the `hello world` string in the terminal, so we put `1` which specifies [standard output](https://en.wikipedia.org/wiki/Standard_streams). The next step is to prepare the second argument of the `sys_write` system call. In our case, we pass the address of the `msg` constant to the `rsi` register. Last but not least, we should specify the length of data we want to write. The length of the `hello, world!` string is `13` bytes, so we pass it to the `rdx` register. |
content/asm_1.md
Outdated
|
||
So we know that `sys_write` syscall takes three arguments and has number one in syscall table. Let's look again to our hello world implementation. We put 1 to rax register, it means that we will use sys_write system call. In next line we put 1 to rdi register, it will be first argument of `sys_write`, 1 - standard output. Then we store pointer to msg at rsi register, it will be second buf argument for sys_write. And then we pass the last (third) parameter (length of string) to rdx, it will be third argument of sys_write. Now we have all arguments of the `sys_write` and we can call it with syscall function at 11 line. Ok, we printed "Hello world" string, now need to do correctly exit from program. We pass 60 to rax register, 60 is a number of exit syscall. And pass also 0 to rdi register, it will be error code, so with 0 our program must exit successfully. That's all for "Hello world". Quite simple :) Now let's build our program. For example we have this code in hello.asm file. Then we need to execute following commands: | ||
As all parameters of the `sys_write` system call is ready, now we can to call the system call itself. It could be done with the `syscall` instruction. That already should print the `hello, world!` string in our terminal. But if you will build and run only these instructions, you will see the [segmentation fault](https://en.wikipedia.org/wiki/Segmentation_fault) error. The problem is that we need to exit properly from the program. To do that, we have to call the `sys_exit` system call. We need to do the same - fill the `rax` with the number of the `sys_exit` system call and fill the respective registers with the parameters needed for this system call. Let's take a look at the system call [table](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As all parameters of the `sys_write` system call is ready, now we can to call the system call itself. It could be done with the `syscall` instruction. That already should print the `hello, world!` string in our terminal. But if you will build and run only these instructions, you will see the [segmentation fault](https://en.wikipedia.org/wiki/Segmentation_fault) error. The problem is that we need to exit properly from the program. To do that, we have to call the `sys_exit` system call. We need to do the same - fill the `rax` with the number of the `sys_exit` system call and fill the respective registers with the parameters needed for this system call. Let's take a look at the system call [table](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl): | |
As all parameters of the `sys_write` system call are ready, we can now call the system itself. We can do it with the `syscall` instruction that should already print the `hello, world!` string in our terminal. However, if you build and run only these instructions, you will see the [segmentation fault](https://en.wikipedia.org/wiki/Segmentation_fault) error. The problem is that we need to exit properly from the program. To do that, we have to call the `sys_exit` system call. We need to do the same - fill the `rax` with the number of the `sys_exit` system call and fill the respective registers with the parameters for this system call. Let's take a look at the [system call table](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you build and run only these instructions
what instructions?
content/asm_1.md
Outdated
60 common exit sys_exit | ||
``` | ||
|
||
We may see that the number of this system call is `60`, so we put this value into the `rax` register. According to the [exit](https://www.man7.org/linux/man-pages/man2/exit.2.html) documentation, this system call expects to get a single argument which is a exit status code. We expect that our program terminates successfully let's just put `0` to the `rdi` register. Our program is ready. Now let's build our program with the following commands: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may see that the number of this system call is `60`, so we put this value into the `rax` register. According to the [exit](https://www.man7.org/linux/man-pages/man2/exit.2.html) documentation, this system call expects to get a single argument which is a exit status code. We expect that our program terminates successfully let's just put `0` to the `rdi` register. Our program is ready. Now let's build our program with the following commands: | |
The system call number is `60`, so we load it into the `rax` register. The [exit](https://www.man7.org/linux/man-pages/man2/exit.2.html) docs say it needs one argument: an exit status code. To indicate success, we put `0` in the `rdi` register. That’s it — our program is ready! Let’s build it with these commands: |
content/asm_1.md
Outdated
ld -o hello hello.o | ||
``` | ||
|
||
After this we should have an executable file named `hello`. Let's execute it: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this we should have an executable file named `hello`. Let's execute it: | |
Now we should have an executable file named `hello`. Let's run it: |
This PR provides significant rework for the first chapter.