I tried making the code of the compiler and vlib as simple and readable as possible. One of V's goals is to be open to developers with different levels of experience in compiler development. Compilers don't need to be black boxes full of magic that only few people understand.
The compiler itself is located in vlib/compiler/
. It's a module that can be used by other
applications.
The main files are:
v.v
andvlib/compiler/main.v
. The entry point.
- V figures out the build mode.
- Constructs the compiler object (
struct V
). - Creates a list of .v files that need to be parsed.
- Creates a parser object for each file and runs
parse()
on them (this should work concurrently in the future). The parser emits C or x64 code directly. For performance reasons, there are no intermediate steps (no AST or Assembly code generation). - If the parsing is successful, a single C file is generated by merging the output from the parsers and carefully arranging all definitions (C is a single pass language).
- Finally, a C compiler is called to compile this C file and generate an executable or a library.
-
parser.v
The core of the compiler. This is the largest file (~3.5k loc).parse()
method asks the scanner to generate a list of tokens for the file it needs to parse. Then it simply goes through all the tokens one by one.In V, objects can be used before declaration, so there are 2 passes. During the first pass, it only looks at declarations and skips function bodies. It memorizes all function signatures, types, consts, etc. During the second pass it looks at function bodies and generates C (e.g.
cgen('if ($expr) {'
) or machine code (e.g.gen.mov(EDI, 1)
).The formatter is embedded in the parser. Correctly formatted tokens are emitted as they are parsed. This allowed us to simplify the compiler and avoid duplication, but slowed it down a bit. In the future, this will be fixed with build flags and separate binaries for C generation, machine code generation, and formatting. This way there will be no unnecessary branching and function calls.
-
scanner.v
The scanner's job is to parse a list of characters and convert them to tokens. It also takes care of string interpolation, which is a mess at the moment. -
token.v
This is simply a list of all tokens, their string values, and a couple of helper functions. -
table.v
V creates one table object that is shared by all parsers. It contains all types, consts, and functions, as well as several helpers to search for objects by name, register new objects, modify types' fields, etc. -
cgen.v
The smallCgen
struct helps generate C code. It's also shared by all parsers. It has a couple of functions that allow to go back and set something that was previously unknown (like witha := 0
=>int a = 0;
). Some of these functions are hacky and need improvements and simplifications. -
fn.v
Handles declaring and calling normal and async functions and methods. This file is about 1000 lines of code, and has some complex logic. It needs to be cleaned up and simplified a bit. -
json.v
defines the json code generation. This file will be removed once V supports comptime code generation, and it will be possible to do this using the language's tools. -
x64/
is the directory with all the machine code generation logic. It's not released yet. Obviously this is the most complex part of the compiler. It defines a set of functions that translates assembly instructions to machine code, it builds complicated binaries from scratch byte by byte. It manually builds all headers, segments, sections, symtable, relocations, etc. Right now it only has basic support of the x64 platform/Mach-O format, and it can only generate.o
files, which then have to be linked withlld
.
The rest of the directories are vlib modules: builtin/
(strings, arrays, maps), time/
, os/
, etc. Their documentation is pretty clear.
(provided by @spytheman)
(If you don't already have a Github account, please create one. Your Github username will be referred to later as 'YOUR_GITHUB_USERNAME'. Change it accordingly in the steps below.)
- Clone https://github.com/vlang/v in a folder, say nv (
git clone https://github.com/vlang/v nv
) cd nv
git remote add pullrequest [email protected]:YOUR_GITHUB_USERNAME/v.git
# (NOTE: this is your own forked repo of: https://github.com/vlang/v - After this, we just do normal git operations such as:git pull
and so on.)- When finished with a feature/bugfix, you can:
git checkout -b fix_alabala
git push pullrequest
# (NOTE: the pullrequest remote was setup on step 3)- On Github's web interface, I go to: https://github.com/vlang/v/pulls Here the UI shows a nice dialog with a button to make a new pull request based on the new pushed branch. (Example dialogue: https://url4e.com/gyazo/images/364edc04.png)
- After making your pullrequest (aka, PR), you can continue to work on the branch... just do step #5 when you have more commits.
- If there are merge conflicts, or a branch lags too much behind V's master, you can do the following:
git checkout master
git pull
git checkout fix_alabala
git rebase master
# solve conflicts and do git rebase --continuegit push pullrequest -f
The point of doing the above steps to never directly push to the main V repository, only to your own fork. Since your local master branch tracks the main V repository's master, then git checkout master; git pull --rebase origin master
work as expected (this is actually used by v up
) and it can always do so cleanly. Git is very flexible, so there may be simpler/easier ways to accomplish the same thing.