An efficient and simple static Python to C++ source-to-source compiler.
The motive behind this project comes from there being almost no native transpilers for Python to other languages. At some point in the process, they eventually rely on some interpreter or Python wrapper of some sort (if you want a Python static compiler which is already in stable versions, check out the mypy / mypyc projects).
The only project I managed to find which doesn't do this is the transpyle project, but unfortunately it appears to have been abandoned for a few years now. It also does not include all the Python features, which is what I plan on including in this project.
In the end, this compiler will take Python code and output native (no weird wrapper classes) C++ code. This will give it peak efficiency and will offer other benefits, such as porting Python code to C++ projects, or vice versa.
Open a command prompt and run:
python -m pip install -r requirements.txt
g++
is the C++ compiler that ComPy uses under the hood to turn the transpiled C++
code into a running executable. C++ can't run on its own, it needs to be compiled
to an executable in order for the OS to understand how to read and run it. Therefore,
g++
acts as a "translator" to turn our readable C++ code into computer-readable
machine/binary code.
If you are on Windows and don't have g++
installed, then you can install it using
the MinGW Project. You should be able to
download and install it via the Cygwin installer, by
following the instructions on the MinGW downloads page under the "Cygwin" section.
After that, make sure to add g++
to your Windows PATH so that you can access and run
g++
from anywhere. You can verify this by opening a CMD shell and running:
g++ --version
... which should display your installed version of the compiler.
If you are on Ubuntu, then you can easily install g++
using the following commands:
sudo apt-get update
sudo apt-get install g++
upx
is a tool used by ComPy to compress the output executable, making the final file
size much smaller. This can make it easier to transfer the executable between computers,
or to upload it online quicker. Do note that by using the upx
compression flag when
compiling with ComPy, the resulting executable will be marginally slower. This is because
the upx
tool will have to decompress all the code at the start, before running it.
In order to install upx
, go to the UPX GitHub Releases page
and download the version that matches your system. For example, if you are using
32-bit Windows, then you should download the upx-VERSION-win32.zip
file, which is
labelled accordingly as UPX - X86 Win32 version
in the Asset/Description table.
After that, you should unzip the file, take out the upx.EXTENSION
(for Windows,
this will be a .exe
) and put it somewhere that you can add to the PATH. After
updating the PATH, make sure that it is properly installed by opening a CMD shell
and running:
upx -V
... to display the version number.
In order to achieve the feat of compiling a duck-typed language, ComPy leverages the aid of type annotations. These annotations must be used on all annotatable names (objects)- variables, functions (return types), and function arguments. However, only the first time an object is initialized, the annotation must be used. This is similar to a lower level langauge:
// This is a snippet of C code
// You must use the type when initializing
int my_var = 1;
// However, when updating the value, no type is needed
my_var = 5;
// Usage of the object also requires no type annotations
printf("%i", my_var);
The above code initializes an integer variable with the name
my_var
, assigns it the initial value of 1
, then later
replaces this value with the number 5
. Afterwards, it prints
the value of my_var
to the screen. If we were to reimplement
this code in the ComPy syntax subset, it would look like so:
# Use type hint on initialization
my_var: int = 1
# No need for type hint afterwards
my_var = 5
# Again, usage does not need type hints
print(my_var)
Help menu, describes all command-line arguments:
usage: compy.py [-h] [-o OUTPUT] [-l LINKS] [-g] [-c] [-dg] [-dt] [-di] file
positional arguments:
file The file to compile
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
The file to output the ASM code to
-l LINKS, --links LINKS
Links the ported libraries to the executable (seperate with the ; character)
-g, --compile Compiles the output to an executable (you must have g++ installed and on the PATH)
-c, --compress Compresses the output executable (you must have UPX installed and on the PATH)
-dg, --debug-gui Opens the debugging GUI, mainly used to display information about the AST
-dt, --debug-text Prints out more logging information, mainly the AST tree in text form
-di, --debug-image Renders the AST as an image
Basic compilation (transpilation) of Python code:
python compy.py examples\test_code.py
Compile the Python code in examples\test_code.py
and output
the C++ code to the file examples\test_code.cpp
:
python compy.py -o examples\test_code.cpp examples\test_code.py
Compile the Python code to a native executable (requires you to have g++
installed!):
python compy.py -g examples\test_code.py
This section will primarily explain how "ported objects" work, and how you can implement your own. Let's start with the defenition- in ComPy, a "ported object" (or "port", as it might be called) is a snippet of code in the native langauge (C++), which has the capability to be interacted with via the Python code. What this all means in the end is that you can write Python code, and you can "inject" snippets of native C++ wherever you'd like.
The benefit of this is that we can now not only interact with our Python code by transpiling it into C++, but we can now do the opposite by taking C++ code and "turning it into Python" (it's not actually transpiling the native code to Python, it simply allows for the transpiler to link these objects together again when it's time to transpile back to native code).
That's a lot of explanation- let's see some examples. If you
view examples/example_port.py
, you'll see an example ported
library. One of the objects in that library is an example
"addition" operator function called add
:
def add(number_one: int, number_two: int) -> int:
"""
Adds two numbers.
"""
A few lines down, we then add it to the object storage like so:
ported_objs: Dict[str, PyPortFunctionSignature] = {
"add": PyPortFunctionSignature(
function=add,
code="return number_one + number_two;"
)
}
In the above snippet, we assigned a Python function with the
name add
to a ported function. This ported function has a
reference to the function we defined earlier, also named
add
(in the function
parameter). Now, ComPy knows the
function's signature (arguments and return type). Finally,
the last thing we need to specify is what the ported
function does in native code. We specify this in the code
parameter, by writing the code that the function will run.
Finally, in the test script examples\test_code.py
, we call
the function (the line above it is so that PyCharm ignores it,
as we am technically calling a function that does not exist):
# noinspection PyUnresolvedReferences
c = add(c, b)
Now, when this code segment is compiled, ComPy spits out the following snippet to the output (the valid C++ code equivalent):
int add(int number_one, int number_two){return number_one + number_two;}
...
c = add(c,b);