This repo consists of a set of canonical hello world programs in multiple languages and a GitHub action that compiles them and summarizes the size information.
Everything is open source - inspect how the data is generated by the GitHub Action by looking at the action definition and inspecting the logs.
Round 4 was carried out on August 27 2024. Binaries are here.
Language | Size (kB) | Compiler | Notes |
---|---|---|---|
Assembly | 2 | Microsoft Macro Assembler Version 14.40.33813.0 | |
Zig | 9 | 0.14.0-dev.1307+849c31a6c | |
C | 134 | Microsoft C/C++ Optimizing Compiler Version 19.40.33813 for x64 | |
Nim | 181 | Nim Compiler Version 2.0.8 [Windows: amd64] | |
C++ | 208 | Microsoft C/C++ Optimizing Compiler Version 19.40.33813 for x64 | |
V | 222 | V 0.4.7 426205e | |
Rust | 234 | 1.80.1 (3f5fd8dd4 2024-08-06) | |
D | 551 | DMD64 D Compiler v2.109.1 | |
C# | 945 | 9.0.100-preview.7.24407.12 | |
Scala | 1439 | Scala 3.2.2 Scala-native 0.4.10 | |
Crystal | 1994 | 1.13.2 [879ec12] | |
Go | 2174 | go version go1.23.0 windows/amd64 | |
F# | 3207 | 9.0.100-preview.7.24407.12 | |
Dart | 4620 | Dart SDK version: 3.5.1 (stable) (Tue Aug 13 21:02:17 2024 +0000) on "windows_x64" | |
Swift | 6397 | Swift version 5.10.1 (swift-5.10.1-RELEASE) | Includes: swiftCore.dll, vcruntime140.dll, vcruntime140_1.dll, msvcp140.dll |
Java | 6553 | Oracle GraalVM 22.0.2+9.1 | Includes vcruntime140.dll |
Kotlin | 6577 | kotlinc-jvm 1.8.10 | Includes vcrtuntime140.dll, AOT compiled with GraalVM, same version as the Java benchmark |
Haskell | 11388 | The Glorious Glasgow Haskell Compilation System, version 9.6.2 |
- The program should be compiled ahead of time. Languages that are jitted/interpreted by default should be compiled ahead of time the canonical way.
- The program should print Hello World the idiomatic way, using the standard library that comes with the language. Ideally, use the same snippet found in the official introductionary tutorial for the language.
- The program should be compiled using the default compiler settings.
- You can enable optimizations if optimizations are not enabled by default. If the compiler has a master switch "enable optimizations", that switch should be used. You can set the optimization preference to size optimizations, provided they don't change semantics.
- The program should run on a vanilla OS install. Ideally, it should link non-OS dependencies statically. If that's not possible, the reported size will include the size of all dynamic libraries not provided by a vanilla OS install.
The motivation for these rules is simple - canonical hello world and canonical compiler settings measure the canonical user experience. All of the measured languages have ways to produce smaller Hello World. E.g. disabling textual backtraces in Go, not using the standard library in C, messing with linker switches for any of these, etc.
But - there is a reason why the language/compiler developers chose the defaults they chose - the defaults match what the language and standard library advertises. It is therefore an objective measure. If you disagree with what the defaults entail, please take it up with the compiler/language maintainers. This repo is not a place for this discussion.
Without these rules in place, this repo would just become a race to the bottom. We know someone made Rust compile to 464 bytes. We know C# can compile to a couple kilobytes. It is also not interesting because these solutions don't deliver on the language or standard library promises anymore. They're merely art projects. There's a lot of grey area in between, but the gray area is just a place for disputes.
The conclusion you can make is that Hello World done in a canonical way in language X compiled with canonical settings compiles to Y bytes. That's all there is to it.
While it may appear that writing to console is a simple problem, it is actually quite complex. Writing can fail if the output is redirected to a file and there's no space on the target file system. On Windows, we have to deal with the active code page of the console. Some languages do console writing through stream abstractions.
Writing to console is typically customized by the standard library to match the general developer expectations in the given language.
For example, while the size of Zig executable is impressive, if we swap the "Hello, World!" string with "Kŕdeľ ďatľov učí koňa žrať kôru", on my machine the Zig program would print the following text: "K┼òde─╛ ─Åat─╛ov u─ì├¡ ko┼êa ┼╛ra┼Ñ k├┤ru." (blasting UTF-8 somewhere that doesn't talk UTF-8) whereas C# (and possibly others) would write "Krdel datlov ucí kona zrat kôru." because they would do a lossy conversion to the console code page first. This is fine as long is matches the developer expectations.
Similarly, not all languages treat error conditions the same. If the OS is in a low memory situation, it may not be possible to convert the text to console code page. If stdout is redirected to a file and the disk is full, the standard library may need to raise an exception. Raising an exception may require printing a backtrace. All of these "cost" some size on disk. Some languages and standard libraries care about these things more than others. It doesn't come for free.
The C++ sample can be smaller if we use printf
instead of streams. Similarly, the C sample can be smaller if we use puts
instead of printf
. In the same way the C# sample can be smaller if we invoke puts
from libc instead of System.Console
. As explained in the rules section, we measure the canonical thing, under default compiler settings. It would not be possible to draw a line otherwise. A chart where most things compile to < 1 kB, with a "Rust" program that mostly consists of inline assembly to achieve that size don't make for interesting charts. Even C# would be in a couple kB range. It would be a boring chart and capture very little of reality.