-
Notifications
You must be signed in to change notification settings - Fork 6
/
Moonad.tex
2281 lines (1840 loc) · 85.1 KB
/
Moonad.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass{article}
\usepackage{epigraph}
\usepackage{tikz}
\usepackage{graphicx}
\usepackage{float}
\usepackage[utf8]{inputenc}
\usepackage{pmboxdraw}
\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{bussproofs}
\usepackage{mathtools}
\usepackage{verbatim}
\usepackage{dsfont}
\usepackage{hyperref}
\usepackage{trfrac}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{fancyvrb}
\definecolor{codegray}{rgb}{0.5,0.5,0.5}
\definecolor{backcolour}{rgb}{0.98,0.98,0.97}
\lstdefinestyle{mystyle}{
backgroundcolor=\color{backcolour},
commentstyle=\color{codegreen},
numberstyle=\tiny\color{codegray},
basicstyle=\ttfamily\footnotesize,
breakatwhitespace=false,
breaklines=true,
captionpos=b,
keepspaces=true,
numbers=left,
numbersep=5pt,
showspaces=false,
showstringspaces=false,
showtabs=false,
tabsize=2
}
\lstset{style=mystyle}
\hypersetup{colorlinks=true, linkcolor=black}
\theoremstyle{definition}
\newtheorem{definition}{Definition}
\theoremstyle{theorem}
\newtheorem{theorem}{Theorem}
\newcommand{\note}[1]{\textcolor{red}{\textit{#1}}}
\newcommand\level{\mathit{level}}
\newcommand\Ra{\Rightarrow}
\newcommand\ra{\rightarrow}
\newcommand\rab{\rightarrow_{\beta}}
\newcommand\dup{\mathrel{\mathtt{dup}}}
\newcommand\code{\mathtt}
\newcommand\SV{\SaveVerb}
\newcommand\UV{\UseVerb}
\newcommand\size{\mathit{size}}
\newcommand\uses{\mathit{uses}}
\newcommand\atlev{ \mathit{atLevel}}
\newcommand\fstsc{ \mathit{SC}_1}
\newcommand \sndsc{ \mathit{SC}_2}
\newcommand \thdsc{ \mathit{SC}_3}
\newcommand \liff{ \longleftrightarrow}
\pagenumbering{arabic}
\begin{document}
\title{Moonad: A Peer-to-Peer Operating System}
\author{
Victor Maia\\
\texttt{[email protected]}
\and
John Burnham\\
\texttt{[email protected]}
}
\date{November 2019}
\maketitle
\setlength\epigraphwidth{.9\textwidth}
\setlength\epigraphrule{0pt}
\epigraph{\itshape But why, some say, the Moon? Why choose this as our goal? And
they may well ask, why climb the highest mountain? Why, 35 years ago, fly the
Atlantic? Why does Rice play Texas?}{---John F. Kennedy, \textit{Address at
Rice University on the Nation's Space Effort}}
\begin{abstract}
We present \textbf{Moonad}, a peer-to-peer operating system. Its machine
language, \textbf{FM-Net}, consists of a massively parallel, beta-optimal
graph-reduction system based on interaction nets. Its purely functional
user-facing language, \textbf{Formality} features a powerful, elegant
type-system capable of stating and proving theorems about its own programs
through inductive $\lambda$-encodings. Online interactions are made through
\textbf{Bitlog}, a token-less data-only blockchain, compatible with both private
and decentralized backends and featuring extensible smart contract universes. We
present two examples of such universes: \textbf{Ethos}, a universe with native
eWASM execution, and \textbf{Phos}, a universe with native FM-Net execution.
On top of those, we describe a front-end interface, Moonad, where users
can create, deploy and use decentralized applications in an immutable namespace,
transact in a trustless code and proof marketplace, and exchange digital assets.
\end{abstract}
\newpage
\tableofcontents
\newpage
\section{Introduction}
Computing is a field in its infancy. While humans have performed manual
computation with fingers, pencil, abacus or quipu since time immemorial, it has
been only 75 years since the first general-purpose computing machine was
switched on. Given how ubiquitous computers are in our lives today, it can be
easy to forget this, but there are hundreds of millions still living who are
older than the computer. If a generation in human terms is 25 years, computers
users today are at most third-generation.
Accordingly, it should not be surprising that computing systems we use in
practice ---devices, operating systems, programming languages--- lag behind
(sometimes far behind) theoretical or experimental systems. Consider that the
most popular operating system in the world, Android was designed in the late
2000s, based on a kernel from the 1990s (Linux), which itself was based on a
design from the 1970s. There simply hasn't been enough time for all the good
ideas to be discovered and to percolate into common use.
For example, most electronic authentication on the web is still done with
passwords (sometimes stored in plain-text!), despite the fact that this
contributes to tens of billions of dollars lost per year to identity theft and
that secure systems for authentication based on public key cryptography have
existed for at least the past 30 years.
In a related area, most electronic commerce is mediated by centralized trusted
counterparties, despite the fact that this regularly leads to data breaches that
compromise the security of millions of people. Decentralized systems, such
cryptographic ledgers using byzantine-fault-tolerant chains of signed
transactions and trustless distributed computing platforms have been under
intense development for the better part of the past decade, but have yet to gain
widespread adoption.
There has been tremendous theoretical progress in programming language and
compiler design, that is barely exploited. We have substructural type systems
that can eliminate resource management errors (like segmentation faults) and
ensure memory safety. We have proof languages capable of unifying the field of
mathematics and programming as one (the Curry-Howard correspondence). In
principle, we could have systems software that is formally proven to be
error-free, and proof assistants that are both user-friendly enough to be taught
in grade-school math class and performant enough to be used on any platform.
Contrast this theoretical progress with our practical record: not long ago, we
had a single, small malicious package cause the entire web ecosystem to
collapse, twice. More than 300,000 sites online today are vulnerable to SQL
injections. Bugs like Heartbleed and TheDAO cause enormous monetary losses. A
single CPU exploit has slowed down nearly every processor in the world by 30%.
Faulty software kills people, routinely causing airplanes to crash and hospitals
to stop operating.
We believe this mismatch between theory and practice is due to presentation: The
technology exists, we need to build the user experience. Moonad aims to assemble
many of these amazing technologies into a decentralized operating system that
"just works"; to build computing systems that are fast, friendly and safe.
\section{Formality: An efficient programming proof language}
\subsection{Why a proof language?}
A proof language is as a programming language with a type system that is
sufficiently expressive to state and prove mathematical theorems about its own
programs. The Curry-Howard correspondence states that there is a structural
congruence between computer programs and proofs in formal logic. While in theory
this implies that software developers and mathematicians are engaged in
substantially the same activity, in practice there are many barriers preventing
the unification of the two fields. Programming languages with type systems
expressive enough to take advantage of Curry-Howard, such as Haskell, Agda and
Coq generally have substantial runtime overhead, which makes them challenging to
use, particularly at the systems programming level. On the other hand, systems
languages such as Rust, Nim or ATS, have shown that type systems can greatly
enhance usability and correctness of low-level software, but either have type
systems too weak to fully exploit Curry-Howard (Rust, Nim), or introduce
syntactic complexity that limits adoption (ATS). In other words, there are lots
of slow languages with great type systems and lots of fast languages with lousy
type systems.
As programmers and as mathematicians, we don't want to compromise. On one side,
mathematicians have the power of rigor: there is nothing more undeniably correct
than a mathematical proof. Yet ironically, those proofs are often written,
checked and reviewed by humans in an extremelly error-prone process. On the
other side, programmers have the power of automation: there is nothing more
capable of performing a huge set of tasks without making a single mistake than a
computer program. Yet, equally ironically, most of those programs are written in
an unrigorous, bug-ridden fashion. Formality lies in the intersection between
both fields: it is a programming language in which developers are capable of
implementing everyday algorithms and data structures, but it is also a proof
language, on which mathematicians are capable of proposing and proving theorems.
Types can be seen as a language of specfications that can be mechanically
checked by the compiler, allowing you to enforce arbitrary guarantees on your
program's behavior. For example, in an untyped language, you could write an
algorithm such as:
\begin{lstlisting}
// Returns the element at index `idx` of an array `arr`
function get_at(i, arr) {
... code here ...
}
\end{lstlisting}
But this has no guarantees. You could call it with nonsensical arguments such as
\verb|get_at("foo", 42)|, and the compiler would do nothing to stop you, leading
to (possibly catastrophical) runtime errors. Traditional typed languages improve
the situation, allowing you to "enrich" that definition with simple or even
polymorphic types, such as:
\begin{lstlisting}
function get_at<A>(i : Num, arr : Array<A>) : A {
... code here ...
}
\end{lstlisting}
This has some guarantees. For example, it is impossible to call \verb|get_at|
with a \verb|String| index, preventing a wide class of runtime errors. But you
can still, for example, call \verb|get_at| with an index out of bounds. In fact,
a similar error caused the catastrophic Heartbleed vulnerability. Formality
types are arbitrarily expressive, allowing you to capture all demands you'd
expect from \verb|get_at|:
\begin{lstlisting}
get_at*L : {A: Type, i: Fin(L), arr: Vector(A, L)} -> [x: A ~ At(A,L,i,arr,x)]
... implementation here...
\end{lstlisting}
Here, we use a bound, \verb|L|, to enforce that the index is not larger than the
size of the array, and we also demand that the returned element, \verb|x: A|, is
exactly the one at index \verb|i|. This specification is complete, in the
mathematical sense. It is not only impossible to get a runtime error by calling
\verb|get_at|, but it is also impossible to write \verb|get_at| incorrectly to
begin with. Note that this is different from, for example, asserts or bound
checking. Here, the compiler statically reasons about the flow or your program,
assuring that it can't go wrong with zero runtime costs.
In a similar fashion, you could use types to ensure that the balance of a
smart-contract never violates certain invariants, or that an airplane controller
won't push its nose down to a crashing position. Of course, you don't need to
write types so precise. If your software doesn't demand security, you could go
all way down to having fully untyped programs. The point is that the ability of
expressing properties so precisely is immensely valuable, specially when
developing critical software that demands all the security one can afford.
Some readers might object here that while formal proofs can ensure that a
program's implementation matches its specification, it cannot eliminate errors
inherent to the specification. This is true, but, in many cases, specifications
are much smaller than implementations. For example,
\begin{lstlisting}
spec : Type
[
// Demands a List function such that
sort : List(Word) -> List(Word),
// It returns the same list in ascendinig order
And(is_same(xs, sort(ys)), is_ascending(sort(ys)))
]
\end{lstlisting}
The spec above demands a List function such that it returns the same list in
ascending order. Writing and auditing this specification is considerably easier
than implementing a complete sort function. As such, formal verification can be
seen as an optimization of human auditing: instead of verifying a vast amount of
code, you audit a small specification and let the compiler do the rest.
\subsection{Overview of Formality}
Formality is a new proof language designed to achieve 3 main goals:
\begin{enumerate}
\item Efficiency: it should be as fast as possible.
\item Simplicity: it should be implemented in as few lines of code as possible.
\item Accessibility: it should be usable by non-experts as much as possible.
\end{enumerate}
Existing proof languages such as Agda, Coq, Lean, Isabelle and Idris don't
address those goals with sufficient priority. They are often very slow, have big
implementations, and demand expert knowledge from its users. Formality achieves
efficiency by compiling to an efficient high-order machine featuring optimal
beta-reduction, no garbage-collection and massive parallelism. It achieves
simplicity by using as few core primitives as possible: in fact, every algorithm
and data-structure can be represented with just lambdas, including inductive
datatypes. It achieves accessibility by developing a syntax that is familiar to
regular developers, drawing insipration from popular languages like Python, Rust
and TypeScript.
Formality is a dependently typed programming language based on extrinsic type
theory. It is based on elementary terms are just annotated versions of the
Elementary Affine Lambda-Calculus (EALC), which is itself a terminating subset
of the lambda-calculus with great computational characteristics. In particular,
EALC is compatible with the most efficient version of the optimal reduction
algorithm [citation]. This is important because, while asymptically optimal,
most implementations of the algorithm carry an extra, significant constant
overhead caused by a book-keeping machinery. This made them too slow in
practice, decreasing interest on the area. By relying on EALC, we can avoid the
book-keeping overhead, making it much faster than all alternatives. Below is a
table comparing the amount of native operations (graph rewrites) required to
reduce $\lambda$-terms in other implementations and Formality:
\hfill \newline
\begin{tabular}{ c | c | c | c | c | c}
Term & GeomOpt & GeomImpl & YALE & IntComb & Formality \\\hline
22II & 204 & 56 & 38 & 66 & 21\\
222II & 789 & 304 & 127 & 278 & 50\\
3II & 75 & 17 & 17 & 32 & 9\\
33II & 649 & 332 & 87 & 322 & 49\\
322II & 7055 & 4457 & 383 & 3268 & 85\\
223II & 1750 & 1046 & 213 & 869 & 69\\
44II & 3456 & 2816 & 148 & 2447 & 89\\
\end{tabular}
\\ \hfill \newline
TODO: citations
Not only Formality requires orders of magnitude less native operations, but its
native operations are much simpler.
Unlike most proof languages, Formality doesn't include a datatype system, since
not only it is very complex to implement, which goes against our second goal
(simplicity), but can also cause more serious issues. For example, it was
discovered that type preservation does not hold in Coq due to its treatment of
coinductive datatypes [citation], and both Coq and Agda were found to be
incompatible with Homotopy Type Theory due to the K-axiom, used by native
pattern-matching on dependent datatypes [citation]. Instead, Formality relies on
lambda encodings, an alternative to a datatype system that use plain lambdas to
encode arbitrary data. While attractive, this approach wasn't adopted in
practical type theories for several reasons:
\begin{enumerate}
\item There is no encoding that provides both recursion and fast pattern-matching.
\begin{enumerate}
\item Church encodings provide recursion, but pattern-matching is slow $O(n)$.
\item Scott encodings provide fast pattern-matching, but no recursion.
\item Alternative encodings that provide both consume too much memory.
\end{enumerate}
\item We cannot prove \verb|0 != 1| with the usual definition of \verb|!=|.
\item Induction is not derivable.
\end{enumerate}
Formality solves the first issue by simply using Church encodings for recursion
and Scott encodings for data, the second issue by collapsing universes, and the
last issue by combining self-types and type-level recursion [citation]. It is
well-known that, in most proof languages, collapsing universes and type-level
recursion can lead to logical paradoxes, but the remarkable coincidence is that
the same feature that allows the language to be compatible with optimal
reductions, Elementary Affine Logic, also makes its type-theory sound in the
presence of those [proof]. This allows the theory to be remarkably simple and
powerful.
\subsection{EALC: Elementary Affine Lambda Calculus}
The Elementary Affine Lambda Calculus is a subset of the
$\lambda$-calculus which is both terminating and compatible with the
book-keeping free optimal reduction algorithm computed by FM-Net. Its terms are
defined by the following syntax:
\begin{definition} EALC terms
\SV{variable}|v|
\SV{lambdaTerm}|v => M|
\SV{lambdaApp}|M(N)|
\SV{boxedTerm}|#M|
\SV{boxedDup}|dup n = M; N|
Let $\mathbb{V}$ be an infinite set of variables $\{v_0,v_1,...,v_n\}$.
Let $\mathtt{v} \in \mathbb{V}$.
Then the EALC terms $\code{M,N}$ are defined recursively as:
\begin{align*}
\mathtt{M,N} &\Coloneqq \UV{variable} \tag*{(variable)}\\
&\,\vert\, \UV{lambdaTerm}\tag*{(lambda term)}\\
&\,\vert\, \UV{lambdaApp}\tag*{(lambda application)}\\
&\,\vert\, \UV{boxedTerm}\tag*{(boxed term)}\\
&\,\vert\, \UV{boxedDup}\tag*{(boxed duplication)}\\
\end{align*}
\end{definition}
The first 3 constructors are the usual $\lambda$-calculus. The last two
introduce boxes, which just wrap terms, and duplications, which copies the
contents of a box.
\subsubsection{EALC reduction}
Substitution of a variable in an EALC term is defined:
\begin{definition} EAlC substitution
\begin{enumerate}
\item \verb|x[y <- M]| is
\begin{enumerate}
\item if $\code{v} = \code{x}$ then $\code{M}$
\item if $\code{v} \neq \code{x}$ then $\code{v}$
\end{enumerate}
\item \verb|(x => M)[y <- N]| is \verb|(x => M[y <- N])|
\item \verb|(M(N))[x <- P]| is \verb|(M [x <-P])(N [x <- P])|
\item \verb|(#M)[x <- N]| is \verb|#(M[x <- N])|
\item \verb|(dup x = M; N)[y <- P]| is
\verb|(dup x = M[x <- P]; N[y <- P])|
\end{enumerate}
\end{definition}
The EALC has the following reduction rules:
\begin{definition} Single-step EALC reduction $\rab$
Let $[N/x]M$ indicate the substitution of $x$ by $N$ in $M$.
\SV{r1}|((x) => a)(b)|
\SV{r2}|a[x <- b]|
\SV{r3}|dup x = #a; b|
\SV{r4}|b[a <- x]|
\SV{r5}|((dup x = a; b) c)|
\SV{r6}|(b c)|
\SV{r7}|dup x = (dup y = a; b); c|
\SV{r8}|dup y = a; dup x = b; c|
\begin{enumerate}
\item application: $\UV{r1} \rab \UV{r2}$
\item duplication: $\UV{r3} \rab \UV{r4}$
\item appdup-swap: $\UV{r5} \rab \UV{r6}$
\item dupdup-swap: $\UV{r7} \rab \UV{r8}$
\end{enumerate}
\end{definition}
The first rule is the usual beta-reduction. The second rule allows us to copy
the contents of a box. The last two are just permutations, allowing inner $dup$s
to be exposed. A EALC term is said to be stratified when it follows the
following conditions:
\begin{definition} EALC stratification condition
\begin{enumerate}
\item Lambda-bound variables can only be used at most once.
\item There must be exactly 0 boxes between a variable bound by a lambda and its
occurrence.
\item There must be exactly 1 box between a variable bound by a duplication and its
occurrence.
\end{enumerate}
\end{definition}
The first condition says that lambdas must be affine. The second condition says
that lambdas can't substitute inside boxes. The third condition says that you
can't unbox, or add boxes, to a term by duplicating it. Stratified $\Lambda_!$
terms are strongly normalizing.
Proof: TODO
\subsection{Formality Type Theory}
Formality Type Theory (FM-TT) extends the EALC with type annotations.
\subsubsection{Syntax}
FM-TT's syntax is defined as follows:
\begin{definition} FM-TT terms
\SV{variable}|v|
\SV{universe}|Type|
\SV{lambdaType}|(v : M) -> N|
\SV{lambdaTerm}|(v : M) => N|
\SV{lambdaApp}|M(N)|
\SV{boxedType}|!M|
\SV{boxedTerm}|#M|
\SV{boxedDup}|dup n = M; N|
\SV{selfType}|$v M|
\SV{selfTerm}|new(M) N|
\SV{selfElim}|%M|
\begin{align*}
\mathtt{M,N} &\Coloneqq \UV{variable} \tag*{(variable)}\\
&\,\vert\, \UV{universe}\tag*{(universe)}\\
&\,\vert\, \UV{lambdaType}\tag*{(lambda type, dependent product)}\\
&\,\vert\, \UV{lambdaTerm}\tag*{(lambda term)}\\
&\,\vert\, \UV{lambdaApp}\tag*{(lambda application)}\\
&\,\vert\, \UV{boxedType}\tag*{(boxed type)}\\
&\,\vert\, \UV{boxedTerm}\tag*{(boxed term)}\\
&\,\vert\, \UV{boxedDup}\tag*{(boxed duplication)}\\
&\,\vert\, \UV{selfType}\tag*{(self type)}\\
&\,\vert\, \UV{selfTerm}\tag*{(self term)}\\
&\,\vert\, \UV{selfElim}\tag*{(self elimination)}\\
\end{align*}
\end{definition}
For simplicity, we extend our notation with the following syntax sugars:
\begin{definition} FM-TT syntax sugars
\begin{itemize}
\item \verb|(x0 : A0, x1 : A1, ...) => t| for \\
\verb|(x0 : A0) => (x1 : A1) => ... => t|
\item \verb|f(x,y,z)| for \verb|f(x)(y)(z)|
\item When \verb|x| isn't used in \verb|B|, \verb|A -> B| for \verb|(x : A) -> B|
\item We will also sometimes omit types of lambda-bound arguments when they can
be inferred and write \verb|(x0) => t| instead of \verb|(x : A) => t|
\end{itemize}
\end{definition}
Substitution of a variable in an FM-TT term is defined:
\begin{definition} FM-TT substitution
\begin{enumerate}
\item \verb|x[y <- M]| is
\begin{enumerate}
\item if $\code{v} = \code{x}$ then $\code{M}$
\item if $\code{v} \neq \code{x}$ then $\code{v}$
\end{enumerate}
\item \verb|Type[y <- M]| is \verb|Type|
\item \verb|(x : M -> N)[y <- P]| is \verb|(x : M[y <- P] -> N[y <- P])|
\item \verb|(x : M => N)[y <- P]| is \verb|(x : M[y <- P] => N[y <- P])|
\item \verb|(M(N))[x <- P]| is \verb|(M [x <- P])(N [x <- P])|
\item \verb|(!M)[x <- N]| is \verb|!(M[x <- N])|
\item \verb|(#M)[x <- N]| is \verb|#(M[x <- N])|
\item \verb|(dup x = M; N)[y <- P]| is
\verb|(dup x = M[x <- P]; N[y <- P])|
\item \verb|($v M)[x <- N]| is \verb|$v M[x <- N]|
\item \verb|(new(M)N)[x <- P]| is \verb|new(M[x <- P])N[x <- P]|
\item \verb|(%M)[x <- N]| is \verb|%(M[x <- N])|
\end{enumerate}
\end{definition}
\subsubsection{Typing Rules}
FM-TT's typing rules are:
\begin{definition} $\Lambda_F$ typing rules
\SV{v1}|(x : T)|
\SV{v2}|x : T|
\[ \trfrac[\small(variable)]
{\UV{v1} \in \Gamma}
{\Gamma \vdash \UV{v2}}
\]
\SV{tt}|Type : Type|
\[ \trfrac[\small(type-in-type)]
{\varnothing}
{\Gamma \vdash \UV{tt}}
\]
\SV{lT1}|A : Type|
\SV{lT2}|x : A|
\SV{lT3}|B : Type|
\SV{lT4}|(x : A) -> B : Type|
\[ \trfrac[\small(lambda type)]
{\Gamma \vdash \UV{lT1}\quad \Gamma,\UV{lT2}\vdash\UV{lT3}}
{\Gamma \vdash \UV{lT4}}
\]
\SV{lt1}|A : Type|
\SV{lt2}|x : A|
\SV{lt3}|t : B|
\SV{lt4}|(x : A) => t : (x : A) -> B |
\[ \trfrac[\small(lambda term)]
{\Gamma \vdash \UV{lt1}\quad \Gamma,\UV{lt2}\vdash\UV{lt3}}
{\Gamma \vdash \UV{lt4}}
\]
\SV{la1}|f : (x : A) -> B|
\SV{la2}|a : A|
\SV{la3}|f(a) : B[x <- a] |
\[ \trfrac[\small(lambda application)]
{\Gamma \vdash \UV{la1}\quad \Gamma\vdash\UV{lt2}}
{\Gamma \vdash \UV{la3}}
\]
\SV{bT1}|A : Type|
\SV{bT2}|!A : Type|
\[ \trfrac[\small(boxed type)]
{\Gamma \vdash \UV{bT1}}
{\Gamma \vdash \UV{bT2}}
\]
\SV{bt1}|t : A|
\SV{bt2}|#t : !A|
\[ \trfrac[\small(boxed term)]
{\Gamma \vdash \UV{bt1}}
{\Gamma \vdash \UV{bt2}}
\]
\SV{bd1}|t : !A|
\SV{bd2}|x : A|
\SV{bd3}|u : B|
\SV{bd4}| dup x = t; u : B|
\[ \trfrac[\small(boxed duplication)]
{\Gamma \vdash \UV{bd1}\quad \Gamma,\UV{bd2}\vdash\UV{bd3}}
{\Gamma \vdash \UV{bd4}}
\]
\SV{sT1}|s : $x A|
\SV{sT2}|A : Type |
\SV{sT3}|$x A : Type|
\[ \trfrac[\small(self type)]
{\Gamma,\UV{sT1}\vdash\UV{sT2}}
{\Gamma \vdash \UV{sT3}}
\]
\SV{st1}|t : A[x <- t]|
\SV{st2}|$x A : Type|
\SV{st3}|new($x A) t : $x A|
\[ \trfrac[\small(self term)]
{\Gamma \vdash \UV{st1}\quad \Gamma\vdash\UV{st2}}
{\Gamma \vdash \UV{st3}}
\]
\SV{se1}|t : $x A|
\SV{se2}|t : A[x <- t]|
\[ \trfrac[\small(boxed term)]
{\Gamma \vdash \UV{se1}}
{\Gamma \vdash \UV{se2}}
\]
\end{definition}
\subsubsection{Induction}
Those primitives allow FM-TT to be a simple type-theory featuring inductive
datatypes.
\begin{theorem} Induction can be proven in FM-TT
\end{theorem}
\begin{proof} \verb|nat_ind| in:
\begin{lstlisting}
Nat : Type
$ self
( P : Nat -> Type
, succ : (r : Nat, i : P(r)) -> P(succ(r))
, zero : P(zero)
) -> P(self)
succ : (n : Nat) -> Nat
new(Nat) (P, succ, zero) => succ(n, (%n)(P, succ, zero))
zero : Nat
new(Nat) (P, succ, zero) => zero
nat_ind :
( n : Nat
, P : Nat -> Type
, s : (n : Nat, i : P(n)) -> P(succ(n))
, z : P(zero)) -> P(n)
(n) => %n
\end{lstlisting}
\end{proof}
This defines an inductive lambda encoded datatype, \verb|Nat| as it's own
induction scheme. Mutual recursion is needed because the induction scheme of any
datatype refers to its constructors, which then refer to the defined datatype.
Self types are used because the type returned by the induction scheme can refer
to the term being induced. Together, those two components allow us to define
inductive datatypes with only ordinary lambdas. To induce on \verb|n : Nat| you
don't require any additional proof, all you need to do is eliminate its
self-type with \verb|%n|. Note that this \verb|Nat| is complicated by the fact
it is recursive (Church encoded). We could have a non-recursive (Scott encoded)
\verb|Nat| as:
\begin{lstlisting}
Nat Type
$ self
( ~P : Nat -> Type
, succ : (n : Nat) -> P(succ(n))
, zero : P(zero)
) -> P(self)
succ(n : Nat) -> Nat
new(Nat) (~P, succ, zero) => succ(n)
zero Nat
new(Nat) (~P, succ, zero) => zero
nat_ind(n : Nat
, ~P : Nat -> Type
, s : (n : Nat, i : P(n)
) -> P(succ(n)), z : P(zero)) -> P(n)
(n) => %n
\end{lstlisting}
Which is simpler and has the benefit of having constant-time pattern-matching.
Even simpler types include \verb|Bool|, \verb|Unit| and \verb|Empty|:
\begin{lstlisting}
Bool Type
$ self
( ~P : Bool -> Type
, true : P(true)
, false : P(false)
) -> P(self)
true Bool
new(Bool) (~P, true, false) => true
false Bool
new(Bool) (~P, true, false) => false
bool_induction(b : Bool, ~P : Bool -> Type, t : P(true), f : P(false)) -> P(b)
(b) => (%b)
Unit Type
$ self
( ~P : Unit -> Type
, unit : P(unit)
) -> P(self)
unit Unit
new(Unit) (~P, unit) => unit
unit_induction(b : Unit, ~P : Unit -> Type, u : P(unit)) -> P(u)
(u) => (%u)
Empty Type
$ self
( ~P : Empty -> Type
) -> P(self)
empty_induction(b : Empty, ~P : Empty -> Type) -> P(b)
(e) => (%e)
\end{lstlisting}
Notice how \verb|bool_induction| is equivalent to the elimination
principle of the \verb|Bool| datatype in Coq or Agda, except that in Formality
this was obtained without needing a complex datatype system implemented as
a native language feature. Formality's type theory is capable of expressing any
inductive datatype seen on traditional proof languages. For example,
\verb|Vector| (a list with statistically known length) can also be encoded as
its dependent elimination:
\begin{lstlisting}
Vector(~A : Type, len : Nat) -> Type
$self
( ~P : (len : Nat) -> Vector(A, len) -> Type
, vcons :
(~len : Nat
, x : A
, xs : Vector(A, len)
) -> P(succ(len), vcons(~A, ~len, x, xs))
, vnil : P(zero, vnil(~A))
) -> P(len, self)
vcons(~A : Type
, ~len : Nat
, head : A
, tail : Vector(A, len)
) -> Vector(A, succ(len))
new(Vector(A, succ(len))) (~P, vcons, vnil) => vcons(~len, head, tail)
vnil(~A : Type) -> Vector(A, zero)
new(Vector(A, zero)) (~P, vcons, vnil) => vnil
\end{lstlisting}
And an identity type for propositional equality is just the J axiom wrapped by a
self type:
\begin{lstlisting}
Id(A : Type, a : A, b : A) -> Type
$self
( ~P : (b : A, eq : Id(A, a, b)) -> Type
, refl : P(a, refl(~A, ~a))
) -> P(b, self)
refl(~A : Type, ~a : A) -> Id(A, a, a)
new(Id(A, a, a)) (~P, refl) => refl
\end{lstlisting}
Formality is also able to express datatypes that traditional proof languages
can't. For example, intervals, i.e., booleans that can only be eliminated if
both provided cases are equal, can be written as:
\begin{lstlisting}
Interval Type
$self
( ~P : (i : Interval) -> Type
, I0 : P(i0)
, I1 : P(i1)
, SG : I0 == I1
) -> P(self)
i0 Interval
new(Interval) (~P, i0, i1, sg) => i0
i1 Interval
new(Interval) (~P, i0, i1, sg) => i1
\end{lstlisting}
And we can easily prove troublesome theorems like $0 /neq 1$:
\begin{lstlisting}
true_isnt_false(e : Id(Bool, true, false)) -> Empty
(%e)(~{b, e.b} (%e.b)(~{b}Type, Unit, Empty), unit)
\end{lstlisting}
This much type-level power comes at a cost: If Formality was based on the
conventional lambda calculus, it wouldn't be sound. However, since it is based
on EALC, a terminating untyped language, its normalization is completely
independent from types, so it is impossible to derive paradoxes by exploiting
non-terminating computations. A good exercise is attempting to write $\lambda x.
(x x) \lambda x.(x x)$ on EALC, which is impossible.
The tradeoff is that EALC is considerably less computationally powerful than the
lambda calculus, imposing severe restrictions on how terms can be duplicated.
However in practice, this limitation isn't as onerous as might be expected.
Formality is still capable of modeling any algorithm that can be implemented in
a Turing Complete language, to some finite bound. Since all physical computing
systems also model Turing Complete computation to a finite time or space bound,
this means that Formality has the same level of practical expressivity as
common languages like C, Javascript or Haskell. Unlike those languages,
programming in Formality involves a resource-usage discipline that can be
burdensom to new users. We expect to develop advanced compiler tools for
automatic inference of boxed terms and types to ease the language acquisition
process.
Formality is currently being implemented in Javascript and Agda, the latter will
contain the formalization of Formality's consistency argument.
\subsection{Formality Lang: Numeric primitives and syntax sugar}
Formality Lang adds a series of user-friendly syntax sugars on top of EALC and
FM-TT making it much more usable. It includes syntaxes for
datatype definitions and pattern-matching, loops, recursion, branching and so
on.
Formality Lang also adds native pairs, numbers and numeric operations. That is
done for efficiency reasons: native pairs use 1/3 of the space of lambda encoded
pairs, and native numbers can be stored unboxed on interaction net nodes, using
orders of magnitude less space than lambda encoded numbers.
Lastly, Formality Lang allows you to mark type information for erasure before
runtime. This reduces overhead during execution.
\subsubsection{Type Erasure}
TODO: Explain erasure
\subsubsection{Let}
Formality includes `let` expressions, which allow you to give local names to
terms.
\begin{lstlisting}
main : Output
let hello = "Hello, world!"
print(hello)
\end{lstlisting}
\verb|let| expressions can be infinitely nested.
\begin{lstlisting}
main : Output
let output =
let hello = "Hello, world!"
print(hello)
output
\end{lstlisting}
\subsubsection{Words}
The type of a native number is \verb|Word|. Words are unsigned integers of 32
bits:
\begin{lstlisting}
main : Word
1900
\end{lstlisting}
They can also be written in hexadecimal:
\begin{lstlisting}
main : Word
0x76C
\end{lstlisting}
And in binary:
\begin{lstlisting}
main : Word
0b11101101100
\end{lstlisting}
They include many built-in operations:
\hfill \newline
\begin{tabular}{ c | c | c}
name & syntax & javascrip equivalent \\\hline
addition & \verb|x + y| & \verb|(x + y) >>> 0|\\
subtraction & \verb|x - y| & \verb|(x - y) >>> 0|\\
multiplication & \verb|x * y| & \verb|(x * y) >>> 0|\\
division & \verb|x / y| & \verb|(x / y) >>> 0|\\
modulus & \verb|x % y| & \verb|(x % y) >>> 0|\\
exponentiation & \verb|x ^ y| & \verb|(x ** y) >>> 0|\\
bitwise-and & \verb|x .& y| & \verb|x & y|\\
bitwise-or & \verb|x .& y| & \verb|x & y|\\
bitwise-xor & \verb|x .^ y| & \verb|x ^ y|\\
bitwise-not & \verb|.!(y)| & \verb|~y|\\
bitwise-right-shift & \verb|x .>> y| & \verb|x >>> y|\\
bitwise-left-shift & \verb|x .<< y| & \verb|x << y|\\
greater-than & \verb|x .> y| & \verb|x > y ? 1 : 0|\\
less-than & \verb|x .< y| & \verb|x < y ? 1 : 0|\\
equals & \verb|x .= y| & \verb|x === y ? 1 : 0|\\
float-addition & \verb|x +f y| & \verb|x + y|\\
float-subtraction & \verb|x -f y| & \verb|x - y|\\
float-multiplication & \verb|x *f y| & \verb|x * y|\\
float-division & \verb|x /f y| & \verb|x / y|\\
float-modulus & \verb|x %f y| & \verb|x % y|\\
float-exponentiation & \verb|x ^f y| & \verb|x ** y|\\
uint-to-float & \verb|.f(y)| & -\\
float-to-uint & \verb|.u(y)| & -\\
\end{tabular}
\\ \hfill \newline
There is no operator precedence: parenthesis are always placed on the right.
That means 3 * 10 + 1 is parsed as 3 * (10 + 1). If you want the multiplication
to occur first, you must be explicit:
\begin{lstlisting}
main : Word
(3 * 10) + 1
\end{lstlisting}
\subsubsection{If-Then-Else}
\verb|if| allows branching with a \verb|Word| condition.
\hfill \newline
\begin{tabular}{ c | c }
syntax & description \\\hline
\verb|if n: a else: b| & If \verb|n .= 0|, evaluates to \verb|b|, else, evaluates to \verb|a|
\end{tabular}
\\ \hfill \newline
Usage is straightforward:
\begin{lstlisting}
main : Output
let age = 30
if age .< 18:
print("boring teenager")
else:
print("respect your elders!")
\end{lstlisting}
\subsubsection{Pairs}
Native pairs store two elements of possibly different types.
\hfill \newline
\begin{tabular}{ c | c }
syntax & description \\\hline
\verb|[x : A, B(x)]| & The type of a pair\\
\verb|[a, b]| & Creates a pair with elements \verb|a| and \verb|b|\\
\verb|fst(p) | & Extracts the first element of a pair\\
\verb|snd(p) | & Extracts the second element of a pair\\
\verb|get [a, b] = p ... | &Extracts both elements of a pair\\
\end{tabular}
\\ \hfill \newline
Note that the type of a pair is \verb|[x : A, B(x)]|, because the type of the
second element can depend on the value of the first. When it doesn't, you can
write just \verb|[:A, B]| instead. Using pairs is straightforward. Examples:
\paragraph{Creating:}
\begin{lstlisting}
main : [:Word, Word]
[1, 2]
\end{lstlisting}
\paragraph{Extracting the first element:}
\begin{lstlisting}
main : Word
let pair = [1, 2]
fst(pair)
\end{lstlisting}
\paragraph{Extracting both elements:}
\begin{lstlisting}
main : Word
let pair = [1, 2]
get [a,b] = pair
a + b
\end{lstlisting}
\paragraph{Nesting to the left:}
\begin{lstlisting}
import Base@0