-
Notifications
You must be signed in to change notification settings - Fork 17
/
white_paper.tex
823 lines (647 loc) · 37.5 KB
/
white_paper.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
%%% COMPILE WITH XELATEX, NOT PDFLATEX
\documentclass[letterpaper]{article}
\author{L.M Goodman}
\date{September 2, 2014}
\title{Tezos --- a self-amending crypto-ledger \\ White paper}
%\usepackage[utf8]{inputenc}
%%\setlength{\parskip}{\baselineskip}
\usepackage{amsfonts}
\usepackage{listings}
\usepackage{color}
\usepackage{courier}
\usepackage{epigraph}
\usepackage{fontspec}
\usepackage{newunicodechar}
\usepackage{graphicx}
\usepackage{siunitx}
\usepackage{url}
\usepackage[hidelinks]{hyperref}
\newcommand\change[1]{\textcolor{red}{#1}}
%\epigraphfontsize{\small\itshape}
\setlength\epigraphwidth{4.6cm}
\setlength\epigraphrule{0pt}
%\DeclareUnicodeCharacter{42793}{\tz{}}
%\newunicodechar{⚓}{\anchor}
%Ꜩ
\usepackage{url}
\lstset{basicstyle=\footnotesize\ttfamily,breaklines=true}
\newcommand{\tz}{{\fontspec{DejaVu Sans} \small{ꜩ}}}
\begin{document}
\maketitle
\change{Changes between the original paper and our current implementation are indicated in red.}
\epigraph{\emph{``Our argument is not flatly circular,
but something like it.''}}
{--- \textup{Willard van Orman Quine}}
\begin{abstract}
We present Tezos, a generic and self-amending crypto-ledger. Tezos can
instantiate any blockchain based ledger. The operations of a regular blockchain
are implemented as a purely functional module abstracted into a shell
responsible for network operations. Bitcoin, Ethereum, Cryptonote, etc. can all
be represented within Tezos by implementing the proper interface to the network
layer.
Most importantly, Tezos supports meta upgrades: the protocols can evolve by
amending their own code. To achieve this, Tezos begins with a seed protocol
defining a procedure for stakeholders to approve amendments to the protocol,
\emph{including} amendments to the voting procedure itself. This is not unlike
philosopher Peter Suber's Nomic\cite{Nomic}, a game built around a fully
introspective set of rules.
In addition, Tezos's seed protocol is based on a pure proof-of-stake system
and supports Turing complete smart contracts. Tezos is implemented in OCaml,
a powerful functional programming language offering speed, an unambiguous
syntax and semantic, and an ecosystem making Tezos a good candidate for formal
proofs of correctness.
Familiarity with the Bitcoin protocol and basic cryptographic primitives are
assumed in the rest of this paper.
\end{abstract}
\newpage
\tableofcontents
\newpage
\section{Introduction}
In the first part of this paper, we will discuss the concept of abstract
blockchains and the implementation of a self-amending crypto-ledger.
In the second part, we will describe our proposed seed protocol.
\section{Self-amending cryptoledger}
A blockchain protocol can be decomposed into three distinct protocols:
\begin{itemize}
\item[-] The network protocol discovers blocks and broadcasts transactions.
\item[-] The transaction protocol specifies what makes a transaction valid.
\item[-] The consensus protocol forms consensus around a unique chain.
\end{itemize}
Tezos implements a generic network shell. This shell is agnostic to the
transaction protocol and to the consensus protocol. We refer to the transaction
protocol and the consensus protocol together as a ``blockchain protocol''. We
will first give a mathematical representation of a blockchain protocol and then
describe some of the implementation choices
in Tezos.
\subsection{Mathematical representation}
A blockchain protocol is fundamentally a monadic implementation of concurrent
mutations of a global state. This is achieved by defining ``blocks'' as
operators acting on this global state. The free monoid of blocks acting on the
genesis state forms a tree structure. A global, canonical, state is defined as
the minimal leaf for a specified ordering.
This suggests the following abstract representation:
\begin{itemize}
\item[-]Let $(\mathbf{S},\leq)$ be a totally ordered, countable, set of possible
states.
\item[-]Let $\oslash \notin \mathbf{S}$ represent a special, invalid, state.
\item[-]Let $\mathbf{B} \subset \mathbf{S}^{\mathbf{S} \cup \{\oslash\}}$ be the
set of blocks. The set of \emph{valid} blocks is
$\mathbf{B} \cap \mathbf{S}^{\mathbf{S}}$.
\end{itemize}
The total order on $\mathbf{S}$ is extended so that
$\forall s \in \mathbf{S}, \oslash < s$.
This order determines which leaf in the block tree is considered to be the
canonical one. Blocks in $\mathbf{B}$ are seen as operators acting on the state.
All in all, any blockchain protocol\footnote{GHOST is an approach which orders
the leafs based on properties of the tree. Such an approach is problematic for
both theoretical and practical reasons. It is almost always better to emulate it
by inserting proofs of mining in the main chain.} (be it Bitcoin, Litecoin,
Peercoin, Ethereum, Cryptonote, etc) can be fully determined by the tuple:
$$\left(\mathbf{S},\leq,\oslash,
\mathbf{B} \subset \mathbf{S}^{\mathbf{S} \cup \{\oslash\}}\right)$$
The networking protocol is fundamentally identical for these blockchains.
``Mining'' algorithms are but an emergent property of the network,
given the incentives for block creation.
In Tezos, we make a blockchain protocol introspective
by letting blocks act on the protocol itself.
We can then express the set of protocols recursively as
$$\mathcal{P} = \left\{\left(\mathbf{S},\leq,\oslash,\mathbf{B} \subset
\mathbf{S}^{(\mathbf{S} \times \mathcal{P})\cup \{\oslash\}} \right)\right\}$$
\subsection{The network shell}
This formal mathematical description doesn't tell us \emph{how} to build the
block tree. This is the role of the network shell, which acts as an interface
between a gossip network and the protocol.
The network shell works by maintaining the best chain known to the client. It is
aware of three type of objects. The first two are transactions and blocks, which
are only propagated through the network if deemed valid. The third are
protocols, OCaml modules used to amend the existing protocol. They will be
described in more details later on. For now we will focus on transaction and
blocks.
The most arduous part of the network shell is to protect nodes against
denial-of-service attacks.
\subsubsection{Clock}
Every block carries a timestamp visible to the network shell. Blocks that appear
to come from the future are buffered if their timestamps are within a few
minutes of the system time and rejected otherwise. The protocol design must
tolerate reasonable clock drifts in the clients and must assume that timestamps
can be falsified.
\subsubsection{Chain selection algorithm}
The shell maintains a single chain rather than a full tree of blocks. This chain
is only overwritten if the client becomes aware of a strictly better chain.
Maintaining a tree would be more parsimonious in terms of network communications
but would be susceptible to denial-of-service attacks where an attacker produces
a large number of low-scoring but valid forks.
Yet, it remains possible for a node to lie about the score of a given
chain, a lie that the client may only uncover after having processed a
potentially large number of blocks. However, such a node can be subsequently
ignored.
Fortunately, a protocol can have the property that low scoring chains exhibit a
low rate of block creation. Thus, the client would only consider a few blocks of
a ``weak'' fork before concluding that the announced score was a lie.
\subsubsection{Network level defense}
In addition, the shell is ``defensive''.
It attempts to connect to many peers across various IP ranges. It detects
disconnected peers and bans malicious nodes.
To protect against certain denial of service attacks, the protocol provides the
shell with context dependent bounds on the size of blocks and transactions.
\subsection{Functional representation}
\subsubsection{Validating the chain}
We can efficiently capture almost all the genericity
of our abstract blockchain structure with the following OCaml types.
To begin with, a block header is defined as:
\lstset{
language=[Objective]Caml
}
\begin{lstlisting}
type raw_block_header = {
pred: Block_hash.t;
header: Bytes.t;
operations: Operation_hash.t list;
timestamp: float;
}
\end{lstlisting}
We are purposefully not typing the header field more strongly so it can
represent arbitrary content. However, we do type the fields necessary for the
operation of the shell. These include the hash of the preceding block, a list of
operation hashes and a timestamp. In practice, the operations included in a
block are transmitted along with the blocks at the network level. Operations
themselves are represented as arbitrary blobs.
\begin{lstlisting}
type raw_operation = Bytes.t
\end{lstlisting}
The state is represented with the help of a \textbf{Context} module which
encapsulates a disk-based immutable key-value store. The structure of a
key-value store is versatile and allows us to efficiently represent a wide
variety of states.
\begin{lstlisting}
module Context = sig
type t
type key = string list
val get: t -> key -> Bytes.t option Lwt.t
val set: t -> key -> Bytes.t -> t Lwt.t
val del: t -> key -> t Lwt.t
(*...*)
end
\end{lstlisting}
To avoid blocking on disk operations, the functions use the asynchronous monad
Lwt\cite{LWT}. Note that the operations on the context are purely functional:
\textbf{get} uses the \textbf{option} monad rather than throwing an exception
while \textbf{set} and \textbf{del} both return a new \textbf{Context}.
The \textbf{Context} module uses a combination of memory caching and disk
storage to efficiently provide the appearance of an immutable store.
We can now define the module type of an arbitrary blockchain protocol:
\begin{lstlisting}
type score = Bytes.t list
module type PROTOCOL = sig
type operation
val parse_block_header : raw_block_header -> block_header option
val parse_operation : Bytes.t -> operation option
val apply :
Context.t ->
block_header option ->
(Operation_hash.t * operation) list ->
Context.t option Lwt.t
val score : Context.t -> score Lwt.t
(*...*)
end
\end{lstlisting}
We no longer compare states directly as in the mathematical model, instead we
project the \textbf{Context} onto a list of bytes using the \textbf{score}
function. List of bytes are ordered first by length, then by
lexicographic order. This is a fairly generic structure, similar to the one used
in software versioning, which is quite versatile in representing various
orderings.
Why not define a comparison function within the protocol modules? First off it
would be hard to enforce the requirement that such a function represent a
\emph{total} order. The score projection always verifies this (ties can be
broken based on the hash of the last block). Second, in principle we need
the ability to compare states across distinct protocols. Specific protocol
amendment rules are likely to make this extremely unlikely to ever happen,
but the network shell does not know that.
The operations \textbf{parse\_block\_header} and \textbf{parse\_operation} are
exposed to the shell and allow it to pass fully typed operations and blocks to
the protocol but also to check whether these operations and blocks are
well-formed, before deciding to relay operations or to add blocks to the local
block tree database.
The apply function is the heart of the protocol:
\begin{itemize}
\item[-]When it is passed a block header and the associated list of operations,
it computes the changes made to the context and returns a modified copy.
Internally, only the difference is stored, as in a versioning system,
using the block's hash as a version handle.
\item[-]When it is only passed a list of operations, it greedily attempts
to apply as many operations as possible. This function is not necessary for the
protocol itself but is of great use to miners attempting to form valid blocks.
\end{itemize}
\subsubsection{Amending the protocol}
Tezos' most powerful feature is its ability to implement protocol capable
of self-amendment. This is achieved by exposing two procedures functions to the
protocol:
\begin{itemize}
\item[-] \textbf{set\_test\_protocol} which replaces the protocol
used in the testnet with a new protocol (typically one that has been adopted
through a stakeholder voter).
\item[-] \textbf{promote\_test\_protocol} which replaces the current
protocol with the protocol currently being tested
\end{itemize}
These functions transform a Context by changing the associated protocol.
The new protocol takes effect when the following block is applied to the chain.
\begin{lstlisting}
module Context = sig
type t
(*...*)
val set_test_protocol: t -> Protocol_hash.t Lwt.t
val promote_test_protocol: t -> Protocol_hash.t -> t Lwt.t
end
\end{lstlisting}
The \textbf{protocol\_hash} is the \textbf{sha256} hash of a tarball of
\textbf{.ml} and \textbf{.mli} files. These files are compiled on the
fly. They have access to a small standard library but are sandboxed
and may not make any system call.
These functions are called through the \textbf{apply} function of the protocol
which returns the new \textbf{Context}.
Many conditions can trigger a change of protocol. In its simplest version,
a stakeholder vote triggers a change of protocol. More complicated rules
can be progressively voted in. For instance, if the stakeholder desire they
may pass an amendment that will require further amendments to provide a
computer checkable proof that the new amendment respects certain properties.
This is effectively an algorithmic check of ``constitutionality''.
\subsubsection{RPC}
In order to make the GUI building job's easier, the protocol exposes a JSON-RPC
API. The API itself is described by a json schema indicating the types of the
various procedures. Typically, functions such as \textbf{get\_balance} can
be implemented in the RPC.
\begin{lstlisting}
type service = {
name : string list ;
input : json_schema option ;
output : json_schema option ;
implementation : Context.t -> json -> json option Lwt.t
}
\end{lstlisting}
The name is a list of string to allow namespaces in the procedures. Input and
output are optionally described by a json schema.
Note that the call is made on a given context which is typically a recent ancestor
of the highest scoring leaf. For instance, querying the context six blocks above
the highest scoring leaf displays the state of the ledger with six confirmations.
The UI itself can be tailored to a specific version of the protocol, or generically
derived from the JSON specification.
\section{Seed protocol}
Much like blockchains start from a genesis hash, Tezos starts with a seed
protocol. This protocol can be amended to reflect virtually any blockchain based
algorithm.
\subsection{Economy}
\subsubsection{Coins}
There are initially $\num{10000000000}$ (ten billion) coins \change{(the initial extent of the token supply will be the number of tokens issued during the crowdsale and not specifically “10 billion”, which was merely a placeholder. This change in size has no effect on the principal at hand)}, divisible up to two
decimal places \change{(for the sake of precision we may in actuality be using eight digits after the decimal)}. We suggest that a single coin be referred to as a ``tez''
and that the smallest unit simply as a cent. We also suggest to use the
symbol \tz~(\verb!\ua729!, ``Latin small letter tz'') to represent a tez.
Therefore 1 cent = \tz\num{0.01} = one hundredth of a tez.
\subsubsection{Mining and signing rewards}
\paragraph{Principle}
We conjecture that the security of any decentralised currency requires
to incentivize the participants with a pecuniary reward \change {(we are in the process of finalizing the rewards schedule at the moment)}. As explained in the
position paper, relying on transaction costs alone suffers from a tragedy of the
commons. In Tezos, we rely on the combination of a bond and a reward.
Bonds are one year \change {(bonds will now only last a single cycle, given the high opportunity cost and little benefit to security of extending the bonding period past one cycle)} security deposits purchased by miners \change {(endorsers will also be required to purchase bonds)}.
In the event of a double signing, these bonds are forfeited.
After a year \change {(cycle)}, the miners \change {(and endorsers)} receive a reward along with their bond to compensate
for their opportunity cost. The security is primarily being provided by the
value of the bond and the reward need only be a small percentage of that value.
The purpose of the bonds is to diminish the amount of reward needed, and perhaps
to use the loss aversion effect to the network's advantage.
\paragraph{Specifics}
In the seed protocol, mining a block offers a reward of \tz\num{512} and
requires a bond of \tz\num{1536}. Signing a block offers a reward of
$32\Delta T^{-1}$ tez where $\Delta T$ is the time interval in minutes between
the block being signed and its predecessor. There are up to 16 signatures per block
and signing requires no bond.\change{These numbers were based on a supply of 10 billions
tokens and we will tweak them accordingly. We may increase the number of signatures
per block as well as we've found in simulations it can strongly increase
the difficulty of forks.}
Thus, assuming a mining rate of one block per minute, about 8\% of the initial
money mass should be held in the form of safety bonds after the first year.
\change{subject to change based on the adjustment of the parameters above}.
The reward schedule implies at most a 5.4\% \emph{nominal} inflation
rate \change {(the total block rewards will still start at about 5\% per year, but we may add an asymptotic cap to the total number of tokens. We think it’s irrelevant when the governance model is aligned with the token holder’s interest, but it is important to some people so we’re reluctantly considering it)}. \emph{Nominal} inflation is neutral, it neither enriches nor
impoverishes anyone\footnote{In contrast, Bitcoin's mining inflation impoverishes
Bitcoin holders as a whole, and central banking enriches the financial
sector at the expense of savers}.
Note that the period of a year is determined from the block's timestamps, not
from the number of blocks. This is to remove uncertainty as to the length of
the commitment made by miners.
\paragraph{Looking forward}
The proposed reward gives miners a 33\% return on their bond \change {(we're currently revising these parameters but will soon finalize a method that makes sense for all parties)}.
This return needs to be high in the early days as miners and signers commit
to hold a potentially volatile asset for an entire year \change {(bonds will only last for a cycle and not a full year)}.
However, as Tezos matures, this return could be gradually lowered to the
prevailing interest rate. A nominal rate of inflation below 1\% could safely be
achieved, though it's not clear there would be any point in doing so.
\subsubsection{Lost coins}
In order to reduce uncertainty regarding the monetary mass, addresses
showing no activity for over a year (as determined by timestamps)
are destroyed along with the coins they contain \change {(inactive addresses will no longer lose their funds after one year as initially proposed in the white paper, they will only lose their staking rights until they become active again. What it means is that if an address is inactive, it will not be selected to create blocks (which would slow down the consensus algorithm), and it will not be allowed to vote until it is reactivated (to avoid uncertainty about participation rate)}.
\subsubsection{Amendment rules}
Amendments are adopted over election cycles lasting $N = 2^{17} = \num{131072}$
blocks each. Given the one minute block interval, this is about three
calendar months. The election cycle is itself divided in four quarters of
$2^{15} = \num{32768}$ blocks. This cycle is relatively short to encourage early
improvements, but it is expected that further amendments will increase the
length of the cycle \change {(protocol upgrade votes will be much more frequent in the first year in order to allow for rapid iteration. As a security measure, the Tezos foundation will have a veto power expiring after twelve months, until we rule out any kinks in the voting procedure)}. Adoption requires a certain quorum to be met. This quorum
starts at $Q = 80\%$ but dynamically adapts to reflect the average
participation. This is necessary if only to deal with lost coins.
\paragraph{First quarter}
Protocol amendments are suggested by submitting the hash of a tarball of
\verb!.ml! and \verb!.mli! files representing a new protocol. Stakeholders may
approve of any number of these protocols. This is known as ``approval voting'',
a particularly robust voting procedure.
\paragraph{Second quarter}
The amendment receiving the most approval in the first quarter is now subject to
a vote. Stakeholders may cast a vote for, against or can choose to explicitly
abstain. Abstentions count towards the quorum.
\paragraph{Third quarter} If the quorum is met (including explicit abstentions),
and the amendment received $80\%$ of yays, the amendment is approved and
replaces the test protocol. Otherwise, it is rejected.
Assuming the quorum reached was $q$, the minimum quorum $Q$ is updated as such:
$$Q \leftarrow 0.8 Q + 0.2 q.$$
The goal of this update is to avoid lost coins causing the voting procedure to
become stuck over time. The minimum quorum is an exponential moving average of
the quorum reached over each previous election.
\paragraph{Fourth quarter} Assuming the amendment was approved, it will have
been running in the testnet since the beginning of the third quarter.
The stakeholders vote a second time to confirm they wish to promote the test
protocol to the main protocol. This also requires the quorum to be met and an
$80\%$ supermajority.
We deliberately chose a conservative approach to amendments. However,
stakeholders are free to adopt amendments loosening or tightening this policy
should they deem it beneficial
\subsection{Proof-of-stake mechanism}
\subsubsection{Overview}
Our proof-of-stake mechanism is a mix of several ideas, including
Slasher\cite{Slasher}, chain-of-activity\cite{CoA}, and proof-of-burn.
The following is a brief overview of the algorithm, the components of which
are explained in more details below.
Each block is mined by a random stakeholder (the miner) and includes
multiple signatures of the previous block provided by random
stakeholders (the signers). Mining and signing both offer a small reward but
also require making a one year \change{(unbonding will happen after one cycle, and not one year as initially suggested. Prolonging the period longer than a cycle did not really improve security at the cost of immobilizing a lot of capital)} safety deposit to be
forfeited in the event of a double mining or double signing.
The protocol unfolds in cycles of \num{2048} blocks. At the beginning of each
cycle, a random seed is derived from numbers that block miners chose and committed
to in the penultimate cycle, and revealed in the last. Using this random seed,
a follow the coin strategy is used to allocate mining rights and signing rights
to a specific address for the next cycle. See figure \ref{fig:pos_figure}.
\begin{figure}[b!]
\centering
\includegraphics[width=0.8\textwidth]{pos_figure.eps}
\caption{Four cycles of the proof-of-stake mechanism}
\label{fig:pos_figure}
\end{figure}
\subsubsection{Clock}
The protocol imposes minimum delays between blocks. In principle, each block
can be mined by any stakeholder. However, for a given block, each stakeholder
is subject to a random minimum delay. The stakeholder receiving the highest
priority may mine the block one minute after the previous block. The
stakeholder receiving the second highest priority may mine the block two
minutes after the previous block, the third, three minutes, and so on.
This guarantees that a fork where only a small fraction of stakeholder
contribute will exhibit a low rate of block creation. If this weren't
the case, a CPU denial of service attacks would be possible by
tricking nodes into verifying a very long chain claimed to have a very high
score.
\subsubsection{Generating the random seed}
Every block mined carries a hash commitment to a random number chosen by the
miner. These numbers must be revealed in the next cycle under penalty of
forfeiting the safety bond. This harsh penalty is meant to prevent selective
withholding of the numbers which could be used to attack the entropy of the seed.
Malicious miners in the next cycle could attempt to censor such reveals, however
since multiple numbers may be revealed in a single block, they are very unlikely
to succeed.
All the revealed numbers in a cycle are combined in a hash list and the seed is
derived from the root using the \verb!scrypt! key derivation function. The key
derivation should be tuned so that deriving the seed takes on the order of a
fraction of a percent of the average validation time for a block on a typical
desktop PC.
\subsubsection{Follow-the-coin procedure}
In order to randomly select a stakeholder, we use a follow the coin procedure.
\paragraph{Principle}
The idea is known in bitcoin as follow-the-satoshi. The procedures works
``as-if'' every satoshi ever minted had a unique serial number. Satoshis are
implicitly ordered by creation time, a random satoshi is drawn and tracked
through the blockchain. Of course, individual cents are not tracked directly.
Instead, rules are applied to describe what happens when inputs are combined and
spent over multiple output.
In the end, the algorithm keeps track of a set of intervals associated with each
key. Each interval represents a ``range'' of satoshis.
Unfortunately, over time, the database becomes more and more fragmented,
increasing bloat on the client side.
\paragraph{Coin Rolls}
We optimize the previous algorithm by constructing large ``coin rolls'' made up
of \num{10000} tez. There are thus about one million rolls in existence. A
database maps every roll to its current owner.
Each address holds a certain set of specific rolls as well as some loose change.
When we desire to spend a fraction of a full roll, the roll is broken and
its serial number is sent in a LIFO queue of rolls, a sort of ``limbo''. Every
transaction is processed in a way that minimizes the number of broken rolls.
Whenever an address holds enough coins to form a roll, a serial number is pulled
from the queue and the roll is formed again.
The LIFO priority ensures that an attacker working on a secret fork cannot
change the coins he holds by shuffling change between accounts.
A slight drawback of this approach is that stake is rounded down to the
nearest integer number of rolls. However, this provides a massive improvement
in efficiency over the follow-the-satoshi approach.
While the rolls are numbered, this approach does not preclude the use of
fungibility preserving protocols like Zerocash. Such protocols can use
the same ``limbo'' queue technique.
\paragraph{Motivation}
This procedure is functionally different from merely drawing a random address
weighted by balance.
Indeed, in a secretive fork, a miner could attempt to control the generation of
the random seed and to assign itself signing and minting rights by creating the
appropriate addresses ahead of time. This is much harder to achieve if rolls
are randomly selected, as the secretive fork cannot fake ownership of certain
rolls and must thus try to preimage the hash function applied to the seed to
assign itself signing and minting rights.
Indeed, in a cycle of length $N=\num{2048}$, someone holding a fraction $f$ of
the rolls will receive on average $f N$ mining rights, and the effective
fraction received, $f_0$ will have a standard deviation of
$$\sqrt{\frac{1}{N}}\sqrt{\frac{1-f}{f}}.$$
If an attacker can perform a brute-force search through $W$ different seeds,
then his expected advantage is at most\footnote{this is a standard bound
on the expectation of the maximum of W normally distributed variable}
$$\left(\sqrt{\frac{2\log(W)}{N}}\sqrt{\frac{1-f}{f}}\right)fN$$
blocks. For instance, an attacker controlling $f = 10\%$ of the rolls should
expect to mine about $205$ blocks per cycle. In a secret fork where he attempts
to control the seed, assuming he computed over a trillion hashes, he could
assign itself about $302$ blocks, or about $14.7\%$ of the blocks. Note that:
\begin{itemize}
\item[-] The hash from which the seed is derived is an expensive key derivation
function, rendering brute-force search impractical.
\item[-] To make linear gains in blocks mined, the attacked needs to expend a
quadratically exponential effort.
\end{itemize}
\subsubsection{Mining blocks}
The random seed is used to repeatedly select a roll. The first roll selected
allows its stakeholder to mine a block after one minute, the second one after
two minutes --- and so on.
When a stakeholder observes the seed and realizes he can mint a high priority
block in the next cycle, he can make a security deposit.
To avoid a potentially problematic situation were no stakeholder made a
safety deposit to mine a particular block, after a 16 minutes delay, the
block may be mined without a deposit.
Bonds are implicitly returned to their buyers immediately in any chain
where they do not mine the block.
\subsubsection{Signing blocks}
As it is, we almost have a working proof of stake system.
We could define a chain's weight to be the number of blocks.
However, this would open the door to a form of selfish mining.
We thus introduce a signing scheme. While a block is being minted, the random
seed is used to randomly assign 16 signing rights to 16 rolls.
The stakeholders who received signing rights observe the blocks being minted and
then submit signatures of that blocks. Those signatures are then included in
the next block, by miners attempting to secure their parent's inclusion in the
blockchain.
The signing reward received by signers is inversely proportional to the time
interval between the block and its predecessor.
Signer thus have a strong incentive to sign what they genuinely believe to be
the best block produced at one point. They also have a strong incentive to agree
on which block they will sign as signing rewards are only paid if the block ends
up included in the blockchain.
If the highest priority block isn't mined (perhaps because the miner isn't
on line), there could be an incentive for signers to wait for a while, just
in case the miner is late. However, other signers may then decide to sign the
best priority block, and a new block could include those signatures, leaving out
the holdouts. Thus, miners are unlikely to follow this strategy.
Conversely, we could imagine an equilibrium where signers panic and start
signing the first block they see, for fear that other signers will do so and
that a new block will be built immediately. This is however a very contrived
situation which benefits no one. There is no incentive for signers to think this
equilibrium is likely, let alone to modify the code of their program to act
this way. A malicious stakeholder attempting to disrupt the operations would only
hurt itself by attempting to follow this strategy, as others would be unlikely
to follow suit.
\subsubsection{Weight of the chain}
The weight is the number of signatures.
\subsubsection{Denunciations}
In order to avoid the double minting of a block or the double signing of a
block, a miner may include in his block a denunciation.
This denunciation takes the form of two signatures. Each minting signature
or block signature signs the height of the block, making the proof of malfeasance quite concise.
While we could allow anyone to denounce malfeasance, there is really no point to
allow anyone else beyond the block miner. Indeed, a miner can
simply copy any proof of malfeasance and pass it off as its own
discovery.\footnote{A zero-knowledge proof would allow anyone to benefit from
denouncing malfeasance, but it's not particularly clear this carries much
benefit.}
Once a party has been found guilty of double minting or double signing,
the safety bond is forfeited.
\subsection{Smart contracts}
\subsubsection{Contract type}
In lieu of unspent outputs, Tezos uses stateful accounts. When those
accounts specify executable code, they are known more generally as
contracts. Since an account is a type of contract (one with no
executable code), we refer to both as "contracts" in full generality.
Each contract has a ``manager", which in the case of an account is
simply the owner. If the contract is flagged as spendable, the manager
may spend the funds associated with the contract. In addition, each
contract may specify the hash of a public key used to sign or
mine blocks in the proof-of-stake protocol. The private key may or
may not be controlled by the manager.
Formally, a contract is represented as:
\begin{lstlisting}
type contract = {
counter: int; (* counter to prevent repeat attacks *)
manager: id; (* hash of the contract's manager public key *)
balance: Int64.t; (* balance held *)
signer: id option; (* id of the signer *)
code: opcode list; (* contract code as a list of opcodes *)
storage: data list; (* storage of the contract *)
spendable: bool; (* may the money be spent by the manager? *)
delegatable: bool; (* may the manager change the signing key? *)
}
\end{lstlisting}
The handle of a contract is the hash of its initial content. Attempting
to create a contract whose hash would collide with an existing contract
is an invalid operation and cannot be included in a valid block.
Note that data is represented as the union type.
\begin{lstlisting}
type data =
| STRING of string
| INT of int
\end{lstlisting}
where \verb!INT! is a signed 64-bit integer and string is an array of
up to \num{1024} bytes. The storage capacity is limited to \num{16384} bytes,
counting the integers as eight bytes and the strings as their length.
\subsubsection{Origination}
The origination operation may be used to create a new contract, it specifies
the code of the contract and the initial content of the contract's storage. If
the handle is already the handle of an existing contract, the origination is
rejected (there is no reason for this to ever happen, unless by mistake or
malice).
A contract needs a minimum balance of $\tz~\num{1}$ to remain active. If the
balance falls below this number, the contract is destroyed.
\subsubsection{Transactions}
A transaction is a message sent from one contract to another contract, this
messages is represented as:
\begin{lstlisting}
type transaction = {
amount: amount; (* amount being sent *)
parameters: data list; (* parameters passed to the script *)
(* counter (invoice id) to avoid repeat attacks *)
counter: int;
destination: contract hash;
}
\end{lstlisting}
Such a transaction can be sent from a contract if signed using the manager's key
or can be sent programmatically by code executing in the contract. When the
transaction is received, the amount is added to the destination contract's
balance and the destination contract's code is executed. This code can make use
of the parameters passed to it, it can read and write the contract's storage,
change the signature key and post transactions to other contracts.
The role of the counter is to prevent replay attacks. A transaction is only
valid if the contract's counter is equal to the transaction's counter. Once a
transaction is applied, the counter increases by one, preventing the transaction
from being reused.
The transaction also includes the block hash of a recent block that the client
considers valid. If an attacker ever succeeds in forcing a long reorganization
with a fork, he will be unable to include such transactions, making the fork
obviously fake. This is a last line of defense, TAPOS is a great system to
prevent long reorganizations but not a very good system to prevent short term
double spending.
The pair (account\_handle, counter) is roughly the equivalent of an unspent
output in Bitcoin.
\subsubsection{Storage fees}
Since storage imposes a cost on the network, a minimum fee of \tz~1 is assessed
for each byte increase in the storage. For instance, if after the execution of
a transaction, an integer has been added to the storage and ten characters have
been appended to an existing string in the storage, then \tz~18 will be withdrawn
from the contract's balance and destroyed.
\subsubsection{Code}
The language is stack based, with high level data types and primitives and strict
static type checking. Its design is inspired by Forth, Scheme, ML and Cat.
A full specification of the instruction set is available in\cite{language}.
This specification gives the complete instruction set, type system and semantics
of the language. It is meant as a precise reference manual, not an easy introduction.
\subsubsection{Fees}
So far, this system is similar to the way Ethereum handles transaction. However,
we differ in the way we handle fees. Ethereum allows arbitrarily long programs
to execute by requiring a fee that increases linearly with the program's
executing time. Unfortunately, while this does provide an incentive for one
miner to verify the transaction, it does not provide such an incentive to other
miners, who must also verify this transaction. In practice, most of the
interesting programs that can be used for smart contracts are very short.
Thus, we simplify the construction by imposing a hard cap on the number of steps
we allow the programs to run for.
If the hard cap proves too tight for some programs, they can break the execution
in multiple steps and use multiple transactions to execute fully. Since Tezos is
amendable, this cap can be changed in the future, or advanced primitives can be
introduced as new opcodes.
If the account permits, the signature key may be changed by issuing a signed
message requesting the change.
\section{Conclusion}
We feel we've built an appealing seed protocol. However, Tezos' true potential
lies in putting the stakeholders in charge of deciding on a protocol that they
feel best serves them.
\bibliographystyle{plain}
\bibliography{biblio}
\end{document}