-
Notifications
You must be signed in to change notification settings - Fork 0
/
interpreter.txt
179 lines (126 loc) · 5.48 KB
/
interpreter.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
A PYTHON INTERPRETER
Python is a programming language while CPython is the primary
implementation of Python in C (and Python - yeah, I know).
A Python interpreter: an interpreter in the Python context
is a generic term. It could mean one of the following:
i) The Python REPL
ii) Running the Python code from start to finish.
In this mini-study, the term "INTERPRETER" focuses
on the final step in the execution of a Python program.
Before the interpreter assumes control; The lexer, parser,
and compiler acts on the source text to convert it to
structured code objects that contains instructions that the
interpreter understands.
NB: Python has a compiling step
The interpreter takes the code object and follows
the instructions.
NB: The compilation step in Python execution process does
relatively less work (the interpreter does more) than in a
compiled language.
BUILDING THE INTERPRETER
Virtual machines are software that emulates physical
computer. The Python interpreter is a **virtual machine**
which works like a stack machine: it manipulates stack(s)
to perform its operations (as opposed to a register machine
which reads and write to a particular memory locations).
why?
The interpreter is a bytecode interpreter. It takes
instruction sets called **bytecode** as input.
The lexer, parser, and compiler generate code objects.
Each code object contains a set of instructions to be
executed - bytecode - and other information that the
interpreter requires.
code object = bytecode + other information
In building the interpreter, the operations are
represented as methods in an Interpreter class e.g
LOAD_VALUE, ADD_TWO_VALUES, PRINT_NUMBER etc. Each of
these operations act on a stack (represented as a list
object inside the interpreter).
LOAD_VALUE: adds element to the stack
ADD_TWO_VALUES: pops two elements from the stack, add them
together and add the result to the stack
PRINT_NUMBER: pops an element from the stack, and prints
the element.
VARIABLES
Variables require an instruction for storing the value of
a variable, STORE_NAME; an instruction for retrieving the
value of a variable, LOAD_NAME; and a mapping of variable
names to values.
The mapping is achieved by defining an environment instance
variable - represented as a Python dictionary - on the
Interpreter class (this is a hack for now).
STORE_NAME: pops an element from the stack and store it on
the environment variable with the appropriate key.
LOAD_NAME: given a key, get the associated value and add it
to the stack
REAL PYTHON BYTECODE
One byte is used to represent each instruction while
two bytes are used to represent arguments (names and
constants).
Python source
-------------
1 def cond():
2 x = 3
3 if x < 5:
4 return 'yes'
5 else:
6 return 'no'
Equivalent bytecode for **cond**
--------------------------------
1 0 RESUME 0
2 2 LOAD_CONST 1 (3)
4 STORE_FAST 0 (x)
3 6 LOAD_FAST 0 (x)
8 LOAD_CONST 2 (5)
10 COMPARE_OP 2 (<)
14 POP_JUMP_IF_FALSE 1 (to 18)
4 16 RETURN_CONST 3 ('yes')
6 >> 18 RETURN_CONST 4 ('no')
Each line has about five(5) columns; and they are
explained thus:
first column - line of code in Python source
second column - bytecode index
third column - instruction itself
fourth column - argument to the instruction
fifth column - hint to what the argument means
Conditional and Loops in Python relies on jumping from
an instruction index to another.
FRAMES
Python interpreter is a stack machine that perform operations by
pushing and popping values on the stack.
Frames are another layer of complexity housed in the Python
interpreter. They are created and destroyed on the fly
during program execution.
A frame is associated with each module, function call
and class definition.
while each frame has a code object associated with it, a
code object can have multiple frames.
There are also three types of stacks to consider:
- frames live on the **call stack** (think traceback of exceptions.
Each line starting with "File "<stdin>", line number, ..."
corresponds to a frame on the call stack).
- **data stack** is the one we've be manipulating in our make-shift
interpreter.
- **block stack** is used for certain kinds of control flow,
looping and exception handling.
NB: Each frame on the call stack has a DATA and BLOCK stack.
The bytecode instruction, RETURN_VALUE, which corresponds
to **return** statement in the code is used to pass values
between frames.
i) the top value is popped off the data stack on the
top frame on the call stack.
ii) the entire top frame is then popped off.
iii) the value from (i) above is pushed into the data
stack of the next frame.
EXAMINING THE INTERPRETER
VirtualMachine: It is the highest level of the interpreter.
It manages the call stack and mapping of instructions to operation.
Only one instance will be created each time the program is run.
Frame: An instance has one code object and manages the namespaces,
holds a reference to the calling frame and the last bytecode
instruction executed
Function: It is used in place of Python function so we can
control the creation of new frames. Every function call
creates a new frame.
Block: Wraps the three attributes of blocks (required so our interpreter
can run real Python code)