-
Notifications
You must be signed in to change notification settings - Fork 3
Overlord for Lispers
Take a moment to think about how strange defvar
must seem coming from other programming languages.
To review: there are two ways to define a global variable (defparameter
and defvar
). The difference between them is how they behave when evaluated in the same session: defvar
only re-evaluates its definition if the variable has not already been defined.
(All examples are in the overlord-user
package, which uses Overlord as well as Alexandria and Serapeum).
(defparameter *table-1* (dict :x 1))
(gethash :x *table-1*) => 1
(defparameter *table-1* (dict :x 2))
;; The new value overrides the old value.
(gethash :x *table-1*) => 2
(defvar *table* (dict :x 1))
(gethash :x *table*) => 1
(defvar *table* (dict :x 2))
;; The old value is preserved.
(gethash :x *table*) => 1
Why do we need defvar
? To hold state that we expect to change during the program’s run. And why do we need defvar
to behave the way it does – not to re-evaluate its form? To avoid destroying the state we have built up in the course of development when we re-load our source files.
(If you think about it, defvar
summarizes a lot of what is distinctive about development in Lisp.)
But there’s something familiar about the behavior of defvar
. It’s a bit like a target in a build system. Specifically, it’s a bit like an order-only dependency in a Makefile.
In Make order-only dependencies are the canonical way to depend on the existence of a directory. If the directory does not exist, it is created; if the directory already exists, nothing happens.
You need a directory to exist before you can start writing to files in it. A variable defined using defvar
, that holds a hash table, so you can write to the keys of that hash table later in your program, is morally the same thing.
State in the Lisp image is analogous to state in the file system, and we can tame it the same way we tame the file system: with a sufficiently expressive build system.
(Note that Overlord also builds files: the point is that state in the Lisp system and state in the file system have enough in common that the difference can be hidden behind a single protocol. The build system doesn't have to know or care if a target is a variable or a file.)
The most annoying thing about defvar
is that if you change the definition, nothing happens – the variable is not updated. Now consider overlord:define-var-once
.
(define-var-once *table* (dict :x 1))
(gethash :x *table*) => 1
(incf (gethash :x *table*))
(define-var-once *table* (dict :x 1))
(gethash :x *table*) => 2
(define-var-once *table* (dict :x 3))
(gethash :x *table*) => 3
A define-var-once
form behaves the same as defvar
– unless the definition is changed. If the definition changes, the variable is considered out of date.
For defvar
, the variable is considered out of date under only one condition:
- the variable is unbound
But for define-var-once
, the variable is considered out of date under either of two conditions:
- the variable is unbound
- the definition in the
define-var-once
form has changed
Would it be useful to be able to add other conditions? Consider a variable that reads in data from a file.
(defvar *file-lines*
(let ((file (asdf:system-relative-pathname :my-project "my-file.txt")))
(lines (read-file-into-string file))))
Using defvar
saves effort: the file will only be read in once. But what if the file changes? Then we have to specifically ask Lisp to re-evaluate the definition.
This is a job for overlord:define-target-var
.
(defparameter *my-file*
(asdf:system-relative-pathname :my-project "my-file.txt"))
(define-target-var *file-lines*
(lines (read-file-into-string *my-file*))
(depends-on *my-file*))
Now we have a variable that will be re-evaluated under three conditions:
- if the variable is unbound
- if the definition changes
- if the file found at
*my-file*
changes
A target can be rebuilt by re-evaluating its definition, but it can also be rebuilt by name with build
:
(overlord:build '*file-lines*)
The new definition of *file-lines*
is now much more precise. But there is an obvious flaw: what if you change the name of the file? Then *file-lines*
will be out of date, because it will contain the data from the wrong file.
Here’s a better version:
(defconfig +my-file+
(asdf:system-relative-pathname :my-project "my-file.txt"))
(define-target-var *file-lines*
(lines (read-file-into-string +my-file+))
(depends-on '+my-file+)
(depends-on +my-file+))
Notice the new dependency. We now have a variable that will be re-evaluated under four conditions:
- if the variable is unbound
- if the definition changes
- if the file (initially) found at
+my-file+
changes - if the value of the variable
+my-file+
changes
The quotation mark in the depends-on
form creates a dependency on the variable itself, rather than on the value of the variable. Variables defined with Overlord can depend on on other variables.
(You should use a +cage+
instead of *earmuffs*
when defining variables with defconfig
, for reasons that are outside the scope of a tutorial.)
Up to this point we have pretended that depends-on
is declarative.
Consider a situation slightly similar to the above. In your project you have a file that holds a list of other files. You want to read all those files into a single string.
(defconfig +my-file+
(asdf:system-relative-pathname :my-project "my-file.txt"))
(define-target-var *file-lines*
(lines (read-file-into-string +my-file+))
(depends-on '+my-file+)
(depends-on +my-file+))
(define-target-var *big-string*
(with-output-to-string (s)
(dolist (file *file-lines*)
(write-string
(read-file-into-string
(file file)))))
(depends-on '*file-lines*)
;;; !!!
(dolist (file *file-lines*)
(depends-on (uiop:parse-unix-namestring file))))
You can see that depends-on
is actually just a function: not a keyword, not a macro; just a function.
(You can even call depends-on
outside of the definition of a target. This records a dependency for the current package; yes, packages are targets too.)
Warning: the above example is not idiomatic. If you have a list of targets, instead of using dolist
, you should just call depends-on
; it automatically flattens its arguments.
(depends-on (mapcar #’uiop:parse-unix-namestring *file-lines*))
At this point we need to stop and address: how does Overlord find your files? This is the subject of a separate article. Short answer: if your package and system have different names, you may need to use overlord:set-package-system
to associate them.
Now that we know how file names are resolved, we are ready to introduce file-target
and defpattern
.
Up to this point, we’ve been comparing Overlord to Make, but now that we are actually building files, it must be admitted that Overlord does not use the Make model; it is actually based on Redo. This is why it doesn't matter where you define your dependencies: unlike Make (or ASDF, for that matter) Overlord, being based on Redo, is naturally recursive.
Here is an example from the documentation of the Apenwarr implementation of Redo. It does a good job of showing how a Redo-style system differs from the Make model. In this case we are compiling a C file. As a side effect of the compilation, a list of header files is written to disk. We then read the file and depend on the header files it contains.
(defpattern c-object-file (:in in :out out) ()
(depends-on in)
(let ((in.d (path-join in (extension "d")))
(in.c (path-join in (extension "c"))))
(cmd "gcc -MD -MF" in.d "-c -o" out in.c)
(let ((deps (read-file-into-string in.d)))
(depends-on (mapcar #’uiop:parse-unix-namestring (lines deps))))))
(file-target my-prog (:path "myprog" :out out)
(let ((deps (list #p"a.o" #p"b.o")))
(depends-on
(loop for dep in deps
collect (pattern-from 'c-object-file dep)))
(cmd "gcc -o" out deps)))
The :path
argument to file-target
is optional; if you omit it the path is derived by downcasing (actually, case-inverting) the name of the target.
Note that both patterns and files must have names. In the case of a pattern the name is bound as a class; in the case of a file it is bound as a global lexical (or a special variable, if the name has *earmuffs*
).
Giving things names is the cost of doing everything inside Lisp. In one Lisp image there could be multiple projects using the same relative pathnames, or defining different ways to build files based on their extension. Names are necessary to differentiate them.
cmd
is a DSL for shell commands. You should use it when possible, as it takes care of ensuring that the command is run in the right directory, even in a multi-threaded build.
defpattern
defines a class. Equivalently, you could define the class using defclass
and specialize some generic functions. But you should use defpattern
if you can, since only patterns defined using defpattern
will be considered out of date if the definition changes.
The syntax of file-target
is complex, since it needs to address different scenarios:
- Is the input file name hard-coded?
- Should creating the file be atomic?
Creating files atomically is preferable – which is why there is syntactic support for it – but it is, unfortunately, sometimes impractical. In particular, surprisingly many command-line tools insist on generating an output file under a name derived from the input file and do not accept options to redirect to a different file or to stdout.
The build script for a file target is not required to update the file it builds (unless the file does not exist). The utility write-file-if-changed takes a string, or an array of bytes, and writes them out to a file only if they are different from the file's existing contents.
You can depend on whether a directory exists using directory-exists
target. As long as the directory exists, the target is always considered up to date. If the directory does not exist, the target is considered out of date. Building the target simply creates the directory.
The stamp used for a file is a tuple of its last modification time and its size (according to stat). Using the size alongside the timestamp gets us most of the practical benefit of using file hashes, but is far cheaper.
You can construct file digest prerequisites using file-digest
.
Oracles let you depend on specific pieces of the Lisp environment or the OS environment. They are short pieces of data that are essentially self-describing, like the value of *print-base*
or PATH
. The trick is to store the value (or a hash of the value) as its own stamp.
Oracles are prerequisites, but not targets: you can depend on them but they cannot be "built".
You can depend on the value of Lisp variables.
Lisp variable oracles are for depending on reader control variables, like *print-base*
or *read-default-float-format*
. If you want to depend on a variable you defined, you should use defconfig
and depend directly on the variable.
You can depend on the value of an OS environment variable, like CC
or PATH
.
You can depend on the declared version of an ASDF system.
You can depend on the version of a particular Quicklisp dist.
You can depend on whether a particular feature is present in *features*
.
You can wrap any named function as an oracle.
Configurations are global Lisp bindings. They have some of the qualities of variables (they can be rebound) and some of the qualities of constants (they are evaluated at compile time, not load time).
Configurations are fundamental to Overlord. Most target-defining forms implicitly depend on a configuration that holds a copy of the definition. (This is how Overlord detects redefinitions.)
Configurations with dependencies are evaluated, like simple configurations, at compile time. But they also have dependencies, so if their dependencies are out of date at load time – or whenever the form is re-evaluated – they get rebuilt anyway.
Configurations with dependencies are mostly useful to move expensive computations from load time to compile time.