-
Notifications
You must be signed in to change notification settings - Fork 0
Examples
The examples below show a few of the myriad ways streams can be exploited, as well as a few ways they can trip the unwary user. All the examples are drawn from published sources; it is instructive to compare the Scheme versions to the originals in other languages.
Infinite streams As a simple illustration of infinite streams, consider this definition of the natural numbers:
(define nats (stream-cons 0 (stream-map add1 nats))) The recursion works because it is offset by one from the initial stream-cons. Another sequence that uses the offset trick is this definition of the fibonacci numbers:
(define fibs (stream-cons 1 (stream-cons 1 (stream-map + fibs (stream-cdr fibs))))) Yet another sequence that uses the same offset trick is the Hamming numbers, named for the mathematician and computer scientist Richard Hamming, defined as all numbers that have no prime factors greater than 5; in other words, Hamming numbers are all numbers expressible as 2i·3j·5k, where i, j and k are non-negative integers. The Hamming sequence starts with 1 2 3 4 5 6 8 9 10 12 and is computed starting with 1, taking 2, 3 and 5 times all the previous elements with stream-map, then merging sub-streams and eliminating duplicates.
(define hamming (stream-cons 1 (stream-unique = (stream-merge < (stream-map (lsec * 2) hamming) (stream-map (lsec * 3) hamming) (stream-map (lsec * 5) hamming))))) It is possible to have an infinite stream of infinite streams. Consider the definition of power-table:
(define power-table (stream-of (stream-of (expt m n) (m in (stream-from 1))) (n in (stream-from 2)))) which evaluates to an infinite stream of infinite streams:
(stream (stream 1 4 9 16 25 ...) (stream 1 8 27 64 125 ...) (stream 1 16 81 256 625 ...) ...) But even though it is impossible to display power-table in its entirety, it is possible to select just part of it:
(stream->list 10 (stream-ref power-table 1)) ⇒ (1 8 27 64 125 216 343 512 729 1000) This example clearly shows that the elements of a stream are computed lazily, as they are needed; (stream-ref power-table 0) is not computed, even when its successor is displayed, since computing it would enter an infinite loop.
Chris Reade shows how to calculate the stream of prime numbers according to the sieve of Eratosthenes, using a method that eliminates multiples of the sifting base with addition rather than division:
(define primes (let () (define-stream (next base mult strm) (let ((first (stream-car strm)) (rest (stream-cdr strm))) (cond ((< first mult) (stream-cons first (next base mult rest))) ((< mult first) (next base (+ base mult) strm)) (else (next base (+ base mult) rest))))) (define-stream (sift base strm) (next base (+ base base) strm)) (define-stream (sieve strm) (let ((first (stream-car strm))> (rest (stream-cdr strm))) (stream-cons first (sieve (sift first rest))))) (sieve (stream-from 2)))) A final example of infinite streams is a functional pearl from Jeremy Gibbons, David Lester and Richard Bird that enumerates the positive rational numbers without duplicates:
(define rats (stream-iterate (lambda (x) (let* ((n (floor x)) (y (- x n))) (/ (- n -1 y)))) 1)) Backtracking via the stream of successes Philip Wadler describes the stream of successes technique that uses streams to perform backtracking search. The basic idea is that each procedure returns a stream of possible results, so that its caller can decide which result it wants; an empty stream signals failure, and causes backtracking to a previous choice point. The stream of successes technique is useful because the program is written as if to simply enumerate all possible solutions; no backtracking is explicit in the code.
The Eight Queens puzzle, which asks for a placement of eight queens on a chessboard so that none of them attack any other, is an example of a problem that can be solved using the stream of successes technique. The algorithm is to place a queen in the first column of a chessboard; any column is satisfactory. Then a queen is placed in the second column, in any position not held in check by the queen in the first column. Then a queen is placed in the third column, in any position not held in check by the queens in the first two columns. And so on, until all eight queens have been placed. If at any point there is no legal placement for the next queen, backtrack to a different legal position for the previous queens, and try again.
The chessboard is represented as a stream of length m, where there are queens in the first m columns, each position in the stream representing the rank on which the queen appears in that column. For example, stream 4 6 1 5 2 8 3 7 represents the following chessboard:
Chessboard
Two queens at column i row j and column m row n check each other if their columns i and m are the same, or if their rows j and n are the same, or if they are on the same diagonal with i + j = m + n or i – j = m – n. There is no need to test the columns, because the placement algorithm enforces that they differ, so the check? procedure tests if two queens hold each other in check.
(define (check? i j m n) (or (= j n) (= (+ i j) (+ m n)) (= (- i j) (- m n)))) The algorithm walks through the columns, extending position p by adding a new queen in row n with (stream-append p (stream n)). Safe? tests if it is safe to do so, using the utility procedure stream-and.
(define (stream-and strm) (let loop ((strm strm)) (cond ((stream-null? strm) #t) ((not (stream-car strm)) #f) (else (loop (stream-cdr strm))))))
(define (safe? p n) (let* ((len (stream-length p)) (m (+ len 1))) (stream-and (stream-of (not (check? (car ij) (cadr ij) m n)) (ij in (stream-zip (stream-range 1 m) p)))))) Procedure (queens m) returns all the ways that queens can safely be placed in the first m columns.
(define (queens m) (if (zero? m) (stream (stream)) (stream-of (stream-append p (stream n)) (p in (queens (- m 1))) (n in (stream-range 1 9)) (safe? p n)))) To see the first solution to the Eight Queens problem, say
(stream->list (stream-car (queens 8)))
To see all 92 solutions, say
(stream->list (stream-map stream->list (queens 8))) There is no explicit backtracking in the code. The stream-of expression in queens returns all possible streams that satisfy safe?; implicit backtracking occurs in the recursive call to queens.
Generators and co-routines It is possible to model generators and co-routines using streams. Consider the task, due to Carl Hewitt, of determining if two trees have the same sequence of leaves:
(same-fringe? = '(1 (2 3)) '((1 2) 3)) ⇒ #t
(same-fringe? = '(1 2 3) '(1 (3 2))) ⇒ #f
The simplest solution is to flatten both trees into lists and compare them element-by-element:
(define (flatten tree) (cond ((null? tree) '()) ((pair? (car tree)) (append (flatten (car tree)) (flatten (cdr tree)))) (else (cons (car tree) (flatten (cdr tree))))))
(define (same-fringe? eql? tree1 tree2) (let loop ((t1 (flatten tree1)) (t2 (flatten tree2))) (cond ((and (null? t1) (null? t2)) #t) ((or (null? t1) (null? t2)) #f) ((not (eql? (car t1) (car t2))) #f) (else (loop (cdr t1) (cdr t2)))))) That works, but requires time to flatten both trees and space to store the flattened versions; if the trees are large, that can be a lot of time and space, and if the fringes differ, much of that time and space is wasted.
Hewitt used a generator to flatten the trees one element at a time, storing only the current elements of the trees and the machines needed to continue flattening them, so same-fringe? could stop early if the trees differ. Dorai Sitaram presents both the generator solution and a co-routine solution, which both involve tricky calls to call-with-current-continuation and careful coding to keep them synchronized.
An alternate solution flattens the two trees to streams instead of lists, which accomplishes the same savings of time and space, and involves code that looks little different than the list solution presented above:
(define-stream (flatten tree) (cond ((null? tree) stream-null) ((pair? (car tree)) (stream-append (flatten (car tree)) (flatten (cdr tree)))) (else (stream-cons (car tree) (flatten (cdr tree))))))
(define (same-fringe? eql? tree1 tree2) (let loop ((t1 (flatten tree1)) (t2 (flatten tree2))) (cond ((and (stream-null? t1) (stream-null? t2)) #t) ((or (stream-null? t1) (stream-null? t2)) #f) ((not (eql? (stream-car t1) (stream-car t2))) #f) (else (loop (stream-cdr t1) (stream-cdr t2)))))) Note that streams, a data structure, replace generators or co-routines, which are control structures, providing a fine example of how lazy streams enhance modularity.
A pipeline of procedures Another way in which streams promote modularity is enabling the use of many small procedures that are easily composed into larger programs, in the style of unix pipelines, where streams are important because they allow a large dataset to be processed one item at a time. Bird and Wadler provide the example of a text formatter. Their example uses right-folds:
(define (stream-fold-right f base strm) (if (stream-null? strm) base (f (stream-car strm) (stream-fold-right f base (stream-cdr strm)))))
(define (stream-fold-right-one f strm) (stream-match strm ((x) x) ((x . xs) (f x (stream-fold-right-one f xs))))) Bird and Wadler define text as a stream of characters, and develop a standard package for operating on text, which they derive mathematically (this assumes the line-separator character is a single #\newline):
(define (breakon a) (stream-lambda (x xss) (if (equal? a x) (stream-append (stream (stream)) xss) (stream-append (stream (stream-append (stream x) (stream-car xss))) (stream-cdr xss)))))
(define-stream (lines strm) (stream-fold-right (breakon #\newline) (stream (stream)) strm))
(define-stream (words strm) (stream-filter stream-pair? (stream-fold-right (breakon #\space) (stream (stream)) strm)))
(define-stream (paras strm) (stream-filter stream-pair? (stream-fold-right (breakon stream-null) (stream (stream)) strm)))
(define (insert a) (stream-lambda (xs ys) (stream-append xs (stream a) ys)))
(define unlines (lsec stream-fold-right-one (insert #\newline)))
(define unwords (lsec stream-fold-right-one (insert #\space)))
(define unparas (lsec stream-fold-right-one (insert stream-null))) These versatile procedures can be composed to count words, lines and paragraphs; the normalize procedure squeezes out multiple spaces and blank lines:
(define countlines (compose stream-length lines))
(define countwords (compose stream-length stream-concat (lsec stream-map words) lines))
(define countparas (compose stream-length paras lines))
(define parse (compose (lsec stream-map (lsec stream-map words)) paras lines))
(define unparse (compose unlines unparas (lsec stream-map (lsec stream-map unwords))))
(define normalize (compose unparse parse)) More useful than normalization is text-filling, which packs as many words onto each line as will fit.
(define (greedy m ws) (- (stream-length (stream-take-while (rsec <= m) (stream-scan (lambda (n word) (+ n (stream-length word) 1)) -1 ws))) 1))
(define-stream (fill m ws) (if (stream-null? ws) stream-null (let* ((n (greedy m ws)) (fstline (stream-take n ws)) (rstwrds (stream-drop n ws))) (stream-append (stream fstline) (fill m rstwrds)))))
(define linewords (compose stream-concat (lsec stream-map words)))
(define textparas (compose (lsec stream-map linewords) paras lines))
(define (filltext m strm) (unparse (stream-map (lsec fill m) (textparas strm)))) To display filename in lines of n characters, say:
(stream-for-each display (filltext n (file->stream filename))) Though each operator performs only a single task, they can be composed powerfully and expressively. The alternative is to build a single monolithic procedure for each task, which would be harder and involve repetitive code. Streams ensure procedures are called as needed.
Persistent data Queues are one of the fundamental data structures of computer science. In functional languages, queues are commonly implemented using two lists, with the front half of the queue in one list, where the head of the queue can be accessed easily, and the rear half of the queue in reverse order in another list, where new items can easily be added to the end of a queue. The standard form of such a queue holds that the front list can only be null if the rear list is also null:
(define queue-null (cons '() '())
(define (queue-null? obj) (and (pair? obj) (null? (car obj))))
(define (queue-check f r) (if (null? f) (cons (reverse r) '()) (cons f r)))
(define (queue-snoc q x) (queue-check (car q) (cons x (cdr q))))
(define (queue-head q) (if (null? (car q)) (error "empty queue: head") (car (car q))))
(define (queue-tail q) (if (null? (car q)) (error "empty-head: tail") (queue-check (cdr (car q)) (cdr q)))) This queue operates in amortized constant time per operation, with two conses per element, one when it is added to the rear list, and another when the rear list is reversed to become the front list. Queue-snoc and queue-head operate in constant time; queue-tail operates in worst-case linear time when the front list is empty.
Chris Okasaki points out that, if the queue is used persistently, its time-complexity rises from linear to quadratic since each persistent copy of the queue requires its own linear-time access. The problem can be fixed by implementing the front and rear parts of the queue as streams, rather than lists, and rotating one element from rear to front whenever the rear list is larger than the front list:
(define queue-null (cons stream-null stream-null))
(define (queue-null? x) (and (pair? x) (stream-null (car x))))
(define (queue-check f r) (if (< (stream-length r) (stream-length f)) (cons f r) (cons (stream-append f (stream-reverse r)) stream-null)))
(define (queue-snoc q x) (queue-check (car q) (stream-cons x (cdr q))))
(define (queue-head q) (if (stream-null? (car q)) (error "empty queue: head") (stream-car (car q))))
(define (queue-tail q) (if (stream-null? (car q)) (error "empty queue: tail") (queue-check (stream-cdr (car q)) (cdr q)))) Memoization solves the persistence problem; once a queue element has moved from rear to front, it need never be moved again in subsequent traversals of the queue. Thus, the linear time-complexity to access all elements in the queue, persistently, is restored.
Reducing two passes to one The final example is a lazy dictionary, where definitions and uses may occur in any order; in particular, uses may precede their corresponding definitions. This is a common problem. Many programming languages allow procedures to be used before they are defined. Macro processors must collect definitions and emit uses of text in order. An assembler needs to know the address that a linker will subsequently give to variables. The usual method is to make two passes over the data, collecting the definitions on the first pass and emitting the uses on the second pass. But Chris Reade shows how streams allow the dictionary to be built lazily, so that only a single pass is needed. Consider a stream of requests:
(define requests (stream '(get 3) '(put 1 "a") ; use follows definition '(put 3 "c") ; use precedes definition '(get 1) '(get 2) '(put 2 "b") ; use precedes definition '(put 4 "d"))) ; unused definition We want a procedure that will display cab, which is the result of (get 3), (get 1), and (get 2), in order. We first separate the request stream into gets and puts:
(define (get? obj) (eq? (car obj) 'get))
(define-stream (gets strm) (stream-map cadr (stream-filter get? strm)))
(define-stream (puts strm) (stream-map cdr (stream-remove get? strm))) Now, run-dict inserts each element of the puts stream into a lazy dictionary, represented as a stream of key/value pairs (an association stream), then looks up each element of the gets stream with stream-assoc:
(define-stream (run-dict requests) (let ((dict (build-dict (puts requests)))) (stream-map (rsec stream-assoc dict) (gets requests))))
(define (stream-assoc key dict) (cond ((stream-null? dict) #f) ((equal? key (car (stream-car dict))) (stream-car dict)) (else (stream-assoc key (stream-cdr dict))))) Dict is created in the let, but nothing is initially added to it. Each time stream-assoc performs a lookup, enough of dict is built to satisfy the lookup, but no more. We are assuming that each item is defined once and only once. All that is left is to define the procedure that inserts new items into the dictionary, lazily:
(define-stream (build-dict puts) (if (stream-null? puts) stream-null (stream-cons (stream-car puts) (build-dict (stream-cdr puts))))) Now we can run the requests and print the result:
(stream-for-each display (stream-map cadr (run-dict requests))) The (put 4 "d") definition is never added to the dictionary because it is never needed.
Pitfalls Programming with streams, or any lazy evaluator, can be tricky, even for programmers experienced in the genre. Programming with streams is even worse in Scheme than in a purely functional language, because, though the streams are lazy, the surrounding Scheme expressions in which they are embedded are eager. The impedance between lazy and eager can occasionally lead to astonishing results. Thirty-two years ago, William Burge warned:
Some care must be taken when a stream is produced to make sure that its elements are not really a list in disguise, in other words, to make sure that the stream elements are not materialized too soon.
For example, a simple version of stream-map that returns a stream built by applying a unary procedure to the elements of an input stream could be defined like this:
(define-stream (stream-map proc strm) ;wrong! (let loop ((strm strm)) (if (stream-null? strm) stream-null (stream-cons (proc (stream-car strm)) (loop (stream-cdr strm)))))) That looks right. It properly wraps the procedure in stream-lambda, and the two legs of the if both return streams, so it type-checks. But it fails because the named let binds loop to a procedure using normal lambda rather than stream-lambda, so even though the first element of the result stream is lazy, subsequent elements are eager. Stream-map can be written using stream-let:
(define-stream (stream-map proc strm) (stream-let loop ((strm strm)) (if (stream-null? strm) stream-null (stream-cons (proc (stream-car strm)) (loop (stream-cdr strm)))))) Here, stream-let assures that each element of the result stream is properly delayed, because each is subject to the stream-lambda that is implicit in stream-let, so the result is truly a stream, not a “list in disguise.” Another version of this procedure was given previously at the description of define-stream.
Another common problem occurs when a stream-valued procedure requires the next stream element in its definition. Consider this definition of stream-unique:
(define-stream (stream-unique eql? strm) ;wrong! (stream-match strm (() strm) ((_) strm) ((a b . _) (if (eql? a b) (stream-unique eql? (stream-cdr strm)) (stream-cons a (stream-unique eql? (stream-cdr strm))))))) The (a b . _) pattern requires the value of the next stream element after the one being considered. Thus, to compute the nth element of the stream, one must know the n+1st element, and to compute the n+1st element, one must know the n+2nd element, and to compute…. The correct version, given above in the description of stream-drop-while, only needs the current stream element.
A similar problem occurs when the stream expression uses the previous element to compute the current element:
(define (nat n) (stream-ref (stream-let loop ((s (stream 0))) (stream-cons (stream-car s) (loop (stream (add1 (stream-car s)))))) n)) This program traverses the stream of natural numbers, building the stream as it goes. The definition is correct; (nat 15) evaluates to 15. But it needlessly uses unbounded space because each stream element holds the value of the prior stream element in the binding to s.
When traversing a stream, it is easy to write the expression in such a way that evaluation requires unbounded space, even when that is not strictly necessary. During the discussion of SRFI-40, Joe Marshall created this infamous procedure:
(define (times3 n) (stream-ref (stream-filter (lambda (x) (zero? (modulo x n))) (stream-from 0)) 3)) (times3 5) evaluates to 15 and (times3 #e1e9) evaluates to three billion, though it takes a while. In either case, times3 should operate in bounded space, since each iteration mutates the promise that holds the next value. But it is easy to write times3 so that it does not operate in bounded space, as the follies of SRFI-40 showed. The common problem is that some element of the stream (often the first element) is bound outside the expression that is computing the stream, so it holds the head of the stream, which holds the second element, and so on. In addition to testing the programmer, this procedure tests the stream primitives (it caught several errors during development) and also tests the underlying Scheme system (it found a bug in one implementation).
Laziness is no defense against an infinite loop; for instance, the expression below never returns, because the odd? predicate never finds an odd stream element.
(stream-null? (stream-filter odd? (stream-from 0 2))) Ultimately, streams are defined as promises, which are implemented as thunks (lambda with no arguments). Since a stream is a procedure, comparisons such as eq?, eqv? and equal? are not meaningful when applied to streams. For instance, the expression (define s ((stream-lambda () stream-null))) defines s as the null stream, and (stream-null? s) is #t, but (eq? s stream-null) is #f. To determine if two streams are equal, it is necessary to evaluate the elements in their common prefixes, reporting #f if two elements ever differ and #t if both streams are exhausted at the same time.
(define (stream-equal? eql? xs ys) (cond ((and (stream-null? xs) (stream-null? ys)) #t) ((or (stream-null? xs) (stream-null? ys)) #f) ((not (eql? (stream-car xs) (stream-car ys))) #f) (else (stream-equal? eql? (stream-cdr xs) (stream-cdr ys))))) It is generally not a good idea to mix lazy streams with eager side-effects, because the order in which stream elements are evaluated determines the order in which the side-effects occur. For a simple example, consider this side-effecting version of strm123:
(define strm123-with-side-effects (stream-cons (begin (display "one") 1) (stream-cons (begin (display "two") 2) (stream-cons (begin (display "three") 3) stream-null)))) The stream has elements 1 2 3. But depending on the order in which stream elements are accessed, "one", "two" and "three" could be printed in any order.
Since the performance of streams can be very poor, normal (eager) lists should be preferred to streams unless there is some compelling reason to the contrary. For instance, computing pythagorean triples with streams
(stream-ref (stream-of (list a b c) (n in (stream-from 1)) (a in (stream-range 1 n)) (b in (stream-range a n)) (c is (- n a b)) (= (+ (* a a) (* b b)) (* c c))) 50) is about two orders of magnitude slower than the equivalent expression using loops.
(do ((n 1 (+ n 1))) ((> n 228)) (do ((a 1 (+ a 1))) ((> a n)) (do ((b a (+ b 1))) ((> b n)) (let ((c (- n a b))) (if (= (+ (* a a) (* b b)) (* c c)) (display (list a b c)))))))