Wednesday, September 22, 2010

SICP Lecture 2B: Compound Data

Structure and Interpretation of Computer Programs
Compound Data
Covers Text Section 2.1: Introduction to Data Abstraction

You can download the video lecture or stream it on MIT's OpenCourseWare site. It's also available on YouTube.

This lecture is presented by Professor Harold Abelson

Chapter 1 of SICP was all about procedures using primitive data, and how to build higher and higher abstractions using procedures. The next chapter will be all about combining primitive data into compound data abstractions.

The fundamental means of combining primitive data in Scheme is to glue two things together into a pair. We use the cons function to combine two things together to make a pair. Once you've combined them you need something to get the two separate parts back. To do that you use the car and cdr functions.

(cons x y) - constructs a pair whose first part is x and second part is y.

(car p) - selects the first part of the pair p.

(cdr p) - selects the second part of the pair p.

The convention for drawing diagrams of pairs is called box-and-pointer notation.




The formal way of describing the behavior of cons, car, and cdr, and how they interact with one another is called the axiom for pairs:

For any x and y:
(car (cons x y)) is x
(cdr (cons x y)) is y

Representing rational numbers

The text defines a system for representing rational numbers using pairs.
(define (make-rat n d) (cons n d))

(define (numer x) (car x))
(define (denom x) (cdr x))

These functions are called a constructor (make-rat) and selectors (numer and denom). Once we have the simple data abstraction in place, we can use the abstraction to define other procedures that work with rational numbers using the rules of rational number arithmetic.
(define (add-rat x y)
(make-rat (+ (* (numer x) (denom y))
(* (numer y) (denom x)))
(* (denom x) (denom y))))

(define (sub-rat x y)
(make-rat (- (* (numer x) (denom y))
(* (numer y) (denom x)))
(* (denom x) (denom y))))

(define (mul-rat x y)
(make-rat (* (numer x) (numer y))
(* (denom x) (denom y))))

(define (div-rat x y)
(make-rat (* (numer x) (denom y))
(* (denom x) (numer y))))

(define (print-rat x)
(newline)
(display (numer x))
(display "/")
(display (denom x)))

Data abstraction separates the use of a data object from its representation. You could define add-rat and the other procedures without using data abstraction by directly using cons, car, and cdr. The point of isolating the use from the representation is that you have to capture the idea of a conceptual entity.


Compound Data

Like procedure definition, the real power of compound data is that we can use it to create building blocks that can themselves be used to create more complex data structures. Using the same method we used for creating rational numbers, we could also create a set of functions for representing points in a plane. From those points we can build line segments, from line segments we can build two-dimensional shapes. We could keep going and build as many abstractions on top of each other as we need to represent any system we want.

Closure
- the means of combination in the system are such that when you put things together with them you can then put those things together using the same means. The system is closed. (This use of the word "closure" is in the same sense as it's used in abstract algebra. The Lisp community also uses the same word for an unrelated concept.)

Once you can combine two things, the closure property allows you to combine as many things as you want.



We'll see later in the text that this is how lists are formed in Scheme.


What are pairs really?

The last section of the lecture shows that pairs can be build out of nothing at all. If Scheme didn't have primitive functions cons, car, and cdr we could define them ourselves.
(define (cons a b)
(lambda (pick)
(cond ((= pick 1) a)
((= pick 2) b))))

(define (car x) (x 1))
(define (cdr x) (x 2))

In this implementation cons returns a procedure, not a pair. The definitions for car and cdr take the procedure supplied by cons and apply it to 1 and 2 respectively. If this system satisfies the axiom for pairs that we saw earlier, then it's a perfectly reasonable implementation. In the lecture, mechanical substitution is used to prove that this system does satisfy the axiom for pairs.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

No comments: