15-150 Fall 2014
Stephen Brookes
LECTURE 1
Tuesday, August 26
Functional
programming
!
!
!
SML
What is SML?
A functional programming language
computation = evaluation
A typed language
only well-typed expressions are evaluated
A polymorphic typed language
well-typed expressions have a most general type
A call-by-value language
function calls evaluate their argument
Features
referential transparency
!
safe substitution for equivalent code
simple compositional reasoning
mathematical foundations
!
use
math/logic to prove correctness
induction to analyze recursive code and data
functions are values
can
be used as data in lists, tuples, ...
!
! and as argument or result of other functions
Referential
transparency
for types and values
The type of an expression depends only on
the types of its sub-expressions
The value of an expression depends only on
the values of its sub-expressions
safe substitution,
compositional reasoning
Equivalence
Expressions of type int are equivalent
if they evaluate to the same integer
Expressions of type int list are equivalent
if they evaluate to the same list of integers
Functions of type int -> int are equivalent
if they map equivalent arguments to equivalent results
(also called extensional equivalence)
Equivalence is a form of semantic equality
Equivalence
21 + 21 is equivalent to 42
[2,4,6] is equivalent to [1+1, 2+2, 3+3]
fn x => x+x is equivalent to fn y => 2*y
21 + 21 = 42
fn x => x+x = fn y => 2*y
(fn x => x+x) (21 + 21) = (fn y => 2*y) 42
We use = for equivalence
Dont confuse with = in ML
Equivalence
For every type t there is a notion of
equivalence for expressions of that type
We usually just use
When necessary we use
Our examples so far illustrate:
=int
=int list
=int -> int
=
=t
Compositionality
In any functional program, replacing an
expression by an equivalent expression
produces an equivalent program
The key to
compositional reasoning
about programs
Parallelism
Expression evaluation has no side-effects
can evaluate independent code in parallel
evaluation order has no effect on value
Parallel evaluation may be faster than sequential
Learn to exploit parallelism!
Principles
Expressions must be well-typed.
Well-typed expressions don't go wrong.
Every function needs a specification.
Well-specified programs are easy to understand.
Every specification needs a proof.
Well-proven programs do the right thing.
Those are my principles,
and if you don't like them...
well, I have others.
Principles
Large programs should be designed as modules.
Well-interfaced code is easier to maintain.
Data structures algorithms.
Good choice of data structure leads to better code.
Exploit parallelism.
Parallel code may run faster.
Strive for simplicity.
Programs should be as simple
as possible, but no simpler.
sum
fun sum [ ] = 0
| sum (x::L) = x + sum(L)
sum : int list -> int
sum [1,2,3] = 6
For all values L : int list,
sum L = the sum of the integers in L
sum [v1, , vn] = v1 + + vn
type
value
spec
sum
fun sum [ ] = 0
| sum (x::L) = x + sum(L)
equational
sum [1,2,3]
reasoning
= 1 + sum [2,3]
= 1 + (2 + sum [3])
= 1 + (2 + (3 + sum [ ]))
= 1 + (2 + (3 + 0))
= 6.
count
fun count [ ] = 0
| count (r::R) = (sum r) + (count R)
count [[1,2,3], [1,2,3]] = 12
For all values R : (int list) list,
count : (int list) list -> int
type
value
spec
count R = the sum of the ints in the lists of R.
count [L1, , Ln] = sum L1 + + sum Ln
count
Since
equational
sum [1,2,3] = 6
reasoning
and
count [[1,2,3], [1,2,3]]
= sum[1,2,3] + sum [1,2,3]
it follows that
count [[1,2,3], [1,2,3]]
= 6+6
= 12
tail recursion
fun sum [ ] = 0
| sum (x::L) = x + sum(L)
The definition of sum is not tail-recursive
Can define a tail recursive helper function
sum that uses an accumulator
sum : int list -> int
sum : int list * int -> int
This is a general technique. But why bother?!
Sometimes tail recursive version is more efficient.
sum
fun sum ([ ], a) = a
| sum (x::L, a) = sum (L, x+a)
sum ([1,2,3], 4) = 10
For all L:int list and a:int,
sum : int list * int -> int
sum (L, a) = sum(L)+a
type
value
spec
Sum
fun sum ([ ], a) = a
| sum (x::L, a) = sum (L, x+a)
fun Sum L = sum (L, 0)
Sum : int list -> int
Sum and sum are extensionally equivalent
For all L:int list, Sum L = sum L.
Hence...
fun count [ ] = 0
| count (r::R) = (sum r) + (count R)
fun Count [ ] = 0
| Count (r::R) = (Sum r) + (Count R)
Count and count are extensionally equivalent
For all R: (int list) list, Count R = count R.
Evaluation
fun sum [ ] = 0
| sum (x::L) = x + sum(L)
sum (1::[2,3]) =>*
=>* !
means!
finitely many steps
x:1, L:[2,3]
(x + sum(L))
=>* 1 + sum [2,3]
=>* 1 + (2 + sum [3])
=>* 1 + (2 + (3 + sum [ ]))
=>* 1 + (2 + (3 + 0))
=>* 1 + (2 + 3)
=>* 1 + 5
=>* 6
pattern of recursive calls,!
x:1, L:[2,3]
is a list of value bindings
order of arithmetic operations
Evaluation
count [[1,2,3], [1,2,3]]
=>* sum [1,2,3] + count [[1,2,3]]
=>* 6 + count [[1,2,3]]
=>* 6 + (sum [1,2,3] + count [ ])
=>* 6 + (6 + count [ ])
=>* 6 + (6 + 0)
=>* 6 + 6
=>* 12
Analysis
(details later!)
code!
fragment
time!
proportional to
sum(L), Sum(L)
length of L
count(R), Count(R)
sum of lengths of
lists in R
(tail recursion
doesnt help)
These functions do sequential evaluation
Potential for
parallelism
+ is associative
+ is commutative
The order in which we combine additions
doesnt affect the result
Exploiting parallelism
with map
fun parcount R = sum (map sum R)
parcount [[1,2,3], [4,5], [6,7,8]]
=>* sum (map sum [[1,2,3], [4,5], [6,7,8]])
=>* sum [sum [1,2,3], sum [4,5], sum [6,7,8]]
parallel evaluation of sum[1,2,3], sum[4,5] and sum [6,7,8]
=>* sum [6, 9, 21]
=>* 36
Analysis
Let R be a row list of length k,
with each row a list of length n/k
If we have enough parallel processors,
parcount R takes time proportional to k + n/k
computes each row sum, in parallel!
then!
adds the row sums
Recall: count R takes time proportional to n
With n=240, k=12,
k + n/k is 32, a 4-fold speedup over 240.
Exploiting parallelism
with map and reduce
fun parcount R = reduce (op +) (map sum R)
parcount [[1,2,3], [4,5], [6,7,8]]
=>* reduce (op +) (map sum [[1,2,3], [4,5], [6,7,8]])
=>* reduce (op +) [sum [1,2,3], sum [4,5], sum [6,7,8]]
=>* reduce (op +) [6, 9, 21]
=>* 36
For k rows of length n/k,!
time is proportional to log k + n/k
Can we do any better?
Try other ways to compute sums of row sums,
using map and reduce
Better ways to exploit parallelism?
How can we tell?
work and spam
We will introduce techniques for analysing
work (sequential runtime)
span (= optimal parallel runtime)
of functional code...
(thats how we did the runtime calculations earlier)
reduce (op +) [v1,,vk] has !
work O(k) and span O(log k)
sum [v1,,vk] has !
work O(k) and span O(k)
Themes
!
!
!
functional programming
correctness, termination, and performance
types, specifications and proofs
evaluation, equivalence and referential transparency
compositional reasoning
exploiting parallelism
!
Objectives
Write functional programs
Write specifications, and use rigorous
techniques to prove correctness
Learn techniques for analyzing sequential
and parallel running time
Choose data structures and exploit
parallelism to improve efficiency
Structure code using abstract types and
modules, with clear interfaces
Notes
Lab tomorrow
Homework 1 out tonight
Homework 1 due Tuesday, 2 Sept
Course policy: no cheating!
Do your own work!
See course website, piazza, for details.
Always attend class! Then read the notes!