Algorithms and implementations
Yiannis N. Moschovakis
UCLA and University of Athens
Tarski Lecture 1, March 3, 2008
What is an algorithm?
I Basic aim: to “define” (or represent) algorithms in set theory,
in the same way that we represent real numbers (Cantor,
Dedekind) and random variables (Kolmogorov) by
set-theoretic objects
I What set-theoretic objects represent algorithms?
I When do two two set-theoretic objects represent the same
algorithm? (The algorithm identity problem)
I In what way are algorithms effective?
I . . . and do it so that the basic results about algorithms can be
established rigorously (and naturally)
I . . . and there should be some applications!
Yiannis N. Moschovakis: Algorithms and implementations 1/25
Plan for the lectures
Lecture 1. Algorithms and implementations
Discuss the problem and some ideas for solving it
Lecture 2. English as a programming language
Applications to Philosophy of language (and linguistics?)
synonymy and faithful translation ∼ algorithm identity
Lecture 3. The axiomatic derivation of absolute lower bounds
Applications to complexity (joint work with Lou van den Dries)
Do not depend on pinning down algorithm identity
Lectures 2 and 3 are independent of each other and mostly
independent of Lecture 1
I will oversimplify, but: All lies are white (John Steel)
Yiannis N. Moschovakis: Algorithms and implementations 2/25
Outline of Lecture 1
Slogan: The theory of algorithms is the theory of recursive equations
(1) Three examples
(2) Machines vs. recursive definitions
(3) Recursors
(4) Elementary algorithms
(5) Implementations
Notation:
N = {0, 1, 2, . . .}
a ≥ b ≥ 1, a = bq + r , 0 ≤ r < b
=⇒ q = iq(a, b), r = rem(a, b)
gcd(a, b) = the greatest common divisor of a and b
⊥b ⇐⇒ rem(a, b) = 1
a⊥ (a and b are coprime)
Yiannis N. Moschovakis: Algorithms and implementations 3/25
The Euclidean algorithm ε
For a, b ∈ N, a ≥ b ≥ 1,
ε : gcd(a, b) = if (rem(a, b) = 0) then b else gcd(b, rem(a, b))
cε (a, b) = the number of divisions needed to compute gcd(a, b) using ε
Complexity of the Euclidean
If a ≥ b ≥ 2, then cε (a, b) ≤ 2 log2 (a)
Proofs of the correctness and the upper bound are by induction on
max(a, b)
Yiannis N. Moschovakis: Algorithms and implementations 4/25
What is the Euclidean algorithm?
ε : gcd(a, b) = if (rem(a, b) = 0) then b else gcd(b, rem(a, b))
I It is an algorithm on N, from (relative to) the remainder
function rem and it computes gcd : N2 → N
I It is needed to make precise the optimality of the Euclidean:
Basic Conjecture
For every algorithm α which computes on N from rem the greatest
common divisor function, there is a constant r > 0 such that for
infinitely many pairs a ≥ b ≥ 1,
cα (a, b) ≥ r log2 (a)
Yiannis N. Moschovakis: Algorithms and implementations 5/25
Sorting (alphabetizing)
Given an ordering ≤ on a set A and any u = hu0 , . . . , un−1 i ∈ An
sort(u) = the unique, sorted (non-decreasing) rearrangement
v = huπ(0) , uπ(1) , . . . , uπ(n−1) i
where π : {0, . . . , n − 1}
→ {0, . . . , n − 1} is a permutation
head(hu0 , . . . , un−1 i) = u0
tail(hu0 , . . . , un−1 i) = hu1 , . . . , un−1 i
hxi ∗ hu0 , . . . , un−1 i = hx, u0 , . . . , un−1 i (prepend)
|hu0 , . . . , un−1 i| = n (the length of u)
h1 (u) = the first half of u (the first half)
h2 (u) = the second half of u (the second half)
Yiannis N. Moschovakis: Algorithms and implementations 6/25
The mergesort algorithm σm
sort(u) = if (|u| ≤ 1) then u else merge(sort(h1 (u)), sort(h2 (u)))
w if |v | = 0,
v else, if |w | = 0,
merge(v , w ) =
hv 0 i ∗ merge(tail(v ), w ) else, if v0 ≤ w0 ,
hw0 i ∗ merge(v , tail(w )) otherwise.
(1) If v , w are sorted, then merge(v , w ) = sort(w ∗ v )
(2) The sorting and merging function satisfy these equations
(3) merge(v , w ) can be computed using no more than
|v | + |w | −· 1 comparisons
(4) sort(u) can be computed by σm using no more than
|u| log2 (|u|) comparisons (|u| > 1)
Yiannis N. Moschovakis: Algorithms and implementations 7/25
What is the mergesort algorithm?
sort(u) = if (|u| ≤ 1) then u else merge(sort(h1 (u)), sort(h2 (u)))
w if |v | = 0,
v else, if |w | = 0,
merge(v , w ) =
hv i ∗ merge(tail(v ), w ) else, if v0 ≤ w0 ,
0
hw0 i ∗ merge(v , tail(w )) otherwise.
cσm (u) = the number of comparisons needed to compute sort(u)
using σm ≤ |u| log2 (|u|) (|u| > 0)
I It is an algorithm from the ordering ≤ and the functions
head(u), tail(u), |u|, . . .
I It is needed to make precise the optimality of σm :
For every sorting algorithm σ from ≤, head, tail, . . ., there is
an r > 0 and infinitely many sequences u such that
cσ (u) ≥ r |u| log2 (|u|) (well known)
Yiannis N. Moschovakis: Algorithms and implementations 8/25
The Gentzen Cut Elimination algorithm
Every proof d of the Gentzen system for Predicate Logic can be
transformed into a cut-free proof γ(d) with the same conclusion
γ(d) = if T1 (d) then f1 (d)
else if T2 (d) then f2 (γ(τ (d)))
else f3 (γ(σ1 (d)), γ(σ2 (d)))
I It is a recursive algorithm from natural syntactic primitives,
very similar in logical structure to the mergesort
I Main Fact: |γ(d)| ≤ e(ρ(d), |d|), where |d| is the length of
the proof d, ρ(d) is its cut-rank, and
e(0, k) = k, e(n + 1, k) = 2e(n,k)
Yiannis N. Moschovakis: Algorithms and implementations 9/25
The infinitary Gentzen algorithm
If we add the ω-rule to the Gentzen system for Peano arithmetic,
then cuts can again be eliminated by an extension of the finitary
Gentzen algorithm
γ ∗ (d) = if T1 (d) then f1 (d)
else if T2 (d) then f2 (γ ∗ (τ (d)))
else if T3 (d) then f3 (γ ∗ (σ1 (d)), γ ∗ (σ2 (d)))
else f4 (λ(n)γ ∗ (ρ(n, d))),
where f4 is a functional embodying the ω-rule
I Again |γ ∗ (d)| ≤ e(ρ(d), |d|), where cut-ranks and lengths of
infinite proofs are ordinals, e(α, β) is defined by ordinal
recursion, and so every provable sentence has a cut-free proof
of length less than
ε0 = the least ordinal > 0 and closed under α 7→ ω α
Yiannis N. Moschovakis: Algorithms and implementations 10/25
Abstract machines (computation models)
S T
s
input - s0 ? z output -
W
· · ·
X x f(x)
σ
A machine m : X Y is a tuple (S, input, σ, T , output) such that
1. S is a non-empty set (of states)
2. input : X → S is the input function
3. σ : S → S is the transition function
4. T is the set of terminal states, T ⊆ S
5. output : T → Y is the output function
m(x) = output(σ n (input(x)))
where n = least such that σ n (input(x)) ∈ T
Yiannis N. Moschovakis: Algorithms and implementations 11/25
Infinitary algorithms are not machines
I It is useful to think of the infinitary Gentzen “effective
procedure” as an algorithm
I There are applications of infinitary algorithms (in Lecture 2)
I Machines are special algorithms which implement finitary
algorithms
I The relation between an (implementable) algorithm and its
implementations is interesting
Yiannis N. Moschovakis: Algorithms and implementations 12/25
Which machine is the Euclidean?
ε : gcd(a, b) = if (rem(a, b) = 0) then b else gcd(b, rem(a, b))
I Must specify a set of states, an input function, a transition
function, etc.
I This can be done, in many ways, generally called
implementations of the Euclidean
I The choice of a “natural” (abstract) implementation is
irrelevant for the correctness and the log upper bound of the
Euclidean, which are derived directly from the recursive
equation above and apply to all implementations
I Claim: ε is completely specified by the equation above
Yiannis N. Moschovakis: Algorithms and implementations 13/25
Which machine is the mergesort algorithm?
sort(u) = if (|u| ≤ 1) then u else merge(sort(h1 (u)), sort(h2 (u)))
w if |v | = 0,
v else, if |w | = 0,
merge(v , w ) =
hv i ∗ merge(tail(v ), w ) else, if v0 ≤ w0 ,
0
hw0 i ∗ merge(v , tail(w )) otherwise.
I Many (essentially) different implementations
sequential (with specified orders of evaluation), parallel, . . .
I The correctness and n log2 (n) upper bound are derived
directly from a (specific reading) of these recursive equations
• They should apply to all implementations of the mergesort
I Claim: σm is completely specified by the system above
I Task: Define σm , define implementations, prove •
Yiannis N. Moschovakis: Algorithms and implementations 14/25
Slogans and questions
I Algorithms compute functions from specific primitives
I They are specified by systems of recursive equations
I An algorithm is (faithfully modeled by) the semantic content
of a system of recursive equations
I Machines are algorithms, but not all algorithms are machines
I Some algorithms have machine implementations
I An algorithm codes all its implementation-independent
properties
I What is the relation between an algorithm and its implementations?
. . . or between two implementations of the same algorithm?
Main slogan
The theory of algorithms is the theory of recursive equations
(Skip non-deterministic algorithms and fairness)
Yiannis N. Moschovakis: Algorithms and implementations 15/25
Monotone recursive equations
I A complete poset is a partial ordered set D = (Field(D), ≤D )
in which every directed set has a least upper bound
I Standard example:
(X * Y ) = the set of all partial functions f : X * Y
I A function f : D → E is monotone if x ≤D y =⇒ f (x) ≤E f (y )
(f : X * Y is a monotone function on X to Y ∪ {⊥})
I For every monotone f : D → D on a complete D, the
equation x = f (x) has a least solution
I Complete posets (domains) are the basic objects studied in
Scott’s Denotational Semantics for programming languages
I Much of this work can be viewed as a refinement of
Denotational Semantics (which interprets programs by algorithms)
Yiannis N. Moschovakis: Algorithms and implementations 16/25
Recursors
τα
*
N W
X Dα -
α0
A recursor α : X W is a tuple α = (α0 , α1 , . . . , αk ) such that
1. X is a poset, W is a complete poset
2. D1 , . . . , Dk are complete posets, Dα = D1 × · · · × Dk ,
the solution space of α
3. αi : X × Dα → Di is monotone (i = 1, . . . , k)
4. τα (x, ~d) = (α1 (x, ~d), . . . , αk (x, ~d)) is the transition function,
τα : X × D α → D α
5. α0 : X × D1 × · · · × Dk → W is monotone, the output map
α(x) = α0 (x, d 1 , . . . , d k ) for the least solution of ~d = τα (x, ~d)
We write α(x) = α0 (x, ~d) where {~d = τα (x, ~d)}
Yiannis N. Moschovakis: Algorithms and implementations 17/25
Recursor isomorphism
Two recursors
α = (α0 , α1 , . . . , αk ), α0 = (α00 , α10 , . . . , αm
0
):X W
are isomorphic (α ' α0 ) if
(1) k = m (same number of parts)
(2) There is a permutation π : {1, . . . , k} and poset isomorphisms
0
ρi : Di → Dπ(i) (i = 1, . . . , k) such that . . .
(the order of the equations in the system ~d = τα (x, ~d) does
not matter)
Isomorphic recursors α, α0 : X W compute the same function
α = α0 : X → W
Yiannis N. Moschovakis: Algorithms and implementations 18/25
Machines or recursors?
With each machine m = (S, input, σ, T , output) : X Y we
associate the tail recursor
rm (x) = p(input(x)) where
{p = λ(s)[if (s ∈ T ) then output(s) else p(σ(s))]}
I m and rm compute the same partial function rm = m : X * Y
I Theorem (with V. Paschalis) The map m 7→ rm respects
isomorphisms, m ' m0 ⇐⇒ rm ' rm0
I The question is one of choice of terminology
(because the mergesort system is also needed)
I Yuri Gurevich has argued that algorithms are machines
(and of a very specific kind)
I Jean-Yves Girard has also given similar arguments
Yiannis N. Moschovakis: Algorithms and implementations 19/25
Elementary (first order) algorithms
Algorithms which compute partial functions from given partial functions
(Partial, pointed) algebra M = (M, 0, 1, ΦM )
where 0, 1 ∈ M, Φ is a set of function symbols (the vocabulary)
and ΦM = {φM }φ∈Φ , with φM : M nφ * M for each φ ∈ Φ
Nε = (N, 0, 1, rem), the Euclidean algebra
Nu = (N, 0, 1, S, Pd), the unary numbers
Nb = (N, 0, 1, Parity, iq2 , (x 7→ 2x), (x 7→ 2x + 1)), the binary numbers
A∗ = (A∗ , 0, 1, ≤, head, tail, . . .), the mergesort algebra, with 0, 1 ∈ A∗
Standard model-theoretic notions must be mildly adapted, for
example for (partial) subalgebras:
U ⊆p M ⇐⇒ {0, 1} ⊆ U ⊆ M and for all φ, φU ⊆ φM
Yiannis N. Moschovakis: Algorithms and implementations 20/25
Recursive (McCarthy) programs of M = (M, 0, 1, ΦM )
Explicit Φ-terms (with partial function variables and conditionals)
A :≡ 0 | 1 | vi | φ(A1 , . . . , An ) | pni (A1 , . . . , An )
| if (A = 0) then B else C
Recursive program (only ~xi , p1 , . . . , pK occur in each part Ai ):
pA (~x0 )
= A0
p (~x1 )
= A1
1
A : .. (A0 : the head, (A1 , . . . , AK ) : the body)
.
pK (~xK ) = AK
I What is the semantic content of the system A?
Yiannis N. Moschovakis: Algorithms and implementations 21/25
The recursor of a program in M
pA (~x0 ) = A0
p (~x1 ) = A1
1
A : ..
.
pK (~xK ) = AK
r(A, M)(~x ) = den(A0 , M)(~x , ~p ) where
n o
p1 = λ(~x1 )den(A1 , M)(~x1 , ~p ), . . . , pK = λ(~xK )den(AK , M)(~xK , ~p )
r(A, M) is not exactly the algorithm expressed by A in M.
For example, if A : pA (~x ) = A0 (~x ) has empty body, then
r(A, M)(~x ) = den(A0 , M)(~x ) where { }
is just the function defined on M by A0
(which may involve much explicit computation)
Yiannis N. Moschovakis: Algorithms and implementations 22/25
The problem of defining implementations
van Emde Boas:
Intuitively, a simulation of [one class of computation
models] M by [another] M 0 is some construction which
shows that everything a machine Mi ∈ M can do on
inputs x can be performed by some machine Mi0 ∈ M 0 on
the same inputs as well;
We will define a reducibility relation α ≤r β and call a machine m
an implementation of α if α ≤r rm
(where rm is the recursor representation of the machine m)
Yiannis N. Moschovakis: Algorithms and implementations 23/25
Recursor reducibility
Suppose α, β : X W , (e.g., β = rm where m : X W ):
A reduction of α to β is any monotone mapping
π : X × Dα → Dβ
such that the following three conditions hold, for every x ∈ X and
every d ∈ Dα :
(R1) τβ (x, π(x, d)) ≤ π(x, τα (x, d)).
(R2) β0 (x, π(x, d)) ≤ α0 (x, d).
(R3) α(x) = β(x).
I α ≤r β if a reduction exists
I m implements α if α ≤r rm
Yiannis N. Moschovakis: Algorithms and implementations 24/25
Implementations of elementary algorithms
Theorem (with Paschalis)
For any recursive program A in an algebra M, the standard
implementation of A is an implementation of r(A, M)
. . . Uniformly enough, so that (with the full definitions), the
standard implementation of A implements the elementary
algorithm expressed by A in M
. . . And this is true of all familiar implementations of recursive
programs
. . . so that the basic (complexity and resource use) upper and lower
bounds established from the program A hold of all
implementations of A
And for the applications to complexity theory, we work directly
with the recursive equations of A
Yiannis N. Moschovakis: Algorithms and implementations 25/25