Concepts in Programming Languages-Pages-2
Concepts in Programming Languages-Pages-2
{ int x = 2;
{ int y = 3; inner
outer
block x = y+2; block
}
}
In this section of code, there are two blocks. Each block begins with a left brace, {,
and ends with a right brace, }. The outer block begins with the first left brace and
ends with the last right brace. The inner block is nested inside the outer block. It
begins with the second left brace and ends with the first right brace. The variable
162
7.1 Block-Structured Languages 163
x is declared in the outer block and the variable y is declared in the inner block.
A variable declared within a block is said to be local to that block. A variable de-
clared in an enclosing block is said to be global to the block. In this example, x is
local to the outer block, y is local to the inner block, and x is global to the inner
block.
C, Pascal, and ML are all block-structured languages. In-line blocks are delineated
by { . . . } in C, begin...end in Pascal, and let...in...end in ML. The body of a procedure
or function is also a block in each of these languages.
Storage management mechanisms associated with block structure allow functions
to be called recursively.
The versions of Fortran in widespread use during the 1960s and 1970s were not
block structured. In historical Fortran, every variable, including every parameter
of each procedure (called a subroutine in Fortran) was assigned a fixed-memory
location. This made it impossible to call a procedure recursively, either directly or
indirectly. If Fortran procedure P calls Q, Q calls R, and then R attempts to call P, the
second call to P is not allowed. If P were called a second time in this call chain, the
second call would write over the parameters and return address for the first call. This
would make it impossible for the call to return properly.
Block-structured languages are characterized by the following properties:
local variables, which are stored on the stack in the activation record associated
with the block
parameters to function or procedure blocks, which are also stored in the activation
record associated with the block
global variables, which are declared in some enclosing block and therefore must
be accessed from an activation record that was placed on the run-time stack
before activation of the current block.
164 Scope, Functions, and Storage Management
Stack
Program
Counter
Environment Heap
Pointer
It may seem surprising that most complications arise in connection with access to
global variables. However, this is really a consequence of stack-based memory man-
agement: The stack is used to make it easy to allocate and access local variables. In
placing local variables close at hand, a global variable may be buried on the stack
under any number of activation records.
implementation will be simple and direct. More efficient methods for doing many of
the things described in this chapter, tailored for specific languages, may be found in
compiler books.
A Note about C
The C programming language is designed to make C easy to compile and execute,
avoiding several of the general scoping and memory management techniques de-
scribed in this chapter. Understanding the general cases considered here will give
C programmers some understanding of the specific ways in which C is simpler than
other languages. In addition, C programmers who want the effect of passing func-
tions and their environments to other functions may use the ideas described in this
chapter in their programs.
Some commercial implementations of C and C++ actually do support function
parameters and return values, with preservation of static scope by use of closures.
(We will discuss closures in Section 7.4.) In addition, the C++ Standard Template
Library (covered in Subsection 9.4.3) provides a form of function closure as many
programmers find function arguments and return values useful.
{ int x=0;
int y=x+1;
{ int z=(x+y)*(x-y);
};
};
166 Scope, Functions, and Storage Management
Space for global Space for global Space for global Space for global
variables variables variables variables
Space for z
Execution time
Figure 7.2. Stack grows and shrinks during program execution.
We can visualize this by using a sequence of figures of the stack. As in Figure 7.1, the
stack is shown growing downward in memory in Figure 7.2.
A simple optimization involves combining small nested blocks into a single block.
For the preceding example program, this would save the run time spent in pushing
and popping the inner block for z, as z could be stored in the same activation record
as that of x and y. However, because we plan to illustrate the general case by using
small examples, we do not use this optimization in further discussion of stack stor-
age allocation. In all of the program examples we consider, we assume that a new
activation record is allocated on the run-time stack each time the program enters a
block.
The number of locations that need to be allocated at run time depends on the
number of variables declared in the block and their types. Because these quantities
are known at compile time, the compiler can determine the format of each activation
record and store this information as part of the compiled code.
Intermediate Results
In general, an activation record may also contain space for intermediate results.
These are values that are not given names in the code, but that may need to be saved
temporarily. For example, the activation record for this block,
{ int z = (x+y)*(x-y);
}
Space for z
because the values of subexpressions x+y and x-y may have to be evaluated and stored
somewhere before they are multiplied.
On modern computers, there are enough registers that many intermediate re-
sults are stored in registers and not placed on the stack. However, because register
7.2 In-Line Blocks 167
{ int x = . . . ;
{ int y = . . . ;
{ int x = . . . ;
... .
};
};
};
In this example, the inner declaration of x hides the outer one. The inner block is
called a hole in the scope of the outer declaration of x, as the outer x cannot be
accessed within the inner block. This example shows that lifetime does not coincide
with scope because the lifetime of the outer x includes time when inner block is being
executed, but the scope of the outer x does not include the scope of the inner one.
we consider the declaration of f one block and the declaration of g another block
inside the outer block. If this code is not inside some other construct, then these
blocks will both end at the end of the program.
When an ML expression contains declarations as part of the let-in-end construct,
we consider the declarations to be part of the same block. For example, consider this
example expression:
168 Scope, Functions, and Storage Management
This expression contains a block, beginning with let and ending with end. This block
contains two declarations, functions g and h, and one expression, h(x), calling one
of these functions. The construct let . . . in . . . end is approximately equivalent to
{ . . . ; . . . } in C. The main syntactic difference is that declarations appear between
the keywords let and in, and expressions using these declarations appear between
keywords in and end. Because the declarations of functions g and h appear in the
same block, the names g and h will be given values in the same activation record.
Control link
Local variables
Intermediate results
Control link
Local variables
Intermediate results
Environment
Pointer
When a new activation record is added to the stack, the control link of the new
activation record is set to the previous value of the environment pointer, and the
environment pointer is updated to point to the new activation record. When an
activation record is popped off the stack, the environment pointer is reset by following
the control link from the activation record.
The code for pushing and popping activation records from the stack is generated
by the compiler at compile time and becomes part of the compiled code for the
program. Because the size of an activation record can be determined from the text
of the block, the compiler can generate code for block entry that is tailored to the
size of the activation record.
When a global variable occurs in an expression, the compiler must generate code
that will find the location of that variable at run time. However, the compiler can
compute the number of blocks between the current block and the block where the
variable is declared; this is easily determined from the program text. In addition, the
relative position of each variable within its block is known at compile time. Therefore,
the compiler can generate lookup code that follows a predetermined number of links
Example 7.1
{ int x=0;
int y=x+1;
{ int z=(x+y)*(x-y);
};
};
When the expressions x+y and x-y are evaluated during execution of this code, the
run-time stack will have activation records for the inner and outer blocks as shown
below:
Control link
x 0
y 1
Control link
z -1
x+y 1
x-y -1
Environment
Pointer
170 Scope, Functions, and Storage Management
On a register-based machine, the machine code generated by the compiler will find
the variables x and y, load each into registers, and then add the two values. The code
for loading x uses the environment pointer to find the top of the current activation,
then computes the location of x by adding 1 to the location stored in the control
link of the current activation record. The compiler generates this code by analyzing
the program text at compile time: The variable x is declared one block out from
the current block, and x is the first variable declared in the block. Therefore, the
control link from the current activation record will lead to the activation record
containing x, and the location of x will be one location down from the top of that
block. Similar steps can be used to find y at the second location down from the top
of its activation record. Although the details may vary from one compiler to the
next, the main point is that the compiler can determine the number of control links
to follow and the relative location of the variable within the correct block from the
source code. In particular, it is not necessary to store variable names in activation
records.
The difference between a procedure and a function is that a function has a return
value but a procedure does not. In most languages, functions and procedures may
have side effects. However, a procedure has only side effects; a procedure call is
a statement and not an expression. Because functions and procedures have many
characteristics in common, we use the terms almost interchangeably in the rest of
this chapter. For example, the text may discuss some properties of functions, and
then a code example may illustrate these properties with a procedure. This should
remind you that the discussion applies to functions and procedures in many program-
ming languages, whether or not the language treats procedures as different from
functions.
Control link
Return address
Return-result addr
Parameters
Local variables
Intermediate results
Environment
Pointer
The activation record associated with a function (see Figure 7.4) must contain
space for the following information:
This information may be stored in different orders and in different ways in different
language implementations. Also, as mentioned earlier, many compilers perform op-
timizations that place some of these values in registers. For concreteness, we assume
that no registers are used and that the six components of an activation record are
stored in the order previously listed.
Although the names of variables are eliminated during compilation, we often
draw activation records with the names of variables in the stack. This is just to make
it possible for us to understand the figures.
Example 7.2
We can see how function activation records are added and removed from the run-time
stack by tracing the execution of the familiar factorial function:
Suppose that some block contains the expression fact(3)+1, leading to a call of fact(3).
Let us assume that the activation record of the calling block contains a location that
will be used to store the value of fact(3) before computing fact(3)+1.
The next activation record that is placed on the stack will be an activation record
for the call fact(3). In this activation record, shown after the list,
the control link points to the activation record of the block containing the call
fact(3),
the return-result link points to the location in the activation record of the calling
block that has been allocated for the intermediate value fact(3) of the calling
expression fact(3)+1,
the actual parameter value 3 is placed in the location allocated for the formal
parameter n,
a location is allocated for the intermediate value fact(n-1) that will be needed
when n>0.
fact(3)
Control link
Return-result addr
n 3
fact(n-1)
Environment
Pointer
After this activation record is allocated on the stack, the code for factorial is executed.
Because n>0, there is a recursive call fact(2). This leads to a recursive call fact(1), which
results in a series of activation records, as shown in the subsequent figure.
Note that in each of the lower activation records, the return-result address points
to the space allocated in the activation record above it. This is so that, on return
from fact(1), for example, the return result of this call can be stored in the activation
record for fact(2). At that point, the final instruction from the calculation of fact(2)
will be executed, multiplying local variable n by the intermediate result fact(1).
The final illustration of this example shows the situation during return from fact(2)
when the return result of fact(2) has been placed in the activation record of fact(3),
but the activation record for fact(2) has not yet been popped off the stack.
The identifiers x and y are formal parameters of the procedure p. The actual param-
eters in the call to p are z and 4*z+1.
The way that actual parameters are evaluated and passed to the function depends
on the programming language and the kind of parameter-passing mechanisms it uses.
The main distinctions between different parameter-passing mechanisms are
the time that the actual parameter is evaluated
the location used to store the parameter value.
Most current programming languages evaluate the actual parameters before execut-
ing the function body, but there are some exceptions. (One reason that a language or
174 Scope, Functions, and Storage Management
Semantics of Pass-by-Value
In pass-by-value, the actual parameter is evaluated. The value of the actual parameter
is then stored in a new location allocated for the function parameter. For example,
consider this function definition and call:
If the parameter is passed by value and y is an integer variable, then this code has
the same meaning as the following ML code:
As you can see from the type, the value passed to the function f is an integer. The
integer is the R-value of the actual parameter y, as indicated by the expression !y
in the call. In the body of f, a new integer location is allocated and initialized to the
R-value of y.
If the value of y is 0 before the call, then the value of f(!y) is 1 because the function
f increments the parameter and returns its value. However, the value of y is still 0
after the call, because the assignment inside the body of f changes the contents of
only a temporary location.
Semantics of Pass-by-Reference
In pass-by-reference, the actual parameter must have an L-value. The L-value of the
actual parameter is then bound to the formal parameter. Consider the same function
definition and call used in the explanation of pass-by-value:
If the parameter is passed by reference and y is an integer variable, then this code
has the same meaning as the following ML code:
As you can see from the type, the value passed to the function f is an integer reference
(L-value).
If the value of y is 0 before the call, then the value of f(!y) is 1 because the function
f increments the parameter and returns its value. However, unlike the situation for
pass-by-value, the value of y is 1 after the call because the assignment inside the body
of f changes the value of the actual parameter.
Example 7.3
Here is an example, written in an Algol-like notation, that combines pass-by-
reference and pass-by-value:
This code, which treats L- and R-values explicitly, shows that for pass-by-reference
we pass an L-value, the integer reference z. For pass-by-value, we pass an R-value,
the contents !z of z. The pass-by-value is assigned a new temporary location.
With y passed by value as written, z is assigned the value 2. If y is instead passed
by reference, then x and y are aliases and z is assigned the value 1.
Example 7.4
Here is a function that tests whether its two parameters are aliases:
function (y,z){
y := 0;
z :=0;
y := 1;
if z=1 then y :=0; return 1 else y :=0; return 0
}
If y and z are aliases, then setting y to 1 will set z to 1 and the function will return 1.
Otherwise, the function will return 0. Therefore, a call f(x,x) will behave differently
if the parameters are pass-by-value than if the parameters are pass-by-reference.
Static Scope: A global identifier refers to the identifier with that name that is
declared in the closest enclosing scope of the program text.
Dynamic Scope: A global identifier refers to the identifier associated with the
most recent activation record.
7.3 Functions and Procedures 177
These definitions can be tricky to understand, so be sure to read the examples be-
low carefully. One important difference between static and dynamic scope is that
finding a declaration under static scope uses the static (unchanging) relationship
between blocks in the program text. In contrast, dynamic scope uses the actual
sequence of calls that are executed in the dynamic (changing) execution of the
program.
Although most current general-purpose programming languages use static scope
for declarations of variables and functions, dynamic scoping is an important con-
cept that is used in special-purpose languages and for specialized constructs such as
exceptions.
Example 7.5
The difference between static and dynamic scope is illustrated by the following code,
which contains two declarations of x:
int x=1;
function g(z) = x+z;
function f(y) = {
int x = y+1;
return g(y*x)
};
f(3);
The call f(3) leads to a call g(12) inside the function f. This causes the expression
x+z in the body of g to be evaluated. After the call to g, the run-time stack will
contain activation records for the outer declaration of x, the invocation of f, and the
invocation of g, as shown in the following illustration.
outer block x 1
f(3) y 3
x 4
g(12) z 12
At this point, two integers named x are stored on the stack, one from the outer block
declaring x and one from the local declaration of x inside f. Under dynamic scope,
the identifier x in the expression x+z will be interpreted as the one from the most
178 Scope, Functions, and Storage Management
Control link
Access link
Return address
Return-result addr
Parameters
Local variables
Intermediate results
Environment
Pointer
recently created activation record, namely x=4. Under static scope, the identifier x in
x+z will refer to the declaration of x from the closest program block, looking upward
from the place that x+z appears in the program text. Under static scope, the relevant
declaration of x is the one in the outer block, namely x=1.
Example 7.6
Let us look at the activation records and access links for the example code from
Example 7.5, treating each ML declaration as a separate block.
Figure 7.6 shows the run-time stack after the call to g inside the body of f. As
always, the control links each point to the preceding activation record on the stack.
The control links are drawn on the left here to leave room for the access links
on the right. The access link for each block points to the activation record of the
closest enclosing block in the program text. Here are some important points about
7.3 Functions and Procedures 179
outer block x 1
control link
access link
g …
control link
access link
f …
this illustration, which follows our convention that we begin a new block for each
ML declaration.
As for in-line blocks, the compiler can determine how many access links to follow and
where to find a variable within an activation record at compile time. These properties
are easily determined from the structure of the source code.
To summarize, the control link is a link to the activation record of the previous
(calling) block. The access link is a link to the activation record of the closest enclosing
block in program text. The control link depends on the dynamic behavior of program
whereas the access link depends on only the static form of the program text. Access
links are used to find the location of global variables in statically scoped languages
with nested blocks at run time.
Access links are needed only in programming languages in which functions may
be declared inside functions or other nested blocks. In C, in which all functions are
declared in the outermost global scope, access links are not needed.
the first call to f in the body of g is a tail call, as the return value of g is exactly the
return value of the call to f. The second call to f in the body of g is not a tail call
because g performs a computation involving the return value of f before g returns.
A function f is tail recursive if all recursive calls in the body of f are tail calls to f.
Example 7.7
Here is a tail recursive function that computes factorial:
More specifically, for any positive integer n, tlfact(n,a) returns n!. We can see that
tlfact is a tail recursive function because the only recursive call in the body of tlfact
is a tail call.
The advantage of tail recursion is that we can use the same activation record for
all recursive calls. Consider the call tlfact(3,1). Figure 7.7 shows the parts of each
activation record in the computation that are relevant to the discussion.
7.3 Functions and Procedures 181
control
return result
n 3
a 1
control
return result
n 2
a 3
control
return result
n 1
a 6
After the third call terminates, it passes its return result back to the second call,
which then passes the return result back to the first call. We can simplify this process
by letting the third call return its result to the activation that made the original call,
tlfact(3,1). Under this alternative execution, when the third call finishes, we can pop
the activation record for the second call as well, as we will no longer need it. In fact,
because the activation record for the first call is no longer needed when the second
call begins, we can pop the first activation record from the stack before allocating
the second. Even better, instead of deallocating the first and then allocating a second
activation record identical to the first, we can use one activation record for all three
calls.
Figure 7.8 shows how the same activation record can be used used three times
for three successive calls to tail recursive tlfact. The figure shows the contents of this
activation record for each call. When the first call, tlfact(3,1), begins, an activation
record with parameters (1,3) is created. When the second call, tlfact(2,3), begins,
the parameter values are changed to (2,3). Tail recursion elimination reuses a single
activation record for all recursive calls, using assignment to change the values of the
function parameters on each call.
An activation record for itfact looks just like an activation record for tlfact . If
we look at the values of n and a on each iteration of the loop, we find that they
change in exactly the same way is for tail recursive calls to tlfact. The two func-
tions compute the same result by exactly the same sequence of instructions. Put an-
other way, tail recursion elimination compiles tail recursive functions into iterative
loops.
In a language with first-class functions and static scope, a function value is generally
represented by a closure, which is a pair consisting of a pointer to function code and
a pointer to an activation record.
Here is an example ML function that requires a function argument:
The map function take a function f and a list m as arguments, applying f to each
element of m in turn. The result of map(f, m) is the list of results f(x) for elements x of
the list m. This function is useful in many programming situations in which lists are
used. For example, if we have a list of expiration times for a sequence of events and
we want to increment each expiration time, we can do this by passing an increment
function to map.
We will see why closures are necessary by considering interactions between static
scoping and function arguments and return values. C and C++ do not support clo-
sures because of the implementation costs involved. However, the implementation
of objects in C++ and other languages is related to the implementation of function
7.4 Higher-Order Functions 183
values discussed in this chapter. The reason is that a closure and an object both
combine data with code for functions.
Although some C programmers may not have much experience with passing
functions as arguments to other functions, there are many situations in which this
can be a useful programming method. One recognized use for functions as function
arguments comes from an influential software organization concept. In systems pro-
gramming, the term upcall refers to a function call up the stack. In an important
paper called “The Structuring of Systems Using Upcalls,” (ACM Symp. Operating
Systems Principles, 1985) David Clark describes a method for arranging the functions
of a system into layers. This method makes it easier to code, modularize, and reason
about the system. As in the network protocol stack, higher layers are clients of the
services provided by lower layers. In a layered file system, the file hierarchy layer
is built on the vnode, which is in turn built over the inode and disk block layers. In
Clark’s method, which has been widely adopted and used, higher levels pass handler
functions into lower levels. These handler functions are called when the lower level
needs to notify the higher level of something. These calls to a higher layer are called
upcalls. This influential system design method shows the value of language support
for passing functions as arguments.
Example 7.8
An example program with two declarations of a variable x and a function f passed
to another function g is used to illustrate the main issues:
val x = 4;
fun f(y) = x*y;
fun g(h) = let val x=7 in h(3) + x;
g(f);
In this program, the body of f contains a global variable x that is declared out-
side the body of f. When f is called inside g, the value of x must be retrieved from
the activation record associated with the outer block. Otherwise the body of f would
be evaluated with the local variable x declared inside g, which would violate static
scope. Here is the same program written with C-like syntax (except for the type
expression int →int) for those who find this easier to read:
184 Scope, Functions, and Storage Management
{ int x = 4;
{ int f(int y) {return x*y;}
{ int g(int → int h) {
int x=7;
return h(3) + x;
}
g(f);
}}}
The C-like version of the code reflects a decision, used for simplicity throughout this
book, to treat each top-level ML declaration as the beginning of a separate block.
We can see the variable-lookup problem by looking at the run-time stack after
the call to f from the invocation of g.
x 4 Code
for f
f
g
Code
g(f) h for g
x 7
h(3) y 3
This simplified illustration shows only the data contained in each activation record.
In this illustration, the expression x*y from the body of f is shown at the bottom, the
activation record associated with the invocation of f (through formal parameter h of
g). As the illustration shows, the variable y is local to the function and can therefore
be found in the current activation record. The variable x is global, however, and
located several activation records above the current one. Because we find global
variables by following access links, the access link of the bottom activation record
should allow us to reach the activation record at the top of the illustration.
When functions are passed to functions, we must set the access link for the activa-
tion record of each function so that we can find the global variables of that function
correctly. We cannot solve this problem easily for our example program without
extending some run-time data structure in some way.
Use of Closures
The standard solution for maintaining static scope when functions are passed to
functions or returned as results is to use a data structure called a closure. A closure is
a pair consisting of a pointer to function code and a pointer to an activation record.
Because each activation record contains an access link pointing to the record for the
7.4 Higher-Order Functions 185
x 4
Code
for f
access
f
access
Code
g
for g
g(f )
access
h
x 7
closest enclosing scope, a pointer to the scope in which a function is declared also
provides links to activation records for enclosing blocks.
When a function is passed to another function, the actual value that is passed is
a pointer to a closure. The following steps are used for calling a function, given a
closure:
We can see how this solves the variable-access problem for functions passed to func-
tions by drawing the activation records on the run-time stack when the program in
Figure 7.8 is executed. These are shown in Figure 7.9.
We can understand Figure 7.9 by walking through the sequence of run-time steps
that lead to the configuration shown in the figure.
As described in step 5, the closure for f allows the code executed at run time to find
the activation record containing the global declaration of x.
When we can pass functions as arguments, the access links within the stack form a
tree. The structure is not linear, as the activation record corresponding to the function
call h(3) has to skip the intervening activation record for g(f) to find the necessary
x. However, all the access links point up. Therefore, it remains possible to allocate
and deallocate activation records by use of a stack (last allocated, first deallocated)
discipline.
Given two function arguments f and g, function compose returns the function
7.4 Higher-Order Functions 187
Example 7.9
In this example code, a “counter” is a function that has a stored, private integer value.
When called with a specific integer increment, the counter increments its internal
value and returns the new value. This new value becomes the stored private value
for the next call. The followng ML function, make counter, takes an integer argument
and returns a counter initialized to this integer value:
Function make counter allocates a local variable count, initialized to the value of
the parameter init. Function make counter then returns a function that, when called,
increments count’s value by parameter inc, and then returns the new value of count.
The types and values associated with these declarations, as printed by the com-
piler, are
Here is the same program example written in a C-like notation for those who prefer
this syntax:
1
mk counter(1) access
init 1
count
counter
c(2)
access
inc 2
Code for
counter
Figure 7.10. Activation records for function closure returned from function.
If we trace the storage allocation associated with this compilation and execution, we
can see that the stack discipline fails. Specifically, it is necessary to save an activation
record that would be popped off the stack if we follow the standard last allocated–first
deallocated policy.
Figure 7.10 shows that the records are allocated and deallocated in execution of
the code from Example 7.9. Here are the sequence of steps involved in producing
the activation records shown in Figure 7.10:
Closures are used to set the access pointer when a function returned from a nested
scope is called.
When functions are returned from nested scopes, activation records do not obey
a stack discipline. The activation record associated with a function call cannot
be deallocated when the function returns if the return value of the function is
another function that may require this activation record.
The second point here is illustrated by the preceding example: The activation record
for the call to make counter cannot be deallocated when make counter returns, as this
activation record is needed by the function count returned from make counter.
Activation records of functions and procedures contain access (or static scoping)
links that point to the activation record associated with the immediately enclosing
block.
Functions passed as parameters or returned as results must be represented as
closures, consisting of function code together with a pointer to the correct lexical
environment.
When a function returns a function that relies on variables declared in a nested
scope, static scoping requires a deviation from stack discipline: An activation
record may need to be retained until the function value (closure) is no longer in
use by the program.
EXERCISES
7.1 Activation Records for In-Line Blocks
You are helping a friend debug a C program. The debugger gdb, for example, lets
you set breakpoints in the program, so the program stops at certain points. When
the program stops at a breakpoint, you can examine the values of variables. If you
want, you can compile the programs given in this problem and run them under a
debugger yourself. However, you should be able to figure out the answers to the
questions by thinking about how activation records work.
(a) Your friend comes to you with the following program, which is supposed to
calculate the absolute value of a number given by the user:
1: int main( )
2: {
3: int num, absVal;
4:
5: printf(“Absolute Value\n”);
6: printf(“Please enter a number:”);
7: scanf(“%d”,&num);
8:
9: if (num <= 0)
10: {
11: int absVal = num;
12: }
13: else
14: {
15: int absVal = -num;
16: }
17:
18: printf(“The absolute value of %d is %d.\n \n”,num,absVal);
19:
20: return 0;
21: }
Your friend complains that this program just doesn’t work and shows you some
sample output:
cardinal: ˜ > gcc -g test.c
cardinal: ˜ > ./a.out
Absolute Value
Please enter a number: 5
The absolute value of 5 is 4.
Your friend swears that the computer must be broken, as it is changing the ad-
dress of a program variable. Using the information provided by the debugger,
explain to your friend what the problem is.
(b) Your explanation must not have been that good, because your friend does not
believe you. Your friend brings you another program:
1: int main( )
2: {
3: int num;
4:
5: printf(“Absolute Value\n”);
6: printf(“Please enter a number:”);
7: scanf(“%d”,&num);
8:
9: if (num >= 0)
10: {
11: int absVal = num;
12: }
13: else
14: {
15: int absVal = -num;
16: }
17:
18: {
19: int absVal;
20: printf(“The absolute value of %d is %d.\n\n”,num, absVal);
21: }
22: }
(c) Imagine that line 17 of the program in part (b) was split into three lines:
17a: {
17b: ...
17c: }
Write a single line of code to replace the ... that would guarantee this program
would NEVER be right. You may not declare functions or use the word absVal
in this line of code.
(d) Explain why the change you made in part (c) breaks the program in part (b).
fun middle1(l) =
let fun len(nil) = 0
| len(x : : l) = 1+len(l)
and get(n, nil) = raise Empty
| get(n, x : : l) = if n=1 then x else get(n-1, l)
in
194 Scope, Functions, and Storage Management
fun middle2(l) =
let fun m(x, nil) = raise Empty
| m(nil,x : : l) = x
| m([y],x : : l) = x
| m(y::(z : : l1),x : : l2) = m(l1, l2)
in
m(l, l)
end;
Assume that both are compiled and executed by a compiler that optimizes use of
activation records or “stack frames.”
(a) Describe the approximate running time and space requirements of middle1
for a list of length n. Just count the number of calls to each function and the
maximum number of activation records that must be placed on the stack at
any time during the computation.
(b) Describe the approximate running time and space requirements of middle2 for
a list of length n. Just count the number of calls to m and the maximum number
of activation records that must be placed on the stack at any time during the
computation.
(c) Would an iterative algorithm with two pointers, one moving down the list twice
as fast as the other, be significantly more or less efficient than middle2? Explain
briefly in one or two sentences.
The code that makes up the body of power is intended to calculate xy and place the
result in z. However, depending on the actual parameters, power may not behave
correctly for certain combinations of parameter-passing methods. For simplicity,
we only consider call-by-value and call-by-reference.
(a) Assume that a and c are assignable integer variables with distinct L-values.
Which parameter-passing methods make c = aa after a call power(a, a, c). You
may assume that the R-values of a and c are nonnegative integers.
(b) Suppose that a and c are formal parameters to some procedure P and that the
preceding expression power(a, a, c) is evaluated inside the body of P. If a and c
are passed to P by reference and become aliases, then what parameter-passing
Exercises 195
method(s) will make c = aa after a call power(a, a, c)? If, after the call, c = aa ,
does that mean that power actually performed the correct calculation?
7.6 Pass-by-Value-Result
In pass-by-value-result, also called call-by-value-result and copy-in/copy-out, pa-
rameters are passed by value, with an added twist. More specifically, suppose a
function f with a pass-by-value-result parameter u is called with actual parameter
v. The activation record for f will contain a location for formal parameter u that is
initialized to the R-value of v. Within the body of f, the identifier u is treated as an
assignable variable. On return from the call to f, the actual parameter v is assigned
the R-value of u.
The following pseudo-Algol code illustrates the main properties of pass-by-
value-result:
var x : integer;
x := 0;
procedure p(value-result y : integer)
begin
y := 1;
x := 0;
end;
p(x);
With pass-by-value-result, the final value of x will be 1: Because y is given a new
location distinct from x, the assignment to x does not change the local value of y.
When the function returns, the value of y is assigned to the actual parameter x.
If the parameter were passed by reference, then x and y would be aliases and the
assignment to x would change the value of y to 0. If the parameter were passed by
value, the assignment to y in the body of p would not change the global variable x
and the final value of x would also be 0.
Translate the preceding program fragment into ML (or pseudo-ML if you pre-
fer) in a way that makes the operations on locations, and the differences between
L-values and R-values, explicit. Your solution should have the form
196 Scope, Functions, and Storage Management
val x = ref 0;
fun p(y : int ref) =
...;
p(x);
Note that, in ML, like C and unlike Algol or Pascal, a function may be called as a
procedure.
procedure pass ( x, y );
integer x, y; // types of the formal parameters
begin
x := x + 1;
y := x + 1;
x := y;
i := i + 1
end
i := 1;
pass (i, i);
print i
end
(a) pass-by-value
(b) pass-by-reference
(c) pass-by-value-result
(b) Under dynamic scoping, what is the value of x + f(x) in this code? For each line
in which x is used, state which value is used for x and explain why these are the
appropriate values under dynamic scoping.
Some languages, including Lisp and Scheme, have a way to construct and evaluate
expressions at run time. Constructing programs at run time is useful in certain kinds
of problems, such as symbolic mathematics and genetic algorithms.
The following program evaluates the string bound to s, inside the scope of two
declarations:
let s = read text from user( ) in
let x = 5 and y = 3 in eval s end
end
If s were bound to 1+x*y then eval would return 16. Assume that eval is a special
language feature and not simply a library function.
(a) The “unused variable” optimization and the “eval” construct are not compat-
ible. The identifiers x and y do not appear in the body of the inner let (the
part between in and end), yet an optimizing compiler cannot eliminate them
because the eval might need them. In addition to the values 5 and 3, what in-
formation does the language implementation need to store for eval that would
not be needed in a language without eval?
(b) A clever compiler might look for eval in the scope of the let. If eval does not
appear, then it may be safe to perform the optimization. The compiler could
eliminate any variables that do not appear in the scope of the let declaration.
Does this optimization work in a statically scoped language? Why or why not?
(c) Does the optimization suggested in part (b) work in a dynamically scoped
language? Why or why not?
198 Scope, Functions, and Storage Management
record is numbered at the far left. In each activation record, place the number of
the activation record of the statically enclosing scope in the slot labeled “access
link.” The first two are done for you. Also use activation record numbers for
the environment pointer part of each closure pair. Write the values of local
variables and function parameters directly in the activation records.
val x = 5;
fun f(y) =
let val z = [1, 2, 3] (* declare list *)
fun g(w) = w+x+y (* declare local function *)
in
g (* return local function *)
end;
val h = let val x=7 in f(3) end;
h(2);
(a) Write the type of each of the declared identifiers (x, f, and h).
(b) Because this code involves a function that returns a function, activation records
cannot be deallocated in a last-in/first-out (LIFO) stacklike manner. Instead,
let us just assume that activation records will be garbage collected at some
point. Under this assumption, the activation record for the call f in the expres-
sion for h will still be available when the call h(2) is executed.
Fill in the missing information in the following illustration of the run-time
stack after the call to h at the end of this code fragment. Remember that func-
tion values are represented by closures and that a closure is a pair consisting
of an environment (pointer to an activation record) and compiled code.
200 Scope, Functions, and Storage Management
In this figure, a bullet (•) indicates that a pointer should be drawn from
this slot to the appropriate closure, compiled code, or list cell. Because the
pointers to activation records cross and could become difficult to read, each
activation record is numbered at the far left. In each activation record, place
the number of the activation record of the statically enclosing scope in the slot
labeled “access link.” The first two are done for you. Also use activation record
numbers for the environment pointer part of each closure pair. Write the values
of local variables and function parameters directly in the activation records.
Activation Records Heap Data and Closures Compiled Code
(1)
access link (0)
x
(2)
access link (1) ( ), •
f • code for f
(3)
access link ( ) ( ), •
h •
(4)
access link ( ) code for g
x
(5)
f(3) access link ( )
y 1 • 2 • 3
z •
g
(6)
h(2) access link ( )
w
(c) What is the value of this expression? Explain which numbers are added
together and why.
(d) If there is another call to h in this program, then the activation record for
this closure cannot be garbage collected. Using the definition of garbage given
in the Lisp chapter (see Chapter 3), explain why, as long as h is reachable,
mark-and-sweep will fail to collect some garbage that will never be accessed
by the program.
(b) What is the value of this expression under static scope? Briefly explain how
your stack diagram from part (a) allows you to find which value of myop to use.
(c) What would be the value if dynamic scope were used? Explain how the stack
diagram would be different from the one you have drawn in part (a) and briefly
explain why.
7.15 Closures
In ANSI C, we can pass and return pointers to functions. Why does the implemen-
tation of this language not require closures?
fun fact(n) =
let f(n, g) = if n=0 then g(1)
else let h(i) = g(i*n)
in
f(n-1, h)
end
in
f(n, fn x => x)
end
This question asks you to consider the problem of applying the ordinary tail recur-
sion elimination optimization to this function.
(a) Fill in the missing information in the following outline of the activation records
resulting from the unoptimized execution of f(2, fn x => x). You may want to
draw closures or other data.
(b) What makes this function more difficult to optimize than the other examples
discussed in this chapter? Explain.
Tail-recursive version
fact’(n,a) = (if n=0 then a else fact’(n-1,(n*a)))
factB (n) = fact’(n,1)