0% found this document useful (0 votes)
28 views7 pages

Stack Machine Handout

The document describes a stack machine designed for code generation and execution, utilizing a simple stack-based architecture for calculations and procedure calls. It outlines the machine's memory organization, including the stack, heap, and code space, along with instructions for manipulating the stack and executing operations. Additionally, it covers the management of local variables, branching, and procedure calls, emphasizing the use of activation records for maintaining state during execution.

Uploaded by

asdfasdf2123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views7 pages

Stack Machine Handout

The document describes a stack machine designed for code generation and execution, utilizing a simple stack-based architecture for calculations and procedure calls. It outlines the machine's memory organization, including the stack, heap, and code space, along with instructions for manipulating the stack and executing operations. Additionally, it covers the management of local variables, branching, and procedure calls, emphasizing the use of activation records for maintaining state during execution.

Uploaded by

asdfasdf2123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Stack Machine

Ian J. Hayes
February 14, 2025

Instead of generating code for a (complex) real ma- 1 Calculating using a stack
chine, we generate code for a simple stack machine,1
and then run the stack machine programs using an em- A basic operation is to push (load) a value onto the
ulator program (written in Java).2 The machine has a stack. This adds an extra value to the stack, (which be-
memory consisting of an array of words (not bytes). comes the current top of the stack), leaving the existing
The memory is divided into three areas: values unchanged, i.e., the old top of stack becomes
the second top of stack, the second top of stack be-
the stack that is used both for calculating the values of comes the third top of stack, etc. There is an instruction
expressions (see §1) as well as storing activation LOAD CON c, which pushes the constant c onto the
records for every active procedure call; an activa- stack, e.g., LOAD CON 42 pushes the value 42 onto
tion record contains the variables declared locally the stack. There are also instructions to load (push) val-
to a procedure (see §2) as well as other informa- ues from variables onto the stack and store a value from
tion like the return address from the procedure call the top of the stack into a variable (see §2).
(see §4); A unary operator, like NEGATE, takes the value on
top of the stack applies the operation, e.g., negates the
the heap that is used for dynamic allocation of ob- value, and replaces the top of stack with the new (e.g.,
jects3 (see §6); negated) value. The unary operator instructions are
NEGATE and NOT; see Appendix A for details.
the code space that is used for storing machine in- A binary operator, like ADD, takes the top two val-
structions. ues from the stack and replaces them by a single value
which is the result of the operation applied to the two
The stack and the heap share the one contiguous area values, e.g., their sum. The size of the stack is reduced
of memory. The stack grows up from the bottom of this by one word. For example, to evaluate 3 − 2 one can
area (address 0) and the heap grows down from the top use the following sequence of instructions:5
of this area. The machine has four special registers:
LOAD CON 3 // Push 3 onto the stack
stack pointer (sp) that contains the address of top of LOAD CON 2 // Push 2 onto the stack
the stack plus 1, i.e., the address into which the NEGATE // Negate the top of stack
next push will take place; ADD // Replace the top two values
// on the stack with their sum
stack limit (limit) that contains the address of the up- The above sequence as a whole leaves one extra value
per limit to which the stack can grow, which is on the top of the stack, namely 1. Note that there is no
also the bottom of the heap; subtract instruction, and hence one needs a sequence of
NEGATE and ADD to implement subtraction.
frame pointer (fp) that contains the address of the For comparison operators, like LESS, the second top of
stack frame (activation record) for the current pro- stack (first argument) is compared with the top of stack
cedure (or the main program);4 and (second argument), and these two values are replaced
by a boolean value, which is true (represented by 1) if
program counter (pc) that contains the address of the the first argument is less than the second argument, and
next machine instruction to be executed. false (represented by 0) if the first argument is greater
than or equal to the second argument. For example, to
See [3, §7.1] or [4, §7.1] for a discussion of memory evaluate 3 < 2 one can use
organisation for runtime execution. LOAD CON 3 // Push 3 onto the stack
1 Using stacks for evaluating expressions was pioneered by Aus-
LOAD CON 2 // Push 2 onto the stack
tralian philosopher Charles Hamblin [1]. The Burroughs B5000 de-
LESS // Replace the top two values
signed by Bob Barton was the first commercial stack machine [2]. // on the stack with boolean
2 The machine is based on the Itty Bitty Stack Machine as de-

scribed Pittman and Peters, The Art of Compiler Design, 1992. which will leave the top of stack containing false (0).
3 The heap should not be be confused with the heap data structure The binary operator instructions are DIV, MPY, ADD,
used in heap sort. XOR, OR, AND, EQUAL, LESS, and LESSEQ.
4 The main program is considered to be a (special) procedure, and

the variables declared in the main program are considered local to it. 5 Of course one could optimise this to LOAD CON 1.

1
2 Stack machine instructions (February 14, 2025)

The left column gives instruction addresses starting from zero. Each instruction takes one word, except
LOAD CON which requires two words.
0 LOAD CON 3 // Push the offset of x (i.e., 3) onto the stack
2 LOAD FRAME // Replace the top of stack with the current value of x
3 LOAD CON 1 // Push 1 onto the stack
5 ADD // Replace the top two values on the stack with their sum (i.e., x + 1)
6 LOAD CON 4 // Push the offset of y (i.e., 4) onto the stack
8 STORE FRAME // Store the second top of stack (x + 1) at the offset given by the top of stack (y)

Figure 1: Instruction sequence for assignment statement y := x + 1

2 Local Variables ative to branch backwards. For example, Fig. 2 gives


an instruction sequence to implement an “if” statement
Each procedure activation (i.e., call) allocates its local which assigns to y the absolute value of x. The branch
variables in its activation record, which is an area of offset (10) for the first branch (at location 8) is the dif-
memory within the stack. An activation record is also ference between its target address (the “else” part at
referred to as a stack frame. The start of the activa- location 19) and the address of the instruction after the
tion record for the most recently called procedure is branch (9). The branch offset (6) for the second branch
addressed by the machine’s frame pointer (fp). Local (at location 18) is the difference between its target ad-
variables are accessed by an offset relative to the frame dress (the instruction following the whole “if” state-
pointer. ment at location 25) and the address of the instruction
The instruction LOAD FRAME expects the top of after the branch (19). The second branch is uncondi-
stack to contain the offset from the frame pointer of a tional, BR. It expects a single argument, the branch
local variable. The top of stack is replaced by the con- offset, on top of the stack.
tents of the location whose address is the frame pointer Note how instructions in 0–5 in Fig. 2 form the code
plus the offset. to evaluate x < 0; those in 9–15 form the code for the
The instruction STORE FRAME expects the top of assignment statement y := −x, and within those 9–12
stack to contain an offset (from the frame pointer) and form the code to evaluate −x; and those in 19–24 form
the second top of stack to contain a value to be stored. the code for the assignment statement y := x.
The value (second top of stack) is stored in the loca-
tion whose address is the frame pointer plus the offset
(top of stack). Both values are popped from the stack. 4 Procedure call and return
For example, if local variable x is at offset 3 and y is
at offset 4 from the frame pointer, then the assignment Fig. 3 contains an example PL0 program which defines
y := x + 1 can be implemented by the instruction se- a procedure C and local to C a (recursive) procedure
quence in Fig. 1. Note how the instructions in loca- fact. Because (at this stage) PL0 does not include pa-
tions 0–5 form the instruction sequence for evaluating rameters to procedures, values are passed to and from
the expression x + 1. procedures using variables non-local to the procedure.

4.1 Stack frames


3 Branching
Each time a procedure is entered via a call, the machine
Control structures such as “if” statements and “while” needs to keep track of information such as the return
loops are implemented using the branch instruction. address and the local variables for that call. This in-
The conditional branch instruction, BR FALSE, ex- formation is stored in the call’s activation record. Be-
pects two arguments on the top of the stack. The cause recursive calls of procedures are allowed, there
first argument (second top of stack) should contain a may be multiple calls of a single procedure active and
boolean. The second argument (top of stack) contains each call (not just each procedure) needs a separate ac-
a branch offset relative to the address of the instruc- tivation record. Because procedure calls and returns
tion immediately following the branch instruction. If follow a last called is first returned structure it is appro-
the first argument is false (0) the branch is taken, oth- priate to use a (last-in first-out) stack to store the activa-
erwise execution continues with the instruction imme- tion records. The machine uses the same stack for both
diately after the branch. The branch offset can be neg- storing activation records and evaluating expressions.
Ian J. Hayes (February 14, 2025) 3

Branch offsets are relative to the address of the instruction immediately following the branch.
0 LOAD CON 3 // Push the offset of x (i.e., 3) onto the stack
2 LOAD FRAME // Replace the top of stack with the value of x
3 LOAD CON 0 // Push 0 onto the stack
5 LESS // Calculate x < 0
6 LOAD CON 10 // Load branch offset = 19-9 to “else” part
8 BR FALSE // Branch if x ≥ 0 to “else” part
“then” part 9 LOAD CON 3 // Push the offset of x (i.e., 3) onto the stack
11 LOAD FRAME // Replace the top of stack with the value of x
12 NEGATE // Negate the top of stack
13 LOAD CON 4 // Push the offset of y (i.e., 4) onto the stack
15 STORE FRAME // Store −x in y
16 LOAD CON 6 // Load branch offset = 25-19 to exit
18 BR // Branch unconditionally to end of “if” statement
“else” part 19 LOAD CON 3 // Push the offset of x (i.e., 3) onto the stack
21 LOAD FRAME // Replace the top of stack with the value of x
22 LOAD CON 4 // Push the offset of y (i.e., 4) onto the stack
24 STORE FRAME // Store x in y
exit 25 ... //

Figure 2: Instruction sequence for if x < 0 then y := −x else y := x

When a procedure is entered via a call, a new activation 4.2 Calling a procedure
record is “pushed” onto the stack. The activation record
To call a procedure the following sequence takes place:
is also called a stack frame and the stack machine has
a frame pointer (fp) which is always set to the address 1. any parameters to the procedure are pushed onto
of the base of the stack frame for the currently active the stack;6
procedure call. Fig. 4 gives the layout of a stack frame.
Each activation record contains: 2. the static link is pushed onto the stack (see §5);
a static link (at offset 0 from fp) also called an access 3. the current frame pointer is pushed onto the stack
link, that is used for accessing non-local variables to create the dynamic link;
– see §5 or [4, §7.3.2] or [3, §7.3.5];
4. the frame pointer is set so that it contains the ad-
a dynamic link (at offset 1) also called a control link, dress of the start of the new stack frame (i.e., the
that contains the address of the activation record address of the location containing the static link);
of the calling procedure (so the frame pointer can
be restored on return from the procedure) – see 5. the current value of the program counter (which
Sections 4.2 and 4.3 or [4, §7.3.1] or [3, §7.2.2]; is the address of the instruction after the call) is
pushed onto the stack to form the return address;
a return address (at offset 2) which is the address of
the instruction to execute on return from the pro- 6. the program counter is set to the address of the
cedure (used for restoring the program counter on procedure and execution of the body of the proce-
return); dure begins; and
local variables which are allocated sequentially on 7. space is allocated on the stack for any local vari-
the stack with addresses relative to the frame ables (using instruction ALLOC STACK).
pointer starting from offset 3; and
The CALL instruction implements steps 3–6 of the
parameters which are allocated sequentially on the above sequence; its first step is to pop the address of
stack with negative addresses relative to the frame the called procedure from the stack for later use in step
pointer. 6.
See [3, §7.2] or [4, §7.3] for a discussion of stack-based 6 The observant reader may have noticed that PL0 procedures do

runtime environments. not (at this stage) have parameters.


4 Stack machine instructions (February 14, 2025)

Stack frame
var x: int; // first argument to C
y: int; // second argument to C of calling proc.
res: int; // result from C

....
procedure C() = Zero or more
var n: int; // argument to fact −1 parameters
f: int; // result from fact
procedure fact() = fp 0 Static link
begin
// assume 0 <= n 1 Dynamic Link
if n = 0 then 2 Return address
f := 1
// f = n! 3

Top of stack
Zero or more
else // 0 < n

....
begin local variables
n := n - 1;
call fact();
// f = n! Stack used for
n:= n + 1; // restore n expression eval.
f := f * n
// f = n! Figure 4: Stack frame
end
end; // fact
quence takes place:
begin // C
n := x; 1. the program counter is set to the return address in
call fact(); the current activation record;
// f = x!
2. the frame pointer is set to the dynamic link;
res := f;
n := y; 3. the stack pointer is set so that all the space used
call fact(); by the stack frame (but not parameters) is popped
// f = y! from the stack;
res := res / f;
// res = x! / y! 4. execution continues at the instruction addressed
n := x - y; by the (restored) program counter; and
call fact(); 5. after return the calling procedure handles deallo-
// f = (x-y)! cating any parameters (using instruction DEAL-
res := res / f LOC STACK).
// res = x! / (y! * (x-y!))
end; // C The RETURN instruction handles steps 1–4 above.
begin // main program Note that there is no need to explicitly deallocate the
read x; local variables as setting the stack pointer in step 3
read y; achieves this.
call C(); Note that the dynamic links form a chain back through
write res all the active procedure calls and ending at the stack
end frame for the main program. The dynamic link of the
main program never needs to be accessed. Note that,
for consistency, the form of the main program’s stack
Figure 3: Example recursive program frame is the same as for any procedure, but its static
and dynamic links are not used.

4.3 Returning from a procedure 5 Non-local variables


On returning from a procedure its activation record is PL0 allows procedure definitions to be declared locally
“popped” from the stack. To return the following se- within a procedure. We say the local procedure def-
Ian J. Hayes (February 14, 2025) 5
inition is directly statically enclosed (or enclosed for −−
short) within the outer procedure. To allow access to
the variables of an enclosing procedure, the stack frame −−
for a procedure includes a static link, that contains the

Main
0
address of the base of the stack frame for the directly
enclosing procedure. To be more precise in the con- x 2
text of recursive calls on procedures, the static link is
y 3
the base of the stack frame for the most recent acti-
vation (call) of the directly enclosing procedure. Note Static link
that the enclosing procedure can, but does not have to
Dynamic link
be, the calling procedure. For the example program in
Fig. 3 the main program calls procedure C, which then Return address

C
calls procedure fact, which as it is recursive may call
n 0
itself again. During any call on procedure fact the
static link for that call addresses the stack frame for the f 1
call on C, and C’s static link addresses the stack frame
Static link
for the main program.

fact
Variables local to the main program are considered Dynamic link
non-local to procedures C and fact, and variables lo-
Return address
cal to procedure C are inaccessible (do not exist really)
for the main program and are considered non-local to Static link
the procedure fact.
Dynamic link

fact
Fig. 5 illustrates the state of the stack for the program
in Fig. 3 after the main program has read x as 2 from Return address
the input and then read y as 3 and has called C, which
fp Static link
sets n to 2 and calls fact, which sets n to 1 and (recur-
sively) calls fact, which sets n to 0 and (recursively) Dynamic link

fact
calls fact, which has set f to 1 and is about to return.
Return address
To access the non-local variable n from procedure
fact the static link of the (current) call of fact is
accessed to get the address of the stack frame of the Figure 5: Multiple stack frames: the arrows on the left
enclosing procedure C and then the offset (i.e., 3) of represent static links and those on the right dynamic
the variable n within that frame is added to that to give links.
the address of the non-local variable n.
To access the variable x of the main program from
procedure fact — not that procedure fact currently • the TO LOCAL converts the absolute address to
does — fact’s static link is accessed to give the frame an address relative to the frame pointer by sub-
of the directly enclosing procedure, C, and then C’s tracting the frame pointer, so that the address can
static link is accessed to give the stack frame of the now be used as if it is local to the current frame.
grand-parent procedure, the main program. The off-
set (i.e., 3) of the variable x within the main program’s The static links form a chain and hence the program
stack frame is then added to this to give the address of can also access great-grand-parents, etc. The end of
the variable. The corresponding code is given in Fig. 6: the chain is always the stack frame for the main pro-
gram. The static link of the main program never needs
• the instruction ZERO pushes 0 onto the stack; to be accessed. The example in Fig. 6 handles a dif-
• the LOAD FRAME then pushes the value at off- ference in static levels of 2. To handle a difference of
set 0, i.e., the value of the static link (that con- 3 an additional LOAD ABS is required. In general, if
tains the address of the stack frame for C), onto the difference in levels is n, where n ≥ 1, then n − 1
the stack; LOAD ABS are required.
• the LOAD ABS replaces the top of stack with the
value contained at the absolute address currently 6 Other instructions
in the top of stack, i.e., the value of the static link
for C (that contains the address of the stack frame The instructions LOAD MULTI and
for the main program); and STORE MULTI can be used to load and store
6 Stack machine instructions (February 14, 2025)
// Push the static link for the current frame (See §3.) The offset may be negative to branch
// i.e. the address of frame for procedure C backwards.
ZERO
BR FALSE The top of the stack gives the location to
LOAD FRAME
branch to if the second top of stack is false (0),
// Replace with static link of next outer frame
otherwise no branch is taken. The top two ele-
// i.e. the address of the frame for main program
ments are popped from the stack. The location to
LOAD ABS
be branched to is an offset relative to the current
// Add offset of x
program counter which addresses the instruction
LOAD CON 3
following the branch. (See §3.) The offset may be
ADD
negative to branch backwards.
// Convert to an offset from fp
TO LOCAL COPY Copy the object, of size contained in the top of
// Load the value of x stack, from the address in the third top of stack to
LOAD FRAME the address in the second top of stack. All three
values are popped from the stack.
Figure 6: Nonlocal access to x from fact CALL Call a procedure whose absolute address is
popped from the stack and remembered. It is as-
sumed that the static link was the second top (now
multiple words onto the stack, and COPY can copy
top) of stack; it remains there to form the first
multiple words from one address to another. Copy is
word of the activation record. The call pushes the
not used in the base compiler but can be used for data
current frame pointer to form the dynamic link
types like structures, records, etc.
and then sets the the frame pointer so that it ad-
The instruction BOUND checks whether a value is in dresses the static link. Finally it pushes the current
bounds. It is used for checking that values are in a given program counter value to form the return address
subrange for subrange types and can be used for check- before branching to the start of the called proce-
ing array indices. dure. (See §4.)
The instruction ALLOC HEAP is used to allocate a
block of words within the heap. The heap is a separate RETURN Exit a procedure restoring the frame
area of memory used for dynamic allocation of objects pointer from the dynamic link and the program
(“new” in Java). There is no deallocation mechanism counter from the return address. The stack pointer
or garbage collection implemented for the machine. is set so that the complete stack frame starting
The instruction STOP stops execution of the stack ma- from the static link is popped from the stack. (See
chine interpreter. It can be used to stop the machine in §4.)
exceptional circumstances. ALLOC STACK The top of stack contains the num-
ber of words to be allocated. This value is popped
from the stack and that many words are allocated
A Stack Machine Instructions on top of the stack. (The current implementation
of the stack machine initialises all words to a non-
All instructions are one word in size, except sense value hexadecimal 0x80808080, which is a
LOAD CON, which is two words. The address of the large negative number.)
next instruction to execute is stored in the machine’s DEALLOC STACK The top of stack is popped to
program counter. Note that during the execution of an give the number of words to be deallocated
instruction the program counter has already been incre- (popped) from the stack.
mented so that it addresses the instruction following the
POP Discard the top element on the stack.
one being executed. The code for the stack machine
emulator may be found in class StackMachine in DUP Duplicate the top of stack.
package machine; it is the ultimate reference for what SWAP Swap the top two words on the stack.
the stack machine actually does.
ADD Add the top two words on the stack, replacing
NO OP No operation. them with their sum.

BR The top of stack is popped to give the location to MPY Multiply the top two words on the stack, replac-
branch to. The location to be branched to is an off- ing them with their product.
set relative to the current program counter which DIV Divide the second top of stack by the top of stack,
addresses the instruction following the branch. replacing them with their quotient.
Ian J. Hayes (February 14, 2025) 7
OR Bitwise ‘or’ the top two words on the stack, re- the top of stack and the frame pointer. The top two
placing them with the result. values on the stack are popped.
AND Bitwise ‘and’ the top two words on the stack, LOAD FRAME The top of the stack is replaced with
replacing them with the result. the contents of the memory location whose ad-
XOR Bitwise ‘exclusive or’ the top two words on the dress is given by the sum of the top of the stack
stack, replacing them with the result. and the frame pointer.
ZERO Push the value zero (or false for booleans) onto
EQUAL Compare the top two words on the stack, re-
the stack.
placing them with ‘true’ (1) if they are equal, or
‘false’ (0) otherwise. ONE Push the value one (or true for booleans) onto
the stack.
LESS Compare the top two words on the stack, replac-
ing them with ‘true’ (1) if second top is less than ALLOC HEAP The top of stack gives the size of an
the top, or ‘false’ (0) otherwise. object to be allocated within the heap. The top of
stack is replaced with the absolute address of the
LESSEQ Compare the top two words on the stack, re-
allocated object, unless insufficient heap memory
placing them with ‘true’ (1) if second top is less
is available, in which case the machine gives an
than or equal to the top, or ‘false’ (0) otherwise.
exception and halts.
NOT Bitwise complement the top word on the stack.
LOAD MULTI The top of stack contains a count of
NEGATE Replace the top of the stack with its integer the number of words to load and the second top
negation. of stack an offset from the frame pointer, which
READ Read an integer value from input (one per line) gives the address from which to push words onto
and push onto the stack.7 the stack. These two parameters are popped and
count words are pushed onto the stack starting
WRITE Pop the stack and write it to output (one per from the word at the address.
line).7
STORE MULTI The top of stack contains a count of
BOUND The top of stack contains an upper bound, the number of words to store and the second top
the second top of stack contains a lower bound, of stack an offset from the frame pointer, which
and the third top of the stack contains a value to gives the address to which to store words from
be bounds checked; these are all popped. If the the stack. These two parameters are popped and
value to be checked is not within the given bounds count words from the stack are stored starting at
(inclusive) then value false is pushed on top of the the word with address + count − 1 and decreas-
stack, else the value true is pushed on top of the ing.
stack.
STOP Stop execution of the machine. The top of the
TO GLOBAL The top of stack contains an offset stack is popped to get an exit code. There are error
(from the frame pointer) which is converted to message associated with different exit codes.
an absolute (global) address by adding the frame
pointer to it.
TO LOCAL The top of stack contains an absolute ad- References
dress which is converted to an address relative to
[1] Charles L. Hamblin.
the frame pointer (fp), i.e., top of stack is replaced An addressless coding
by top of stack minus the fp. scheme based on mathematical notation. In Pro-
ceedings of the First Australian Conference on
LOAD CON c Push the constant c (which may be Computing and Data Processing, June 1957.
negative) onto the stack. This instruction is two
words long. [2] Robert S. Barton. A new approach to the functional
design of a digital computer. In Proceedings of the
LOAD ABS The top of the stack is replaced with the
Western Joint Computer Conference. ACM, 1961.
contents of the memory location whose address is
given by the top of the stack. [3] A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ull-
STORE FRAME The value of the second top of the man. Compilers: Principles, Techniques, & Tools.
stack is stored at the address given by the sum of Addison-Wesley, second edition, 2007.
7 The instructions READ and WRITE would normally be con- [4] K. C. Louden. Compiler Construction: Principles
sidered too high-level as machine instructions, but they simplify the and Practice. PWS, 1997.
handling of input and output.

You might also like