Three Address Code Part
Three Address Code Part
I have scoured the Internet searching and searching for information, as complete as possible, about code 3.
directions to develop my university projects and, unfortunately, I wasn't lucky.
Before looking at those big sheets of code, I will start to give some theory to allow understanding of the
topics. This information does not belong to any book but is the way I know it. I recommend not to
take it as something unique, I am sure there are more concepts and techniques. My goal is to present something easy to
understand this very complex topic and make it a little easier for them.
Introduction
Compiler and its phases
A compiler is nothing more than a program that allows interaction between a programmer and a computer.
I describe it as casually as possible.
A compiler allows a program to be coded without errors and translates a programmer's language to it.
language that computers understand "1 and 0".
It is composed of two phases:
Analysis.
Synthesis.
Analysis
Lexical analysis (scanner): Breaks down all the code into lexical units called tokens which form the
words of our language. It doesn't matter what each word says, only what type of word it is. Here we
also check if each symbol is part of the language.
Syntactic analysis (parser): It gathers all the words from the lexical analyzer and creates a grammatical structure that
Recognize the correct order of the words. For that, it is necessary for a defined grammar to exist.
Semantic analysis: It allows giving meaning to the existing grammatical structure. In all grammar there can be
different contexts which are considered as situations in which a certain action should be taken. In
this phase checks if a variable being accessed has already been declared previously, if repeated variables are declared, it
validates coherences between data types, etc. These are issues that comply with grammar but make no sense.
In the aforementioned phases, the symbol table is constructed and errors are checked. Keep in mind that
logical errors have a lot to do with the programmer's abstraction, so it is not possible that a
compiler can detect them.
Synthesis
In this phase, the translation from high-level code to low-level code such as assembly or even
Translation to intermediate code (three-address code): In this part, we will focus all the study, it consists
in transforming all high-level code to an equivalent but simpler one. This means that the language will be
less intelligent in the sense that there is no control over what is being executed but it facilitates translation to
assembly code.
Code optimization: Since a grammar is responsible for generating the three-address code, this
it has the flaw that generates inefficient code at runtime and uses more resources than necessary. This
The phase consists of optimizing and improving the performance of the program. Generally, better results are achieved.
Translation to assembly: It is the step from three-address code to assembly, in fact, this phase is quite
trivial since there is much similarity between three-address code and assembly.
Translation to binary code: It is the step from assembly code to binary. This part is out of reach of
this blog but it is worth mentioning that it is very related to how the arithmetic logic unit (ALU) works
our team as well as the RAM.
For these topics, I recommend the dragon book; it goes in-depth.
Let's get started then.
Three-address code
It's called that way because it only allows referencing 3 memory addresses at the same time. This means
Expressions like (a+3)*(4-5)/2 cannot be operated as a single instruction. Therefore, it must be
decompose the expression into simpler expressions and we must use temporary variables to store
the results as long as we have not used them. For notation purposes, we will denote the temporal variables with
a "t_n" where "n" represents a subscript.
Let's put into practice what we've said, first let's suppose that at this moment there are no functions nor
the classes. We are only in a global environment with a structured language and all variables are global:
For the expression x=(a+3)*(4-5)/2 we are going to transform it into three-address code:
t_1=a+3;
t_2=4-5;
t_3 = t_1 * t_2;
t_4 = t_3 / 2
x=t_4;
As you can see, I have used temporaries to store the result of simple expressions and use them.
later. This allows us to perform more complex operations with a simple language. I am assuming that
we have a precedence that favors multiplication and division over addition and subtraction; we are also
associating the operations from left to right. It is interesting to observe that we always create a new temporary.
when we could reuse them; this is because our parser is not smart enough to
Identifying that would only complicate the programming and we will see that it does not allow for code optimization.
The million-dollar question would be, what is grammar going to be useful for? My answer would be, the idea is that it allows one to generate oneself.
Three-address code from a high-level language. Let's see how the grammar looks then:
In the previous grammar, we assumed that there is already a function called gen_temp that generates the temporaries for us.
they are more than concatenated strings of text with a number XD.
As can be seen so far, the topic is not complex, and in the upcoming posts, we will talk more about this topic. My
The objective is to cover everything related to three-address code for object-oriented high-level languages.
It is important to keep in mind that the three-address code is not aware of what it is
doing, nor does it know about the existence of variables, classes, statements, etc. It simply executes all the code
that is encountered along the way and its behavior should be exactly like that of a language of
high level.
Logical expressions
A logical expression is understood as one that can only return two types of values: true or false.
Logical operations, although they do not share the same characteristics as numerical operations,
they must go along with the arithmetic expressions. However, logical or boolean operations have lower
precedence over arithmetic operations.
Simple prepositions.
Compound prepositions.
Simple prepositions
They are called those that only depend on the result of a logical proposition. They are divided into:
For the following assignments, we are going to demonstrate the simple prepositions with their respective translation.
three-address code
1. bool x=true; //Boolean literal
2. bool y=false; //Boolean literal
3. bool z=4>20;//Relational expressions
4. bool w=a+1==10; //Equality expression
5. bool N = 7 != 5; // Equality expression
Remember that all these expressions result in true or false values and then assign them to the
values to the variables that are on the left side of the assignment, that is, the id that is found at
the left side of the "=." It is important to emphasize that it is impossible for expressions that are not to occur.
binary or unary. For example, it is impossible for '1<2>5' to occur since they would operate by associativity.
the first two operands and it would give us false and then it would attempt to compare a boolean with an integer that is
"false>5" which is a semantic error. Such types of expressions cannot be joined by symbols of
equality nor relational. A valid case could be '4==5==true' however it makes no sense.
Before solving the aforementioned expressions, I will introduce additional notation to the code of 3.
directions.
We have the logical operators (>, >=, <=, <, ==, !=)
We have the 'goto label' that allows us to jump to the line of code where the label is located.
named.
We have "Label:" which is the label named as such.
We have the evaluation of a condition through the 'if' statement.
Next, I will solve the expressions I mentioned earlier:
1. bool x=true;
x=1;
2. bool y=false;
y=0;
3. bool z=4>20;
if(4>20) goto L1;
goto L2;
L1: z=1;
goto L3;
L2: z=0;
L3:
4.w=a+1==10;
a + 1;
if(t_1==10) goto L1;
goto L2;
L1 w=1;
goto L3;
L2: w=0;
L3:
5. N=7!=5;
if(7!=5) goto L1;
goto L2;
L1 N=1;
goto L3;
N=0;
L3:
As you may have noticed, I never used true or false but rather 1 or 0. It is also important to emphasize that in no case...
I have used more than 3 references per instruction.
Compound prepositions
Compound propositions are those that contain two or more propositions within the same expression. These
Expressions also return a true or false value. The operators that allow composing simple propositions are:
&&, || or ! or also and, or or not.
The operators "and, or, and not" do not exist.in the three-address code but can be simulated through a control
in the labels and 'goto'.
Another very important point to consider is that the idea is to execute the smallest amount possible of
instructions to make the program more efficient.
X Y Result
0 0 0
0 1 1
1 0 1
1 1 1
As you can see, as long as one of the two prepositions is true, the result will be
true and otherwise it will be false. Therefore, we could prevent the proposition from being executed.
completely composed if any of them is true.
if(n<10) goto
goto L2;
L2:if(y==10) goto L3;
goto L4;
L4: x=0;
goto L5;
L1:L3: x=1;
L5:
bool w = a + n / 4 >= 10 || x + 1 == 10 || x != 0;
t_1=n/4;
a + t_1
if(t_2>=10)goto L1;
goto L2;
L2
t_3=x+1;
if(t_3==10)goto L3;
goto L4;
L4:if(x!=0)goto L5;
goto L6;
w=1;
goto L7;
L6:w=0;
L7:
Let's start with the instruction AND
According to the following table we will see:
X Y Result
0 0 0
0 1 0
1 0 0
1 1 1
As can be seen, the result will be false if one of the two propositions is false. Therefore
We could avoid executing the compound preposition completely if any of them is false.
Asig->id = ExpOr{
if(ExpOr.type=="int"||(ExpOr.type=="bool"&&ExpOr.v==""&&ExpOr.f=="")){
write(id.val, "=", ExpOr.val);
}
else{
write(ExpOr.v, ":");
write(id.val,"=1");
tmp=genTemp();
write("goto ",tmp);
write(ExpOr.f, ":");
write(id.val,"=0");
write(tmp, ":");
}
}
ExpOr->ExpOr{
if(ExpOr1.type!='bool')
abort("ErrorLine X, position X, types are not consistent for the operator ||");
if(ExpOr1.v==""){
ExpOr1.v=genEtiq();
ExpOr1.f=genEtiq();
write("if(", ExpOr1.val, "==1)goto", ExpOr1.v);
write("goto ", ExpOr1.f);
}
write(ExpOr.f, ":")
} ExpAnd{
if(ExpAnd.type != 'bool')
abort("ErrorLine X, position X, the types are not consistent for the operator ||");
ExpOr.v=concat(ExpOr1.v, ":", ExpAnd.v);
ExpOr.f = ExpAnd.f;
ExpOr.val
bool
}
|ExpAnd{ExpOr.val=ExpAnd.val;ExpOr.tipo=ExpAnd.tipo;ExpOr.v=ExpAnd.v;
ExpOr.f=ExpAnd.f
ExpAnd->ExpAnd{
if(ExpAnd1.type != 'bool')
abort("ErrorLine X, position X, the types are not coherent for the operator&&");
if(ExpAnd1.v==""){
genEtiq();
ExpAnd1.f=genEtiq();
write("if(", ExpAnd1.val, "==1)goto", ExpAnd1.v);
write("goto ", ExpAnd1.f);
}
write(ExpAnd.v, ":")
}&& ExpIg{
if(ExpIg.tipo!='bool')
abort("LineError X, position X, the types are not coherent for the operator &&");
ExpAnd.f=concat(ExpAnd1.f, ":", ExpIg.f);
ExpAnd.v=ExpIg.v;
ExpAnd.val
bool
}
|ExpIg{ExpAnd.val=ExpIg.val;ExpAnd.tipo=ExpIg.tipo;ExpAnd.v=ExpIg.v; ExpAnd.f=ExpIg.f}
ExpIg->ExpIg == ExpRe{
if(ExpIg1.type!='int'&&ExpRe.type!='int')
abort("ErrorLine X, position X, types are not consistent for operator==");
ExpIg.v=genEtiq();
ExpIg.f=genEtiq();
write("if(", ExpIg1.val, "==", ExpRe, ")goto ", ExpIg.v);
write("goto ", ExpIg.f);
ExpIg.val=””;
bool
}
ExpIg != ExpRe{
if(ExpIg1.type != 'int' && ExpRe.type != 'int')
abort("ErrorLine X, position X, the types are not consistent for the operator !=");
ExpIg.v=genEtiq();
ExpIg.f=genEtiq();
write("if(", ExpIg1.val, "!=" , ExpRe, ") goto ", ExpIg.v);
write("goto ", ExpIg.f);
ExpIg.val
bool
}
ExpRe{ExpIg.val=ExpRe.val;ExpIg.tipo=ExpRe.tipo;ExpIg.v=ExpRe.v; ExpIg.f=ExpRe.f}
ExpRe->ExpRe> ExpSum{
if(ExpRe1.type!='int'&&ExpSum.type!='int')
abort("ErrorLine X, position X, the types are not consistent for the operator>");
ExpRe.v=genEtiq();
ExpRe.f=genEtiq();
write("if(", ExpRe1.val, ">", ExpSum, ")goto ", ExpRe.v);
write("goto ",ExpRe.f);
ExpRe.val
bool
}
|ExpRe >= ExpSum|{
if(ExpRe1.type!='int'&&ExpSum.type!='int')
abort("LineError X, position X, types are not consistent for the operator >=")
ExpRe.v = genEtiq();
ExpRe.f=genEtiq();
write("if(",ExpRe1.val,">=",ExpSum,")goto ",ExpRe.v);
write("goto ", ExpRe.f);
ExpRe.val
bool
}
|ExpRe < ExpSum{
if(ExpRe1.type!='int'&&ExpSum.type!='int')
abort("LineError X, position X, types are not coherent for the operator <");
ExpRe.v=genEtiq();
ExpRe.f=genEtiq();
write("if(\",ExpRe1.val,\" < \",ExpSum,\" )goto \",ExpRe.v);
write("goto ",ExpRe.f);
ExpRe.val
bool
}
|ExpRe <= ExpSum
{
if(ExpRe1.type!='int'&& ExpSum.type!='int')
abort("ErrorLine X, position X, types are not coherent for operator <=")
ExpRe.v=genEtiq();
ExpRe.f=genEtiq();
write("if(", ExpRe1.val, "<=", ExpSum, ")goto ", ExpRe.v);
write("goto ", ExpRe.f);
ExpRe.val
bool
}
|ExpSum{ExpRe.val=ExpSum.val;ExpRe.tipo=ExpSum.tipo;ExpRe.v=ExpSum.v; ExpRe.f=ExpSum.f}
ExpSum->ExpSum + ExpPor{
ExpSum .val=genTemp();
if(ExpSum.type!='int'&&ExpPor.type!='int')
abort("ErrorLine X, position X, types are not consistent for operator+");
write(ExpSum.val,"=",ExpSum 1.val,"+",ExpPor.val);
ExpSum.v
int
}
|ExpSum -ExpPor{
ExpSum .val=genTemp();
if(ExpSum.type!='int'&&ExpPor.type!='int')
abort("ErrorLine X, position X, types are not consistent for the operator -");
write(ExpSum.val, "=", ExpSum 1.val, "-", ExpPor.val);
ExpSum.v
int
}
ExpPor{ExpSum .val=ExpPor.val;ExpSum .tipo=ExpPor.tipo;ExpSum .v=ExpPor.v; ExpSum .f=ExpPor.f}
ExpPor->ExpPor * ExpUn{
genTemp();
if(ExpPor1.type!='int'&&ExpUn.type!='int')
abort("ErrorLine X, position X, the types are not coherent for operator *");
write(ExpPor.val, '=', ExpPor1.val, '*', ExpUn.val);
ExpPor.v
int
}
|ExpPor/ExpUn{
genTemp();
if(ExpPor1.type!='int'&&ExpUn.type!='int')
abort("LineError X, position X, types are not coherent for the operator/");
write(ExpPor.val,"=",ExpPor1.val,"/",ExpUn.val);
ExpPor.v
int
}
|ExpPor%ExpUn{
genTemp();
if (ExpPor1.type != 'int' && ExpUn.type != 'int')
abort("ErrorLine X, position X, the types are not coherent for the operator/");
write(ExpPor.val, "=", ExpPor1.val, "/", ExpUn.val);
ExpPor.v
int
}
EpxUn->!ExpSimple{EpxUn.val=ExpSimple.val;EpxUn.tipo=ExpSimple.tipo;EpxUn.v=ExpSimple.f;
EpxUn.f=ExpSimple.v}
SimpleExpExpSimple.val
ExpSimple->(ExpOr){ExpSimple.val=ExpOr.val;ExpSimple.tipo=ExpOr.tipo;ExpSimple.v=ExpOr.v;
ExpSimple.f=ExpOr.f
Id{ExpSimple.val=Id.val;ExpSimple.tipo=tablaSimbolos.getTipo(Id.val);ExpSimple.v=””; ExpSimple.f=””}
|Num{ExpSimple.val=Num.val;ExpSimple.tipo=”int”;ExpSimple.v=””; ExpSimple.f=””}
true{ExpSimple.val=”1″;ExpSimple.tipo=”bool”;ExpSimple.v=””; ExpSimple.f=””}
false{ExpSimple.val=”0″;ExpSimple.tipo=”bool”;ExpSimple.v=””; ExpSimple.f=””}
If statement
A condition is validated and if true, the body of the instruction is executed, let's see an example
simple
int x=0;
int y=5;
if(x==0){
y = y + 5 * 2;
x = y / 5;
}
y = y + 1;
x=0;
y=5;
if(x==0)goto L1;
goto L2;
t_1=5*2;
y + t_1
y=t_2;
x=y/5;
L2:y=y+1
As you can see, I used a lot of what we discussed in the previous post. The only difference that can be perceived...
the result of the condition is not stored in any variable but is only used for
manipulate the flow of execution. That means that the fact that we evaluate a condition in an 'if'
that does not alter any variable, it only changes the way the program will be executed.
if(condition) goto Lv
goto Lf
Lv:
instructions
<instructions>
If they are observant, they will realize that the content of the fake label will execute regardless of what it is.
the result of the condition of the if and that is correct for this case since if it is false it will simply not execute it
what is inside the if but the rest will do it anyway that means the instruction "y=y+1" always
it is executed for this case. In the following statement, two exclusive blocks can be made, that is, if not
one block is executed then the other one is executed.
If-Else Statement
The condition is validated; if it is true, the body of the 'if' is executed and the code of the 'else' body is omitted.
Otherwise, the code in the body of the if does not execute, but that of the body of the 'else' does. Let's see the following
example:
int x=0;
int y=5;
if(x==0){
y = y + 5 * 2;
x = y / 5;
]else{
y = y + 1;
}
We see that it is quite similar to the previous example, but now we do not want it to increase by one, and if the
condition becomes true.
x=0;
y=5;
if(x==0)goto L1;
goto L2;
t_1=5*2;
t_2=y+t_1;
y=t_2;
x = y / 5;
goto L3;
L2:y=y+1
L3:
What changes is that at the end of the body of the 'if' a break tag is used that will cause the code to be skipped.
from 'else'. Therefore, the general form of the three-address code for the 'if' statement is as follows:
<instructions>
The exit:
Now we will see how we can play with jump tags to generate loops.
While statement
The "while" statement is a kind of "if", with the only difference being that at the end of the body of the if it returns to
evaluate the condition and if it turns out true, execute it again and then repeat the cycle until it is false
condition. For example, if we want there to be a counter:
int x=0;
int y=20;
while(x<10 && y>=0){
x=x+1;
y=y-1;
}
x=0;
y=20;
return jump label
if(x<10) goto L2;
goto L3;
if(y>=0)goto L4;
goto L5;
L4:if the condition is true
x=x+1;
y=y-1;
goto L1;//jump to reevaluate the condition
L3if it is false
Note that we use a compound condition, as we saw the operator 'and' does not exist in code 3.
directions, therefore we have to play around a bit with the labels and the jumps to
simulate it. In the end, we reach the part where the condition is true, and there the body of the
Then there must be a jump that takes us back to the beginning of the instruction to reevaluate the
condition. If in the end it is false, then it kicks us out.
<instructions>
Some variants:
Instruction until
The same thing happens as with the while, with the difference that the body of the instruction is executed until it is true.
instruction.
The return: if(condition) goto Lv;
goto Lf;
Lf:
<instructions>
goto The return;
Lv:
<instructions>
do-while instruction
This variant first executes the body of the instruction and finally checks if the condition is true; this
guarantees that the body of the statement will be executed at least once. The way it should be expressed in
Three-address code is as follows:
The return:
<instructions>
if(condition) goto The return;
goto
instructions
do{
instructions
}while(condition);
Yes, in case the same functionality is desired, that is to say, it executes the body and then validates the condition.
but it would like to present itself like this
do while(condition) {
<instructions>
}
goto L1;
return: if(condition) goto Lv;
goto Lf;
L1:Lv:
<body>
goto Lreturn;
Lf:
<other code>
The code may seem absurd but remember that it must be generated through grammar and would be
the only way to do it. We cannot express it as the previous three-address code since when it comes to
after finishing recognizing the body of the do-while, we can no longer recognize the condition because we already
we passed by there.
Instruction for
It is another variant of the while, for this case we are going to present it just as C# does.
for(<assignment>;<condition>;<update>){<body>}
variable assignment
Lreturn: if(condition) goto Lv;
goto Lf;
The update:
<update>
goto Lreturn;
<instructions>
goto Lactualizar;
<instructions>
It still seems absurd, but it is generated in that way since it is a grammar, and this compels...
generate inefficient code but when optimizing code many things can be corrected.
Switch statement
This instruction retrieves the value of a variable and evaluates it against a set of defined cases and, in the first
coincidence, execute the code associated with that case. If, at any moment, none of the cases match
An alternative case may exist that is executed at that moment.
switch(id){
case case_1:
<body_1>
case case_2:
<body_2>
case case_3:
<body_3>
.
.
.
case case_n:
<body_n>
defult:
<default_body>
}
if(id!=case_1) goto L1
if(id!=case2)goto L2;
L2:if(id!=case3)goto L3;
L3:if(id!=case4)goto Ldef;
Ledf:
<body_def>
If it were desired to execute as Java does, in that the first matching case starts executing
all bodies of each remaining case unless it is found with a brake:
goto Lx;
<body_1>
L2:<body_2>
L3:<body_3>
<body_def>
goto Lfin;
Lx:
if(id!=case_1) goto L1
if(id!=case2)goto L2;
if(id!=case3)goto L3;
if(id!=case4)goto Ldef;
Lfin:
It is expressed in that way because, as always, it is generated through a grammar and its structure.
the one described above was convenient.
Next, we will see a simple example:
Example
We want to create a loop that increments a global variable by one as long as its value is odd, if in
In some case, the value goes through '4' which converts a global boolean variable to true.
We will do it with colorsXD we will use green for the for part, blue for the if part, and black for the
restaurant
x=0;
contador=1;
n=1;
contador=1;
if(counter<100) goto L2;
goto L3;
counter + 1;
counter=t1;
goto L1;
t2=counter%2;
if(t2==0)goto L5;
goto L6;
x + 1;
x=t3;
if(x!=4) goto L7;
goto L8;
L7:n=1;
L8:n=0;
L6:
t4=y+1;
y=t4;
goto L4;
L3: