0% found this document useful (0 votes)
21 views55 pages

CD515

The document outlines a Compiler Design Lab course, detailing various experiments focused on implementing a lexical analyzer and parsing techniques using C programming. It includes tasks such as creating a lexical analyzer, validating identifiers, recognizing specific string patterns, and converting BNF rules into Yacc form. Each experiment is accompanied by code snippets and expected outputs, demonstrating practical applications of compiler design concepts.

Uploaded by

Rohit Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views55 pages

CD515

The document outlines a Compiler Design Lab course, detailing various experiments focused on implementing a lexical analyzer and parsing techniques using C programming. It includes tasks such as creating a lexical analyzer, validating identifiers, recognizing specific string patterns, and converting BNF rules into Yacc form. Each experiment is accompanied by code snippets and expected outputs, demonstrating practical applications of compiler design concepts.

Uploaded by

Rohit Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Compiler Design Lab Ritik

BTCS604-18 2224515

I. K. GUJRAL PUNJAB TECHNICAL


UNIVERSITY KAPURTHALA

Compiler Design Lab


BTCS604-18

Submitted to :- Submitted by :-

Er. Harpreet Singh Sir Name :- Ritik

Roll No. :- 2224515

Section:- D

Semester:- 6th

1|Page
Compiler Design Lab Ritik
BTCS604-18 2224515

INDEX

[Link] Experiments Page No. Remarks

1. Design a lexical analyzer for given language 4-8


and the lexical analyzer should ignore
redundant spaces, tabs and new lines. It should
also ignore comments. Although the syntax
specification states that identifiers can be
arbitrarily long, you may restrict the length to
some reasonable value. Simulate the same in C
language.

2. Write a C program to identify whether a given 9


line is a comment or not.

3. Write a C program to recognize strings under 10-11


'a', 'a*b+', 'abb'.

4. Write a C program to test whether a given 12


identifier is valid or not.

5. Write a C program to simulate lexical analyzer 14-15


for validating operator.

6. Implement the lexical analyzer using JLex, flex 16-17


or other lexical analyzer generating tools.

7. Write a C program for implementing the 18-22


functionalities of predictive parser for the mini
language specified in Note 1.

8. a) Write a C program for constructing of LL (1) 23-31


parsing.
b) Write a C program for constructing recursive
descent parsing.

2|Page
Compiler Design Lab Ritik
BTCS604-18 2224515

9. Write a C program to implement LALR parsing. 32-36

10. a) Write a C program to implement operator 37-41


precedence parsing.
b) Write a C program to implement Program
semantic rules to calculate the expression that
takes an expression with digits, + and * and
computes the value.

11. Convert the BNF rules into Yacc form and 42-44
write code to generate abstract syntax tree for
the mini language specified in Note 1.

12. Write a C program to generate machine code 45-47


from abstract syntax tree generated by the
parser. The instruction set specified in Note 2
may be considered as the target code.

3|Page
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 1

Design a lexical analyzer for given language and the lexical analyzer should ignore
redundant spaces, tabs and new lines. It should also ignore comments. Although the syntax
specification states that identifiers can be arbitrarily long, you may restrict the length to
some reasonable value. Simulate the same in C language.

#include <stdio.h>
#include <ctype.h>
#include <string.h>

#define MAX_IDENTIFIER_LENGTH 32

// Define token types


typedef enum {
TOKEN_KEYWORD, TOKEN_IDENTIFIER, TOKEN_NUMBER, TOKEN_OPERATOR,
TOKEN_COMMENT, TOKEN_EOF, TOKEN_UNKNOWN
} TokenType;

// Keywords for the language


const char *keywords[] = {"int", "if", "else", "return", "while", "for"};
const int num_keywords = 6;

// Token structure to hold the type and value of each token


typedef struct {
TokenType type;
char value[MAX_IDENTIFIER_LENGTH];
} Token;

// Function to check if a word is a


keyword int is_keyword(const char
*word) {
for (int i = 0; i < num_keywords; i++)
{ if (strcmp(word, keywords[i]) == 0)
{
return 1;
}
}
return 0;
}

// Function to handle skipping whitespaces and comments


void skip_whitespace_and_comments(FILE *source) {
char ch;
while ((ch = fgetc(source)) != EOF) {
if (isspace(ch)) {
// Skip spaces, tabs, and newlines
4|Page
Compiler Design Lab Ritik
BTCS604-18 2224515
continue;

5|Page
Compiler Design Lab Ritik
BTCS604-18 2224515
} else if (ch == '/') {
// Check for comments
char next_ch = fgetc(source);
if (next_ch == '/') {
// Skip single-line comment
while ((ch = fgetc(source)) != EOF && ch != '\n');
} else if (next_ch == '*') {
// Skip multi-line comment
while ((ch = fgetc(source)) != EOF) {
if (ch == '*' && (ch = fgetc(source)) == '/') {
break;
}
}
} else {
// Not a comment, return the character
ungetc(next_ch, source);
return;
}
} else {
// Not a whitespace or comment, break
ungetc(ch, source);
return;
}
}
}

// Function to get the next token


Token get_next_token(FILE *source)
{ Token token;
[Link] = TOKEN_UNKNOWN;
[Link][0] = '\0';

char ch;
skip_whitespace_and_comments(source);

ch = fgetc(source);

// End of file
if (ch == EOF) {
[Link] = TOKEN_EOF;
return token;
}

// If it's a letter or underscore, it's an identifier or keyword


if (isalpha(ch) || ch == '_') {
int i = 0;

6|Page
Compiler Design Lab Ritik
BTCS604-18 2224515
do {
[Link][i++] = ch;
if (i >= MAX_IDENTIFIER_LENGTH) break;
ch = fgetc(source);
} while (isalnum(ch) || ch == '_');
[Link][i] = '\0';

// If it's a keyword, set the type as TOKEN_KEYWORD


if (is_keyword([Link])) {
[Link] = TOKEN_KEYWORD;
} else {
[Link] = TOKEN_IDENTIFIER;
}

ungetc(ch, source); // Put the last character back


}
// If it's a digit, it's a number
else if (isdigit(ch)) {
int i = 0;
do {
[Link][i++] = ch;
if (i >= MAX_IDENTIFIER_LENGTH) break;
ch = fgetc(source);
} while (isdigit(ch));
[Link][i] = '\0';
[Link] = TOKEN_NUMBER;

ungetc(ch, source); // Put the last character back


}
// If it's an operator
else if (ch == '+' || ch == '-' || ch == '*' || ch == '/') {
[Link] = TOKEN_OPERATOR;
[Link][0] = ch;
[Link][1] = '\0';
}
else {
[Link] = TOKEN_UNKNOWN;
[Link][0] = ch;
[Link][1] = '\0';
}

return token;
}

void print_token(Token token)


{ switch ([Link]) {

7|Page
Compiler Design Lab Ritik
BTCS604-18 2224515
case TOKEN_KEYWORD:
printf("KEYWORD: %s\n", [Link]);
break;
case TOKEN_IDENTIFIER:
printf("IDENTIFIER: %s\n", [Link]);
break;
case TOKEN_NUMBER:
printf("NUMBER: %s\n", [Link]);
break;
case TOKEN_OPERATOR:
printf("OPERATOR: %s\n", [Link]);
break;
case TOKEN_EOF:
printf("END OF FILE\n");
break;
case TOKEN_UNKNOWN:
printf("UNKNOWN: %s\n", [Link]);
break;
}
}

int main() {
FILE *source = fopen("test.c", "r");
if (!source) {
printf("Error: Could not open file\n");
return 1;
}

Token token;
do {
token = get_next_token(source);
print_token(token);
} while ([Link] != TOKEN_EOF);

fclose(source);
return 0;
}

Output:

KEYWORD: int
IDENTIFIER: main
OPERATOR:
( OPERATOR: )
OPERATOR:
{ KEYWORD: int

8|Page
Compiler Design Lab Ritik
BTCS604-18 2224515
IDENTIFIER: a
OPERATOR: =
NUMBER: 5
OPERATOR: ;
KEYWORD: if
OPERATOR:
( IDENTIFIER:
a OPERATOR:
> NUMBER: 0
OPERATOR: )
OPERATOR: {
IDENTIFIER: a
OPERATOR: =
IDENTIFIER: a
OPERATOR: +
NUMBER: 10
OPERATOR: ;
OPERATOR: }
KEYWORD: return
IDENTIFIER: a
OPERATOR: ;
OPERATOR: }

9|Page
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 2

Write a C program to identify whether a given line is a comment or not.

#include <stdio.h>
#include <string.h>

// Function to check if the line is a comment


int is_comment(const char *line) {
// Check for single-line comment (starts with "//")
if (line[0] == '/' && line[1] == '/') {
return 1; // It is a single-line comment
}
// Check for multi-line comment (starts with "/*")
if (line[0] == '/' && line[1] == '*' ) {
return 1; // It is the start of a multi-line comment
}
return 0; // Not a comment
}

int main() {
char line[256];

// Read a line of input


printf("Enter a line of code: ");
fgets(line, sizeof(line), stdin); // Get input with spaces

// Remove newline character at the end, if present


line[strcspn(line, "\n")] = '\0';

// Check if the line is a comment


if (is_comment(line)) {
printf("The line is a comment.\n");
} else {
printf("The line is not a comment.\n");
}

return 0;
}

Output:
Enter a line of code: // This is a comment
The line is a comment.

10 | P a g
e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 3

Write a C program to recognize strings under 'a', 'a*b+', 'abb'.

#include <stdio.h>
#include <string.h>
#include <ctype.h>

// Function to check if the string matches 'a'


int match_a(const char *str) {
return (str[0] == 'a' && str[1] == '\0'); // Matches exactly 'a'
}

// Function to check if the string matches 'a*b+'


int match_a_b_plus(const char *str) {
int i = 0;
// First, check for zero or more 'a's
while (str[i] == 'a') {
i++;
}
// Then check for one or more 'b's
if (str[i] != '\0') {
while (str[i] == 'b') {
i++;
}
return str[i] == '\0'; // If we reached the end of the string, it's a match
}
return 0; // No 'b' found after 'a's
}

// Function to check if the string matches 'abb'


int match_abb(const char *str) {
return strcmp(str, "abb") == 0; // Exact match for "abb"
}

int main() {
char input[100];

// Take input from the user


printf("Enter a string: ");
fgets(input, sizeof(input), stdin);
input[strcspn(input, "\n")] = '\0'; // Remove newline character from input

// Check if it matches any of the three patterns


if (match_a(input)) {
printf("The string matches pattern 'a'\n");

11 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
} else if (match_a_b_plus(input)) {
printf("The string matches pattern 'a*b+'\n");
} else if (match_abb(input)) {
printf("The string matches pattern 'abb'\n");
} else {
printf("The string does not match any of the patterns\n");
}

return 0;
}

Output:
Enter a string: abbbb
The string matches pattern 'a*b+'

12 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 4

Write a C program to test whether a given identifier is valid or not.

#include <stdio.h>
#include <ctype.h>
#include <string.h>

#define NUM_KEYWORDS 32

// List of C keywords (for simplicity, we use a small subset)


const char *keywords[] = {
"auto", "break", "case", "char", "const", "continue", "default", "do", "double", "else",
"enum", "extern", "float", "for", "goto", "if", "inline", "int", "long", "register",
"restrict", "return", "short", "signed", "sizeof", "static", "struct", "switch", "typedef",
"union", "unsigned", "void", "volatile", "while"
};

// Function to check if the string is a C keyword


int is_keyword(const char *str) {
for (int i = 0; i < NUM_KEYWORDS; i++)
{ if (strcmp(str, keywords[i]) == 0) {
return 1; // It is a keyword
}
}
return 0; // Not a keyword
}

// Function to check if a given string is a valid


identifier int is_valid_identifier(const char *str) {
// Check if the string is empty
if (str[0] == '\0') {
return 0; // Empty string is not a valid identifier
}

// Check if the first character is a letter or underscore


if (!(isalpha(str[0]) || str[0] == '_')) {
return 0; // Invalid if first character is not a letter or underscore
}

// Check the rest of the characters (they should be alphanumeric or underscores)


for (int i = 1; str[i] != '\0'; i++) {
if (!(isalnum(str[i]) || str[i] == '_')) {
return 0; // Invalid if any character is not alphanumeric or underscore
}
}

13 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

// Check if the identifier is a keyword


if (is_keyword(str)) {
return 0; // Keywords are not valid identifiers
}

return 1; // The identifier is valid


}

int main() {
char identifier[100];

// Take input from the user


printf("Enter an identifier: ");
fgets(identifier, sizeof(identifier), stdin);
identifier[strcspn(identifier, "\n")] = '\0'; // Remove newline character

// Check if the identifier is valid


if (is_valid_identifier(identifier)) {
printf("'%s' is a valid identifier.\n", identifier);
} else {
printf("'%s' is not a valid identifier.\n", identifier);
}

return 0;
}

Output:
Enter an identifier: myVariable

'myVariable' is a valid identifier.

14 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 5

Write a C program to simulate lexical analyzer for validating operator.

#include <stdio.h>
#include <string.h>

// Function to check if a given string is a valid operator


int is_operator(const char *str) {
// List of valid operators
const char *operators[] = {
"+", "-", "*", "/", "%",
"==", "!=", "<", ">", "<=", ">=",
"&&", "||",
"=", "+=", "-=", "*=", "/=", "%=",
"&", "|", "^", "<<", ">>", "+
+", "--",
"!", "~",
"?", ":"
};

int num_operators = sizeof(operators) / sizeof(operators[0]);

// Check if the input string matches any valid operator


for (int i = 0; i < num_operators; i++) {
if (strcmp(str, operators[i]) == 0)
{ return 1; // Valid operator found
}
}

return 0; // Not a valid operator


}

int main() {
char input[100];

// Take input from the user


printf("Enter an operator: ");
fgets(input, sizeof(input), stdin);
input[strcspn(input, "\n")] = '\0'; // Remove the newline character from input

// Check if the input is a valid operator


if (is_operator(input)) {
printf("'%s' is a valid operator.\n", input);
} else {
printf("'%s' is not a valid operator.\n", input);

15 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
}

return 0;
}

Output:

Enter an operator: +
'+' is a valid operator.

Enter an operator: ==
'==' is a valid operator.

16 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 6

Implement the lexical analyzer using JLex, flex or other lexical analyzer generating tools.
/* program name is lexp.l */
%{
/* program to recognize a c program */
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* { printf("\n%s is a PREPROCESSOR DIRECTIVE",yytext);}
int |float |char |double |while |for |do |if |break |continue |void |switch |case |long |struct |const
|typedef |return
|else |goto {printf("\n\t%s is a KEYWORD",yytext);}
"/*" {COMMENT = 1;}
/*{printf("\n\n\t%s is a COMMENT\n",yytext);}*/
"*/" {COMMENT = 0;}
/* printf("\n\n\t%s is a COMMENT\n",yytext);}*/
{identifier}\( {if(!COMMENT)printf("\n\nFUNCTION\n\t%s",yytext);}
{ {if(!COMMENT) printf("\n BLOCK BEGINS");}
} {if(!COMMENT) printf("\n BLOCK ENDS");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s IDENTIFIER",yytext);}
".*\" {if(!COMMENT) printf("\n\t%s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n\t%s is a NUMBER",yytext);}
{if(!COMMENT) printf("\n\t");ECHO;printf("\n");}
( ECHO;
{if(!COMMENT)printf("\n\t%s is an ASSIGNMENT OPERATOR",yytext);}
<= |>= |< |== |> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc,char **argv)
{
if (argc > 1)
{
FILE *file;
file = fopen(argv[1],"r");
if(!file)
{
printf("could not open %s \n",argv[1]);
exit(0);
}
yyin = file;
}
yylex();
printf("\n\n");
return 0;
} int yywrap()

17 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
{
return 0;
}
Output:
Input:

$vi var.c

#include<stdio.h>

main()

int a,b;

Output:

$lex lex.l

$cc [Link].c

$./[Link] var.c

#include<stdio.h> is a PREPROCESSOR DIRECTIVE

FUNCTION

main (

BLOCK BEGINS

int is a KEYWORD

a IDENTIFIER

b IDENTIFIER

BLOCK ENDS

18 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 7

Write a C program for implementing the functionalities of predictive parser for the
mini language specified in Note 1.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX_STACK_SIZE 100


#define MAX_TOKEN_SIZE 20

// Stack for the parsing process


char stack[MAX_STACK_SIZE];
int stackTop = -1;

// Token input stream


char tokens[MAX_STACK_SIZE][MAX_TOKEN_SIZE];
int tokenIndex = 0;
int numTokens = 0;

// Grammar symbols
#define NUM 0
#define ID 1
#define PLUS 2
#define MINUS 3
#define MUL 4
#define DIV 5
#define LPAREN 6
#define RPAREN 7
#define ASSIGN 8
#define END 9

// Function to push a symbol to the stack


void push(char c) {
stack[++stackTop] = c;
}

// Function to pop a symbol from the stack


char pop() {
return stack[stackTop--];
}

// Function to peek the top of the stack


char peek() {
return stack[stackTop];

19 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
}

// Function to check if a string is a number


int isNumber(char *str) {
for (int i = 0; str[i] != '\0'; i++) {
if (!isdigit(str[i]))
{ return 0;
}
}
return 1;
}

// Tokenizing the input string


void tokenize(char *input) {
char *token = strtok(input, " \t\n");
while (token != NULL) {
if (isNumber(token))
{ strcpy(tokens[numTokens++],
token);
} else if (strcmp(token, "+") == 0)
{ strcpy(tokens[numTokens++],
"+");
} else if (strcmp(token, "-") == 0)
{ strcpy(tokens[numTokens++],
"-");
} else if (strcmp(token, "*") == 0)
{ strcpy(tokens[numTokens++],
"*");
} else if (strcmp(token, "/") == 0)
{ strcpy(tokens[numTokens++],
"/");
} else if (strcmp(token, "(") == 0)
{ strcpy(tokens[numTokens++],
"(");
} else if (strcmp(token, ")") == 0)
{ strcpy(tokens[numTokens++],
")");
} else if (strcmp(token, "=") == 0)
{ strcpy(tokens[numTokens++],
"=");
} else {
strcpy(tokens[numTokens++], token);
}
token = strtok(NULL, " \t\n");
}
}

20 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
// Function to get the next token from the token list
char* getNextToken() {
if (tokenIndex < numTokens)
{ return tokens[tokenIndex+
+];
}
return NULL;
}

21 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

// Function to match and consume the expected token


int match(char* expected) {
char* currentToken = getNextToken();
if (currentToken && strcmp(currentToken, expected) == 0)
{ return 1; // Match and consume
}
return 0; // No match
}

// Function to parse the expression


void parseE() {
printf("Parsing E -> T E'\n");
parseT();
parseEPrime();
}

// Function to parse E'


void parseEPrime() {
printf("Parsing E' -> + T E' | - T E' | ε\n");
char* token = getNextToken();
if (token != NULL && (strcmp(token, "+") == 0 || strcmp(token, "-") == 0)) {
if (match("+")) {
parseT();
parseEPrime();
} else if (match("-")) {
parseT();
parseEPrime();
}
}
}

// Function to parse T
void parseT() {
printf("Parsing T -> F T'\n");
parseF();
parseTPrime();
}

// Function to parse T'


void parseTPrime() {
printf("Parsing T' -> * F T' | / F T' | ε\n");
char* token = getNextToken();
if (token != NULL && (strcmp(token, "*") == 0 || strcmp(token, "/") == 0))
{ if (match("*")) {
parseF();

22 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
parseTPrime();
} else if (match("/")) {
parseF();
parseTPrime();
}
}
}

// Function to parse F
void parseF() {
printf("Parsing F -> ( E ) | id | num\n");
char* token = getNextToken();
if (token != NULL && strcmp(token, "(") == 0) {
match("(");
parseE();
match(")");
} else if (token != NULL && isNumber(token)) {
match(token);
} else if (token != NULL && strcmp(token, "id") == 0) {
match("id");
}
}

int main() {
char input[200];

// Example: Expression for parsing


printf("Enter the expression: ");
fgets(input, sizeof(input), stdin);

// Tokenize the input string


tokenize(input);

// Push the start symbol to the stack (we start with E)


push('E');

// Start parsing
printf("Parsing process:\n");
while (stackTop != -1) {
char currentSymbol = peek();
char* currentToken = getNextToken();

if (currentSymbol == 'E') {
parseE();
} else if (currentSymbol == 'E\'') {
parseEPrime();

23 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
} else if (currentSymbol == 'T') {
parseT();
} else if (currentSymbol == 'T\'') {
parseTPrime();
} else if (currentSymbol == 'F') {
parseF();
}
// Continue parsing as needed
}

return 0;
}

Output:
Enter the expression: 3 + 5 * ( 2 - 8 )
Parsing process:
Parsing E -> T E'
Parsing T -> F T'
Parsing F -> num
Parsing T' -> * F T'
Parsing F -> num
Parsing T' -> ε
Parsing E' -> + T E'
Parsing T -> F T'
Parsing F -> num
Parsing T' -> ε
Parsing E' -> ε

24 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 8

a) Write a C program for constructing of LL (1) parsing.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX_STACK_SIZE 100


#define MAX_TOKEN_SIZE 20

// Stack for the parsing process


char stack[MAX_STACK_SIZE];
int stackTop = -1;

// Token input stream


char tokens[MAX_STACK_SIZE][MAX_TOKEN_SIZE];
int tokenIndex = 0;
int numTokens = 0;

// Grammar symbols
#define NUM 0
#define ID 1
#define PLUS 2
#define MUL 3
#define LPAREN 4
#define RPAREN 5
#define EPSILON 6
#define END 7

// Function to push a symbol to the stack


void push(char c) {
stack[++stackTop] = c;
}

// Function to pop a symbol from the stack


char pop() {
return stack[stackTop--];
}

// Function to peek the top of the stack


char peek() {
return stack[stackTop];
}

25 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
// Function to check if a string is a number
int isNumber(char *str) {
for (int i = 0; str[i] != '\0'; i++) {
if (!isdigit(str[i]))
{ return 0;
}
}
return 1;
}

// Tokenizing the input string


void tokenize(char *input) {
char *token = strtok(input, " \t\n");
while (token != NULL) {
if (isNumber(token))
{ strcpy(tokens[numTokens++],
token);
} else if (strcmp(token, "+") == 0)
{ strcpy(tokens[numTokens++],
"+");
} else if (strcmp(token, "*") == 0)
{ strcpy(tokens[numTokens++],
"*");
} else if (strcmp(token, "(") == 0)
{ strcpy(tokens[numTokens++],
"(");
} else if (strcmp(token, ")") == 0)
{ strcpy(tokens[numTokens++], ")");
} else if (strcmp(token, "id") == 0)
{ strcpy(tokens[numTokens++],
"id");
}
token = strtok(NULL, " \t\n");
}
}

// Function to get the next token from the token list


char* getNextToken() {
if (tokenIndex < numTokens)
{ return tokens[tokenIndex+
+];
}
return NULL;
}

// Function to match and consume the expected token


int match(char* expected) {

26 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
char* currentToken = getNextToken();
if (currentToken && strcmp(currentToken, expected) == 0)
{ return 1; // Match and consume
}
return 0; // No match

27 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
}

// Parse function for non-terminal E


void parseE();

// Parse function for non-terminal E'


void parseEPrime();

// Parse function for non-terminal T


void parseT();

// Parse function for non-terminal T'


void parseTPrime();

// Parse function for non-terminal F


void parseF();

// The LL(1) parsing function that uses a predictive table


void parse() {
push('E'); // Start with the non-terminal E

while (stackTop != -1) {


char currentSymbol = peek();
char* currentToken = getNextToken();

if (isalpha(currentSymbol) || isdigit(currentSymbol)) {
// If the current symbol on the stack is terminal and matches the token
if (currentSymbol == currentToken[0]) {
printf("Matched token: %s\n", currentToken);
pop();
} else {
printf("Error: Unexpected token %s\n", currentToken);
return;
}
} else if (currentSymbol == 'E') {
parseE();
} else if (currentSymbol == 'E\'') {
parseEPrime();
} else if (currentSymbol == 'T') {
parseT();
} else if (currentSymbol == 'T\'') {
parseTPrime();
} else if (currentSymbol == 'F') {
parseF();
}
}

28 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
}

// Function to parse E -> T E'


void parseE() {
printf("Parsing E -> T E'\n");
parseT();
parseEPrime();
}

// Function to parse E' -> + T E' | ε


void parseEPrime() {
char* token = getNextToken();
if (token != NULL && strcmp(token, "+") == 0) {
match("+");
parseT();
parseEPrime();
} else {
printf("Parsing E' -> ε\n");
}
}

// Function to parse T -> F T'


void parseT() {
printf("Parsing T -> F T'\n");
parseF();
parseTPrime();
}

// Function to parse T' -> * F T' | ε


void parseTPrime() {
char* token = getNextToken();
if (token != NULL && strcmp(token, "*") == 0) {
match("*");
parseF();
parseTPrime();
} else {
printf("Parsing T' -> ε\n");
}
}

// Function to parse F -> ( E ) | id | num


void parseF() {
char* token = getNextToken();
if (token != NULL && strcmp(token, "(") == 0) {
match("(");
parseE();

29 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
match(")");
} else if (token != NULL && strcmp(token, "id") == 0) {
match("id");
} else if (token != NULL && isNumber(token)) {
match("num");
}
}

int main() {
char input[200];

// Example: Expression for parsing


printf("Enter the expression: ");
fgets(input, sizeof(input), stdin);

// Tokenize the input string


tokenize(input);

// Start parsing the expression


parse();

return 0;
}

Output:
Enter the expression: id + num * ( id + num )
Parsing E -> T E'
Parsing T -> F T'
Parsing F -> id
Parsing T' -> ε
Parsing E' -> + T E'
Parsing T -> F T'
Parsing F -> num
Parsing T' -> * F T'
Parsing F ->
( Parsing E -> T E'
Parsing T -> F T'
Parsing F -> id
Parsing T' -> ε
Parsing E' -> + T E'
Parsing T -> F T'
Parsing F -> num
Parsing T' -> ε
Parsing E' -> ε
Parsing T' -> ε
Parsing E' -> ε

30 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

b) Write a C program for constructing recursive descent parsing.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX_STACK_SIZE 100


#define MAX_TOKEN_SIZE 20

// Global variables for input and tokens


char tokens[MAX_STACK_SIZE][MAX_TOKEN_SIZE];
int tokenIndex = 0;
int numTokens = 0;

// Current token in the input


char currentToken[MAX_TOKEN_SIZE];

// Function to check if a string is a number


int isNumber(char *str) {
for (int i = 0; str[i] != '\0'; i++) {
if (!isdigit(str[i]))
{ return 0;
}
}
return 1;
}

// Function to tokenize the input string


void tokenize(char *input) {
char *token = strtok(input, " \t\n");
while (token != NULL) {
if (isNumber(token)) {
strcpy(tokens[numTokens++], token); // num token
} else if (strcmp(token, "+") == 0 || strcmp(token, "-") == 0 ||
strcmp(token, "*") == 0 || strcmp(token, "/") == 0 ||
strcmp(token, "(") == 0 || strcmp(token, ")") == 0) {
strcpy(tokens[numTokens++], token); // Operator or Parenthesis
} else {
strcpy(tokens[numTokens++], token); // id token
}
token = strtok(NULL, " \t\n");
}
}

31 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
// Function to get the next token from the token list
void nextToken() {
if (tokenIndex < numTokens)
{ strcpy(currentToken, tokens[tokenIndex+
+]);
} else {
strcpy(currentToken, "EOF"); // End of input
}
}

// Function to match and consume the expected token


int match(char* expected) {
if (strcmp(currentToken, expected) == 0) {
nextToken();
return 1;
}
return 0;
}

// Forward declarations for the recursive descent functions


void parseE();
void parseEPrime();
void parseT();
void parseTPrime();
void parseF();

// Parsing the expression E -> T E'


void parseE() {
printf("Parsing E -> T E'\n");
parseT(); // Parse the first term
parseEPrime(); // Parse the rest of the expression
}

// Parsing the expression E' -> + T E' | - T E' | ε


void parseEPrime() {
if (strcmp(currentToken, "+") == 0)
{ printf("Parsing E' -> + T E'\n");
match("+");
parseT();
parseEPrime();
} else if (strcmp(currentToken, "-") == 0)
{ printf("Parsing E' -> - T E'\n");
match("-");
parseT();
parseEPrime();
} else {
printf("Parsing E' -> ε\n");

32 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

33 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
}
}

// Parsing the term T -> F T'


void parseT() {
printf("Parsing T -> F T'\n");
parseF(); // Parse the first factor
parseTPrime(); // Parse the rest of the term
}

// Parsing the term T' -> * F T' | / F T' | ε


void parseTPrime() {
if (strcmp(currentToken, "*") == 0)
{ printf("Parsing T' -> * F T'\n");
match("*");
parseF();
parseTPrime();
} else if (strcmp(currentToken, "/") == 0)
{ printf("Parsing T' -> / F T'\n");
match("/");
parseF();
parseTPrime();
} else {
printf("Parsing T' -> ε\n");
}
}

// Parsing the factor F -> ( E ) | id | num


void parseF() {
if (strcmp(currentToken, "(") == 0)
{ printf("Parsing F -> ( E )\n");
match("(");
parseE();
match(")");
} else if (isNumber(currentToken))
{ printf("Parsing F -> num\n");
match(currentToken); // Consume number
} else if (strcmp(currentToken, "id") == 0)
{ printf("Parsing F -> id\n");
match("id"); // Consume identifier
} else {
printf("Syntax Error: Unexpected token '%s'\n", currentToken);
exit(1);
}
}
// Main parsing function

34 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
void parse() {
nextToken(); // Start by reading the first token
parseE(); // Start parsing the expression
if (strcmp(currentToken, "EOF") != 0) {
printf("Syntax Error: Unexpected token '%s'\n", currentToken);
exit(1);
}
}
// Main function
int main() {
char input[200];
printf("Enter the expression: ");
fgets(input, sizeof(input), stdin);

// Tokenize the input string


tokenize(input);

// Start parsing the expression


parse();

printf("Parsing successful!\n");
return 0;
}

Output:
Parsing E -> T E'
Parsing T -> F T'
Parsing F -> id
Parsing T' -> ε
Parsing E' -> + T E'
Parsing T -> F T'
Parsing F -> num
Parsing T' -> * F T'
Parsing F ->
( Parsing E -> T E'
Parsing T -> F T'
Parsing F -> id
Parsing T' -> ε
Parsing E' -> - T E'
Parsing T -> F T'
Parsing F -> num
Parsing T' -> ε
Parsing E' -> ε
Parsing T' -> ε
Parsing E' -> ε
Parsing successful!

35 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 9
Write a C program to implement LALR parsing.

#include <stdio.h>

#include <stdlib.h>

// Define the maximum number of states and symbols for simplicity

#define MAX_STATES 4

#define MAX_SYMBOLS 3

#define MAX_INPUT 10

// Define some constants for terminal and non-terminal symbols

enum Symbols {

T_A = 0, // Terminal symbol 'a'

T_END = 1, // End of input

symbol NT_S = 2, // Non-terminal

S NT_A = 3 // Non-terminal A

};

// Define the action table (state x terminal symbol)

int actionTable[MAX_STATES][MAX_SYMBOLS] = {

{1, 0}, // State 0: On terminal 'a', shift to state 1 (Action Table)

{0, -1}, // State 1: On terminal 'a', reduce by A -> a (Action Table)

{-1, 0}, // State 2: Accept (end of input)

{-1, -1} // State 3: Error state

};

36 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

// Define the goto table (state x non-terminal symbol)

int gotoTable[MAX_STATES][MAX_SYMBOLS] = {

{1, 2}, // From state 0: on non-terminal A, go to state 1; on non-terminal S, go to state 2

{0, 3}, // From state 1: on non-terminal A, go to state 0; on non-terminal S, go to state 3

{0, 0}, // From state 2: no transitions on A or S

{0, 0} // From state 3: no transitions

};

// Grammar rule for S -> AA

// Grammar rule for A -> a

// Define a simple stack for the parsing process

int stack[MAX_STATES];

int stackTop = -1;

// Function to push a state onto the stack

void push(int state) {

stack[++stackTop] = state;

// Function to pop a state from the stack

int pop() {

return stack[stackTop--];

37 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

// Function to print the current stack for debugging purposes

void printStack() {

printf("Stack: ");

for (int i = 0; i <= stackTop; i++)

{ printf("%d ", stack[i]);

printf("\n");

// Main parsing function

void parse(int* input) {

int inputPos = 0;

push(0); // Start from state 0

while (1) {

int currentState = stack[stackTop];

int lookahead = input[inputPos];

// Check the action to take from the Action Table

int action = actionTable[currentState][lookahead];

if (action > 0) { // Shift operation

printf("Shift: %d\n", lookahead);

push(action);

inputPos++;

38 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
} else if (action < 0) { // Reduce operation (negative values)

int ruleNumber = -action;

printf("Reduce by rule: A -> a\n");

// Pop the stack based on the length of the RHS of the rule

pop(); // Pop A

pop(); // Pop A

// Go to the next state based on the GOTO table

int gotoState = gotoTable[stack[stackTop]][NT_S]; // We reduce to S

push(gotoState);

} else if (action == 0) { // Accept operation

printf("Accept\n");

break;

} else { // Error condition

printf("Syntax error!\n");

break;

printStack(); // Debugging step: print the current stack

// Main function to test the parser

int main() {

39 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
// Test input: "a a" (which should reduce to S)

int input[] = {T_A, T_A, T_END}; // Represents the string "a a" followed by the end-of-input
symbol

// Call the parsing function

parse(input);

return 0;

Output:
Input :a a

Output: Shift: 0

Stack: 0 1

Shift: 0

Stack: 0 1 1

Reduce by rule: A -> a

Stack: 0 2

Reduce by rule: A -> a

Stack: 2

Accept

40 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 10

a) Write a C program to implement operator precedence parsing.

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

#define MAX 100

// Precedence values for operators


#define LOWEST 0
#define EQUAL 1
#define HIGH 2

// Stack for operators


char stack[MAX];
int top = -1;

// Function to push an element onto the stack


void push(char c) {
if (top < MAX - 1) { stack[+
+top] = c;
}
}

// Function to pop an element from the stack


char pop() {
if (top == -1)
{ return -1;
}
return stack[top--];
}

// Function to get precedence of an operator


int precedence(char c) {
switch (c) {
case '+': case '-':
return 1; // Low precedence
case '*': case '/':
return 2; // High precedence
case '(': case ')':
return 0; // Parentheses have no precedence when encountered
default:
return -1; // Not an operator
}

41 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
}

// Function to check if a character is an operator


int isOperator(char c) {
return (c == '+' || c == '-' || c == '*' || c == '/');
}

// Function to perform operator precedence parsing


void operatorPrecedenceParsing(char *expression) {
int i = 0;
char symbol, topSymbol;

// Process the input expression


while ((symbol = expression[i++]) != '\0') {
// If the symbol is an operand, just print it (we assume operand is a variable/constant)
if (isalnum(symbol)) {
printf("%c ", symbol);
}
// If the symbol is an operator
else if (isOperator(symbol)) {
while (top != -1 && precedence(stack[top]) >= precedence(symbol))
{ topSymbol = pop();
printf("%c ", topSymbol); // Print the operator popped from the stack
}
push(symbol); // Push the current operator onto the stack
}
// If the symbol is an opening parenthesis, push it onto the stack
else if (symbol == '(') {
push(symbol);
}
// If the symbol is a closing parenthesis, pop and print until an opening parenthesis is found
else if (symbol == ')') {
while ((topSymbol = pop()) != '(') {
printf("%c ", topSymbol); // Print operators
}
}
}

// Pop the remaining operators from the stack and print them
while (top != -1) {
printf("%c ", pop());
}
}

// Main function
int main() {

42 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
char expression[MAX];

// Read an infix expression from the user


printf("Enter an infix expression: ");
fgets(expression, MAX, stdin);

// Parse the expression


printf("Postfix Expression: ");
operatorPrecedenceParsing(expression);

return 0;
}

Output:
Enter an infix expression: a + b * c
Postfix Expression: a b c * +

b) Write a C program to implement Program semantic rules to calculate the


expression that takes an expression with digits, + and * and computes the value.

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

int index = 0;
char expression[100];

// Function to evaluate a term (i.e., handle multiplication)


int term();

// Function to evaluate a factor (i.e., handle digits)


int factor();

// Function to evaluate an expression (handles addition and multiplication)


int expression();

// Function to process the next character and skip whitespaces


void skipWhitespace() {
while (isspace(expression[index])) {
index++;
}

// Function to evaluate an integer number (handle digits)


int number() {
int num = 0;
while (isdigit(expression[index])) {

43 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
num = num * 10 + (expression[index] - '0');
index++;
}
return num;
}

// Function to evaluate a factor (operand)


int factor() {
skipWhitespace();
if (isdigit(expression[index]))
{ return number();
} else {
// Handling parentheses: this program doesn't support parentheses for simplicity
printf("Error: Unexpected character '%c'\n", expression[index]);
exit(1);
}
}

// Function to evaluate a term (multiplication or division)


int term() {
int result = factor();
skipWhitespace();

while (expression[index] == '*' || expression[index] == '/') {


char op = expression[index++];
skipWhitespace();
int nextFactor = factor();

if (op == '*') {
result *= nextFactor;
} else {
printf("Error: Division is not supported in this simple calculator.\n");
exit(1);
}
}

return result;
}

// Function to evaluate the expression (addition and subtraction)


int expression() {
int result = term();
skipWhitespace();

while (expression[index] == '+' || expression[index] == '-') {


char op = expression[index++];

44 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
skipWhitespace();
int nextTerm = term();

if (op == '+') {
result += nextTerm;
} else {
result -= nextTerm;
}
}

return result;
}

int main() {
printf("Enter an expression (digits, +, *): ");
fgets(expression, sizeof(expression), stdin);

int result = expression();


printf("The result of the expression is: %d\n", result);

return 0;
}

Output:
Enter an expression (digits, +, *): 3 + 5 * 2
The result of the expression is: 13

45 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 11

Convert the BNF rules into Yacc form and write code to generate abstract syntax tree for
the mini language specified in Note 1.

%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Abstract syntax tree (AST) node structure


typedef struct ASTNode {
int type; // Type of node (e.g., operator, number)
union {
int value; // For numbers
char* id; // For identifiers (e.g., variables)
struct {
struct ASTNode* left;
struct ASTNode* right;
} op; // For binary operators
};
} ASTNode;

// Define the types for the AST nodes


#define AST_NUM 1
#define AST_ID 2
#define AST_ADD 3
#define AST_MUL 4

// Function to create a number node


ASTNode* new_num_node(int value) {
ASTNode* node =
(ASTNode*)malloc(sizeof(ASTNode)); node->type =
AST_NUM;
node->value = value;
return node;
}

// Function to create an identifier node


ASTNode* new_id_node(char* id) {
ASTNode* node = (ASTNode*)malloc(sizeof(ASTNode));
node->type = AST_ID;
node->id = strdup(id); // Copy the identifier string
return node;
}

// Function to create an operator node


46 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

47 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
ASTNode* new_op_node(int op_type, ASTNode* left, ASTNode* right)
{ ASTNode* node = (ASTNode*)malloc(sizeof(ASTNode));
node->type = op_type;
node->[Link] = left;
node->[Link] = right;
return node;
}

// Function to print AST (In-order traversal)


void print_ast(ASTNode* node) {
if (!node) return;
if (node->type == AST_NUM)
{ printf("%d", node->value);
} else if (node->type == AST_ID)
{ printf("%s", node->id);
} else if (node->type == AST_ADD) {
printf("(");
print_ast(node->[Link]);
printf(" + ");
print_ast(node->[Link]);
printf(")");
} else if (node->type == AST_MUL) {
printf("(");
print_ast(node->[Link]);
printf(" * ");
print_ast(node->[Link]);
printf(")");
}
}

%}

%union
{ int
num;
char* id;
ASTNode* ast;
}

%token <num> NUMBER


%token <id> IDENTIFIER
%left '+' '-'
%left '*' '/'

%%

program:

48 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

49 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
expression { printf("AST for expression: "); print_ast($1); printf("\n"); }
;

expression:
expression '+' term { $$ = new_op_node(AST_ADD, $1, $3); }
| expression '*' term { $$ = new_op_node(AST_MUL, $1, $3); }
| term { $$ = $1; }
;

term:
factor { $$ = $1; }
| factor '*' term { $$ = new_op_node(AST_MUL, $1, $3); }
| factor '/' term { $$ = new_op_node(AST_MUL, $1, $3); } // Assuming no division
;

factor:
'(' expression ')' { $$ = $2; }
| NUMBER { $$ = new_num_node($1); }
| IDENTIFIER { $$ = new_id_node($1); }
;

%%

int main() {
printf("Enter an expression: ");
yyparse();
return 0;
}

int yyerror(char *s)


{ fprintf(stderr, "Error: %s\n",
s); return 0;
}

Output:

3+5*x
AST for expression: (3 + (5 * x))

50 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

Experiment – 12

Write a C program to generate machine code from abstract syntax tree generated by the
parser. The instruction set specified in Note 2 may be considered as the target code.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Abstract syntax tree (AST) node structure


typedef struct ASTNode {
int type; // Type of node (e.g., number, identifier, operator)
union {
int value; // For numbers
char* id; // For identifiers (variables)
struct {
struct ASTNode* left;
struct ASTNode* right;
} op; // For binary operators (left and right operands)
};
} ASTNode;

// Define the types for the AST nodes


#define AST_NUM 1
#define AST_ID 2
#define AST_ADD 3
#define AST_MUL 4

// Function to create a number node


ASTNode* new_num_node(int value) {
ASTNode* node =
(ASTNode*)malloc(sizeof(ASTNode)); node->type =
AST_NUM;
node->value = value;
return node;
}

// Function to create an identifier node


ASTNode* new_id_node(char* id) {
ASTNode* node = (ASTNode*)malloc(sizeof(ASTNode));
node->type = AST_ID;
node->id = strdup(id); // Copy the identifier string
return node;
}

// Function to create an operator node


ASTNode* new_op_node(int op_type, ASTNode* left, ASTNode* right) {
51 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

52 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
ASTNode* node =
(ASTNode*)malloc(sizeof(ASTNode)); node->type =
op_type;
node->[Link] = left;
node->[Link] = right;
return node;
}

// Function to print the machine code


void generate_machine_code(ASTNode* node, int* reg_counter) {
if (!node) return;

// Case 1: Number (constant)


if (node->type == AST_NUM) {
printf("LOAD R%d, %d\n", (*reg_counter), node->value);
(*reg_counter)++;
}

// Case 2: Identifier (variable)


else if (node->type == AST_ID) {
printf("LOAD R%d, %s\n", (*reg_counter), node->id);
(*reg_counter)++;
}

// Case 3: Addition
else if (node->type == AST_ADD)
{ generate_machine_code(node->[Link],
reg_counter); int left_reg = (*reg_counter) - 1;
generate_machine_code(node->[Link], reg_counter);
int right_reg = (*reg_counter) - 1;
printf("ADD R%d, R%d\n", left_reg, right_reg);
}

// Case 4: Multiplication
else if (node->type == AST_MUL)
{ generate_machine_code(node->[Link],
reg_counter); int left_reg = (*reg_counter) - 1;
generate_machine_code(node->[Link], reg_counter);
int right_reg = (*reg_counter) - 1;
printf("MUL R%d, R%d\n", left_reg, right_reg);
}
}

int main() {
// Example AST: 3 + 5 * x
// Construct AST for (3 + (5 * x))
ASTNode* x = new_id_node("x");

53 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515

54 | P a g e
Compiler Design Lab Ritik
BTCS604-18 2224515
ASTNode* five = new_num_node(5);
ASTNode* three = new_num_node(3);
ASTNode* mul = new_op_node(AST_MUL, five, x);
ASTNode* add = new_op_node(AST_ADD, three, mul);

// Register counter to generate unique register numbers


int reg_counter = 1;

// Generate machine code from AST


printf("Generated Machine Code:\n");
generate_machine_code(add, &reg_counter);

return 0;
}

Output:
Input: 3 + 5 * x

Generated Machine Code:


LOAD R1, 3
LOAD R2, 5
LOAD R3, x
MUL R2, R3
ADD R1, R2

55 | P a g e

You might also like