Compiler Design — Some Important Lab Programs

6 min readNov 23, 2022

Practice some of the commonly used Programs from Compiler Design

1. To write a C Program to Compute the first and follow sets of given grammar.

Description: FIRST and FOLLOW are two functions associated with grammar that help us fill in the entries of an M-table.

FIRST ()− It is a function that gives the set of terminals that begin the strings derived from the production rule.

A symbol c is in FIRST (α) if and only if α ⇒ cβ for some sequence β of grammar symbols.

FOLLOW() — A terminal symbol a is in FOLLOW (N) if and only if there is a derivation from the start symbol S of the grammar such that S ⇒ αNαβ, where α and β are a (possibly empty) sequence of grammar symbols. In other words, a terminal c is in FOLLOW (N) if c can follow N at some point in a derivation.

#include<stdio.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int n,m=0,p,i=0,j=0;
char a[10][10],f[10];
void follow(char c);
void first(char c);
int main(){
int i,z;
char c,ch;

printf("Enter the no of prooductions:\n");
scanf("%d",&n);
printf("Enter the productions:\n");
for(i=0;i<n;i++)
scanf("%s%c",a[i],&ch);
do{
m=0;
printf("Enter the elemets whose fisrt & follow is to be found:");
scanf("%c",&c);
first(c);
printf("First(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
printf("}\n");
strcpy(f," ");
//flushall();
m=0;
follow(c);
printf("Follow(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
printf("}\n");
printf("Continue(0/1)?");
scanf("%d%c",&z,&ch);
}while(z==1);
return(0);
}
void first(char c)
{
int k;
if(!isupper(c))
f[m++]=c;
for(k=0;k<n;k++)
{
if(a[k][0]==c)
{
if(a[k][2]=='$')
follow(a[k][0]);
else if(islower(a[k][2]))
f[m++]=a[k][2];
else first(a[k][2]);
}
}
}
void follow(char c)
{
if(a[0][0]==c)
f[m++]='$';
for(i=0;i<n;i++)
{
for(j=2;j<strlen(a[i]);j++)
{
if(a[i][j]==c)
{
if(a[i][j+1]!='\0')
first(a[i][j+1]);
if(a[i][j+1]=='\0' && c!=a[i][0])
follow(a[i][0]);
}
}
}
}

2. To eliminate left recursion and left factoring from the given grammar.

Description: A Grammar G (V, T, P, S) is left recursive if it has a production in the form.

A → A α |β.

The above Grammar is left recursive because the left of production is occurring at a first position on the right side of production. It can eliminate left recursion by replacing a pair of production with

A → βA′

A → αA′|ϵ

Elimination of Left Recursion

Left Recursion can be eliminated by introducing new non-terminal A such that.

This type of recursion is also called Immediate Left Recursion.

In Left Recursive Grammar, the expansion of A will generate Aα, Aαα, Aααα at each step, causing it to enter into an infinite loop.

#include<stdio.h> 
#include<string.h>
void leftfactoring(void);
void leftrecursion(void);
char gram[20],part1[20],part2[20],modifiedGram[20],newGram[20],tempGram[20]; 
int i,j=0,k=0,l=0,pos;
char input[100],ll[50],r[50],temp[10],tempprod[20],productions[25][50];
int flag=0,consumed=0,p;
int main() 
{ 
 printf("choose 1: left recursion 2. left factoring\n");
 scanf("%d",&p);
 printf("Enter Production : ");
scanf("%1s->%s",ll,r);
 switch(p)
 {
case 1: leftrecursion();
 break;
 case 2: strcpy(gram,r);
 leftfactoring();
 break;
 }
}
void leftrecursion()
{
 while(sscanf(r+consumed,"%[^|]s",temp) == 1 && consumed <= strlen(r)) 
{
 if(temp[0] == ll[0]) 
{
 flag = 1;
 sprintf(productions[i++],"%s->%s%s'\0",ll,temp+1,ll);
 }
 else
 sprintf(productions[i++],"%s'->%s%s'\0",ll,temp,ll);
 consumed += strlen(temp)+1;
 }
 if(flag == 1)
{
 sprintf(productions[i++],"%s'->?\0",ll);
 printf("The productions after eliminating Left Recursion are:\n");
 for(j=0;j<i;j++)
 printf("%s\n",productions[j]);
 }
 else
 printf("The Given Grammar has no Left Recursion");
}

void leftfactoring()
{
 printf("hello");
 for(i=0;gram[i]!='|';i++,j++) 
 part1[j]=gram[i]; 
 part1[j]='\0'; 
 for(j=++i,i=0;gram[j]!='\0';j++,i++) 
 part2[i]=gram[j]; 
 part2[i]='\0'; 
 for(i=0;i<strlen(part1)||i<strlen(part2);i++) 
 { 
 if(part1[i]==part2[i]) 
 { 
 modifiedGram[k]=part1[i]; 
 k++; 
 pos=i+1; 
 } 
 } 
 for(i=pos,j=0;part1[i]!='\0';i++,j++)
{ 
 newGram[j]=part1[i]; 
 } 
 newGram[j++]='|'; 
 for(i=pos;part2[i]!='\0';i++,j++)
{ 
 newGram[j]=part2[i]; 
 } 
 modifiedGram[k]='X'; 
 modifiedGram[++k]='\0'; 
 newGram[j]='\0'; 
 printf("\n A->%s",modifiedGram); 
 printf("\n X->%s\n",newGram); 
}

3. To check for the validity of the input string using a predictive parser

Description: A predictive parser is a recursive descent parser with no backtracking or backup. It is a top-down parser that does not require backtracking. At each step, the choice of the rule to be expanded is made upon the next terminal symbol.
Consider

A -> A1 | A2 | ... | An

If the non-terminal is to be further expanded to ‘A’, the rule is selected based on the current input symbol ‘a’ only.

Predictive Parser Algorithm :

Make a transition diagram(DFA/NFA) for every rule of grammar.
Optimize the DFA by reducing the number of states, yielding the final transition diagram.
Simulate the string on the transition diagram to parse a string.
If the transition diagram reaches an accept state after the input is consumed, it is parsed.

#include<stdio.h>
#include<conio.h>
#include<string.h>
char prol[7][10]={"S","A","A","B","B","C","C"};
char pror[7][10]={"A","Bb","Cd","aB","@","Cc","@"};
char prod[7][10]={"S->A","A->Bb","A->Cd","B->aB","B->@","C->Cc","C->@"};
char first[7][10]={"abcd","ab","cd","a@","@","c@","@"};
char follow[7][10]={"$","$","$","a$","b$","c$","d$"};
char table[5][6][10];
int numr(char c)
{
switch(c){
case 'S': return 0;
case 'A': return 1;
case 'B': return 2;
case 'C': return 3;
case 'a': return 0;
case 'b': return 1;
case 'c': return 2;
case 'd': return 3;
case '$': return 4;
}
return(2);
}
void main()
{
int i,j,k;
for(i=0;i<5;i++)
for(j=0;j<6;j++)
strcpy(table[i][j]," ");
printf("\nThe following is the predictive parsing table for the following grammar:\n"); for(i=0;i<7;i++)
printf("%s\n",prod[i]);
printf("\nPredictive parsing table is\n");
fflush(stdin);
for(i=0;i<7;i++){
k=strlen(first[i]);
for(j=0;j<10;j++)
if(first[i][j]!='@')
strcpy(table[numr(prol[i][0])+1][numr(first[i][j])+1],prod[i]);
}
for(i=0;i<7;i++){
if(strlen(pror[i])==1)
{
if(pror[i][0]=='@')
{
k=strlen(follow[i]);
for(j=0;j<k;j++)
strcpy(table[numr(prol[i][0])+1][numr(follow[i][j])+1],prod[i]);
}
}
}
strcpy(table[0][0]," ");
strcpy(table[0][1],"a");
strcpy(table[0][2],"b");
strcpy(table[0][3],"c");
strcpy(table[0][4],"d");
strcpy(table[0][5],"$");
strcpy(table[1][0],"S");
strcpy(table[2][0],"A");
strcpy(table[3][0],"B");
strcpy(table[4][0],"C");
printf("\n--------------------------------------------------------\n");
for(i=0;i<5;i++)
for(j=0;j<6;j++){
printf("%-10s",table[i][j]);
if(j==5)
printf("\n--------------------------------------------------------\n");
}
getch();
}

4. To Stimulate the calculator using Lex and YACC Tool

Description: This is a simple calculator program implemented using Lex and Yacc. The idea is simple: create a program that takes as input the text of an arithmetic expression (for instance, the string "5 + 4"), and prints out the value of that expression ("9").

There’s a lot that has to happen in order to make this work: first, the program has to split the input up into its individual pieces: this is a process called lexing, so that it can take an expression "5 + 4" and turn it into a number (5), the + operator (addition), and another number (4). Each of these is described by a regular expression codified in calculator.l.

Then, it has to put these pieces together into a coherrent interpretation: a number plus another number is the sum of the two numbers, so that 5, +, 4 is interpreted as 9. This is described by a context-free grammar codified in calculator.y.

How to run it?
To run this program, you will need lex and yacc installed. These are installed by default on MacOS, and a guide for installation on Windows can be found in this Stack Overflow answer thread.

Easy version, if make is installed:

Clone the repository
Open the project directory in your shell
Run make
Run the program using ./calculator

Slightly harder version, if make is not installed:

Clone the repository
Open the project directory in your shell
Generate the lexer, using lex calculator.l
Generate the parser, using yacc calculator.y
Compile the program, using gcc y.tab.c -ly -ll -lm -o calculator (feel free to use a different compiler)
Run the program using ./calculator

// calculator.i

%{
    #include <stdio.h>
    extern int yylval;

%}

%%
[0-9]+      { yylval = atoi(yytext); return NUMBER; }
"+"         { return PLUS; }
"-"         { return MINUS; }
"*"         { return TIMES; }
"/"|"÷"     { return DIVIDE; }
\n          { return EOL; }
[ \t\r]     {}

%%

int yywrap(void){ return 1; }

// declarations
%{
    #include <stdio.h>
    int yylex();
    int yyerror();
%}

%%
// rules
%token NUMBER PLUS MINUS TIMES DIVIDE EOL;
%left PLUS MINUS;
%left TIMES DIVIDE;
%start statements;

statements  :   statement statements
            |   statement
            ;

statement   :   expression EOL              { printf("= %d\n", $1); }

expression  :   NUMBER                      { $$ = $1; printf("number: %d\n", $$); }
            |   expression TIMES expression { $$ = $1 * $3; printf("*: %d\n", $$); }
            |   expression PLUS expression  { $$ = $1 + $3; printf("+: %d\n", $$); }
            ;

%%
// programs
#include "lex.yy.c"

int main() {
  yyparse();
  return 1;
}

Compiler Design — Some Important Lab Programs

1. To write a C Program to Compute the first and follow sets of given grammar.

2. To eliminate left recursion and left factoring from the given grammar.

3. To check for the validity of the input string using a predictive parser

4. To Stimulate the calculator using Lex and YACC Tool

Written by Praseeda Saripalle