Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1
Syllabus
SUBJECT CODE SUBJECT TITLE CORE/ ELECTIVE CREDITS
L T P C
CSE 306 Complier Design C 3 0 2 4
Books of Study
1. Compilers – Principles, Techniques and Tools, Alfred V Aho, Monica S. Lam, Ravi Sethi and
Jeffrey D Ullman, 2 nd Edition, Pearson Education, 2007.
Books of References
2. Optimizing Compilers for Modern Architectures: A Dependence-based Approach,Randy
Allen, Ken Kennedy, Morgan Kaufmann Publishers, 2002.
3. Advanced Compiler Design and Implementation, Steven S. Muchnick, Morgan Kaufmann
Publishers - Elsevier Science, India, Indian Reprint 200
4. Engineering a Compiler,Keith D Cooper and Linda Torczon, Morgan Kaufmann Publishers
Elsevier Science, 2004.
5. Crafting a Compiler with C,Charles N. Fischer, Richard. J. LeBlanc, Pearson Education, 2008.
The Course covers
o Compiler Basics
o Lexical Analysis
o Syntax Analysis
o Semantic Analysis
o Runtime environments
o Code Generation
o Code Optimization
Distribution of Marks
Assessment Tool Conducting Marks Converting Marks Final Conversion
Internal
Mid-term-I 25 10 30
Theory
Mid-term-II 25 10
CLA 1 30 5
CLA 2 30 5
Lab Performance 10 10 20
Practical
Observation note 10 10
Total 50
Total 50
Playing it safe in CSE-306
If you follow these 4 simple rules during the class, you'll make
sure that you do well in the course:
1. Attend every Theory and LAB classes.
2. Read the course material (textbook sections assigned +
slides).
3. Submit everything (Assignments, Quizzes, Exams) on time -
don't be late.
4. Don't cheat.
Basics of Complier Design
8
Translators
What do you understand by the terms translator?
The source language can be low level language like assembly language or a high level
language like C, C++, JAVA, FORTRAN, and so on.
The target language can be a low level language (assembly language) or a machine
language (set of instructions executed directly by a CPU).
Translators
o Compiler
o Interpreter
o Assembler
o Cross-Compiler
o Language Translator / Source to source translator / Language Converter
o Language Rewriter
o Decompiler
o Compiler-Compiler
o Linker
o Loader
What is a Compiler
The program written in high-level language is known as source program, and the program
converted into low-level language is known as object (or target) program.
What is a Compiler
16
Interpreter
17
Difference between compiler and interpreter
18
Translators
o Compiler
o Interpreter
o Assembler
o Cross-Compiler
o Language Translator / Source to source translator / Language Converter
o Language Rewriter
o De-compiler
o Compiler-Compiler
o Linker
o Loader
Hybrid Compiler
Hybrid Compiler is combination of compilation and interpretation. Java language
processors combine compilation and interpretation
Java source program first be compiled into an intermediate form called bytecodes. The bytecodes are then
interpreted by a virtual machine.
21
Other different Translators
Loader: It is a program which accepts input as linked modules & loads them into main memory
for execution.
It copies modules from secondary memory to main memory. It may also replace virtual
addresses with physical addresses. Linker & Loader may overlap.
24
Language processing system
25
Language processing system
26
Language processing system
Pre-processor: Preprocessor collects the source program which is
divided into modules and stored in separate files. The preprocessor
may also expand shorthands called macros into source language
statements. E.g. # include<math.h>, #define PI .14
27
Steps involved in the analysis of a source program
Linker links the relocatable code with the library files and
the relocatable objects, and loader loads the integrated
code into memory for the execution.
28
compiling C programs are preprocessing, compilation, and linking
Example:
Reference: https://jsommers.github.io/cbook/programstructure.html
29
Structure of Compiler
Compilers bridge the gap between high level language & machine
hardware.
• Compiler requires:
1. Finding errors in syntax of program.
2. Generating correct & efficient object code.
3. Run-time organization.
4. Formatting o/p according to linker/ assembler
30
Structure of Compiler
31
Structure of Compiler
32
Structure of Compiler
• Lexical Analysis
• Syntax Analysis Analysis Part
33
Structure of Compiler
• Lexical Analysis
• Syntax Analysis Analysis Part
• Determines operations implied by the source program which are recorded in a tree structure
called the Syntax Tree.
• Breaks up the source code into basic pieces, while storing info. in the symbol table.
• Lexical Analysis
• Syntax Analysis Analysis Part
Synthesis Part
• Code Optimization Back End
• Code Generation machine dependent
• Constructs the target code from the syntax tree, and from the information in the symbol table.
• Here, code optimization offers efficiency of code generation with least use of resources.
35
Structure of Compiler
36
Lexical Analysis
37
Lexical Analysis- Example
38
Lexical Analysis- Example
39
Structure of Compiler
40
Syntax Analysis
• It takes list of tokens produced by lexical analysis.
• Then, these tokens are arranged in a tree like structure (Syntax tree), which reflects
program structure.
• Also known as Parsing.
In syntax analysis, the tokens are grouped together and checked, if they form a valid sequence
as defined in the programming language.
A context-free grammar specifies the rules or productions for identifying constructs that are
valid in a programming language.
41
Syntax Analysis- Example
42
Syntax Analysis- Example
43
Structure of Compiler
44
Semantic Analysis
• It validates the syntax tree by applying rules & regulations of the target language.
• It does type checking, scope resolution, variable declaration, etc.
• It decorates the syntax tree by putting data types, values, etc.
In semantic analysis, we check if the syntactically correct statements make a meaningful reading.
For example, a statement in the input source program ‘x = y + 2;’ would not make a meaningful
read if say x is the name of a function or array and y is a float type of variable. This statement
might be syntactically acceptable by the productions of the context-free grammar in syntax
analysis, but would not hold out during semantic analysis because the data types of x and y are
not compatible.
45
Structure of Compiler
46
Intermediate Code Generation
• The program is translated to a simple machine independent intermediate language.
• Register allocation of variables is done in this phase.
47
Structure of Compiler
48
Code Optimization
• It aims to reduce process timings of any program.
• It produces efficient programming code.
• It is an optional phase
• Removing unreachable code.
• Getting rid of unused variables
• Eliminating multiplication by 1 and addition by 0
• Removing statements that are not modified from the loop
• Common sub-expression elimination
49
Structure of Compiler
50
Code Generation
• Target program is generated in the machine language of the target architecture.
• Memory locations are selected for each variable.
• Instructions are chosen for each operation
• Individual tree nodes are translated into sequence of m/c language instructions
51
Structure of Compiler
52
Symbol Table
• It stores identifiers identified in lexical analysis.
• It adds type and scope information during syntactical and semantical analysis.
• Also used for ‘Live analysis’ in optimization.
• This info is used in code generation to find which instructions to use.
• A symbol table is a data structure that is used by the compiler to record and collect
information about source program constructs like variable names and all of its attributes,
which provide information about the storage space occupied by a variable (name, type, and
scope of the variables).
• A symbol table should be designed in an efficient way so that it permits the compiler to
locate the record for each token name quickly and to allow rapid transfer of data from the
records.
53
54
Mini QUIZ-4 & 5:
https://bit.ly/2Cit03a
55
Structure of Compiler
56
Error Handler
• It handles error handling & reporting during many phases.
• Error detection and reporting of errors are important functions of the compiler.
• Ex: Invalid character sequence in scanning, invalid token sequences in parsing, type & scope
errors in semantic analysis.
1. In lexical analysis phase, errors can occur due to misspelled tokens, unrecognized characters,
etc. These errors are mostly the typing errors.
2. In syntax analysis phase, errors can occur due to the syntactic violation of the language.
3. In intermediate code generation phase, errors can occur due to incompatibility of operands
type for an operator.
4. In code optimization phase, errors can occur during the control flow analysis due to some
unreachable statements.
5. In code generation phase, errors can occurs due to the incompatibility with the computer
architecture during the generation of machine code. For example, a constant created by
compiler may be too large to fit in the word of the target machine.
6. In symbol table, errors can occur during the bookkeeping routine, due to the multiple
57
declaration of an identifier with ambiguous attributes.
Structure of Compiler
58
Structure of Compiler Example
59
Compiler Construction Tools
60
Compiler Construction Tools
• Compiler construction tools were introduced after widespread of computers.
• Also known as compiler- compilers, compiler generators or translator writing systems.
• These tools may use sophisticated algorithm or specified languages for specifying &
implementing the component.
Parser Generators: Parser generators that automatically produce syntax analyzers (parse tree)
from a grammatical description of a programming language. Unix has a tool called YACC which
is a parser generator.
Earlier, used to be most difficult to develop but now, easier to develop & implement.
61
Compiler Construction Tools
• Automatic Code Generator: This software basically take intermediate code as input &
produce machine language as output.
• It is capable of fetching data from various storage locations like registers, static memory,
stack, etc.
• Basic technique here is ‘template matching’.
62
Compiler Construction Tools
Code-generator generators that produce a code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for a
target machine.
63
Structure of Compiler
64
PROGRAMMING LANGUAGE BASICS
65
The Static/Dynamic Distinction
The language uses a static policy or that the issue can be decided at compile time. On the
other hand, a policy that only allows a decision to be made when we execute the program is
said to be a dynamic policy or to require a decision at run time.
Example:
66
Environments and States
Programming languages affect the values of data elements or affect the interpretation of
names for that data changes, as the program runs. For example, the execution of an assignment
such as x = y + 1 changes the value denoted by the name x. More specifically, the assignment
changes the value in whatever location is denoted by x.
Example:
67
Environments and States
Programming languages affect the values of data elements or affect the interpretation of
names for that data changes, as the program runs. For example, the execution of an assignment
such as x = y + 1 changes the value denoted by the name x. More specifically, the assignment
changes the value in whatever location is denoted by x.
Example:
Output: 13 11 13 11 68
Static Scope and Block Structure
The scope rules for C are based on program structure; the scope of a declaration is determined implicitly by where
the declaration appears in the program. Later languages, such as C++, Java, and C#, also provide explicit control
over scopes through the use of keywords like public, private, and protected.
A block is a grouping
of declarations and
statements. C uses
braces { and } to
delimit a block; the
alternative use of
begin and end in some
languages.
69
Static Scope and Block Structure
The scope rules for C are based on program structure; the scope of a declaration is determined implicitly by where the
declaration appears in the program. Later languages, such as C++, Java, and C#, also provide explicit control over scopes
through the use of keywords like public, private, and protected.
70
Explicit Access Control
Classes and structures introduce a new scope for their members. If p is an object of a class with a field (member) x, then the use
of x in p.x refers to field x in the class definition. the scope of a member declaration x in a class C extends to any subclass C',
except if C' has a local declaration of the same name x.
Through the use of keywords like public, private, and protected, object oriented languages such as C++ or Java provide
explicit control over access to member names in a super class. These keywords support encapsulation by restricting access.
71
Dynamic Scope
Technically, any scoping policy is dynamic if it is based on factor(s) that can be known only when the program executes. The
term dynamic scope, however, usually refers to the following policy: a use of a name x refers to the declaration of x in the most
recently called procedure with such a declaration.
72
Parameter Passing Mechanisms
All programming languages have a notion of a procedure, but they can differ in how these procedures get their arguments.
In call-by-value, the actual parameter is evaluated (if it is an expression) or copied (if it is a variable). The value is placed in the
location belonging to the corresponding formal parameter of the called procedure. This method is used in C and Java.
In call- by-reference, the address of the actual parameter is passed to the call as the value of the corresponding formal parameter.
Uses of the formal parameter in the code of the call are implemented by following this pointer to the location indicated by the
caller. Changes to the formal parameter thus appear as changes to the actual parameter.
73
Parameter Passing Mechanisms
All programming languages have a notion of a procedure, but they can differ in how these procedures get their arguments.
Call by reference
Call by value
When called with swap(x, y); When called with swap(&x, &y); where x and y
x and y will remain the same are int values, then a points to x and b points
to y, so the values pointed to are swapped,
not the pointers themselves
74
Parameter Passing Mechanisms
Call by value
void main() swap(int a, int b)
{ {
int x=10, y=5;
swap(x,y); int temp;
temp=a;
a=b;
} b=temp;
}
10 5
x y 10 5
75
a b
Parameter Passing Mechanisms
Call by swap(int *a, int *b)
Reference {
void main()
{ int temp;
int x=10, y=5; temp=*a;
swap(&x,&y);
*a=*b;
*b=temp;
}
}
1400 10 5 1500 *b
x y
1400 1500
*a
76
a b
Aliasing
There is an interesting consequence of call-by-reference parameter passing or its simulation, as in Java, where references to
objects are passed by value. It is possible that two formal parameters can refer to the same location; such variables are said to
be aliases of one another. As a result, any two variables, which may appear to take their values from two distinct formal
parameters, can become aliases of each other.
Example: Suppose a is an array belonging to a procedure p, and p calls another procedure q(x, y) with a call q(a, a). Suppose
also that parameters are passed by value, but that array names are really references to the location where the array is stored, as in
C or similar languages. Now, x and y have become aliases of each other. The important point is that if within q there is an
assignment x [10] = 2, then the value of y[10] also becomes 2.
77
PROGRAMMING LANGUAGE BASICS
78
Mini QUIZ-6:
https://bit.ly/2XQXbpP
79