Sei sulla pagina 1di 20

Compiler

Construction
CS 606
Sohail Aslam
Lecture 1
Course Organization
 General course information
 Homework and project
information

2
General Information
Instructor Sohail Aslam
Lectures 45
Text Compilers – Principles,
Techniques and Tools
by Aho, Sethi and
Ullman

3
Work Distribution
 Theory
• homeworks = 20%
• exams = 40%
 Practice
• build a compiler = 40%

4
Project
Implementation
 language: subset of java
 Generated code: Intel x86
assembly
 Implementation language: C++
 Six programming assignments
5
Why Take this Course
Reason #1: understand compilers
and languages
 understand the code structure
 understand language semantics
 understand relation between
source code and generated
machine code
 become a better programmer
6
Why Take this Course
Reason #2: nice balance of
theory and practice
 Theory
• mathematical models: regular
expressions, automata,
grammars, graphs
• algorithms that use these
models
7
Why Take this Course
Reason #2: nice balance of
theory and practice
 Practice
• Apply theoretical notions to
build a real compiler

8
Why Take this Course
Reason #3: programming
experience
 write a large program which
manipulates complex data
structures
 learn more about C++ and
Intel x86
9
What are Compilers
 Translate information from one
representation to another
 Usually information = program

10
Examples
 Typical Compilers:
• VC, VC++, GCC, JavaC
• FORTRAN, Pascal, VB(?)
 Translators
• Word to PDF
• PDF to Postscript
11
In This Course
We will study typical compilation:
 from programs written in high-
level languages to low-level
object code and machine code

12
Typical Compilation
High-level
High-level source
source code
code

Compiler

Low-level
Low-level machine
machine code
code
13
Source Code
int expr( int n )
{
int d;
d = 4*n*n*(n+1)*(n+1);
return d;
}
14
Source Code
 Optimized for human
readability
 Matches human notions of
grammar
 Uses named constructs such
as variables and procedures
15
Assembly Code
.globl _expr
_expr:
imull %eax,%edx
pushl %ebp
movl 8(%ebp),%eax
movl %esp,%ebp
incl %eax
subl $24,%esp
imull %eax,%edx
movl 8(%ebp),%eax
movl %edx,-4(%ebp)
movl %eax,%edx
movl -4(%ebp),%edx
leal 0(,%edx,4),
%eax movl %edx,%eax
movl %eax,%edx jmp L2
imull 8(%ebp),%edx .align 4
movl 8(%ebp),%eax L2:
incl %eax leave
ret
16
Assembly Code
Optimized for hardware
 Consists of machine instructions
 Uses registers and unnamed
memory locations
 Much harder to understand by
humans

17
How to Translate
Correctness:
the generated machine code
must execute precisely the
same computation as the
source code

18
How to Translate
 Is there a unique translation?
No!
 Is there an algorithm for an
“ideal translation”? No!

19
How to Translate
 Translation is a complex
process
 source language and
generated code are very
different
 Need to structure the
translation
20

Potrebbero piacerti anche