Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
EMBEDDED PROGRAMS
The Program Translation Process:
Figure 5.8 highlights the role of assemblers and linkers in the compilation process. This process is often hidden
from us by compilation commands that do everything required to generate an executable program. As the
figure shows, most compilers do not directly generate machine code, but instead create the instruction-level
program in the form of human-readable assembly language.
Generating assembly language rather than binary instructions frees the compiler writer from details
extraneous to the compilation process, which includes the instruction format as well as the exact addresses of
instructions and data. The assemblers job is to translate symbolic assembly language statements into bit-level
representations of instructions known as object code. The assembler takes care of instruction formats and does
part of the job of translating labels into addresses. However, since the program may be built from many files,
the final steps in determining the addresses of instructions and data are performed by the linker, which
produces an executable binary file. That file may not necessarily be located in the CPUs memory, however,
unless the linker happens to create the executable directly in RAM. The program that brings the program into
memory for execution is called a loader.
Representation of programs:
We develop models for programs that are more general than source code. Why not use the source code
directly? First, there are many different types of source codeassembly languages, C code, and so onbut we
can use a single model to describe all of them. Once we have such a model, we can perform many useful
analyses on the model more easily than we could on the source code.
Before we are able to draw the data flow graph for this code we need to modify it slightly. There are
two assignments to the variable xit appears twice on the left side of an assignment.We need to rewrite the
code in single-assignment form, in which a variable appears only once on the left side. Since our specification
is C code, we assume that the statements are executed sequentially, so that any use of a variable refers to its
latest assigned value.
In this case, x is not reused in this block (presumably it is used elsewhere), so we just have to eliminate
the multiple assignment to x. The result is shown in Figure 5.3, where we have used the names x1 and x2 to
distinguish the separate uses of x. an introduction to the data flow graph, we use two types of nodes in the
graphround nodes denote operators and square nodes represent values.The value nodes may be either inputs
to the basic block, such as a and b,or variables assigned to within the block, such as w and x1. The data flow
graph for our single-assignment code is shown in Figure 5.4.
The data flow graph is generally drawn in the form shown in Figure 5.5. Here, the variables are not explicitly
represented by nodes. Instead, the edges are labeled with the variables they represent. As a result, a variable
can be represented by more than one edge. However, the edges are directed and all the edges for a variable
must come from a single source. We use this form for its simplicity and compactness.
Building a CDFG for a while loop is straightforward, as shown in Figure 5.7.The while loop consists of both a
test and a loop body, each of which we know how to represent in a CDFG.We can represent for loops by
remembering that, in C, a for loop is defined in terms of a while loop. The following for loop
While the CPU can work only in binary, it can do so at a very high speed. However, it is quite tedious
and slow for humans to deal with 0s and 1s in order to program the computer. A program that consists of 0s
and 1s is called machine language.
In the early days of the computer, programmers coded program in machine language. Eventually,
assembly language were developed which provides mnemonics for the machine code instructions, plus other
features which made programming faster and less prone to error. The term mnemonic is frequently use in
computer science and engineering literature to refer to codes and abbreviations that are relatively easy to
remember.
Assembly languages programs must be translated into machine code by a program called an assembler.
Assembly language is referred to as a low-level language because it deals directly with the internal structure of
the CPU. To program in assembly language, the programmer must know all the registers of the CPU and the
size of the each, as well as other details.
Today, one can use many different programming languages, such as C, C++, JAVA and numerous
others. These languages are called high-level languages because the programmer does not have to be
concerned with the internal details of the CPU. Whereas an assembler is used to translate an assembly
language program into machine code (sometimes also called object code or opcode for operation code), high
level languages are translated into machine code by a program called a compiler.
For instance, to write a program in C, one must use a C compiler to translate the program into machine
language.
ORG 0H
MOV A, #0H
; Load A Into A
ADD A,R5
; Add Contents Of R5 To A
; Now A=A+R5
ADD A,R7
; Add Contents Of R7 To A
; Now A=A=R7
ADD A,#12H
END
In the above program, it consists of a series of statements, or lines, which are assembly language
instructions such as ADD and MOV etc. or statements called directives. While instruction tell the CPU what to
do, directives (also called pseudo-instructions) give directions to the assembler.
For example, in the above program while the MOV and ADD instructions are commands to the CPU,
ORG and END are directives to assembler. ORG tells the assembler to place the opcode at memory location 0
while END indicates to the assembler the end of the source code. In other words, one is for the start of the
program and the other one for the end of the program.
An assembly language instruction consists of four fields:
[label:]
mnemonics
[operands]
[;comment]
named variables but also intermediate results such as (c - d). The code for the expression can be built by walking the
data flow graph.
The temporary variables for the intermediate values and final result have been named w, x , y , and z. To
generate code, we walk from the trees root (where z, the final result, is generated) by traversing the nodes in post
order. During the walk, we generate instructions to cover the operation at every node. The path is presented below
The nodes are numbered in the order in which code is generated. Since every node in the data flow graph
corresponds to an operation that is directly supported by the instruction set, we simply generate an instruction at
every node. Since we are making an arbitrary register assignment, we can use up the registers in order starting with
r1. The resulting ARM code follows:
; operator 1 (+)
ADR r4,a ; get address for a
MOV r1,[r4] ; load a
ADR r4,b ; get address for b
MOV r2,[r4] ; load b
ADD r3,r1,r2 ; put w into r3
; operator 2 ()
ADR r4,c ; get address for c
MOV r4,[r4] ; load c
ADR r4,d ; get address for d
MOV r5,[r4] ; load d
SUB r6,r4,r5 ; put x into r6
; operator 3 (*)
MUL r7,r6,#5 ; operator 3, puts y into r7
; operator 4 (+)
ADD r8,r7,r3 ; operator 4, puts z into r8